Our goal is to explore methods for combining structured but incomplete information from dictionaries with the unstructured but more complete information available in corpora for t...
Parallel text is one of the most valuable resources for development of statistical machine translation systems and other NLP applications. The Linguistic Data Consortium (LDC) has...
In this paper we look at the problem of cleansing noisy text using a statistical machine translation model. Noisy text is produced in informal communications such as Short Message...
Danish Contractor, Tanveer A. Faruquie, L. Venkata...
Recent advances in statistical machine translation have used approximate beam search for NP-complete inference within probabilistic translation models. We present an alternative ap...
Abhishek Arun, Barry Haddow, Philipp Koehn, Adam L...
Targeted paraphrasing is a new approach to the problem of obtaining cost-effective, reasonable quality translation that makes use of simple and inexpensive human computations by m...
Philip Resnik, Olivia Buzek, Chang Hu, Yakov Kronr...