Certain distinctions made in the lexicon of one language may be redundant when translating into another language. We quantify redundancy among source types by the similarity of th...
Data Selection has emerged as a common issue in language technologies. We define Data Selection as the choosing of a subset of training data that is most effective for a given tas...
Jonathan Clark, Robert E. Frederking, Lori S. Levi...
Long-span features, such as syntax, can improve language models for tasks such as speech recognition and machine translation. However, these language models can be difficult to u...
The authors propose a model for analyzing English sentences including coordinate conjunctions such as "and","or","but" and the equivalent words. Synt...
For many years, statistical machine translation relied on generative models to provide bilingual word alignments. In 2005, several independent efforts showed that discriminative m...