We describe the Arabic broadcast transcription system elded by IBM in the GALE Phase 4 machine translation evaluation. Key advances over our Phase 3.5 system include improvements ...
Brian Kingsbury, Hagen Soltau, George Saon, Stephe...
Clustering is a basic task in a variety of machine learning applications. Partitioning a set of input vectors into compact, wellseparated subsets can be severely affected by the p...
Pedro A. Forero, Vassilis Kekatos, Georgios B. Gia...
We present an unsupervised model for joint phrase alignment and extraction using nonparametric Bayesian methods and inversion transduction grammars (ITGs). The key contribution is...
Graham Neubig, Taro Watanabe, Eiichiro Sumita, Shi...
Abstract. We describe a framework for Adaptive Multilingual Information Retrieval (AMIR) which allows multilingual resource discovery and delivery using on-the-fly machine transla...
Long-span features, such as syntax, can improve language models for tasks such as speech recognition and machine translation. However, these language models can be difficult to u...