Traditionally, statistical machine translation systems have relied on parallel bi-lingual data to train a translation model. While bi-lingual parallel data are expensive to genera...
Matthew G. Snover, Bonnie J. Dorr, Richard M. Schw...
This paper reports experiments in which pCRU — a generation framework that combines probabilistic generation methodology with a comprehensive model of the generation space — i...
We describe an efficient technique to weigh word-based features in binary classification tasks and show that it significantly improves classification accuracy on a range of proble...
Justin Martineau, Tim Finin, Anupam Joshi, Shamit ...
Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the databas...
This paper presents two different approaches to automatic captioning of geo-tagged images by summarizing multiple web-documents that contain information related to an image’s lo...