Data sparseness is one of the factors that degrade statistical machine translation (SMT). Existing work has shown that using morphosyntactic information is an effective solution t...
We describe a methodology for rapid experimentation in statistical machine translation which we use to add a large number of features to a baseline system exploiting features from...
Franz Josef Och, Daniel Gildea, Sanjeev Khudanpur,...
Extractive summarization techniques cannot generate document summaries shorter than a single sentence, something that is often required. An ideal summarization system would unders...
Michele Banko, Vibhu O. Mittal, Michael J. Witbroc...
For the patent classification task of the 2010 CLEF-IP evaluation we have used three different approaches combining semantics and statistics-driven techniques: first approach is b...
Franck Derieux, Mihaela Bobeica, Delphine Pois, Je...
Today there is a relatively large body of work on automatic acquisition of lexicosyntactical preferences (subcategorization) from corpora. Various techniques have been developed t...