Current statistical machine translation (SMT) systems are trained on sentencealigned and word-aligned parallel text collected from various sources. Translation model parameters ar...
Spyros Matsoukas, Antti-Veikko I. Rosti, Bing Zhan...
Unlike conventional data or text, Web pages typically contain a large amount of information that is not part of the main contents of the pages, e.g., banner ads, navigation bars, ...
This paper addresses personal E-mail filtering by casting it in the framework of text classification. Modeled as semi-structured documents, Email messages consist of a set of field...
Sliding window classifiers are among the most successful and widely applied techniques for object localization. However, training is typically done in a way that is not specific to...
Dataset shift from the training data in a source domain to the data in a target domain poses a great challenge for many statistical learning methods. Most algorithms can be viewed ...