BLEU is the de facto standard for evaluation and development of statistical machine translation systems. We describe three real-world situations involving comparisons between diff...
David Chiang, Steve DeNeefe, Yee Seng Chan, Hwee T...
—Text classification is a widely studied topic in the area of machine learning. A number of techniques have been developed to represent and classify text documents. Most of the t...
For most English words, dictionaries give various senses: e.g., “bank” can stand for a financial institution, shore, set, etc. Automatic selection of the sense intended in a gi...
Alexander F. Gelbukh, Grigori Sidorov, Sang-Yong H...
Truecasing is the process of restoring case information to badly-cased or noncased text. This paper explores truecasing issues and proposes a statistical, language modeling based ...
Lucian Vlad Lita, Abraham Ittycheriah, Salim Rouko...
Searching online information is increasingly a daily activity for many people. The multilinguality of online content is also increasing (e.g. the proportion of English web users, ...
Yaser Al-Onaizan, Radu Florian, Martin Franz, Hany...