This paper describes our first participation in the Indian language sub-task of the main Adhoc monolingual and bilingual track in CLEF1 competition. In this track, the task is to...
Current re-ranking algorithms for machine translation rely on log-linear models, which have the potential problem of underfitting the training data. We present BoostedMERT, a nove...
We present a novel method for discovering and modeling the relationship between informal Chinese expressions (including colloquialisms and instant-messaging slang) and their forma...
This paper presents an unsupervised learning approach to building a non-English (Arabic) stemmer. The stemming model is based on statistical machine translation and it uses an Eng...
We describe a machine learning approach for predicting sponsored search ad relevance. Our baseline model incorporates basic features of text overlap and we then extend the model t...
Dustin Hillard, Stefan Schroedl, Eren Manavoglu, H...