This paper proposes to use monolingual collocations to improve Statistical Machine Translation (SMT). We make use of the collocation probabilities, which are estimated from monoli...
Document images undergo various degradation processes. Numerous models of these degradation processes have been proposed in the literature. In this paper we propose a modelbased r...
Supervised sequence-labeling systems in natural language processing often suffer from data sparsity because they use word types as features in their prediction tasks. Consequently...
In cross-language information retrieval it is often important to align words that are similar in meaning in two corpora written in different languages. Previous research shows tha...
We present Promodes, an algorithm for unsupervised word decomposition, which is based on a probabilistic generative model. The model considers segment boundaries as hidden variable...