Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can p...
Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyv...
The distortion cost function used in Mosesstyle machine translation systems has two flaws. First, it does not estimate the future cost of known required moves, thus increasing sea...
Spence Green, Michel Galley, Christopher D. Mannin...
Background: Associating literature with pathways poses new challenges to the Text Mining (TM) community. There are three main challenges to this task: (1) the identification of th...
Kanae Oda, Jin-Dong Kim, Tomoko Ohta, Daisuke Okan...
In this paper, we propose a new application of Bayesian language model based on Pitman-Yor process for information retrieval. This model is a generalization of the Dirichlet distr...
This paper proposes a new method for automatic acquisition of Chinese bracketing knowledge from English-Chinese sentencealigned bilingual corpora. Bilingual sentence pairs are fir...