In this paper, with a belief that a language model that embraces a larger context provides better prediction ability, we present two extensions to standard n-gram language models ...
This paper proposes a forest-based tree sequence to string translation model for syntaxbased statistical machine translation, which automatically learns tree sequence to string tr...
Hui Zhang, Min Zhang, Haizhou Li, AiTi Aw, Chew Li...
Online communities are increasingly important in modern social life. Yet, the diverse collection of guidelines that have directed online community design may not be keeping pace w...
Truecasing is the process of restoring case information to badly-cased or noncased text. This paper explores truecasing issues and proposes a statistical, language modeling based ...
Lucian Vlad Lita, Abraham Ittycheriah, Salim Rouko...
In statistical language modeling, one technique to reduce the problematic effects of data sparsity is to partition the vocabulary into equivalence classes. In this paper we invest...