We present three novel methods of compactly storing very large n-gram language models. These methods use substantially less space than all known approaches and allow n-gram probab...
Language resources, including corpus and tools, are normally required to be combined in order to achieve a user's specific task. However, resources tend to be developed indep...
Yoshinobu Kano, Ruben Dorado, Luke McCrohon, Sophi...
The retrieval of similar documents in the Web from a given document is different in many aspects from information retrieval based on queries generated by regular search engine use...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...
Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web ï...
There have been several authoring methods proposed in the literature that are model based, essentially following the Model Driven Design philosophy. While useful, such methods nee...