Abstract. In this article, we propose the use of suffix arrays to efficiently implement n-gram language models with practically unlimited size n. This approach, which is used with ...
We attemped to improve recognition accuracy by reducing the inadequacies of the lexicon and language model. Specifically we address the following three problems: (1) the best size...
Richard M. Schwartz, Long Nguyen, Francis Kubala, ...
This paper presents the results of the State University of New York at Buffalo (UB) in the Mono-lingual and Multi-lingual tasks at CLEF 2004. For these tasks we used an approach ba...
In this paper we show how to train statistical machine translation systems on reallife tasks using only non-parallel monolingual data from two languages. We present a modificatio...
In this paper, a new language model, the Multi-Class Composite N-gram, is proposed to avoid a data sparseness problem for spoken language in that it is difficult to collect traini...