Language modeling for an inflected language such as Arabic poses new challenges for speech recognition and machine translation due to its rich morphology. Rich morphology results i...
Speech recognition of inflectional and morphologically rich languages like Czech is currently quite a challenging task, because simple n-gram techniques are unable to capture impo...
We present a simple method for language independent and task independent text categorization learning, based on character-level n-gram language models. Our approach uses simple in...
Abstract. In this article, we propose the use of suffix arrays to efficiently implement n-gram language models with practically unlimited size n. This approach, which is used with ...
We attemped to improve recognition accuracy by reducing the inadequacies of the lexicon and language model. Specifically we address the following three problems: (1) the best size...
Richard M. Schwartz, Long Nguyen, Francis Kubala, ...