Offline handwriting recognition--the transcription of images of handwritten text--is an interesting task, in that it combines computer vision with sequence learning. In most syste...
We analyze subword-based language models (LMs) in large-vocabulary continuous speech recognition across four “morphologically rich” languages: Finnish, Estonian, Turkish, and ...
We propose a novel HMM-based framework to accurately transliterate unseen named entities. The framework leverages features in letteralignment and letter n-gram pairs learned from ...
Bing Zhao, Nguyen Bach, Ian R. Lane, Stephan Vogel
We present a joint morphological-lexical language model (JMLLM) for use in statistical machine translation (SMT) of language pairs where one or both of the languages are morpholog...
In this paper, we argue that n-gram language models are not sufficient to address word reordering required for Machine Translation. We propose a new distortion model that can be u...