We analyze subword-based language models (LMs) in large-vocabulary continuous speech recognition across four “morphologically rich” languages: Finnish, Estonian, Turkish, and ...
-- In this paper, an efficient approach to segment Persian off-line handwritten text-line into characters is presented. The proposed algorithm first traces the baseline of the inpu...
Searching online information is increasingly a daily activity for many people. The multilinguality of online content is also increasing (e.g. the proportion of English web users, ...
Yaser Al-Onaizan, Radu Florian, Martin Franz, Hany...
In this paper, we present a multi-level recognizer for online Arabic handwriting. In Arabic script (handwritten and printed), cursive writing – is not a style – it is an inher...
Arabic, a highly inflected language, requires good stemming for effective information retrieval, yet no standard approach to stemming has emerged. We developed several light stemm...
Leah S. Larkey, Lisa Ballesteros, Margaret E. Conn...