The goal of the DARPA MADCAT (Multilingual Automatic Document Classification Analysis and Translation) Program is to automatically convert foreign language text images into Englis...
This paper presents a maximum entropy machine translation system using a minimal set of translation blocks (phrase-pairs). While recent phrase-based statistical machine translatio...
Duration of phonemic segments provide important cues for distinguishing words in languages such as Arabic. Recently, we proposed a discriminatively estimated joint acoustic, durat...
In this work, we focus on an improvement of a multi-script handwritting recognition system using a HMM based classifiers combination. The improvement relies on the use of Dempster-...
We present a method to transliterate names in the framework of end-to-end statistical machine translation. The system is trained to learn when to transliterate. For Arabic to Engl...