Efficient decoding has been a fundamental problem in machine translation, especially with an integrated language model which is essential for achieving good translation quality. ...
We propose a general method to watermark and probabilistically identify the structured outputs of machine learning algorithms. Our method is robust to local editing operations and...
Ashish Venugopal, Jakob Uszkoreit, David Talbot, F...
Statistical Machine Translation (SMT) is based on alignment models which learn from bilingual corpora the word correspondences between source and target language. These models are...
We present a novel approach to integrate transliteration into Hindi-to-Urdu statistical machine translation. We propose two probabilistic models, based on conditional and joint pr...
Nadir Durrani, Hassan Sajjad, Alexander Fraser, He...
In Chinese texts, words composed of single or multiple characters are not separated by spaces, unlike most western languages. Therefore Chinese word segmentation is considered an ...