We present a new statistical compression method, which we call Phrase Based Dense Code (PBDC), aimed at compressing large digital libraries. PBDC compresses the text collection to ...
Statistical machine translation is often faced with the problem of combining training data from many diverse sources into a single translation model which then has to translate se...
Majid Razmara, George Foster, Baskaran Sankaran, A...
The dominant practice of statistical machine translation (SMT) uses the same Chinese word segmentation specification in both alignment and translation rule induction steps in buil...
Ning Xi, Guangchao Tang, Xinyu Dai, Shujian Huang,...
Some Statistical Machine Translation systems never see the light because the owner of the appropriate training data cannot release them, and the potential user of the system canno...
In this work we present two extensions to the well-known dynamic programming beam search in phrase-based statistical machine translation (SMT), aiming at increased efficiency of ...