CLIR resources, such as dictionaries and parallel corpora, are scarce for special domains. Obtaining comparable corpora automatically for such domains could be an answer to this p...
We investigate the use of Fisher's exact significance test for pruning the translation table of a hierarchical phrase-based statistical machine translation system. In additio...
Minimum Error Rate Training (MERT) and Minimum Bayes-Risk (MBR) decoding are used in most current state-of-theart Statistical Machine Translation (SMT) systems. The algorithms wer...
Shankar Kumar, Wolfgang Macherey, Chris Dyer, Fran...
As tokenization is usually ambiguous for many natural languages such as Chinese and Korean, tokenization errors might potentially introduce translation mistakes for translation sy...
Xinyan Xiao, Yang Liu, Young-Sook Hwang, Qun Liu, ...
This paper proposes to introduce a novel reordering model in the open-source Moses toolkit. The main idea is to provide weighted reordering hypotheses to the SMT decoder. These hy...