ACL Anthology

Empirical Methods for Compound Splitting

13 years 6 months ago
Empirical Methods for Compound Splitting
Compounded words are a challenge for NLP applications such as machine translation (MT). We introduce methods to learn splitting rules from monolingual and parallel corpora. We evaluate them against a gold standard and measure their impact on performance of statistical MT systems. Results show accuracy of 99.1% and performance gains for MT of 0.039 BLEU on a German-English noun phrase translation task.
Philipp Koehn, Kevin Knight
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2003
Where EACL
Authors Philipp Koehn, Kevin Knight
Comments (0)