Sciweavers

EMNLP
2009

Improved Word Alignment with Statistics and Linguistic Heuristics

13 years 2 months ago
Improved Word Alignment with Statistics and Linguistic Heuristics
We present a method to align words in a bitext that combines elements of a traditional statistical approach with linguistic knowledge. We demonstrate this approach for Arabic-English, using an alignment lexicon produced by a statistical word aligner, as well as linguistic resources ranging from an English parser to heuristic alignment rules for function words. These linguistic heuristics have been generalized from a development corpus of 100 parallel sentences. Our aligner, UALIGN, outperforms both the commonly used GIZA++ aligner and the state-of-theart LEAF aligner on F-measure and produces superior scores in end-to-end sta
Ulf Hermjakob
Added 17 Feb 2011
Updated 17 Feb 2011
Type Journal
Year 2009
Where EMNLP
Authors Ulf Hermjakob
Comments (0)