Sciweavers

LREC
2010

NP Alignment in Bilingual Corpora

13 years 6 months ago
NP Alignment in Bilingual Corpora
We created a simple gold standard for English-Hungarian NP-level alignment, Orwell's 1984 by manually verifying the automatically generated NP chunking and manually aligning the maximal NPs and PPs. Since the results are highly impacted by the quality of the NP chunking, we tested our alignment algorithms both with real world (machine obtained) chunkings, where results are in the .35 range for the baseline algorithm which propagates GIZA++ word alignments to the NP level, and on the gold chunkings, where the baseline reaches .4 and our current system reaches .74.
Gabor Recski, András Rung, Attila Zsé
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Gabor Recski, András Rung, Attila Zséder, András Kornai
Comments (0)