NP Alignment in Bilingual Corpora

15 years 3 months ago

Download www.kornai.com

We created a simple gold standard for English-Hungarian NP-level alignment, Orwell's 1984 by manually verifying the automatically generated NP chunking and manually aligning the maximal NPs and PPs. Since the results are highly impacted by the quality of the NP chunking, we tested our alignment algorithms both with real world (machine obtained) chunkings, where results are in the .35 range for the baseline algorithm which propagates GIZA++ word alignments to the NP level, and on the gold chunkings, where the baseline reaches .4 and our current system reaches .74.

Gabor Recski, András Rung, Attila Zsé

Real-time Traffic

Education | English-Hungarian NP-level Alignment | LREC 2010 | NP Chunking | Simple Gold Standard |

claim paper

» Using Movie Subtitles for Creating a LargeScale Bilingual Corpora

» FrenchEnglish Terminology Extraction from Comparable Corpora

» Flow Network Models for Word Alignment and Terminology Extraction from Bilingual Corpora

» Aligning More Words with High Precision for Small Bilingual Corpora

» Using cognates to align sentences in bilingual corpora

» Alignment of Shared Forests for Bilingual Corpora

» Word Sense Acquisition from Bilingual Comparable Corpora

» EMbased Hybrid Model for Bilingual Terminology Extraction from Comparable Corpora

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2010
Where	LREC
Authors	Gabor Recski, András Rung, Attila Zséder, András Kornai

Comments (0)

Sciweavers

NP Alignment in Bilingual Corpora

Education | English-Hungarian NP-level Alignment | LREC 2010 | NP Chunking | Simple Gold Standard |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers