Sciweavers

EMNLP
2011

Relaxed Cross-lingual Projection of Constituent Syntax

12 years 4 months ago
Relaxed Cross-lingual Projection of Constituent Syntax
We propose a relaxed correspondence assumption for cross-lingual projection of constituent syntax, which allows a supposed constituent of the target sentence to correspond to an unrestricted treelet in the source parse. Such a relaxed assumption fundamentally tolerates the syntactic non-isomorphism between languages, and enables us to learn the target-language-specific syntactic idiosyncrasy rather than a strained grammar directly projected from the source language syntax. Based on this assumption, a novel constituency projection method is also proposed in order to induce a projected constituent treebank from the source-parsed bilingual corpus. Experiments show that, the parser trained on the projected treebank dramatically outperforms previous projected and unsupervised parsers.
Wenbin Jiang, Qun Liu, Yajuan Lv
Added 20 Dec 2011
Updated 20 Dec 2011
Type Journal
Year 2011
Where EMNLP
Authors Wenbin Jiang, Qun Liu, Yajuan Lv
Comments (0)