Sciweavers

COLING
2010

Discriminative Induction of Sub-Tree Alignment using Limited Labeled Data

12 years 11 months ago
Discriminative Induction of Sub-Tree Alignment using Limited Labeled Data
We employ Maximum Entropy model to conduct sub-tree alignment between bilingual phrasal structure trees. Various lexical and structural knowledge is explored to measure the syntactic similarity across Chinese-English bilingual tree pairs. In the experiment, we evaluate the sub-tree alignment using both gold standard tree bank and the automatically parsed corpus with manually annotated sub-tree alignment. Compared with a heuristic similarity based method, the proposed method significantly improves the performance with only limited sub-tree aligned data. To examine its effectiveness for multilingual applications, we further attempt different approaches to apply the sub-tree alignment in both phrase and syntax based SMT systems. We then compare the performance with that of the widely used word alignment. Experimental results on benchmark data show that sub-tree alignment benefits both systems by relaxing the constraint of the word alignment.
Jun Sun, Min Zhang, Chew Lim Tan
Added 13 May 2011
Updated 13 May 2011
Type Journal
Year 2010
Where COLING
Authors Jun Sun, Min Zhang, Chew Lim Tan
Comments (0)