Discriminative Induction of Sub-Tree Alignment using Limited Labeled Data

15 years 1 months ago

Download www.comp.nus.edu.sg

We employ Maximum Entropy model to conduct sub-tree alignment between bilingual phrasal structure trees. Various lexical and structural knowledge is explored to measure the syntactic similarity across Chinese-English bilingual tree pairs. In the experiment, we evaluate the sub-tree alignment using both gold standard tree bank and the automatically parsed corpus with manually annotated sub-tree alignment. Compared with a heuristic similarity based method, the proposed method significantly improves the performance with only limited sub-tree aligned data. To examine its effectiveness for multilingual applications, we further attempt different approaches to apply the sub-tree alignment in both phrase and syntax based SMT systems. We then compare the performance with that of the widely used word alignment. Experimental results on benchmark data show that sub-tree alignment benefits both systems by relaxing the constraint of the word alignment.

Jun Sun, Min Zhang, Chew Lim Tan

Real-time Traffic

COLING 2010 | Computational Linguistics | Sub-tree Alignment | Used Word Alignment | Word Alignment |

claim paper

Added	13 May 2011
Updated	13 May 2011
Type	Journal
Year	2010
Where	COLING
Authors	Jun Sun, Min Zhang, Chew Lim Tan

Sciweavers

Discriminative Induction of Sub-Tree Alignment using Limited Labeled Data

COLING 2010 | Computational Linguistics | Sub-tree Alignment | Used Word Alignment | Word Alignment |

Explore & Download

Productivity Tools

Sciweavers