Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study

13 years 6 months ago

Download www.comp.nus.edu.sg

A central problem of word sense disambiguation (WSD) is the lack of manually sense-tagged data required for supervised learning. In this paper, we evaluate an approach to automatically acquire sensetagged training data from English-Chinese parallel corpora, which are then used for disambiguating the nouns in the SENSEVAL-2 English lexical sample task. Our investigation reveals that this method of acquiring sense-tagged data is promising. On a subset of the most difficult SENSEVAL-2 nouns, the accuracy difference between the two approaches is only 14.0%, and the difference could narrow further to 6.5% if we disregard the advantage that manually sense-tagged data have in their sense coverage. Our analysis also highlights the importance of the issue of domain dependence in evaluating WSD programs.

Hwee Tou Ng, Bin Wang, Yee Seng Chan

Real-time Traffic

ACL 2003 | ACL 2007 | English-Chinese Parallel Corpora | Sense-tagged Data | Word Sense Disambiguation |

claim paper

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2003
Where	ACL
Authors	Hwee Tou Ng, Bin Wang, Yee Seng Chan

Sciweavers

Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study

ACL 2003 | ACL 2007 | English-Chinese Parallel Corpora | Sense-tagged Data | Word Sense Disambiguation |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers