Sciweavers

ACL
2004

Relieving the data Acquisition Bottleneck in Word Sense Disambiguation

13 years 6 months ago
Relieving the data Acquisition Bottleneck in Word Sense Disambiguation
Supervised learning methods for WSD yield better performance than unsupervised methods. Yet the availability of clean training data for the former is still a severe challenge. In this paper, we present an unsupervised bootstrapping approach for WSD which exploits huge amounts of automatically generated noisy data for training within a supervised learning framework. The method is evaluated using the 29 nouns in the English Lexical Sample task of SENSEVAL2. Our algorithm does as well as supervised algorithms on 31% of this test set, which is an improvement of 11% (absolute) over state-of-the-art bootstrapping WSD algorithms. We identify seven different factors that impact the performance of our system.
Mona T. Diab
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2004
Where ACL
Authors Mona T. Diab
Comments (0)