Sciweavers

NAACL
2003

Unsupervised methods for developing taxonomies by combining syntactic and statistical information

13 years 5 months ago
Unsupervised methods for developing taxonomies by combining syntactic and statistical information
This paper describes an unsupervised algorithm for placing unknown words into a taxonomy and evaluates its accuracy on a large and varied sample of words. The algorithm works by first using a large corpus to find semantic neighbors of the unknown word, which we accomplish by combining latent semantic analysis with part-of-speech information. We then place the unknown word in the part of the taxonomy where these neighbors are most concentrated, using a class-labelling algorithm developed especially for this task. This method is used to reconstruct parts of the existing WordNet database, obtaining results for common nouns, proper nouns and verbs. We evaluate the contribution made by part-of-speech tagging and show that automatic filtering using the class-labelling algorithm gives a fourfold improvement in accuracy.
Dominic Widdows
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2003
Where NAACL
Authors Dominic Widdows
Comments (0)