Sciweavers

RANLP
2003

Roget's thesaurus and semantic similarity

13 years 5 months ago
Roget's thesaurus and semantic similarity
Roget’s Thesaurus has not been sufficiently appreciated in Natural Language Processing. We show that Roget's and WordNet are birds of a feather. In a few typical tests, we compare how the two resources help measure semantic similarity. One of the benchmarks is Miller and Charles’ list of 30 noun pairs to which human judges had assigned similarity measures. We correlate these measures with those computed by several NLP systems. The 30 pairs can be traced back to Rubenstein and Goodenough’s 65 pairs, which we have also studied. Our Roget’sbased system gets correlations of .878 for the smaller and .818 for the larger list of noun pairs; this is quite close to the .885 that Resnik obtained when he employed humans to replicate the Miller and Charles experiment. We further evaluate our measure by using Roget’s and WordNet to answer 80 TOEFL, 50 ESL and 300 Reader’s Digest questions: the correct synonym must be selected amongst a group of four words. Our system gets 78.75%, ...
Mario Jarmasz, Stan Szpakowicz
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2003
Where RANLP
Authors Mario Jarmasz, Stan Szpakowicz
Comments (0)