Thesaurus Extension Using Web Search Engines

13 years 9 months ago

Download ki.informatik.uni-mannheim.de

Maintaining and extending large thesauri is an important challenge facing digital libraries and IT businesses alike. In this paper we describe a method building on and extending existing methods from the areas of thesaurus maintenance, natural language processing, and machine learning to (a) extract a set of novel candidate concepts from text corpora and (b) to generate a small ranked list of suggestions for the position of these concept in an existing thesaurus. Based on a modiﬁcation of the standard tf-idf term weighting we extract relevant concept candidates from a document corpus. We then apply a pattern-based machine learning approach on content extracted from web search engine snippets to determine the type of relation between the candidate terms and existing thesaurus concepts. The approach is evaluated with a largescale experiment using the MeSH and WordNet thesauri as testbed.

Robert Meusel, Mathias Niepert, Kai Eckert, Heiner

Real-time Traffic

Education | ICADL 2010 | Machine Learning | Thesauri | Thesaurus Maintenance |

claim paper

» Building a web thesaurus from web link structure

» Web Text Corpus for Natural Language Processing

» ProThes thesaurusbased metasearch engine for a specific application domain

» Searching the Web From Keywords to Semantic Queries

» Query Recommendation Using LargeScale Web Access Logs and Web Page Archive

» Using web search engines to improve text recognition

» An Efficient Method for Tagging a Query with Category Labels Using Wikipedia towards Enhan...

» Measuring Semantic Similarity between Named Entities by Searching the Web Directory

Post Info
More Details (n/a)

Added	19 Jul 2010
Updated	19 Jul 2010
Type	Conference
Year	2010
Where	ICADL
Authors	Robert Meusel, Mathias Niepert, Kai Eckert, Heiner Stuckenschmidt

Comments (0)

Sciweavers

Thesaurus Extension Using Web Search Engines

Education | ICADL 2010 | Machine Learning | Thesauri | Thesaurus Maintenance |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers