Sciweavers

PVLDB
2010

Towards The Web of Concepts: Extracting Concepts from Large Datasets

13 years 3 months ago
Towards The Web of Concepts: Extracting Concepts from Large Datasets
Concepts are sequences of words that represent real or imaginary entities or ideas that users are interested in. As a first step towards building a web of concepts that will form the backbone of the next generation of search technology, we develop a novel technique to extract concepts from large datasets. We approach the problem of concept extraction from corpora as a market-baskets problem [2], adapting statistical measures of support and confidence. We evaluate our concept extraction algorithm on datasets containing data from a large number of users (e.g., the AOL query log data set [11]), and we show that a high-precision concept set can be extracted.
Aditya G. Parameswaran, Hector Garcia-Molina, Anan
Added 30 Jan 2011
Updated 30 Jan 2011
Type Journal
Year 2010
Where PVLDB
Authors Aditya G. Parameswaran, Hector Garcia-Molina, Anand Rajaraman
Comments (0)