Data extraction from the web using wild card queries

10 years 2 months ago
Data extraction from the web using wild card queries
This paper presents an overview of our framework for searching and retrieving facts and relationships within natural language text sources. In this framework, an extraction task over a text collection is expressed as a query that combines text fragments with wild cards, and the query result is a set of facts in the form of unary, binary and general n-ary tuples. Despite being both simple and declarative, the framework can be applied to a wide range of extraction tasks. We report some of our work on expanding queries and ranking the the results. We also report some of our experiments and evaluations of the proposed querying framework. Categories and Subject Descriptors H.3.3 [Information Systems]: Information Search and Retrieval; H.5.2 [Information Systems]: User Interfaces General Terms Algorithms,Experimentation,Measurement Keywords DeWild, Data Extraction, Web Search, Ranking
Davood Rafiei, Haobin Li
Added 24 Jul 2010
Updated 24 Jul 2010
Type Conference
Year 2009
Where CIKM
Authors Davood Rafiei, Haobin Li
Comments (0)