Exploring a Few Good Tuples from Text Databases

12 years 1 months ago
Exploring a Few Good Tuples from Text Databases
Information extraction from text databases is a useful paradigm to populate relational tables and unlock the considerable value hidden in plain-text documents. However, information extraction can be expensive, due to various complex text processing steps necessary in uncovering the hidden data. There are a large number of text databases available, and not every text database is necessarily relevant to every relation. Hence, it is important to be able to quickly explore the utility of running an extractor for a specific relation over a given text database before carrying out the expensive extraction task. In this paper, we present a novel exploration methodology of finding a few good tuples for a relation that can be extracted from a database which allows for judging the relevance of the database for the relation. Specifically, we propose the notion of a good(k, ) query as one that can return any k tuples for a relation among the top- fraction of tuples ranked by their aggregated confid...
Alpa Jain, Divesh Srivastava
Added 20 Oct 2009
Updated 20 Oct 2009
Type Conference
Year 2009
Where ICDE
Authors Alpa Jain, Divesh Srivastava
Comments (0)