Open Information Extraction from the Web

15 years 5 months ago

Download turing.cs.washington.edu

Traditionally, Information Extraction (IE) has focused on satisfying precise, narrow, pre-speciﬁed requests from small homogeneous corpora (e.g., extract the location and time of seminars from a set of announcements). Shifting to a new domain requires the user to name the target relations and to manually create new extraction rules or hand-tag new training examples. This manual labor scales linearly with the number of target relations. This paper introduces Open IE (OIE), a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input. The paper also introduces TEXTRUNNER, a fully implemented, highly scalable OIE system where the tuples are assigned a probability and indexed to support efﬁcient extraction and exploration via user queries. We report on experiments over a 9,000,000 Web page corpus that compare TEXTRUNNER with KNOWITALL, a state-of-the-art Web IE system. TEXTRUN...

Michele Banko, Michael J. Cafarella, Stephen Soder

Real-time Traffic

Artificial Intelligence | IJCAI 2007 | Information Extraction | Small Homogeneous Corpora | Target Relations |

claim paper

» A Novel WebOriented Writing Environment Using Objects Facts Acquired from the Web

» An analysis of open information extraction based on semantic role labeling

» Open Information Extraction Using Wikipedia

» Information Extraction from Tree Documents by Learning Subtree Delimiters

» DBpedia A Nucleus for a Web of Open Data

» Extracting Sequences from the Web

» From information to knowledge harvesting entities and relationships from web sources

» Facilitating situation assessment through gir with multiscale open source web documents

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2007
Where	IJCAI
Authors	Michele Banko, Michael J. Cafarella, Stephen Soderland, Matthew Broadhead, Oren Etzioni

Comments (0)

Sciweavers

Open Information Extraction from the Web

Artificial Intelligence | IJCAI 2007 | Information Extraction | Small Homogeneous Corpora | Target Relations |

Explore & Download

Productivity Tools

Sciweavers