Sciweavers

SIGMOD
2010
ACM

I4E: interactive investigation of iterative information extraction

13 years 4 months ago
I4E: interactive investigation of iterative information extraction
Information extraction systems are increasingly being used to mine structured information from unstructured text documents. A commonly used unsupervised technique is to build iterative information extraction (IIE) systems that learn task-specific rules, called patterns, to generate the desired tuples. Oftentimes, output from an information extraction system may contain unexpected results which may be due to an incorrect pattern, incorrect tuple, or both. In such scenarios, users and developers of the extraction system could greatly benefit from an investigation tool that can quickly help them reason about and repair the output. In this paper, we develop an approach for interactive post-extraction investigation for IIE systems. We formalize three important phases of this investigation, namely, explain the IIE result, diagnose the influential and problematic components, and repair the output from an information extraction system. We show how to characterize the execution of an IIE syste...
Anish Das Sarma, Alpa Jain, Divesh Srivastava
Added 06 Dec 2010
Updated 06 Dec 2010
Type Conference
Year 2010
Where SIGMOD
Authors Anish Das Sarma, Alpa Jain, Divesh Srivastava
Comments (0)