Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

17

CICLING
2005
Springer

favoriteEmaildiscussreport

150views Natural Language Processing» more CICLING 2005»

Incremental Information Extraction Using Tree-Based Context Representations

13 years 10 months ago

Incremental Information Extraction Using Tree-Based Context Representations

Download www.siefkes.net

Abstract. The purpose of information extraction (IE) is to ﬁnd desired pieces of information in natural language texts and store them in a form that is suitable for automatic processing. Providing annotated training data to adapt a trainable IE system to a new domain requires a considerable amount of work. To address this, we explore incremental learning. Here training documents are annotated sequentially by a user and immediately incorporated into the extraction model. Thus the system can support the user by proposing extractions based on the current extraction model, reducing the workload of the user over time. We introduce an approach to modeling IE as a token classiﬁcation task that allows incremental training. To provide suﬃcient information to the token classiﬁers, we use rich, tree-based context representations of each token as feature vectors. These representations make use of the heuristically deduced document structure in addition to linguistic and semantic informatio...

Christian Siefkes

Real-time Traffic

CICLING 2005 | Extraction Model | Feature Vectors | Natural Language Processing | Trainable Ie |

claim paper

Related Content

» Incremental OntologyBased Extraction and Alignment in Semistructured Documents

» Hybrid visual and conceptual image representation within active relevance feedback context

» Towards Querying of Traceability Information in the Context of Software Evolution

» Combining Text Vector Representations for Information Retrieval

» GridEnabling the Global Geodynamics Project Automatic RDF Extraction from the ESML Data De...

» EITH A Unifying Representation for Database Schema and Application Code in Enterprise Kno...

» MetaFac community discovery via relational hypergraph factorization

» Bayesian Information Extraction Network

» Issues and Methodology for Template Design for Information Extraction

Post Info
More Details (n/a)

Added	26 Jun 2010
Updated	26 Jun 2010
Type	Conference
Year	2005
Where	CICLING
Authors	Christian Siefkes

Comments (0)