Regular Expression Learning for Information Extraction

13 years 5 months ago

Download www.almaden.ibm.com

Regular expressions have served as the dominant workhorse of practical information extraction for several years. However, there has been little work on reducing the manual effort involved in building high-quality, complex regular expressions for information extraction tasks. In this paper, we propose ReLIE, a novel transformation-based algorithm for learning such complex regular expressions. We evaluate the performance of our algorithm on multiple datasets and compare it against the CRF algorithm. We show that ReLIE, in addition to being an order of magnitude faster, outperforms CRF under conditions of limited training data and cross-domain data. Finally, we show how the accuracy of CRF can be improved by using features extracted by ReLIE.

Yunyao Li, Rajasekar Krishnamurthy, Sriram Raghava

Real-time Traffic

Complex Regular Expressions | EMNLP 2008 | Information Extraction | Natural Language Processing | Regular Expressions |

claim paper

» Classifying Sentences Using Induced Structure

» The SystemT IDE an integrated development environment for information extraction rules

» A Framework for Frequent Sequence Mining under Generalized Regular Expression Constraints

» Output Regularized Metric Learning with Side Information

» Semisupervised multitask learning of structured prediction models for web information extr...

» Wiki Vandalysis Wikipedia Vandalism Analysis Lab Report for PAN at CLEF 2010

» Distributed Information Regularization on Graphs

» Reggae Automated Test Generation for Programs Using Complex Regular Expressions

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	EMNLP
Authors	Yunyao Li, Rajasekar Krishnamurthy, Sriram Raghavan, Shivakumar Vaithyanathan, H. V. Jagadish

Comments (0)

Sciweavers

Regular Expression Learning for Information Extraction

Complex Regular Expressions | EMNLP 2008 | Information Extraction | Natural Language Processing | Regular Expressions |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers