Textractor: A Framework for Extracting Relevant Domain Concepts from Irregular Corporate Textual Datasets

15 years 11 days ago

Download www.let.rug.nl

Various information extraction (IE) systems for corporate usage exist. However, none of them target the product development and/or customer service domain, despite significant application potentials and benefits. This domain also poses new scientific challenges, such as the lack of external knowledge resources, and irregularities like ungrammatical constructs in textual data, which compromise successful information extraction. To address these issues, we describe the development of Textractor; an application for accurately extracting relevant concepts from irregular textual narratives in datasets of product development and/or customer service organizations. The extracted information can subsequently be fed to a host of business intelligence activities. We present novel algorithms, combining both statistical and linguistic approaches, for the accurate discovery of relevant domain concepts from highly irregular/ungrammatical texts. Evaluations on real-life corporate data revealed that Te...

Ashwin Ittoo, Laura Maruster, Hans Wortmann, Gosse

Real-time Traffic

BIS 2010 | Business | Development And/or Customer | Domain Concepts | Information Extraction |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2010
Where	BIS
Authors	Ashwin Ittoo, Laura Maruster, Hans Wortmann, Gosse Bouma

Comments (0)

Sciweavers

Textractor: A Framework for Extracting Relevant Domain Concepts from Irregular Corporate Textual Datasets

BIS 2010 | Business | Development And/or Customer | Domain Concepts | Information Extraction |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers