Sciweavers

AIM
2005

Semantic Integration in Text: From Ambiguous Names to Identifiable Entities

13 years 4 months ago
Semantic Integration in Text: From Ambiguous Names to Identifiable Entities
Intelligent access to information requires semantic integration of structured databases with unstructured textual resources. While the semantic integration problem has been widely studied in the database domain on structured data, it has not been fully recognized nor studied on unstructured or semi-structured textual resources. This paper presents a first step towards this goal by studying semantic integration in natural language texts -- identifying whether different mentions of real world entities, within and across documents, actually represent the same concept. We present a machine learning study of this problem. The first approach is a discriminative approach -- a pairwise local classifier is trained in a supervised way to determine whether two given mentions represent the same real world entity. This is followed, potentially, by a global clustering algorithm that uses the classifier as its similarity metric. Our second approach is a global generative model, at the heart of which...
Xin Li, Paul Morie, Dan Roth
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2005
Where AIM
Authors Xin Li, Paul Morie, Dan Roth
Comments (0)