Sciweavers

CIKM
2006
Springer

Representing documents with named entities for story link detection (SLD)

14 years 3 months ago
Representing documents with named entities for story link detection (SLD)
Several information organization, access, and filtering systems can benefit from different kind of document representations than those used in traditional Information Retrieval (IR). Topic Detection and Tracking (TDT) is an example of such an application. In this paper we demonstrate that named entities serve as better choices of units for document representation over all words. In order to test this hypothesis we study the effect of words-based and entity-based representations on Story Link Detection (SLD) - a core task in TDT research. The experiments on TDT corpora show that entity-based representations give significant improvements for SLD. We also propose a mechanism to expand the set of named entities used for document representation, which enhances the performance in some cases. We then take a step further and analyze the limitations of using only named entities for the document representation. Our studies and experiments indicate that adding additional topical terms can help i...
Chirag Shah, W. Bruce Croft, David Jensen
Added 20 Aug 2010
Updated 20 Aug 2010
Type Conference
Year 2006
Where CIKM
Authors Chirag Shah, W. Bruce Croft, David Jensen
Comments (0)