Selecting Labels for News Document Clusters

10 years 5 months ago
Selecting Labels for News Document Clusters
This work deals with determination of meaningful and terse cluster labels for News document clusters. We analyze a number of alternatives for selecting headlines and/or sentences of document in a document cluster (obtained as a result of an entity-event-duration query), and formalize an approach to extracting a short phrase from well-supported headlines/sentences of the cluster that can serve as the cluster label. Our technique maps a sentence into a set of significant stems to approximate its semantics, for comparison. Eventually a cluster label is extracted from a selected headline/sentence as a contiguous sequence of words, resuscitating word sequencing information lost in the formalization of semantic equivalence.
Krishnaprasad Thirunarayan, Trivikram Immaneni, Ma
Added 08 Jun 2010
Updated 08 Jun 2010
Type Conference
Year 2007
Where NLDB
Authors Krishnaprasad Thirunarayan, Trivikram Immaneni, Mastan Vali Shaik
Comments (0)