Sciweavers

DGO
2006

Automatically labeling hierarchical clusters

13 years 5 months ago
Automatically labeling hierarchical clusters
Government agencies must often quickly organize and analyze large amounts of textual information, for example comments received as part of notice and comment rulemaking. Hierarchical organization is popular because it represents information at different levels of detail and is convenient for interactive browsing. Good hierarchical clustering algorithms are available, but there are few good solutions for automatically labeling the nodes in a cluster hierarchy. This paper presents a simple algorithm that automatically assigns labels to hierarchical clusters. The algorithm evaluates candidate labels using information from the cluster, the parent cluster, and corpus statistics. A trainable threshold enables the algorithm to assign just a few high-quality labels to each cluster. Experiments with Open Directory Project (ODP) hierarchies indicate that the algorithm creates cluster labels that are similar to labels created by ODP editors. Categories and Subject Descriptors H.3.1 [Information ...
Pucktada Treeratpituk, Jamie Callan
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where DGO
Authors Pucktada Treeratpituk, Jamie Callan
Comments (0)