Most queries in web search are ambiguous and multifaceted. Identifying the major senses and facets of queries from search log data, referred to as query subtopic mining in this pa...
Yunhua Hu, Ya-nan Qian, Hang Li, Daxin Jiang, Jian...
We propose and test an objective criterion for evaluation of clustering performance: How well does a clustering algorithm run on unlabeled data aid a classification algorithm? The...
Documents and authors can be clustered into “knowledge communities” based on the overlap in the papers they cite. We introduce a new clustering algorithm, Streemer, which fin...
Vasileios Kandylas, S. Phineas Upham, Lyle H. Unga...
The co-association (CA) matrix was previously introduced to combine multiple partitions. In this paper, we analyze the CA matrix, and address its difference from the similarity ma...
Text document clustering plays an important role in providing intuitive navigation and browsing mechanisms by organizing large sets of documents into a small number of meaningful ...