Phrase has been considered as a more informative feature term for improving the effectiveness of document clustering. In this paper, we propose a phrase-based document similarity t...
Correlation Clustering was defined by Bansal, Blum, and Chawla as the problem of clustering a set of elements based on a possibly inconsistent binary similarity function between e...
To analyze the linear correlations of numeric attributes of government data, this paper proposes a method based on the clustering algorithm. A clustering method is adopted to prun...
We study the problem of automatically identifying“hotspots” on the real-time web. Concretely, we propose to identify highly-dynamic ad-hoc collections of users – what we ref...
The manipulation of large-scale document data sets often involves the processing of a wealth of features that correspond with the available terms in the document space. The employm...