We consider the problem of clustering data lying on multiple subspaces of unknown and possibly different dimensions. We show that one can represent the subspaces with a set of pol...
The Visual Thesaurus is a new query approach when no starting image is available. It is a concise representation of all similar regions in a panel of visual patches; the user arra...
Clustering is to identify densely populated subgroups in data, while correlation analysis is to find the dependency between the attributes of the data set. In this paper, we combin...
High-performance document clustering systems enable similar documents to automatically self-organize into groups. In the past, the large amount of computational time needed to clu...
G. Adam Covington, Charles L. G. Comstock, Andrew ...
High dimensionality remains a significant challenge for document clustering. Recent approaches used frequent itemsets and closed frequent itemsets to reduce dimensionality, and to...