Sparse principal component analysis (PCA) imposes extra constraints or penalty terms to the standard PCA to achieve sparsity. In this paper, we first introduce an efficient algor...
Population based real-life datasets often contain smaller clusters of unusual sub-populations. While these clusters, called `hot spots', are small and sparse, they are usuall...
Clustering large data sets of high dimensionality has always been a serious challenge for clustering algorithms. Many recently developed clustering algorithms have attempted to ad...
A vast amount of documents in the Web have duplicates, which is a challenge for developing efficient methods that would compute clusters of similar documents. In this paper we use ...
As large-scale databases become commonplace, there has been signi cant interest in mining them for commercial purposes. One of the basic tasks that underlies many of these mining ...