Abstract. Clustering high dimensional data with sparse features is challenging because pairwise distances between data items are not informative in high dimensional space. To addre...
Abstract. In this paper, we study the problem of projected outlier detection in high dimensional data streams and propose a new technique, called Stream Projected Ouliter deTector ...
Ji Zhang, Qigang Gao, Hai H. Wang, Qing Liu, Kai X...
Abstract. Many applications of machine learning involve sparse highdimensional data, where the number of input features is (much) larger than the number of data samples, d n. Predi...
The outlier detection problem has important applications in the eld of fraud detection, network robustness analysis, and intrusion detection. Most such applications are high dimen...
The k-means algorithm with cosine similarity, also known as the spherical k-means algorithm, is a popular method for clustering document collections. However, spherical k-means ca...