Many important problems involve clustering large datasets. Although naive implementations of clustering are computationally expensive, there are established efficient techniques f...
High dimensionality remains a significant challenge for document clustering. Recent approaches used frequent itemsets and closed frequent itemsets to reduce dimensionality, and to...
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Feature Filtering is an approach that is widely used for dimensionality reduction in text categorization. In this approach feature scoring methods are used to evaluate features le...
Nayer M. Wanas, Dina A. Said, Nevin M. Darwish, Na...
Latent semantic analysis (LSA), as one of the most popular unsupervised dimension reduction tools, has a wide range of applications in text mining and information retrieval. The k...
Xi Chen, Yanjun Qi, Bing Bai, Qihang Lin, Jaime G....