We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Tasks of data mining and information retrieval depend on a good distance function for measuring similarity between data instances. The most effective distance function must be for...
Frequent-pattern mining has been studied extensively on scalable methods for mining various kinds of patterns including itemsets, sequences, and graphs. However, the bottleneck of...
The search for frequent subgraphs is becoming increasingly important in many application areas including Web mining and bioinformatics. Any use of graph structures in mining, howev...
Numerous applications of data mining to scientific data involve the induction of a classification model. In many cases, the collection of data is not performed with this task in m...