We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Identifiers represent an important source of information for programmers understanding and maintaining a system. Self-documenting identifiers reduce the time and effort necessa...
Efficient mining of frequent patterns from large databases has been an active area of research since it is the most expensive step in association rules mining. In this paper, we pr...
Classification problems in critical applications such as health care or security often require very high reliability because of the high costs of errors. In order to achieve this r...
Distance permutation indexes support fast proximity searching in high-dimensional metric spaces. Given some fixed reference sites, for each point in a database the index stores a...