In k-means clustering we are given a set of n data points in d-dimensional space d and an integer k, and the problem is to determine a set of k points in d , called centers, to mi...
Tapas Kanungo, David M. Mount, Nathan S. Netanyahu...
Abstract. Nearest neighbor searching is a fundamental computational problem. A set of n data points is given in real d-dimensional space, and the problem is to preprocess these poi...
The wide availability of large scale databases requires more efficient and scalable tools for data understanding and knowledge discovery. In this paper, we present a method to ...
Duy-Dinh Le, Shin'ichi Satoh, Michael E. Houle, Da...
It is relatively common for different people or organizations to share the same name. Given the increasing amount of information available online, this results in the ever growing...
Motivation: High-throughput expression profiling allows researchers to study gene activities globally. Genes with similar expression profiles are likely to encode proteins that ma...