High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) gr...
This paper is concerned with efficient querying of very large multi-resolution datasets on storage and compute clusters. We present a suite of services that support storage, index...
Similarity search has been proved suitable for searching in very large collections of unstructured data objects. We are interested in efficient parallel query processing under si...
We study a generalization of the k-median problem with respect to an arbitrary dissimilarity measure D. Given a finite set P, our goal is to find a set C of size k such that the s...
While traditional research on text clustering has largely focused on grouping documents by topic, it is conceivable that a user may want to cluster documents along other dimension...