Supporting top-k queries over distributed collections of schemaless XML data poses two challenges. While XML supports expressive query languages such as XPath and XQuery, these la...
We present similarity-based methods to cluster digital photos by time and image content. The approach is general, unsupervised, and makes minimal assumptions regarding the structu...
Matthew L. Cooper, Jonathan Foote, Andreas Girgens...
Clustering the results of a search helps the user to overview the information returned. In this paper, we regard the clustering task as indexing the search results. Here, an index...
With the explosion in the amount of semi-structured data users access and store in personal information management systems, there is a need for complex search tools to retrieve of...
We propose a partitioning scheme for similarity search indexes that is called Maximal Metric Margin Partitioning (MMMP). MMMP divides the data on the basis of its distribution pat...