Sciweavers

969 search results - page 157 / 194
» Clustering performance data efficiently at massive scales
Sort
View
92
Voted
KDD
2003
ACM
156views Data Mining» more  KDD 2003»
16 years 28 days ago
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal o...
Stephen D. Bay, Mark Schwabacher
158
Voted
SIGMOD
2007
ACM
91views Database» more  SIGMOD 2007»
16 years 20 days ago
Indexing dataspaces
Dataspaces are collections of heterogeneous and partially unstructured data. Unlike data-integration systems that also offer uniform access to heterogeneous data sources, dataspac...
Xin Dong, Alon Y. Halevy
PCM
2001
Springer
183views Multimedia» more  PCM 2001»
15 years 5 months ago
An Adaptive Index Structure for High-Dimensional Similarity Search
A practical method for creating a high dimensional index structure that adapts to the data distribution and scales well with the database size, is presented. Typical media descrip...
Peng Wu, B. S. Manjunath, Shivkumar Chandrasekaran
113
Voted
CVPR
2010
IEEE
15 years 9 months ago
Semi-supervised Hashing for Scalable Image Retrieval
Large scale image search has recently attracted considerable attention due to easy availability of huge amounts of data. Several hashing methods have been proposed to allow approx...
Jun Wang, Sanjiv Kumar, Shih-Fu Chang
98
Voted
ICML
2003
IEEE
16 years 1 months ago
Learning Distance Functions using Equivalence Relations
We address the problem of learning distance metrics using side-information in the form of groups of "similar" points. We propose to use the RCA algorithm, which is a sim...
Aharon Bar-Hillel, Tomer Hertz, Noam Shental, Daph...