Sciweavers

1081 search results - page 101 / 217
» A Database Interface for Clustering in Large Spatial Databas...
Sort
View
KDD
2009
ACM
198views Data Mining» more  KDD 2009»
16 years 4 months ago
Pervasive parallelism in data mining: dataflow solution to co-clustering large and sparse Netflix data
All Netflix Prize algorithms proposed so far are prohibitively costly for large-scale production systems. In this paper, we describe an efficient dataflow implementation of a coll...
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joy...
VLDB
2002
ACM
154views Database» more  VLDB 2002»
15 years 3 months ago
I/O-Conscious Data Preparation for Large-Scale Web Search Engines
Given that commercial search engines cover billions of web pages, efficiently managing the corresponding volumes of disk-resident data needed to answer user queries quickly is a f...
Maxim Lifantsev, Tzi-cker Chiueh
112
Voted
CIKM
2009
Springer
15 years 10 months ago
Mining tourist information from user-supplied collections
Tourist photographs constitute a large part of the images uploaded to photo sharing platforms. But filtering methods are needed before one can extract useful knowledge from noisy ...
Adrian Popescu, Gregory Grefenstette, Pierre-Alain...
DOLAP
2006
ACM
15 years 10 months ago
Pre-aggregation with probability distributions
Motivated by the increasing need to analyze complex, uncertain multidimensional data this paper proposes probabilistic OLAP queries that are computed using probability distributio...
Igor Timko, Curtis E. Dyreson, Torben Bach Pederse...
KDD
2005
ACM
205views Data Mining» more  KDD 2005»
15 years 9 months ago
Feature bagging for outlier detection
Outlier detection has recently become an important problem in many industrial and financial applications. In this paper, a novel feature bagging approach for detecting outliers in...
Aleksandar Lazarevic, Vipin Kumar