Sciweavers

4085 search results - page 474 / 817
» Benchmarking Data Mining Algorithms
Sort
View
DMDW
2000
93views Management» more  DMDW 2000»
15 years 7 months ago
Storing auxiliary data for efficient maintenance and lineage tracing of complex views
As views in a data warehouse become more complex, the view maintenance process can become very complicated and potentially very inefficient. Storing auxiliary views in the warehou...
Yingwei Cui, Jennifer Widom
SIGMOD
1998
ACM
99views Database» more  SIGMOD 1998»
15 years 10 months ago
CURE: An Efficient Clustering Algorithm for Large Databases
Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clust...
Sudipto Guha, Rajeev Rastogi, Kyuseok Shim
176
Voted
SIGMOD
2010
ACM
277views Database» more  SIGMOD 2010»
15 years 11 months ago
A comparison of join algorithms for log processing in MaPreduce
The MapReduce framework is increasingly being used to analyze large volumes of data. One important type of data analysis done with MapReduce is log processing, in which a click-st...
Spyros Blanas, Jignesh M. Patel, Vuk Ercegovac, Ju...
199
Voted
SIGMOD
2001
ACM
200views Database» more  SIGMOD 2001»
16 years 6 months ago
Data Bubbles: Quality Preserving Performance Boosting for Hierarchical Clustering
In this paper, we investigate how to scale hierarchical clustering methods (such as OPTICS) to extremely large databases by utilizing data compression methods (such as BIRCH or ra...
Markus M. Breunig, Hans-Peter Kriegel, Peer Kr&oum...
WSDM
2009
ACM
125views Data Mining» more  WSDM 2009»
16 years 1 months ago
Less is more: sampling the neighborhood graph makes SALSA better and faster
In this paper, we attempt to improve the effectiveness and the efficiency of query-dependent link-based ranking algorithms such as HITS, MAX and SALSA. All these ranking algorith...
Marc Najork, Sreenivas Gollapudi, Rina Panigrahy