Sciweavers

679 search results - page 48 / 136
» Scaling Clustering Algorithms to Large Databases
Sort
View
IFIP12
2008
15 years 3 months ago
P-Prism: A Computationally Efficient Approach to Scaling up Classification Rule Induction
Top Down Induction of Decision Trees (TDIDT) is the most commonly used method of constructing a model from a dataset in the form of classification rules to classify previously unse...
Frederic T. Stahl, Max A. Bramer, Mo Adda
SIGMOD
2001
ACM
193views Database» more  SIGMOD 2001»
16 years 1 months ago
Epsilon Grid Order: An Algorithm for the Similarity Join on Massive High-Dimensional Data
The similarity join is an important database primitive which has been successfully applied to speed up applications such as similarity search, data analysis and data mining. The s...
Christian Böhm, Bernhard Braunmüller, Fl...
CIKM
2009
Springer
15 years 8 months ago
Packing the most onto your cloud
Parallel dataflow programming frameworks such as Map-Reduce are increasingly being used for large scale data analysis on computing clouds. It is therefore becoming important to a...
Ashraf Aboulnaga, Ziyu Wang, Zi Ye Zhang
KDD
2005
ACM
127views Data Mining» more  KDD 2005»
15 years 7 months ago
Mining closed relational graphs with connectivity constraints
Relational graphs are widely used in modeling large scale networks such as biological networks and social networks. In this kind of graph, connectivity becomes critical in identif...
Xifeng Yan, Xianghong Jasmine Zhou, Jiawei Han
201
Voted
SIGMOD
2009
ACM
140views Database» more  SIGMOD 2009»
16 years 1 months ago
Distributed data-parallel computing using a high-level programming language
The Dryad and DryadLINQ systems offer a new programming model for large scale data-parallel computing. They generalize previous execution environments such as SQL and MapReduce in...
Michael Isard, Yuan Yu