Sciweavers

25 search results - page 2 / 5
» ScalParC: A New Scalable and Efficient Parallel Classificati...
Sort
View
DPD
2002
125views more  DPD 2002»
13 years 5 months ago
Parallel Mining of Outliers in Large Database
Data mining is a new, important and fast growing database application. Outlier (exception) detection is one kind of data mining, which can be applied in a variety of areas like mon...
Edward Hung, David Wai-Lok Cheung
IPPS
2006
IEEE
13 years 11 months ago
Design and analysis of a multi-dimensional data sampling service for large scale data analysis applications
Sampling is a widely used technique to increase efficiency in database and data mining applications operating on large dataset. In this paper we present a scalable sampling imple...
Xi Zhang, Tahsin M. Kurç, Joel H. Saltz, Sr...
KDD
1998
ACM
99views Data Mining» more  KDD 1998»
13 years 9 months ago
On the Efficient Gathering of Sufficient Statistics for Classification from Large SQL Databases
For a wide variety of classification algorithms, scalability to large databases can be achieved by observing that most algorithms are driven by a set of sufficient statistics that...
Goetz Graefe, Usama M. Fayyad, Surajit Chaudhuri
IFIP12
2008
13 years 6 months ago
P-Prism: A Computationally Efficient Approach to Scaling up Classification Rule Induction
Top Down Induction of Decision Trees (TDIDT) is the most commonly used method of constructing a model from a dataset in the form of classification rules to classify previously unse...
Frederic T. Stahl, Max A. Bramer, Mo Adda
DMKD
1997
ACM
308views Data Mining» more  DMKD 1997»
13 years 9 months ago
A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining
Partitioning a large set of objects into homogeneous clusters is a fundamental operation in data mining. The k-means algorithm is best suited for implementing this operation becau...
Zhexue Huang