Sciweavers

25 search results - page 2 / 5
» ScalParC: A New Scalable and Efficient Parallel Classificati...
Sort
View
DPD
2002
125views more  DPD 2002»
13 years 5 months ago
Parallel Mining of Outliers in Large Database
Data mining is a new, important and fast growing database application. Outlier (exception) detection is one kind of data mining, which can be applied in a variety of areas like mon...
Edward Hung, David Wai-Lok Cheung
IPPS
2006
IEEE
13 years 11 months ago
Design and analysis of a multi-dimensional data sampling service for large scale data analysis applications
Sampling is a widely used technique to increase efficiency in database and data mining applications operating on large dataset. In this paper we present a scalable sampling imple...
Xi Zhang, Tahsin M. Kurç, Joel H. Saltz, Sr...
KDD
1998
ACM
99views Data Mining» more  KDD 1998»
13 years 10 months ago
On the Efficient Gathering of Sufficient Statistics for Classification from Large SQL Databases
For a wide variety of classification algorithms, scalability to large databases can be achieved by observing that most algorithms are driven by a set of sufficient statistics that...
Goetz Graefe, Usama M. Fayyad, Surajit Chaudhuri
IFIP12
2008
13 years 7 months ago
P-Prism: A Computationally Efficient Approach to Scaling up Classification Rule Induction
Top Down Induction of Decision Trees (TDIDT) is the most commonly used method of constructing a model from a dataset in the form of classification rules to classify previously unse...
Frederic T. Stahl, Max A. Bramer, Mo Adda
DMKD
1997
ACM
308views Data Mining» more  DMKD 1997»
13 years 10 months ago
A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining
Partitioning a large set of objects into homogeneous clusters is a fundamental operation in data mining. The k-means algorithm is best suited for implementing this operation becau...
Zhexue Huang