Mining Classification Rules from Datasets with Large Number of Many-Valued Attributes

13 years 8 months ago

Download www.anderson.ucla.edu

Decision tree induction algorithms scale well to large datasets for their univariate and divide-and-conquer approach. However, they may fail in discovering effective knowledge when the input dataset consists of a large number of uncorrelated many-valued attributes. In this paper we present an algorithm, Noah, that tackles this problem by applying a multivariate search. Performing a multivariate search leads to a much larger consumption of computation time and memory, this may be prohibitive for large datasets. We remedy this problem by exploiting effective pruning strategies and efficient data structures. We applied our algorithm to a real marketing application of cross-selling. Experimental results revealed that the application database was too complex for C4.5 as it failed to discover any useful knowledge. The application database was also too large for various well known rule discovery algorithms which were not able to complete their task. The pruning techniques used in Noah are gen...

Giovanni Giuffrida, Wesley W. Chu, Dominique M. Ha

Real-time Traffic

Computer Science | Database | EDBT 2000 | Large Datasets | Multivariate Search | Uncorrelated Many-valued Attributes |

claim paper

» Using Classification to Evaluate the Output of ConfidenceBased Association Rule Mining

» Scalable RuleBased Gene Expression Data Classification

» BruteForce Mining of HighConfidence Classification Rules

» Association Rule Mining Algorithms for SetValued Data

» Experiences in Building a Tool for Navigating Association Rule Result Sets

» Efficiently Mining Frequent Patterns from Dense Datasets Using a Cluster of Computers

» Algorithms for Mining DistanceBased Outliers in Large Datasets

» A Rough Set Approach to Attribute Generalization in Data Mining

Post Info
More Details (n/a)

Added	24 Aug 2010
Updated	24 Aug 2010
Type	Conference
Year	2000
Where	EDBT
Authors	Giovanni Giuffrida, Wesley W. Chu, Dominique M. Hanssens

Comments (0)

Sciweavers

Mining Classification Rules from Datasets with Large Number of Many-Valued Attributes

Computer Science | Database | EDBT 2000 | Large Datasets | Multivariate Search | Uncorrelated Many-valued Attributes |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers