A parallel, distributed algorithm for relational frequent pattern discovery from very large data sets

14 years 11 months ago

Download www.di.uniba.it

The amount of data produced by ubiquitous computing applications is quickly growing, due to the pervasive presence of small devices endowed with sensing, computing and communication capabilities. Heterogeneity and strong interdependence, which characterize ‘ubiquitous data’, require a (multi-)relational approach to their analysis. However, relational data mining algorithms do not scale well and very large data sets are hardly processable. In this paper we propose an extension of a relational algorithm for multi-level frequent pattern discovery, which resorts to data sampling and distributed computation in Grid environments, in order to overcome the computational limits of the original serial algorithm. The set of patterns discovered by the new algorithm approximates the set of exact solutions found by the serial algorithm. The quality of approximation depends on three parameters: the proportion of data in each sample, the minimum support thresholds and the number of samples in whic...

Annalisa Appice, Michelangelo Ceci, Antonio Turi,

Real-time Traffic

Algorithm | Frequent Pattern | IDA 2011 | Information Technology | Large Data Sets |

claim paper

» Efficiently Mining Frequent Patterns from Dense Datasets Using a Cluster of Computers

» Efficient Frequent Pattern Mining in Relational Databases

» Parallel mining of closed quasicliques

» A MultiLevel Parallel Implementation of a Program for Finding Frequent Patterns in a Large...

» Finding Frequent Patterns in a Large Sparse Graph

» Finding All Frequent Patterns Starting from the Closure

» Discovery of Frequent DATALOG Patterns

» SLPMiner An Algorithm for Finding Frequent Sequential Patterns Using LengthDecreasing Supp...

Post Info
More Details (n/a)

Added	14 May 2011
Updated	14 May 2011
Type	Journal
Year	2011
Where	IDA
Authors	Annalisa Appice, Michelangelo Ceci, Antonio Turi, Donato Malerba

Comments (0)

Sciweavers

A parallel, distributed algorithm for relational frequent pattern discovery from very large data sets

Algorithm | Frequent Pattern | IDA 2011 | Information Technology | Large Data Sets |

Explore & Download

Productivity Tools

Sciweavers