Mining Frequent Patterns without Candidate Generation

13 years 9 months ago

Download www.cs.uiuc.edu

Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist a large number of patterns and/or long patterns. In this study, we propose a novel frequent-pattern tree (FP-tree) structure, which is an extended preﬁx-tree structure for storing compressed, crucial information about frequent patterns, and develop an efﬁcient FP-treebased mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. Efﬁciency of mining is achieved with three techniques: (1) a large database is compressed into a condensed, smaller data structure, FP-tree which avoids costly, repeated database scans, (2) our FP-tree-based mining adopts a pattern-fragment growth method to avoid the costly generati...

Jiawei Han, Jian Pei, Yiwen Yin

Real-time Traffic