Sciweavers

KDD
2005
ACM

Fast discovery of unexpected patterns in data, relative to a Bayesian network

14 years 4 months ago
Fast discovery of unexpected patterns in data, relative to a Bayesian network
We consider a model in which background knowledge on a given domain of interest is available in terms of a Bayesian network, in addition to a large database. The mining problem is to discover unexpected patterns: our goal is to find the strongest discrepancies between network and database. This problem is intrinsically difficult because it requires inference in a Bayesian network and processing the entire, potentially very large, database. A sampling-based method that we introduce is efficient and yet provably finds the approximately most interesting unexpected patterns. We give a rigorous proof of the method's correctness. Experiments shed light on its efficiency and practicality for large-scale Bayesian networks and databases. Categories and Subject Descriptors H.2.8 [Database Management]: Database ApplicationsData Mining General Terms Algorithms, Experimentation, Performance Keywords Bayesian Networks, Association Rules, Sampling
Szymon Jaroszewicz, Tobias Scheffer
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2005
Where KDD
Authors Szymon Jaroszewicz, Tobias Scheffer
Comments (0)