Free Online Productivity Tools
i2Speak
i2Symbol
i2OCR
iTex2Img
iWeb2Print
iWeb2Shot
i2Type
iPdf2Split
iPdf2Merge
i2Bopomofo
i2Arabic
i2Style
i2Image
i2PDF
iLatex2Rtf
Sci2ools

KDD

2007

ACM

2007

ACM

The discovery of subsets with special properties from binary data has been one of the key themes in pattern discovery. Pattern classes such as frequent itemsets stress the co-occurrence of the value 1 in the data. While this choice makes sense in the context of sparse binary data, it disregards potentially interesting subsets of attributes that have some other type of dependency structure. We consider the problem of finding all subsets of attributes that have low complexity. The complexity is measured by either the entropy of the projection of the data on the subset, or the entropy of the data for the subset when modeled using a Bayesian tree, with downward or upward pointing edges. We show that the entropy measure on sets has a monotonicity property, and thus a levelwise approach can find all low-entropy itemsets. We also show that the treebased measures are bounded above by the entropy of the corresponding itemset, allowing similar algorithms to be used for finding low-entropy trees...

Related Content

Added |
30 Nov 2009 |

Updated |
30 Nov 2009 |

Type |
Conference |

Year |
2007 |

Where |
KDD |

Authors |
Eino Hinkkanen, Hannes Heikinheimo, Heikki Mannila, Jouni K. Seppänen, Taneli Mielikäinen |

Comments (0)