Sciweavers

SDM
2008
SIAM

A general framework for estimating similarity of datasets and decision trees: exploring semantic similarity of decision trees

13 years 6 months ago
A general framework for estimating similarity of datasets and decision trees: exploring semantic similarity of decision trees
Decision trees are among the most popular pattern types in data mining due to their intuitive representation. However, little attention has been given on the definition of measures of semantic similarity between decision trees. In this work, we present a general framework for similarity estimation that includes as special cases the estimation of semantic similarity between decision trees, as well as various forms of similarity estimation on classification datasets with respect to different probability distributions defined over the attribute-class space of the datasets. The similarity estimation is based on the partitions induced by the decision trees on the attribute space of the datasets. We use the framework in order to estimate the semantic similarity of decision trees induced from different subsamples of classification datasets; we evaluate its performance with respect to the empirical semantic similarity, which we estimate on the basis of independent hold-out test sets. The avai...
Irene Ntoutsi, Alexandros Kalousis, Yannis Theodor
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2008
Where SDM
Authors Irene Ntoutsi, Alexandros Kalousis, Yannis Theodoridis
Comments (0)