Sciweavers

MLDM
2005
Springer

Using Clustering to Learn Distance Functions for Supervised Similarity Assessment

13 years 10 months ago
Using Clustering to Learn Distance Functions for Supervised Similarity Assessment
Assessing the similarity between objects is a prerequisite for many data mining techniques. This paper introduces a novel approach to learn distance functions that maximizes the clustering of objects belonging to the same class. Objects belonging to a data set are clustered with respect to a given distance function and the local class density information of each cluster is then used by a weight adjustment heuristic to modify the distance function so that the class density is increased in the attribute space. This process of interleaving clustering with distance function modification is repeated until a ‘‘good’’ distance function has been found. We implemented our approach using the k-means clustering algorithm. We evaluated our approach using seven UCI data sets for a traditional 1-nearest-neighbor (1-NN) classifier and a compressed 1-NN classifier, called NCC, that uses the learnt distance function and cluster centroids instead of all the points of a training set. The expe...
Christoph F. Eick, Alain Rouhana, Abraham Bagherje
Added 28 Jun 2010
Updated 28 Jun 2010
Type Conference
Year 2005
Where MLDM
Authors Christoph F. Eick, Alain Rouhana, Abraham Bagherjeiran, Ricardo Vilalta
Comments (0)