Measuring similarity or distance between two entities is a key step for several data mining and knowledge discovery tasks. The notion of similarity for continuous data is relative...
Abstract. The concept of similarity is fundamentally important in almost every scientific field. Clustering, distance-based outlier detection, classification, regression and sea...
Determining similarities among data objects is a core task of content-based multimedia retrieval systems. Approximating data object contents via flexible feature representations, ...
This paper is a comparative study of feature selection methods in statistical learning of text categorization. The focus is on aggressive dimensionality reduction. Five methods we...
In this paper, we present a framework for categorical data analysis which allows such data sets to be explored using a rich set of techniques that are only applicable to continuou...