The objective of data reduction is to obtain a compact representation of a large data set to facilitate repeated use of non-redundant information with complex and slow learning alg...
The advent of social network sites in the last years seems to be a trend that will likely continue. What naive technology users may not realize is that the information they provide...
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This p...
For categorical data there does not exist any similarity measure which is as straight forward and general as the numerical distance between numerical items. Due to this it is ofte...
Motivation: In the field of bioinformatics there is an emerging need to integrate all knowledge discovery steps into a standardized modular framework. Indeed, component-based deve...