The ultimate goal of data mining is to extract knowledge from massive data. Knowledge is ideally represented as human-comprehensible patterns from which end-users can gain intuiti...
The segmentation of time-series is a constrained clustering problem: the data points should be grouped by their similarity, but with the constraint that all points in a cluster mus...
We show that an important and computationally challenging solution space feature of the graph coloring problem (COL), namely the number of clusters of solutions, can be accurately...
: We propose a set of statistical metrics for making a comprehensive, fair, and insightful evaluation of features, clustering algorithms, and distance measures in representative sa...
This paper presents an approach based on Information Retrieval (IR) techniques for extracting and representing the unstructured information in large software systems such that it ...