The present paper analyzes the usefulness of the normalized compression distance for the problem to cluster the hemagglutinin (HA) sequences of influenza virus data for the HA gene...
Clustering of EST data is a method for the non-redundant representation of an organisms transcriptome. During clustering of large amounts of EST data, usually some large clusters ...
Abstract. Cluster validation to determine the right number of clusters is an important issue in clustering processes. In this work, a strategy to address the problem of cluster val...
Many real-world datasets can be clustered along multiple dimensions. For example, text documents can be clustered not only by topic, but also by the author's gender or sentim...
Given the ubiquity of time series data, the data mining community has spent significant time investigating the best time series similarity measure to use for various tasks and dom...
Qiang Zhu 0002, Gustavo E. A. P. A. Batista, Thana...