Abstract. Data with multi-valued categorical attributes can cause major problems for decision trees. The high branching factor can lead to data fragmentation, where decisions have ...
Sampling has been recognized as an important technique to improve the efficiency of clustering. However, with sampling applied, those points which are not sampled will not have t...
In the context of large databases, data preparation takes a greater importance : instances and explanatory attributes have to be carefully selected. In supervised learning, instanc...
Partitioning a large set of objects into homogeneous clusters is a fundamental operation in data mining. The k-means algorithm is best suited for implementing this operation becau...
Abstract. Feature extraction based on evolutionary search offers new possibilities for improving classification accuracy and reducing measurement complexity in many data mining and...