While previous CPU- or memory-centric load balancing schemes are capable of achieving the effective usage of global CPU and memory resources in a cluster system, the cluster exhib...
Xiao Qin, Hong Jiang, Yifeng Zhu, David R. Swanson
This article presents the ways of identification, selection and transformation of the data into other structures. Relation selection and transformation may change data quantity and...
Partitioning a large set of objects into homogeneous clusters is a fundamental operation in data mining. The k-means algorithm is best suited for implementing this operation becau...
Missing data handling is an important preparation step for most data discrimination or mining tasks. Inappropriate treatment of missing data may cause large errors or false result...
Exploratory data analysis is inherently an iterative, interactive endeavor. In the context of massive data sets, however, many current data analysis algorithms will not scale appr...