Data clustering represents an important tool in exploratory data analysis. The lack of objective criteria render model selection as well as the identification of robust solutions...
Clustering methods can be either data-driven or need-driven. Data-driven methods intend to discover the true structure of the underlying data while need-driven methods aims at org...
Biclustering refers to simultaneously capturing correlations present among subsets of attributes (columns) and records (rows). It is widely used in data mining applications includ...
Address standardization is a very challenging task in data cleansing. To provide better customer relationship management and business intelligence for customer-oriented cooperates...
We analyze a massive social network, gathered from the records of a large mobile phone operator, with more than a million users and tens of millions of calls. We examine the distr...