P2P systems represent a large portion of the Internet traffic which makes the data discovery of great importance to the user and the broad Internet community. Hence, the power of ...
Semi-supervised clustering uses a small amount of supervised data to aid unsupervised learning. One typical approach specifies a limited number of must-link and cannotlink constra...
A computationally fast procedure for identifying outliers is presented, that is particularly effective in high dimensions. This algorithm utilizes simple properties of principal c...
Abstract. The high dimensionality of the data generated by social networks has been a big challenge for researchers. In order to solve the problems associated with this phenomenon,...
Inference of latent variables from complicated data is one important problem in data mining. The high dimensionality and high complexity of real world data often make accurate infe...