Background: The number of algorithms available to predict ligand-protein interactions is large and ever-increasing. The number of test cases used to validate these methods is usua...
Luis A. Diago, Persy Morell, Longendri Aguilera, E...
In recent years interest has grown in “mining” large databases to extract novel and interesting information. Knowledge Discovery in Databases (KDD) has been recognised as an em...
In this paper, we propose GAD (General Activity Detection) for fast clustering on large scale data. Within this framework we design a set of algorithms for different scenarios: (...
Jiawei Han, Liangliang Cao, Sangkyum Kim, Xin Jin,...
Population based real-life datasets often contain smaller clusters of unusual sub-populations. While these clusters, called `hot spots', are small and sparse, they are usuall...
A vast amount of documents in the Web have duplicates, which is a challenge for developing efficient methods that would compute clusters of similar documents. In this paper we use ...