Characterising the differences between two databases is an often occurring problem in Data Mining. Detection of change over time is a prime example, comparing databases from two b...
This paper offers a local distributed algorithm for expectation maximization in large peer-to-peer environments. The algorithm can be used for a variety of well-known data mining...
Text clustering is most commonly treated as a fully automated task without user supervision. However, we can improve clustering performance using supervision in the form of pairwi...
Many real-world graphs have been shown to be scale-free— vertex degrees follow power law distributions, vertices tend to cluster, and the average length of all shortest paths is...
It is crucial in many information systems to organize short text segments, such as keywords in documents and queries from users, into a well-formed topic hierarchy. In this paper,...