Entities on social systems, such as users on Twitter, and images on Flickr, are at the core of many interesting applications: they can be ranked in search results, recommended to ...
In this paper, we consider a novel scheme referred to as Cartesian contour to concisely represent the collection of frequent itemsets. Different from the existing works, this sche...
Complexity of post-genomic data and multiplicity of mining strategies are two limits to Knowledge Discovery in Databases (KDD) in life sciences. Because they provide a semantic fr...
A key method for privacy preserving data mining is that of randomization. Unlike k-anonymity, this technique does not include public information in the underlying assumptions. In ...
We consider the problem of finding duplicates in data streams. Duplicate detection in data streams is utilized in various applications including fraud detection. We develop a solu...