Emerging data stream management systems approach the challenge of massive data distributions which arrive at high speeds while there is only small storage by summarizing and minin...
In this paper, we examine the performance of frequent pattern mining algorithms on a modern processor. A detailed performance study reveals that even the best frequent pattern min...
We present a new algorithm for finding large, dense subgraphs in massive graphs. Our algorithm is based on a recursive application of fingerprinting via shingles, and is extreme...
In various applications such as data cleansing, being able to retrieve categorical or numerical attributes based on notions of approximate match (e.g., edit distance, numerical di...
Liang Jin, Nick Koudas, Chen Li, Anthony K. H. Tun...
Many database applications have the emerging need to support fuzzy queries that ask for strings that are similar to a given string, such as “name similar to smith” and “tele...