We present a new algorithm for finding large, dense subgraphs in massive graphs. Our algorithm is based on a recursive application of fingerprinting via shingles, and is extreme...
We describe a simple randomized construction for generating pairs of hash functions h1, h2 from a universe U to ranges V = [m] = {0, 1, . . . , m - 1} and W = [m] so that for ever...
Features in many real world applications such as Cheminformatics, Bioinformatics and Information Retrieval have complex internal structure. For example, frequent patterns mined fr...
Nowadays, graph-based knowledge discovery algorithms do not consider numeric attributes (they are discarded in the preprocessing step, or they are treated as alphanumeric values w...
Oscar E. Romero, Jesus A. Gonzalez, Lawrence B. Ho...
This paper addresses the problem of mining named entity translations from comparable corpora, specifically, mining English and Chinese named entity translation. We first observe...
Jinhan Kim, Long Jiang, Seung-won Hwang, Young-In ...