We consider the problem of finding related tables in a large corpus of heterogenous tables. Detecting related tables provides users a powerful tool for enhancing their tables wit...
Anish Das Sarma, Lujun Fang, Nitin Gupta 0003, Alo...
A major challenge of the anti-virus (AV) industry is how to effectively process the huge influx of malware samples they receive every day. One possible solution to this problem i...
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...
Clustering is one of the most important tasks for geographic knowledge discovery. However, existing clustering methods have two severe drawbacks for this purpose. First, spatial c...
Abstract Existing solutions to the automated physical design problem in database systems attempt to minimize execution costs of input workloads for a given storage constraint. In t...