The problem of finding clusters in data is challenging when clusters are of widely differing sizes, densities and shapes, and when the data contains large amounts of noise and out...
Researchers in the social and behavioral sciences routinely rely on quasi-experimental designs to discover knowledge from large databases. Quasi-experimental designs (QEDs) exploi...
David D. Jensen, Andrew S. Fast, Brian J. Taylor, ...
Condensed representations of pattern collections have been recognized to be important building blocks of inductive databases, a promising theoretical framework for data mining, and...
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...
The issue of data mining in time series databases is of utmost importance for many practical applications and has attracted a lot of research in the past years. In this paper, we ...