The goal of clustering is to identify distinct groups in a dataset. The basic idea of model-based clustering is to approximate the data density by a mixture model, typically a mix...
Many applications in surveillance, monitoring, scientific discovery, and data cleaning require the identification of anomalies. Although many methods have been developed to iden...
We describe a data mining system to detect frauds that are camouflaged to look like normal activities in domains with high number of known relationships. Examples include accounti...
Protein secondary structure prediction and high-throughput drug screen data mining are two important applications in bioinformatics. The data is represented in sparse feature spac...
Steven Eschrich, Nitesh V. Chawla, Lawrence O. Hal...
Spatial scan statistics are used to determine hotspots in spatial data, and are widely used in epidemiology and biosurveillance. In recent years, there has been much effort invest...
Deepak Agarwal, Andrew McGregor, Jeff M. Phillips,...