Sciweavers

KDD
2009
ACM

CoCo: coding cost for parameter-free outlier detection

13 years 11 months ago
CoCo: coding cost for parameter-free outlier detection
How can we automatically spot all outstanding observations in a data set? This question arises in a large variety of applications, e.g. in economy, biology and medicine. Existing approaches to outlier detection suffer from one or more of the following drawbacks: The results of many methods strongly depend on suitable parameter settings being very difficult to estimate without background knowledge on the data, e.g. the minimum cluster size or the number of desired outliers. Many methods implicitly assume Gaussian or uniformly distributed data, and/or their result is difficult to interpret. To cope with these problems, we propose CoCo, a technique for parameter-free outlier detection. The basic idea of our technique relates outlier detection to data compression: Outliers are objects which can not be effectively compressed given the data set. To avoid the assumption of a certain data distribution, CoCo relies on a very general data model combining the Exponential Power Distribution wit...
Christian Böhm, Katrin Haegler, Nikola S. M&u
Added 20 May 2010
Updated 20 May 2010
Type Conference
Year 2009
Where KDD
Authors Christian Böhm, Katrin Haegler, Nikola S. Müller, Claudia Plant
Comments (0)