Outlier Detection for High Dimensional Data

11 years 1 months ago
Outlier Detection for High Dimensional Data
The outlier detection problem has important applications in the eld of fraud detection, network robustness analysis, and intrusion detection. Most such applications are high dimensional domains in which the data can contain hundreds of dimensions. Many recent algorithms use concepts of proximity in order to nd outliers based on their relationship to the rest of the data. However, in high dimensional space, the data is sparse and the notion of proximity fails to retain its meaningfulness. In fact, the sparsity of high dimensional data implies that every point is an almost equally good outlier from the perspective of proximity-based de nitions. Consequently, for high dimensional data, the notion of nding meaningful outliers becomes substantially more complex and non-obvious. In this paper, we discuss new techniques for outlier detection which nd the outliers by studying the behavior of projections from the data set.
Charu C. Aggarwal, Philip S. Yu
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2001
Authors Charu C. Aggarwal, Philip S. Yu
Comments (0)