Parallel Algorithms for Distance-Based and Density-Based Outliers

16 years 29 days ago

Download academic.uprm.edu

An outlier is an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism. Outlier detection has many applications, such as data cleaning, fraud detection and network intrusion. The existence of outliers can indicate individuals or groups that exhibit a behavior that is very different from most of the individuals of the dataset. In this paper we design two parallel algorithms, the ﬁrst one is for ﬁnding out distance-based outliers based on nested loops along with randomization and the use of a pruning rule. The second parallel algorithm is for detecting densitybased local outliers. In both cases data parallelism is used. We show that both algorithms reach near linear speedup. Our algorithms are tested on four real-world datasets coming from the Machine Learning Database Repository at the UCI.

Elio Lozano, Edgar Acuña

Real-time Traffic

Data Mining | Densitybased Local Outliers | ICDM 2005 | Outliers | Parallel Algorithm |

claim paper

Post Info
More Details (n/a)

Added	24 Jun 2010
Updated	24 Jun 2010
Type	Conference
Year	2005
Where	ICDM
Authors	Elio Lozano, Edgar Acuña

Comments (0)

Sciweavers

Parallel Algorithms for Distance-Based and Density-Based Outliers

Data Mining | Densitybased Local Outliers | ICDM 2005 | Outliers | Parallel Algorithm |

Explore & Download

Productivity Tools

Sciweavers