Sciweavers

ANLP
2000

Detecting Errors within a Corpus using Anomaly Detection

13 years 5 months ago
Detecting Errors within a Corpus using Anomaly Detection
We present a method for automatically detecting errors in a manually marked corpus using anomaly detection. Anomaly detection is a method for determining which elements of a large data set do not conform to the whole. This method fits a probability distribution over the data and applies a statistical test to detect anomalous elements. In the corpus error detection problem, anomalous elements are typically marking errors. We present the results of applying this method to the tagged portion of the Penn Treebank corpus.
Eleazar Eskin
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where ANLP
Authors Eleazar Eskin
Comments (0)