Error Detection and Impact-Sensitive Instance Ranking in Noisy Datasets

13 years 6 months ago

Download www.aaai.org

Given a noisy dataset, how to locate erroneous instances and attributes and rank suspicious instances based on their impacts on the system performance is an interesting and important research issue. We provide in this paper an Error Detection and Impact-sensitive instance Ranking (EDIR) mechanism to address this problem. Given a noisy dataset D, we first train a benchmark classifier T from D. The instances, that cannot be effectively classified by T are treated as suspicious and forwarded to a subset S. For each attribute Ai, we switch Ai and the class label C to train a classifier APi for Ai. Given an instance Ik in S, we use APi and the benchmark classifier T to locate the erroneous value of each attribute Ai. To quantitatively rank instances in S, we define an impact measure based on the Information-gain Ratio (IR). We calculate IRi between attribute Ai and C, and use IRi as the impact-sensitive weight of Ai. The sum of impact-sensitive weights from all located erroneous attributes...

Xingquan Zhu, Xindong Wu, Ying Yang

Real-time Traffic

AAAI 2004 | Attribute Ai | Benchmark Classifier | Intelligent Agents | Noisy Dataset |

claim paper

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2004
Where	AAAI
Authors	Xingquan Zhu, Xindong Wu, Ying Yang

Sciweavers

Error Detection and Impact-Sensitive Instance Ranking in Noisy Datasets

AAAI 2004 | Attribute Ai | Benchmark Classifier | Intelligent Agents | Noisy Dataset |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers