Sciweavers

CICLING
2011
Springer

Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features

12 years 8 months ago
Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features
Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia vandalism detection: a spatiotemporal analysis of metadata (STiki), a reputation-based system (WikiTrust), and natural language processing features. The performance of the resulting joint system improves the state-of-the-art from all previous methods and establishes a new baseline for Wikipedia vandalism detection. We examine in detail the contribution of the three approaches, both for the task of discovering fresh vandalism, and for the task of locating vandalism in the complete set of Wikipedia revisions.
B. Thomas Adler, Luca de Alfaro, Santiago Mois&eac
Added 25 Aug 2011
Updated 25 Aug 2011
Type Journal
Year 2011
Where CICLING
Authors B. Thomas Adler, Luca de Alfaro, Santiago Moisés Mola-Velasco, Paolo Rosso, Andrew G. West
Comments (0)