Sciweavers

LREC
2010

Corpus and Evaluation Measures for Automatic Plagiarism Detection

13 years 6 months ago
Corpus and Evaluation Measures for Automatic Plagiarism Detection
The simple access to texts on digital libraries and the WWW has led to an increased number of plagiarism cases in recent years, which renders manual plagiarism detection infeasible at large. Various methods for automatic plagiarism detection have been developed whose objective is to assist human experts to analyze documents for plagiarism. Unlike other tasks in natural language processing and information retrieval, it is not possible to publish a collection of real plagiarism cases for evaluation purposes since they cannot be properly anonymized. Therefore, current evaluations found in the literature are incomparable and often not even reproducible. Our contribution in this respect is a newly developed large-scale corpus of artificial plagiarism and new detection performance measures tailored to the evaluation of plagiarism detection algorithms.
Alberto Barrón-Cedeño, Martin Pottha
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Alberto Barrón-Cedeño, Martin Potthast, Paolo Rosso, Benno Stein
Comments (0)