Sciweavers

CLEF
2010
Springer

FastDocode: Finding Approximated Segments of N-Grams for Document Copy Detection - Lab Report for PAN at CLEF 2010

12 years 11 months ago
FastDocode: Finding Approximated Segments of N-Grams for Document Copy Detection - Lab Report for PAN at CLEF 2010
Nowadays, plagiarism has been presented as one of the main distresses that the information technology revolution has lead into our society for which using pattern matching algorithms and intelligent data analysis approaches, these practices could be identified. Furthermore, a fast document copy detection algorithm could be used in large scale applications for plagiarism detection in academia, scientific research, patents, knowledge management, among others. Notwithstanding the fact that plagiarism detection has been tackled by exhaustive comparison of source and suspicious documents, approximated algorithms could lead to interesting results. In this paper, an approach for plagiarism detection is presented. Results in a learning dataset of plagiarized documents from the PAN'09, and its further evaluation in the PAN'10 plagiarism detection challenge, showed that the trade-off between speed and performance could improve other plagiarism detection algorithms.
Gabriel Oberreuter, Gaston L'Huillier, Sebasti&aac
Added 13 May 2011
Updated 13 May 2011
Type Journal
Year 2010
Where CLEF
Authors Gabriel Oberreuter, Gaston L'Huillier, Sebastián A. Ríos, Juan D. Velásquez
Comments (0)