FastDocode: Finding Approximated Segments of N-Grams for Document Copy Detection - Lab Report for PAN at CLEF 2010

14 years 7 months ago

Download www.uni-weimar.de

Nowadays, plagiarism has been presented as one of the main distresses that the information technology revolution has lead into our society for which using pattern matching algorithms and intelligent data analysis approaches, these practices could be identified. Furthermore, a fast document copy detection algorithm could be used in large scale applications for plagiarism detection in academia, scientific research, patents, knowledge management, among others. Notwithstanding the fact that plagiarism detection has been tackled by exhaustive comparison of source and suspicious documents, approximated algorithms could lead to interesting results. In this paper, an approach for plagiarism detection is presented. Results in a learning dataset of plagiarized documents from the PAN'09, and its further evaluation in the PAN'10 plagiarism detection challenge, showed that the trade-off between speed and performance could improve other plagiarism detection algorithms.

Gabriel Oberreuter, Gaston L'Huillier, Sebasti&aac

Real-time Traffic

Algorithms | CLEF 2010 | Information Technology | Plagiarism Detection | Plagiarism Detection Challenge |

claim paper

Post Info
More Details (n/a)

Added	13 May 2011
Updated	13 May 2011
Type	Journal
Year	2010
Where	CLEF
Authors	Gabriel Oberreuter, Gaston L'Huillier, Sebastián A. Ríos, Juan D. Velásquez

Comments (0)

Sciweavers

FastDocode: Finding Approximated Segments of N-Grams for Document Copy Detection - Lab Report for PAN at CLEF 2010

Algorithms | CLEF 2010 | Information Technology | Plagiarism Detection | Plagiarism Detection Challenge |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers