Sciweavers

GECCO
2008
Springer

Evolving similarity functions for code plagiarism detection

13 years 5 months ago
Evolving similarity functions for code plagiarism detection
Students are often asked to submit electronic copies of their program code as part of assessment in computer science courses. To counter code plagiarism, educational institutions use tools to detect similarity between submissions. Previous research has identified that using a modified text search engine to identify similar code within large code collections is both efficient and effective. The similarity functions used internally by such search engines have historically been devised manually by experts in the field; in this work, we investigate the practicality of using evolutionary computing techniques to evolve similarity functions. We use particle swarm optimisation to find optimal values of variables in human constructed similarity functions, and use genetic programming to generate new similarity functions specifically for this task. We show empirically that our optimised similarity functions perform better than standard Okapi BM25 across a range of collections. Our results ...
Victor Ciesielski, Nelson Wu, Seyed M. M. Tahaghog
Added 09 Nov 2010
Updated 09 Nov 2010
Type Conference
Year 2008
Where GECCO
Authors Victor Ciesielski, Nelson Wu, Seyed M. M. Tahaghoghi
Comments (0)