A Lightweight Program Similarity Detection Model using XML and Levenshtein Distance

13 years 6 months ago

Download ww1.ucmss.com

Program plagiarism is one of the most significant problems in Computer Science education. Most common plagiarism includes modifying comments, reordering statements, and changing variable names. Such simple changes, however, require excessive string comparisons. This paper presents a lightweight program similarity detection model. Unlike other detection models, our model avoids globally involved string comparisons. String matching is only involved locally when comparing control sequences. To this end we use XML and Levenshtein distance algorithm. The XML's tree-like representation reduces intensive string comparisons for the simple modifications. Levenshtein distance algorithm makes our model reliable for logic changes. Our approach is based on the XPDec model and is capable of distinguishing a flat structure from a nested structure of control sequences. Such improvement will lead to simple and reliable implementation of program similarity detection systems.

Seo-Young Noh, Sangwoo Kim, Cheonyoung Jung

Real-time Traffic

FECS 2006 | FECS 2007 | Levenshtein Distance Algorithm | Program Similarity Detection | String Comparisons |

claim paper

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2006
Where	FECS
Authors	Seo-Young Noh, Sangwoo Kim, Cheonyoung Jung

Sciweavers

A Lightweight Program Similarity Detection Model using XML and Levenshtein Distance

FECS 2006 | FECS 2007 | Levenshtein Distance Algorithm | Program Similarity Detection | String Comparisons |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers