Sciweavers

GECCO
2007
Springer

Using code metric histograms and genetic algorithms to perform author identification for software forensics

13 years 8 months ago
Using code metric histograms and genetic algorithms to perform author identification for software forensics
We have developed a technique to characterize software developers' styles using a set of source code metrics. This style fingerprint can be used to identify the likely author of a piece of code from a pool of candidates. Author identification has applications in criminal justice, corporate litigation, and plagiarism detection. Furthermore, we can identify candidate developers who share similar styles, making our technique useful for software maintenance as well. Our method involves measuring the differences in histogram distributions for code metrics. Identifying a combination of metrics that is effective in distinguishing developer styles is key to the utility of the technique. Our case study involves 18 metrics, and the time involved in exhaustive searching of the problem space prevented us from adding additional metrics. Using a genetic algorithm to perform the search, we were able to find good metric combinations in hours as opposed to weeks. The genetic algorithm has enabled...
Robert Charles Lange, Spiros Mancoridis
Added 16 Aug 2010
Updated 16 Aug 2010
Type Conference
Year 2007
Where GECCO
Authors Robert Charles Lange, Spiros Mancoridis
Comments (0)