Sciweavers

IJDE
2007

Identifying Authorship by Byte-Level N-Grams: The Source Code Author Profile (SCAP) Method

13 years 4 months ago
Identifying Authorship by Byte-Level N-Grams: The Source Code Author Profile (SCAP) Method
Source code author identification deals with identifying the most likely author of a computer program, given a set of predefined author candidates. There are several scenarios where digital evidence of this kind plays a role in investigation and adjudication, such as code authorship disputes, intellectual property infringement, tracing the source of code left in the system after a cyber attack, and so forth. As in any identification task, the disputed program is compared to undisputed, known programming samples by the predefined author candidates. We present a new approach, called the SCAP (Source Code Author Profiles) approach, based on byte-level n-gram profiles representing the source code author’s style. The SCAP method extends a method originally applied to natural language text authorship attribution; we show that an n-gram approach also suits the characteristics of source code analysis. The methodological extension includes a simplified profile and a less complicated, but mor...
Georgia Frantzeskou, Efstathios Stamatatos, Stefan
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2007
Where IJDE
Authors Georgia Frantzeskou, Efstathios Stamatatos, Stefanos Gritzalis, Carole E. Chaski, Blake Stephen Howald
Comments (0)