Sciweavers

WCRE
2003
IEEE

Problems Creating Task-relevant Clone Detection Reference Data

13 years 9 months ago
Problems Creating Task-relevant Clone Detection Reference Data
One prevalent method for evaluating the results of automated software analysis tools is to compare the tools’ output to the judgment of human experts. This evaluation strategy is commonly assumed in the field of software clone detector research. We report our experiences from a study using several human judges who tried to establish “reference sets” of function clones for several medium-sized software systems written in C. The study employed multiple judges and followed a process typical for inter-coder reliability assurance wherein coders discussed classification discrepancies until consensus is reached. A high level of disagreement was found for reference sets made specifically for reengineering task contexts. The results, although preliminary, raise questions about limitations of prior clone detector evaluations and other similar tool evaluations. Implications are drawn for future work on reference data generation, tool evaluations, and benchmarking efforts.
Andrew Walenstein, Nitin Jyoti, Junwei Li, Yun Yan
Added 05 Jul 2010
Updated 05 Jul 2010
Type Conference
Year 2003
Where WCRE
Authors Andrew Walenstein, Nitin Jyoti, Junwei Li, Yun Yang, Arun Lakhotia
Comments (0)