Automated duplicate detection for bug tracking systems

9 years 4 months ago
Automated duplicate detection for bug tracking systems
Bug tracking systems are important tools that guide the maintenance activities of software developers. The utility of these systems is hampered by an excessive number of duplicate bug reports–in some projects as many as a quarter of all reports are duplicates. Developers must manually identify duplicate bug reports, but this identification process is time-consuming and exacerbates the already high cost of software maintenance. We propose a system that automatically classifies duplicate bug reports as they arrive to save developer time. This system uses surface features, textual semantics, and graph clustering to predict duplicate status. Using a dataset of 29,000 bug reports from the Mozilla project, we perform experiments that include a simulation of a real-time bug reporting environment. Our system is able to reduce development cost by filtering out 8% of duplicate bug reports while allowing at least one report for each real defect to reach developers.
Nicholas Jalbert, Westley Weimer
Added 29 May 2010
Updated 29 May 2010
Type Conference
Year 2008
Where DSN
Authors Nicholas Jalbert, Westley Weimer
Comments (0)