Learning a Metric for Code Readability

8 years 3 months ago
Learning a Metric for Code Readability
—In this paper, we explore the concept of code readability and investigate its relation to software quality. With data collected from 120 human annotators, we derive associations between a simple set of local code features and human notions of readability. Using those features, we construct an automated readability measure and show that it can be 80% effective, and better than a human on average, at predicting readability judgments. Furthermore, we show that this metric correlates strongly with three measures of software quality: code changes, automated defect reports, and defect log messages. We measure these correlations on over 2.2 million lines of code, as well as longitudinally, over many releases of selected projects. Finally, we discuss the implications of this study on programming language design and engineering practice. For example, our data suggests that comments, in of themselves, are less important than simple blank lines to local judgments of readability.
Raymond P. L. Buse, Westley Weimer
Added 31 Jan 2011
Updated 31 Jan 2011
Type Journal
Year 2010
Where TSE
Authors Raymond P. L. Buse, Westley Weimer
Comments (0)