Sciweavers

MSR
2011
ACM

Modeling the evolution of topics in source code histories

12 years 7 months ago
Modeling the evolution of topics in source code histories
Studying the evolution of topics (collections of co-occurring words) in a software project is an emerging technique to automatically shed light on how the project is changing over time: which topics are becoming more actively developed, which ones are dying down, or which topics are lately more error-prone and hence require more testing. Existing techniques for modeling the evolution of topics in software projects suffer from issues of data duplication, i.e., when the repository contains multiple copies of the same document, as is the case in source code histories. To address this issue, we propose the Diff model, which applies a topic model only to the changes of the documents in each version instead of to the whole document at each version. A comparative study with a state-of-the-art topic evolution model shows that the Diff model can detect more distinct topics as well as more sensitive and accurate topic evolutions, which are both useful for analyzing source code histories. Cat...
Stephen W. Thomas, Bram Adams, Ahmed E. Hassan, Do
Added 16 Sep 2011
Updated 16 Sep 2011
Type Journal
Year 2011
Where MSR
Authors Stephen W. Thomas, Bram Adams, Ahmed E. Hassan, Dorothea Blostein
Comments (0)