Sciweavers

KBSE
1999
IEEE

Automatic Software Clustering via Latent Semantic Analysis

13 years 8 months ago
Automatic Software Clustering via Latent Semantic Analysis
The paper describes the initial results of applying Latent Semantic Analysis (LSA) to program source code and associated documentation. Latent Semantic Analysis is a corpus-based statistical method for inducing and representing aspects of the meanings of words and passages (of natural language) reflective in their usage. This methodology is assessed for application to the domain of software components (i.e., source code and its accompanying documentation). The intent of applying Latent Semantic Analysis to software components is to automatically induce a specific semantic meaning of a given component. Here LSA is used as the basis to cluster software components. Results of applying this method to the LEDA library and MINIX operating system are given. Applying Latent Semantic Analysis to the domain of source code and internal documentation for the support of software reuse is a new application of this method and a departure from the normal application domain of natural language.
Jonathan I. Maletic, Naveen Valluri
Added 04 Aug 2010
Updated 04 Aug 2010
Type Conference
Year 1999
Where KBSE
Authors Jonathan I. Maletic, Naveen Valluri
Comments (0)