A number of clustering based approaches and tools have been proposed in the past to partition a software system into subsystems. The greater part of these approaches is semiautomatic, thus requiring human decision to identify the best partition of software entities into clusters among the possible partitions. In addition, some approaches are conceived for software systems implemented using a particular programming language (e.g., C and C++). In this paper we present an approach to automate the partitioning of a given software system into subsystems. In particular, the approach first analyzes the software entities (e.g., programs or classes) and then using Latent Semantic Indexing the dissimilarity between these entities is computed. Finally, software entities are grouped using iteratively the k-means clustering algorithm. The approach has been implemented in a prototype of a supporting software system as an Eclipse plug-in. Finally, to assess the approach and the plug-in, we have conducted an empirical investigation on three open source software systems implemented using the programming languages Java and C/C++.

Architecture Recovery Using Latent Semantic Indexing and K-Means: An Empirical Evaluation

RISI, MICHELE;SCANNIELLO, GIUSEPPE
;
TORTORA, Genoveffa
2010-01-01

Abstract

A number of clustering based approaches and tools have been proposed in the past to partition a software system into subsystems. The greater part of these approaches is semiautomatic, thus requiring human decision to identify the best partition of software entities into clusters among the possible partitions. In addition, some approaches are conceived for software systems implemented using a particular programming language (e.g., C and C++). In this paper we present an approach to automate the partitioning of a given software system into subsystems. In particular, the approach first analyzes the software entities (e.g., programs or classes) and then using Latent Semantic Indexing the dissimilarity between these entities is computed. Finally, software entities are grouped using iteratively the k-means clustering algorithm. The approach has been implemented in a prototype of a supporting software system as an Eclipse plug-in. Finally, to assess the approach and the plug-in, we have conducted an empirical investigation on three open source software systems implemented using the programming languages Java and C/C++.
2010
978-1-4244-8289-4
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/3015986
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 24
  • ???jsp.display-item.citation.isi??? ND
social impact