In this paper we present a clustering based approach to partition software systems into meaningful subsystems. In particular, the approach uses lexical information extracted from four zones in Java classes, which may provide a different contribution towards software systems partitioning. To automatically weigh these zones, we introduced a probabilistic model, and applied the Expectation-Maximization (EM) algorithm. To group classes according to the considered lexical information, we customized the wellknown K-Medoids algorithm. To assess the approach and the implemented supporting system, we have conducted a case study on six open source software systems. © 2010 IEEE.
A probabilistic based approach towards software system clustering
Scanniello G.
2010-01-01
Abstract
In this paper we present a clustering based approach to partition software systems into meaningful subsystems. In particular, the approach uses lexical information extracted from four zones in Java classes, which may provide a different contribution towards software systems partitioning. To automatically weigh these zones, we introduced a probabilistic model, and applied the Expectation-Maximization (EM) algorithm. To group classes according to the considered lexical information, we customized the wellknown K-Medoids algorithm. To assess the approach and the implemented supporting system, we have conducted a case study on six open source software systems. © 2010 IEEE.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.