Model validation methods (e.g., k-fold cross-validation) use historical data to predict how well an estimation technique (e.g., random forest) performs on the current (or future) data. Studies in the contexts of software development effort estimation (SDEE) and software fault prediction (SFP) have used and investigated different model validation methods. However, no conclusive indications to suggest which model validation method has a major impact on the prediction accuracy and stability of estimation techniques. Some studies have investigated model validation methods specific to data about either SDEE or SFP. To the best of our knowledge, there is no study in the literature, which has employed different validation methods both with SDEE and SFP data. The aim of this paper is to consider different methods (10) from the family of cross-validation (CV) and bootstrap validation methods to identify which one contributes to obtaining a better prediction accuracy for both types of data. We also evaluate which model validation methods allow the estimation techniques to provide stable performances (i.e., with lower variance). To this aim, we present an empirical study involving six datasets from the domain of SDEE and six datasets from the SFP domain. The results reveal that repeated 10-fold CV with SDEE and optimistic boot with SFP data are the model validation methods that provide a better prediction accuracy in a greater number of experiments than the other model validation methods. Furthermore, a model validation method can improve the prediction accuracy up to 60% with SDEE data and up to 36% when employing SFP data. The analysis also reveals that repeated fivefold CV produces more stable performances when the experiments are repeated on the same data.

An empirical comparison of validation methods for software prediction models

Gravino C.
2021

Abstract

Model validation methods (e.g., k-fold cross-validation) use historical data to predict how well an estimation technique (e.g., random forest) performs on the current (or future) data. Studies in the contexts of software development effort estimation (SDEE) and software fault prediction (SFP) have used and investigated different model validation methods. However, no conclusive indications to suggest which model validation method has a major impact on the prediction accuracy and stability of estimation techniques. Some studies have investigated model validation methods specific to data about either SDEE or SFP. To the best of our knowledge, there is no study in the literature, which has employed different validation methods both with SDEE and SFP data. The aim of this paper is to consider different methods (10) from the family of cross-validation (CV) and bootstrap validation methods to identify which one contributes to obtaining a better prediction accuracy for both types of data. We also evaluate which model validation methods allow the estimation techniques to provide stable performances (i.e., with lower variance). To this aim, we present an empirical study involving six datasets from the domain of SDEE and six datasets from the SFP domain. The results reveal that repeated 10-fold CV with SDEE and optimistic boot with SFP data are the model validation methods that provide a better prediction accuracy in a greater number of experiments than the other model validation methods. Furthermore, a model validation method can improve the prediction accuracy up to 60% with SDEE data and up to 36% when employing SFP data. The analysis also reveals that repeated fivefold CV produces more stable performances when the experiments are repeated on the same data.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11386/4779919
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 1
social impact