Many empirical software engineering studies have employed feature selection algorithms to exclude the irrelevant and redundant features from the datasets with the aim to improve prediction accuracy achieved with machine learning-based estimation models as well as their generalizability. However, little has been done to investigate how consistently these feature selection algorithms produce features/metrics across different training samples, which is an important point for the interpretation of the trained models. The interpretation of the models largely depends on the features of the analyzed datasets, so it is recommended to evaluate the potential of various feature selection algorithms in terms of how consistently they extract features from the employed datasets. In this study, we consider eight different feature selection algorithms to evaluate how consistently they select features across different folds of k-fold cross-validation as well as when small changes are made in the training data. To provide a stable and generalized conclusion, we investigate data from two different domains, i.e., six datasets from the domain of Software Development Effort Estimation (SDEE) and six datasets from the Software Fault Prediction (SFP) domain. Our results reveal that a feature selection algorithm could produce 20-100% inconsistent features with an SDEE dataset and 18.8-95.3% inconsistent features in the case of an SFP dataset. The analysis also reveals that it is not necessarily true that the most consistent feature selection algorithm results to be the most accurate one (i.e., leads to better prediction accuracy) in the case of SDEE datasets, while with SFP datasets, the analysis highlights that the most consistent feature selection algorithm also results to be the most accurate in predicting faults.

Evaluating the impact of feature selection consistency in software prediction

Gravino C.
2022

Abstract

Many empirical software engineering studies have employed feature selection algorithms to exclude the irrelevant and redundant features from the datasets with the aim to improve prediction accuracy achieved with machine learning-based estimation models as well as their generalizability. However, little has been done to investigate how consistently these feature selection algorithms produce features/metrics across different training samples, which is an important point for the interpretation of the trained models. The interpretation of the models largely depends on the features of the analyzed datasets, so it is recommended to evaluate the potential of various feature selection algorithms in terms of how consistently they extract features from the employed datasets. In this study, we consider eight different feature selection algorithms to evaluate how consistently they select features across different folds of k-fold cross-validation as well as when small changes are made in the training data. To provide a stable and generalized conclusion, we investigate data from two different domains, i.e., six datasets from the domain of Software Development Effort Estimation (SDEE) and six datasets from the Software Fault Prediction (SFP) domain. Our results reveal that a feature selection algorithm could produce 20-100% inconsistent features with an SDEE dataset and 18.8-95.3% inconsistent features in the case of an SFP dataset. The analysis also reveals that it is not necessarily true that the most consistent feature selection algorithm results to be the most accurate one (i.e., leads to better prediction accuracy) in the case of SDEE datasets, while with SFP datasets, the analysis highlights that the most consistent feature selection algorithm also results to be the most accurate in predicting faults.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11386/4779914
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 3
social impact