Context: Bio-inspired feature selection algorithms got the attention of the researchers in the domain of Software Development Effort Estimations (SDEE) because they can improve the prediction accuracy of existing estimation techniques, such as machine learning methods. Objective: This paper aims to analyze different feature selection algorithms and assess the role they can play to increase the accuracy of software development effort predictions. Method: We have performed an empirical study considering commonly used bio-inspired feature selection algorithms in the domain of SDEE, i.e., Genetic Algorithm (GA), Particle Swarm Optimization, Ant Colony Optimization, Tabu Search, Harmony Search (HS), and Firefly algorithm, and four traditional non-bio-inspired algorithms, i.e., Best-First Search (BFS), Greedy Stepwise, Subset Forward Selection, and Random Search, used in combination with five widely used estimation techniques and applied to eight widely used SDEE datasets. Results: The performed analysis suggests that almost all (bio-inspired) feature selection algorithms have outperformed the baseline estimation techniques (i.e., techniques employed without any feature selection algorithms) in the majority of the experiments and hence we can conclude that feature selection algorithms can help in the domain of SDEE to increase the prediction accuracy. Similarly, HS and GA are considered as best performed bio-inspired algorithms because they provided significantly better results than the non-bio-inspired algorithms in a greater number of experiments. Moreover, we also compared the results of various employed bio-inspired algorithms, and, again, GA and HS came out as the best performed bio-inspired feature selection algorithms. Conclusion: From our results, if we have to pick feature selection algorithms (from both bio- and non-bio-inspired) and recommend them for future investigations, we would suggest HS because it provided better effort predictions in more combinations of datasets and estimation techniques than the other considered bio- and non-bio-inspired algorithms. Among the non-bio-inspired algorithms, BFS is the one that provided better predictions.

Improving software effort estimation using bio-inspired algorithms to select relevant features: An empirical study

Ali A.;Gravino C.
2021-01-01

Abstract

Context: Bio-inspired feature selection algorithms got the attention of the researchers in the domain of Software Development Effort Estimations (SDEE) because they can improve the prediction accuracy of existing estimation techniques, such as machine learning methods. Objective: This paper aims to analyze different feature selection algorithms and assess the role they can play to increase the accuracy of software development effort predictions. Method: We have performed an empirical study considering commonly used bio-inspired feature selection algorithms in the domain of SDEE, i.e., Genetic Algorithm (GA), Particle Swarm Optimization, Ant Colony Optimization, Tabu Search, Harmony Search (HS), and Firefly algorithm, and four traditional non-bio-inspired algorithms, i.e., Best-First Search (BFS), Greedy Stepwise, Subset Forward Selection, and Random Search, used in combination with five widely used estimation techniques and applied to eight widely used SDEE datasets. Results: The performed analysis suggests that almost all (bio-inspired) feature selection algorithms have outperformed the baseline estimation techniques (i.e., techniques employed without any feature selection algorithms) in the majority of the experiments and hence we can conclude that feature selection algorithms can help in the domain of SDEE to increase the prediction accuracy. Similarly, HS and GA are considered as best performed bio-inspired algorithms because they provided significantly better results than the non-bio-inspired algorithms in a greater number of experiments. Moreover, we also compared the results of various employed bio-inspired algorithms, and, again, GA and HS came out as the best performed bio-inspired feature selection algorithms. Conclusion: From our results, if we have to pick feature selection algorithms (from both bio- and non-bio-inspired) and recommend them for future investigations, we would suggest HS because it provided better effort predictions in more combinations of datasets and estimation techniques than the other considered bio- and non-bio-inspired algorithms. Among the non-bio-inspired algorithms, BFS is the one that provided better predictions.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4758109
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 19
  • ???jsp.display-item.citation.isi??? 13
social impact