The rapid adoption of Machine Learning (ML) technologies has introduced new challenges for code quality. Code smells, i.e., suboptimal design and implementation choices applied when developing source code, represent a particularly prevalent problem. While software engineering (SE) practices are often recommended to improve maintainability, their actual impact on code smells in ML projects remains unclear. In this paper, we present an evidence-based empirical study of 566 real-world Python ML projects from the NICHE dataset, labeled according to adherence to eight established SE practices. Using static analysis and statistical testing, we assess the relationship between these practices and the presence of ten Python-specific code smells. Our results show that projects adopting SE practices exhibit significantly fewer code smells. In particular, Continuous Integration is negatively correlated with the Complex Container Comprehension smell. These findings highlight the importance of engineering discipline in managing code quality in ML development.

An Evidence-Based Study on the Relationship of Software Engineering Practices on Code Smells in Python ML Projects

Giordano, Giammaria
;
Della Porta, Antonio;Ferrucci, Filomena;Palomba, Fabio
2026

Abstract

The rapid adoption of Machine Learning (ML) technologies has introduced new challenges for code quality. Code smells, i.e., suboptimal design and implementation choices applied when developing source code, represent a particularly prevalent problem. While software engineering (SE) practices are often recommended to improve maintainability, their actual impact on code smells in ML projects remains unclear. In this paper, we present an evidence-based empirical study of 566 real-world Python ML projects from the NICHE dataset, labeled according to adherence to eight established SE practices. Using static analysis and statistical testing, we assess the relationship between these practices and the presence of ten Python-specific code smells. Our results show that projects adopting SE practices exhibit significantly fewer code smells. In particular, Continuous Integration is negatively correlated with the Complex Container Comprehension smell. These findings highlight the importance of engineering discipline in managing code quality in ML development.
2026
9783032042064
9783032042071
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4919409
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact