Early diagnosis of Type 1 Diabetes (T1D) is essential for effective intervention but remains challenging with conventional clinical methods. In this study, we investigate the potential of machine learning (ML) models to classify pediatric subjects as T1D or healthy based on microarray gene expression data. We train and evaluate three models—Support Vector Machine, Random Forest, and XGBoost—and assess their predictive performance and interpretability. The best-performing model (SVM) achieves an accuracy of 80.8\% and an AUC-ROC of 87.6\%. To understand model behavior, we apply SHAP and Anchor explanations, which identify key genes such as \textit{PTPRN2} and \textit{HLA-DQB1} as major contributors to classification outcomes. These results demonstrate the feasibility of combining predictive accuracy with model transparency, laying the groundwork for clinically meaningful and explainable decision support systems for early T1D detection.

Transparent Machine Learning for Type 1 Diabetes Diagnosis from Gene Expression Data

Rosa Carotenuto;Viviana Pentangelo;Antonio Della Porta;Fabio Palomba
In corso di stampa

Abstract

Early diagnosis of Type 1 Diabetes (T1D) is essential for effective intervention but remains challenging with conventional clinical methods. In this study, we investigate the potential of machine learning (ML) models to classify pediatric subjects as T1D or healthy based on microarray gene expression data. We train and evaluate three models—Support Vector Machine, Random Forest, and XGBoost—and assess their predictive performance and interpretability. The best-performing model (SVM) achieves an accuracy of 80.8\% and an AUC-ROC of 87.6\%. To understand model behavior, we apply SHAP and Anchor explanations, which identify key genes such as \textit{PTPRN2} and \textit{HLA-DQB1} as major contributors to classification outcomes. These results demonstrate the feasibility of combining predictive accuracy with model transparency, laying the groundwork for clinically meaningful and explainable decision support systems for early T1D detection.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4919778
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact