Early diagnosis of Type 1 Diabetes (T1D) is essential for effective intervention but remains challenging with conventional clinical methods. In this study, we investigate the potential of machine learning (ML) models to classify pediatric subjects as T1D or healthy based on microarray gene expression data. We train and evaluate three models—Support Vector Machine, Random Forest, and XGBoost—and assess their predictive performance and interpretability. The best-performing model (SVM) achieves an accuracy of 80.8\% and an AUC-ROC of 87.6\%. To understand model behavior, we apply SHAP and Anchor explanations, which identify key genes such as \textit{PTPRN2} and \textit{HLA-DQB1} as major contributors to classification outcomes. These results demonstrate the feasibility of combining predictive accuracy with model transparency, laying the groundwork for clinically meaningful and explainable decision support systems for early T1D detection.
Transparent Machine Learning for Type 1 Diabetes Diagnosis from Gene Expression Data
Rosa Carotenuto;Viviana Pentangelo;Antonio Della Porta;Fabio Palomba
In corso di stampa
Abstract
Early diagnosis of Type 1 Diabetes (T1D) is essential for effective intervention but remains challenging with conventional clinical methods. In this study, we investigate the potential of machine learning (ML) models to classify pediatric subjects as T1D or healthy based on microarray gene expression data. We train and evaluate three models—Support Vector Machine, Random Forest, and XGBoost—and assess their predictive performance and interpretability. The best-performing model (SVM) achieves an accuracy of 80.8\% and an AUC-ROC of 87.6\%. To understand model behavior, we apply SHAP and Anchor explanations, which identify key genes such as \textit{PTPRN2} and \textit{HLA-DQB1} as major contributors to classification outcomes. These results demonstrate the feasibility of combining predictive accuracy with model transparency, laying the groundwork for clinically meaningful and explainable decision support systems for early T1D detection.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.