Diabetes is a widespread chronic disease that requires timely and accurate diagnosis to prevent severe complications. Machine Learning (ML) models have shown great potential in predicting diabetes risk, but their lack of interpretability remains a major barrier to clinical adoption. Explainable AI (XAI) techniques, such as Anchor, offer rule-based insights that are closer to natural language but still fall short of full transparency. With the emergence of Large Language Models (LLMs), it is now possible to enhance these explanations and make them more accessible. In this study, we present a pipeline that combines ML prediction, Anchor-based explanation, and LLM-augmented natural language reporting. In this study, we trained four ML models and selected the best-performing one—SVM with an accuracy of 82\%—which we then paired with Anchor to generate rule-based explanations. These were refined through five iterations of prompt tuning with ChatGPT 3.5, evaluated qualitatively for clarity and precision. The resulting natural-language reports were integrated into \textsc{DIA}, a web-based tool designed to deliver interpretable, human-centered diabetes predictions.

From Rules to Reports: Enhancing Diabetes Prediction Interpretability with Anchor and LLMs

Antonio Della Porta;Viviana Pentangelo;Fabio Palomba
In corso di stampa

Abstract

Diabetes is a widespread chronic disease that requires timely and accurate diagnosis to prevent severe complications. Machine Learning (ML) models have shown great potential in predicting diabetes risk, but their lack of interpretability remains a major barrier to clinical adoption. Explainable AI (XAI) techniques, such as Anchor, offer rule-based insights that are closer to natural language but still fall short of full transparency. With the emergence of Large Language Models (LLMs), it is now possible to enhance these explanations and make them more accessible. In this study, we present a pipeline that combines ML prediction, Anchor-based explanation, and LLM-augmented natural language reporting. In this study, we trained four ML models and selected the best-performing one—SVM with an accuracy of 82\%—which we then paired with Anchor to generate rule-based explanations. These were refined through five iterations of prompt tuning with ChatGPT 3.5, evaluated qualitatively for clarity and precision. The resulting natural-language reports were integrated into \textsc{DIA}, a web-based tool designed to deliver interpretable, human-centered diabetes predictions.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4919815
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact