ChatGPT is a general domain chatbot which is object of great attention stimulating all the world discussions on the power and the consequences of the Artificial Intelligence diffusion in all the field, ranging from education, research, music to software development, health care, cultural heritage, and entertainment. In this paper, we try to investigate whether and when the answers provided by ChatGPT are unreliable and how this is perceived by expert users, such as Computer Science students. To this aim, we first analyze the reliability of the answers provided by ChatGPT by experimenting its narrative, problem solving, searching, and logic capabilities and report examples of answers. Then, we conducted a user study in which 15 participants that already knew the chatbot proposed a set of predetermined queries generating both correct and incorrect answers and then we collected their satisfaction. Results revealed that even if the present version of ChatGPT sometimes is unreliable, people still plan to use it. Thus, it is recommended to use the present version of ChatGPT always with the support of human verification and interpretation.
AI Unreliable Answers: A Case Study on ChatGPT
Amaro I.;Della Greca A.;Francese R.
;Tortora G.;Tucci C.
2023-01-01
Abstract
ChatGPT is a general domain chatbot which is object of great attention stimulating all the world discussions on the power and the consequences of the Artificial Intelligence diffusion in all the field, ranging from education, research, music to software development, health care, cultural heritage, and entertainment. In this paper, we try to investigate whether and when the answers provided by ChatGPT are unreliable and how this is perceived by expert users, such as Computer Science students. To this aim, we first analyze the reliability of the answers provided by ChatGPT by experimenting its narrative, problem solving, searching, and logic capabilities and report examples of answers. Then, we conducted a user study in which 15 participants that already knew the chatbot proposed a set of predetermined queries generating both correct and incorrect answers and then we collected their satisfaction. Results revealed that even if the present version of ChatGPT sometimes is unreliable, people still plan to use it. Thus, it is recommended to use the present version of ChatGPT always with the support of human verification and interpretation.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.