The main task of a social robot is to interact with humans through spoken natural language. It implies that it must be able to understand the intent of the user and the involved entities. Recently, different solutions have been proposed to deal with the Natural Language Understanding (NLU) task. Extremely accurate results have been obtained by architectures based on transformers, but they require high computational resources to work in real-Time. Unfortunately, these resources are not available on embedded systems equipped on board the robot. For these reasons, in this paper we experimentally evaluate the most promising transformers for NLU over the popular ATIS and SNIPS datasets and measured their inference time on the NVIDIA Jetson Xavier NX embedded system. The experimental analysis demonstrates that the Albert model can obtain comparable performance w.r.t.The popular BERT architecture (just a 2% drop on entity recognition), while gaining a speed-up of more than 3x. Thanks to the insights coming out from our analysis, we finally developed a real system for restaurant search running the model over a NVIDIA Jetson Xavier NX equipped on board of a social robot, obtaining a positive user feedback about its effectiveness and responsiveness.
Efficient Transformers for on-robot Natural Language Understanding
Greco A.;Roberto A.;Saggese A.;Vento M.
2022-01-01
Abstract
The main task of a social robot is to interact with humans through spoken natural language. It implies that it must be able to understand the intent of the user and the involved entities. Recently, different solutions have been proposed to deal with the Natural Language Understanding (NLU) task. Extremely accurate results have been obtained by architectures based on transformers, but they require high computational resources to work in real-Time. Unfortunately, these resources are not available on embedded systems equipped on board the robot. For these reasons, in this paper we experimentally evaluate the most promising transformers for NLU over the popular ATIS and SNIPS datasets and measured their inference time on the NVIDIA Jetson Xavier NX embedded system. The experimental analysis demonstrates that the Albert model can obtain comparable performance w.r.t.The popular BERT architecture (just a 2% drop on entity recognition), while gaining a speed-up of more than 3x. Thanks to the insights coming out from our analysis, we finally developed a real system for restaurant search running the model over a NVIDIA Jetson Xavier NX equipped on board of a social robot, obtaining a positive user feedback about its effectiveness and responsiveness.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.