Player Characters (NPCs) play a central role in interactive environments, where adaptive and context-aware behavior is essential for engaging gameplay. Traditional rule-based and utility-driven approaches often lack adaptability and contextually coherent behaviors, while Reinforcement Learning (RL) and Large Language Models (LLMs) offer complementary strengths but exhibit distinct limitations: RL suffers from training inefficiency and limited generalization, whereas LLMs are prone to hallucinations and context drift. This paper introduces HeRoN, a mediated framework that integrates RL and LLMs through functional separation and critique-based refinement to enable coherent and strategically adaptive NPC behavior. The architecture comprises an RL-controlled NPC policy for action execution, an LLM-based strategy generator providing context-aware action proposals, and a lightweight reviewer that refines these proposals to enforce consistency with environment constraints. Through experiments in two structurally distinct custom game environments, we show that early LLM-mediated guidance improves exploration efficiency and generalization. Compared to standard RL baselines, HeRoN achieves up to an 81% improvement in task success rate while substantially reducing constraint-violating actions.

HeRoN: A Mediated RL-LLM Framework for Adaptive NPC Behavior in Interactive Environments

Gaetano Cimino;Vincenzo Deufemia;Andrea Selice
In corso di stampa

Abstract

Player Characters (NPCs) play a central role in interactive environments, where adaptive and context-aware behavior is essential for engaging gameplay. Traditional rule-based and utility-driven approaches often lack adaptability and contextually coherent behaviors, while Reinforcement Learning (RL) and Large Language Models (LLMs) offer complementary strengths but exhibit distinct limitations: RL suffers from training inefficiency and limited generalization, whereas LLMs are prone to hallucinations and context drift. This paper introduces HeRoN, a mediated framework that integrates RL and LLMs through functional separation and critique-based refinement to enable coherent and strategically adaptive NPC behavior. The architecture comprises an RL-controlled NPC policy for action execution, an LLM-based strategy generator providing context-aware action proposals, and a lightweight reviewer that refines these proposals to enforce consistency with environment constraints. Through experiments in two structurally distinct custom game environments, we show that early LLM-mediated guidance improves exploration efficiency and generalization. Compared to standard RL baselines, HeRoN achieves up to an 81% improvement in task success rate while substantially reducing constraint-violating actions.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4948713
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact