Entity-relation triple extraction is a crucial task in knowledge graph research, and its performance can be affected by overlapping triples. Existing studies often apply a matrix structure to represent relational triples and each matrix represents the correlation of all tokens. However, this structure is not well-suited to managing triple overlap, especially in cases involving the Entity Pair Overlap (EPO) problem. It struggles to clearly differentiate between multiple relational triples that involve the same pair of entities, leading to ambiguity and reduced extraction accuracy in such scenarios. To address this issue, we propose ARETO, a joint extraction model that incorporates an adjacency list structure to handle triple overlap. This structure represents entity relations without losing information, thus avoiding the complex mapping problems associated with the matrix structure in EPO cases. Moreover, ARETO includes a triple-element decoupling module that reduces errors caused by feature confusion, employing a step-by-step ”subject-object-relation” extraction method to improve accuracy and effectively address the EPO problem. Furthermore, this investigation discovers and defines SEO_R, a new type of relation overlap where triples share the same entity and relation. Experimental results demonstrate that ARETO achieves state-of-the-art F1 scores (93.0% and 93.8%) on the NYT and WebNLG-star datasets, with an accuracy improvement of 0.4%, and reaches F1 scores of 95.3% and 94.7% for overlaps between EPO and SEO_R, outperforming other baseline models and demonstrating its effectiveness. The source code of our work is available at: https://github.com/104wucan/ARETO.

ARETO : A joint entity and relation extraction model for the triple overlapping problem

Li, Ling-Huey;Castiglione, Arcangelo;
2026

Abstract

Entity-relation triple extraction is a crucial task in knowledge graph research, and its performance can be affected by overlapping triples. Existing studies often apply a matrix structure to represent relational triples and each matrix represents the correlation of all tokens. However, this structure is not well-suited to managing triple overlap, especially in cases involving the Entity Pair Overlap (EPO) problem. It struggles to clearly differentiate between multiple relational triples that involve the same pair of entities, leading to ambiguity and reduced extraction accuracy in such scenarios. To address this issue, we propose ARETO, a joint extraction model that incorporates an adjacency list structure to handle triple overlap. This structure represents entity relations without losing information, thus avoiding the complex mapping problems associated with the matrix structure in EPO cases. Moreover, ARETO includes a triple-element decoupling module that reduces errors caused by feature confusion, employing a step-by-step ”subject-object-relation” extraction method to improve accuracy and effectively address the EPO problem. Furthermore, this investigation discovers and defines SEO_R, a new type of relation overlap where triples share the same entity and relation. Experimental results demonstrate that ARETO achieves state-of-the-art F1 scores (93.0% and 93.8%) on the NYT and WebNLG-star datasets, with an accuracy improvement of 0.4%, and reaches F1 scores of 95.3% and 94.7% for overlaps between EPO and SEO_R, outperforming other baseline models and demonstrating its effectiveness. The source code of our work is available at: https://github.com/104wucan/ARETO.
2026
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4937795
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact