Entity-relation triple extraction is a crucial task in knowledge graph research, and its performance can be affected by overlapping triples. Existing studies often apply a matrix structure to represent relational triples and each matrix represents the correlation of all tokens. However, this structure is not well-suited to managing triple overlap, especially in cases involving the Entity Pair Overlap (EPO) problem. It struggles to clearly differentiate between multiple relational triples that involve the same pair of entities, leading to ambiguity and reduced extraction accuracy in such scenarios. To address this issue, we propose ARETO, a joint extraction model that incorporates an adjacency list structure to handle triple overlap. This structure represents entity relations without losing information, thus avoiding the complex mapping problems associated with the matrix structure in EPO cases. Moreover, ARETO includes a triple-element decoupling module that reduces errors caused by feature confusion, employing a step-by-step ”subject-object-relation” extraction method to improve accuracy and effectively address the EPO problem. Furthermore, this investigation discovers and defines SEO_R, a new type of relation overlap where triples share the same entity and relation. Experimental results demonstrate that ARETO achieves state-of-the-art F1 scores (93.0% and 93.8%) on the NYT and WebNLG-star datasets, with an accuracy improvement of 0.4%, and reaches F1 scores of 95.3% and 94.7% for overlaps between EPO and SEO_R, outperforming other baseline models and demonstrating its effectiveness. The source code of our work is available at: https://github.com/104wucan/ARETO.
ARETO : A joint entity and relation extraction model for the triple overlapping problem
Li, Ling-Huey;Castiglione, Arcangelo;
2026
Abstract
Entity-relation triple extraction is a crucial task in knowledge graph research, and its performance can be affected by overlapping triples. Existing studies often apply a matrix structure to represent relational triples and each matrix represents the correlation of all tokens. However, this structure is not well-suited to managing triple overlap, especially in cases involving the Entity Pair Overlap (EPO) problem. It struggles to clearly differentiate between multiple relational triples that involve the same pair of entities, leading to ambiguity and reduced extraction accuracy in such scenarios. To address this issue, we propose ARETO, a joint extraction model that incorporates an adjacency list structure to handle triple overlap. This structure represents entity relations without losing information, thus avoiding the complex mapping problems associated with the matrix structure in EPO cases. Moreover, ARETO includes a triple-element decoupling module that reduces errors caused by feature confusion, employing a step-by-step ”subject-object-relation” extraction method to improve accuracy and effectively address the EPO problem. Furthermore, this investigation discovers and defines SEO_R, a new type of relation overlap where triples share the same entity and relation. Experimental results demonstrate that ARETO achieves state-of-the-art F1 scores (93.0% and 93.8%) on the NYT and WebNLG-star datasets, with an accuracy improvement of 0.4%, and reaches F1 scores of 95.3% and 94.7% for overlaps between EPO and SEO_R, outperforming other baseline models and demonstrating its effectiveness. The source code of our work is available at: https://github.com/104wucan/ARETO.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


