Since 2007, the Linked Open Data (LOD) Cloud has served as a central hub for datasets following Linked Data (LD) principles, offering a large repository of interconnected information. Over time, it has undergone multiple quality assessments to ensure datasets are accessible, well-maintained, and meet standards. Grounded on metadata assessment performed over time, this paper examines the current quality of the LOD Cloud by analyzing 1,658 datasets from the December 2024 snapshot, evaluated against 52 quality metrics. By proposing a reproducible methodology, it reports about the quality assessment and the trend analysis assessing progress, identifying persistent problems, and verifying how datasets registered in the LOD Cloud evolve over time. According to results, many earlier issues persist. Datasets still lack consistency in metadata structure, licenses, and distribution format. Moreover, they mainly remain in archived versions, with real-time access often poorly maintained. Well-curated, up-to-date datasets are exceptions rather than the rule.

Lost in LOD: Analyzing the Linked Open Data Cloud Quality Maze

Pellegrino M. A.
;
Tuozzo G.
2026

Abstract

Since 2007, the Linked Open Data (LOD) Cloud has served as a central hub for datasets following Linked Data (LD) principles, offering a large repository of interconnected information. Over time, it has undergone multiple quality assessments to ensure datasets are accessible, well-maintained, and meet standards. Grounded on metadata assessment performed over time, this paper examines the current quality of the LOD Cloud by analyzing 1,658 datasets from the December 2024 snapshot, evaluated against 52 quality metrics. By proposing a reproducible methodology, it reports about the quality assessment and the trend analysis assessing progress, identifying persistent problems, and verifying how datasets registered in the LOD Cloud evolve over time. According to results, many earlier issues persist. Datasets still lack consistency in metadata structure, licenses, and distribution format. Moreover, they mainly remain in archived versions, with real-time access often poorly maintained. Well-curated, up-to-date datasets are exceptions rather than the rule.
2026
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4946457
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact