Since 2007, the Linked Open Data (LOD) Cloud has served as a central hub for datasets following Linked Data (LD) principles, offering a large repository of interconnected information. Over time, it has undergone multiple quality assessments to ensure datasets are accessible, well-maintained, and meet standards. Grounded on metadata assessment performed over time, this paper examines the current quality of the LOD Cloud by analyzing 1,658 datasets from the December 2024 snapshot, evaluated against 52 quality metrics. By proposing a reproducible methodology, it reports about the quality assessment and the trend analysis assessing progress, identifying persistent problems, and verifying how datasets registered in the LOD Cloud evolve over time. According to results, many earlier issues persist. Datasets still lack consistency in metadata structure, licenses, and distribution format. Moreover, they mainly remain in archived versions, with real-time access often poorly maintained. Well-curated, up-to-date datasets are exceptions rather than the rule.
Lost in LOD: Analyzing the Linked Open Data Cloud Quality Maze
Pellegrino M. A.
;Tuozzo G.
2026
Abstract
Since 2007, the Linked Open Data (LOD) Cloud has served as a central hub for datasets following Linked Data (LD) principles, offering a large repository of interconnected information. Over time, it has undergone multiple quality assessments to ensure datasets are accessible, well-maintained, and meet standards. Grounded on metadata assessment performed over time, this paper examines the current quality of the LOD Cloud by analyzing 1,658 datasets from the December 2024 snapshot, evaluated against 52 quality metrics. By proposing a reproducible methodology, it reports about the quality assessment and the trend analysis assessing progress, identifying persistent problems, and verifying how datasets registered in the LOD Cloud evolve over time. According to results, many earlier issues persist. Datasets still lack consistency in metadata structure, licenses, and distribution format. Moreover, they mainly remain in archived versions, with real-time access often poorly maintained. Well-curated, up-to-date datasets are exceptions rather than the rule.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


