Among the several problems related to the management of database instances, missing values represents a crucial factor that could severely compromise the integrity and the meaningfulness of such data representations. Thus, the data imputation research field focuses its efforts on solutions for filling missing values by means of plausible candidates, while still preserving the overall semantic integrity the database instance is characterized by. To keep imputation times low while still keeping high accuracy, the employment of metadata has made its way through research proposals. This discussion paper presents our effort in the definition of RENUVER, a novel data imputation algorithm relying on Relaxed Functional Dependencies (rfds) for identifying value candidates best guaranteeing the semantic integrity of data. Experimental results on real-world datasets highlighted the effectiveness of RENUVER in terms of both filling accuracy and imputation times, also compared to other well-known approaches.

Imputation of Missing Values through Profiling Metadata

Breve B.;Caruccio L.;Deufemia V.;Polese G.
2022-01-01

Abstract

Among the several problems related to the management of database instances, missing values represents a crucial factor that could severely compromise the integrity and the meaningfulness of such data representations. Thus, the data imputation research field focuses its efforts on solutions for filling missing values by means of plausible candidates, while still preserving the overall semantic integrity the database instance is characterized by. To keep imputation times low while still keeping high accuracy, the employment of metadata has made its way through research proposals. This discussion paper presents our effort in the definition of RENUVER, a novel data imputation algorithm relying on Relaxed Functional Dependencies (rfds) for identifying value candidates best guaranteeing the semantic integrity of data. Experimental results on real-world datasets highlighted the effectiveness of RENUVER in terms of both filling accuracy and imputation times, also compared to other well-known approaches.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4823365
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact