Functional dependencies (fds) were conceived in the early ’70s, and were mainly used to verify database design and assess data quality. Nowadays they are automatically discovered from data since they can be exploited for many different purposes, such as query relaxation, data cleansing, and record matching. In the context of big data, the speed at which new data is being created demand for new efficient algorithms for fd discovery. In this paper, we propose an incremental discovery algorithm for fds, which is able to update the set of holding fds upon modifications to the data instance, without having to restart the discovery process from scratch. It exploits a bit-vector representation of fds, and an upward/downward search strategy aiming to reduce the overall search space. Experimental results show that such algorithm could achieve extremely better time performances with respect to a complete re-execution of the discovery algorithm.

Incremental discovery of functional dependencies with a bit-vector algorithm

Loredana Caruccio;Stefano Cirillo;Vincenzo Deufemia;Giuseppe Polese
2019-01-01

Abstract

Functional dependencies (fds) were conceived in the early ’70s, and were mainly used to verify database design and assess data quality. Nowadays they are automatically discovered from data since they can be exploited for many different purposes, such as query relaxation, data cleansing, and record matching. In the context of big data, the speed at which new data is being created demand for new efficient algorithms for fd discovery. In this paper, we propose an incremental discovery algorithm for fds, which is able to update the set of holding fds upon modifications to the data instance, without having to restart the discovery process from scratch. It exploits a bit-vector representation of fds, and an upward/downward search strategy aiming to reduce the overall search space. Experimental results show that such algorithm could achieve extremely better time performances with respect to a complete re-execution of the discovery algorithm.
2019
9781510892880
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4730437
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 25
  • ???jsp.display-item.citation.isi??? ND
social impact