Data stream profiling concerns the automatic extraction of metadata from a data stream, without having the possibility to store it. Among the metadata of interest, functional dependencies (FDs), and their extensions relaxed functional dependencies (RFDs), represent an important semantic property of data. Nowadays, there are many algorithms for automatically discovering them from static datasets, and some are being proposed for data streams. However, one of the main problems is that the stream nature of data requires a different paradigm of monitoring, since the “big” number of (R)FDs that might hold on a given dataset continuously change as new data are read from the stream. In this paper, we present a tool for visualizing RFDs discovered from a data stream. The tool permits to explore results for different types of RFDs, and uses quantitative measures to monitor how discovery results evolve. Moreover, the tool enables the comparison among RFDs discovered across several executions, also proving visual manipulation operators to dynamically compose and filter results. A user study has been conducted to assess the effectiveness of the proposed visualization tool.

Dependency Visualization in Data Stream Profiling

Breve B.;Caruccio L.;Cirillo S.;Deufemia V.;Polese G.
2021-01-01

Abstract

Data stream profiling concerns the automatic extraction of metadata from a data stream, without having the possibility to store it. Among the metadata of interest, functional dependencies (FDs), and their extensions relaxed functional dependencies (RFDs), represent an important semantic property of data. Nowadays, there are many algorithms for automatically discovering them from static datasets, and some are being proposed for data streams. However, one of the main problems is that the stream nature of data requires a different paradigm of monitoring, since the “big” number of (R)FDs that might hold on a given dataset continuously change as new data are read from the stream. In this paper, we present a tool for visualizing RFDs discovered from a data stream. The tool permits to explore results for different types of RFDs, and uses quantitative measures to monitor how discovery results evolve. Moreover, the tool enables the comparison among RFDs discovered across several executions, also proving visual manipulation operators to dynamically compose and filter results. A user study has been conducted to assess the effectiveness of the proposed visualization tool.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4768172
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 9
social impact