Functional dependencies (fds) express important relationships among data, which can be used for several goals, including schema normalization and data cleansing. However, to solve several issues in emerging application domains, such as the identification of data inconsistencies or patterns of semantically related data, it has been necessary to relax the fd definition through the introduction of approximations in data comparison and/or validity. Moreover, while fds were originally specified at design time, with the availability of massive data and computational power many algorithms have been devised to automatically discover them from data, including algorithms for discovering some types of relaxed fds. In this paper we present a technique that exploits lattice-based algorithms for the discovery of fds from data, in order to detect relaxed fds. Moreover, we introduce an algorithm to determine a proper distance threshold for a given relaxed fd holding over the entire database.
On the Discovery of Relaxed Functional Dependencies
CARUCCIO, LOREDANA;DEUFEMIA, Vincenzo;POLESE, Giuseppe
2016-01-01
Abstract
Functional dependencies (fds) express important relationships among data, which can be used for several goals, including schema normalization and data cleansing. However, to solve several issues in emerging application domains, such as the identification of data inconsistencies or patterns of semantically related data, it has been necessary to relax the fd definition through the introduction of approximations in data comparison and/or validity. Moreover, while fds were originally specified at design time, with the availability of massive data and computational power many algorithms have been devised to automatically discover them from data, including algorithms for discovering some types of relaxed fds. In this paper we present a technique that exploits lattice-based algorithms for the discovery of fds from data, in order to detect relaxed fds. Moreover, we introduce an algorithm to determine a proper distance threshold for a given relaxed fd holding over the entire database.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.