The problem of data compression having specific security properties in order to guarantee user’s privacy is a living matter. On the other hand, high-throughput systems in genomics (e.g. the so-called Next Generation Sequencers) generate massive amounts of genetic data at affordable costs. As a consequence, huge DBMSs integrating many types of genomic information, clinical data and other (personal, environmental, historical, etc.) information types are on the way. This will allow for an unprecedented capability of doing large-scale, comprehensive and in-depth analysis of human beings and diseases; however, it will also constitute a formidable threat to user’s privacy. Whilst the confidential storage of clinical data can be done with well-known methods in the field of relational databases, it is not the same for genomic data; so the main goal of my research work was the design of new compressed indexing schemas for the management of genomic data with confidentiality protection. For the effective processing of a huge amount of such data, a key point will be the possibility of doing high speed search operations in secondary storage, directly operating on the data in compressed and encrypted form; therefore, I spent a big effort to obtain algorithms and data structures enabling pattern search operations on compressed and encrypted data in secondary storage, so that there is no need to preload data in main memory before starting that operations. [edited by Author]

Compression and indexing of genomic data with confidentiality protection / Ferdinando Montecuollo , 2015 Apr 30., Anno Accademico 2013 - 2014.

Compression and indexing of genomic data with confidentiality protection

Montecuollo, Ferdinando
2015

Abstract

The problem of data compression having specific security properties in order to guarantee user’s privacy is a living matter. On the other hand, high-throughput systems in genomics (e.g. the so-called Next Generation Sequencers) generate massive amounts of genetic data at affordable costs. As a consequence, huge DBMSs integrating many types of genomic information, clinical data and other (personal, environmental, historical, etc.) information types are on the way. This will allow for an unprecedented capability of doing large-scale, comprehensive and in-depth analysis of human beings and diseases; however, it will also constitute a formidable threat to user’s privacy. Whilst the confidential storage of clinical data can be done with well-known methods in the field of relational databases, it is not the same for genomic data; so the main goal of my research work was the design of new compressed indexing schemas for the management of genomic data with confidentiality protection. For the effective processing of a huge amount of such data, a key point will be the possibility of doing high speed search operations in secondary storage, directly operating on the data in compressed and encrypted form; therefore, I spent a big effort to obtain algorithms and data structures enabling pattern search operations on compressed and encrypted data in secondary storage, so that there is no need to preload data in main memory before starting that operations. [edited by Author]
30-apr-2015
Biologia dei sistemi
Confidentiality protection
Genomic sequences
Indexed data compression
Tagliaferri, Roberto
Leone, Antonietta
File in questo prodotto:
File Dimensione Formato  
108029604304619694906716462172027622479.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 2.86 MB
Formato Adobe PDF
2.86 MB Adobe PDF Visualizza/Apri
137810161322407549620429259355793787776.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 29.67 kB
Formato Adobe PDF
29.67 kB Adobe PDF Visualizza/Apri
166527093388294444938734498415714173450.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 29.57 kB
Formato Adobe PDF
29.57 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4923818
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact