Data Compression is today essential for a wide range of applications: for example Internet and the World Wide Web infrastructures benefits from compression. New general compression methods are always being developed, in particular those that allow indexing over compressed data or error resilience. Compression also inspires information theoretic tools for pattern discovery and classification, in particular it is possible to use data compression as a metric for clustering. This leads to a powerful clustering strategy that does not use any “semantic” information on the data to be classified but does a “blind” and effective classification that is based only on the compressibility of digital data and not on its “meaning”. Here we experiment with this strategy and show its effectiveness.

Data Compression and Clustering: A Blind Approach to Classification

CARPENTIERI, Bruno
2012-01-01

Abstract

Data Compression is today essential for a wide range of applications: for example Internet and the World Wide Web infrastructures benefits from compression. New general compression methods are always being developed, in particular those that allow indexing over compressed data or error resilience. Compression also inspires information theoretic tools for pattern discovery and classification, in particular it is possible to use data compression as a metric for clustering. This leads to a powerful clustering strategy that does not use any “semantic” information on the data to be classified but does a “blind” and effective classification that is based only on the compressibility of digital data and not on its “meaning”. Here we experiment with this strategy and show its effectiveness.
2012
9781618041265
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4250103
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact