We define and implement a novel side-channel attack that exploits a smartphone’s accelerometer to eavesdrop entire words that the device itself is reproducing through its loudspeakers. The proposed approach consists of two modules: (i) a deep learning-based system that, using a Convolutional Neural Network (CNN), learns to recognize a set of significant speech units, using the spectrogram representation of the corresponding acceleration signals; (ii) an evolutionary-based segmentation method that, given the accelerometer measurements corresponding to an input speech, finds the best way to split it so that the proposed CNN maintains a high classification performance on each of the segments obtained, guarantying the recognition of a significant percentage of words from the original speech. Results of experiments performed to assess the effectiveness of the proposed attack, show its ability to recognize a percentage of words which is higher for short speeches and diminishes as the speeches get longer. We experimented with speeches of lengths ranging from 5 to 60 s, obtaining a recognition percentage going from about 80% for the shortest speeches, down to about 54% for the longest ones.

An improved privacy attack on smartphones exploiting the accelerometer

Roberto De Prisco;Alfredo De Santis;Delfina Malandrino;Rocco Zaccagnino
2023-01-01

Abstract

We define and implement a novel side-channel attack that exploits a smartphone’s accelerometer to eavesdrop entire words that the device itself is reproducing through its loudspeakers. The proposed approach consists of two modules: (i) a deep learning-based system that, using a Convolutional Neural Network (CNN), learns to recognize a set of significant speech units, using the spectrogram representation of the corresponding acceleration signals; (ii) an evolutionary-based segmentation method that, given the accelerometer measurements corresponding to an input speech, finds the best way to split it so that the proposed CNN maintains a high classification performance on each of the segments obtained, guarantying the recognition of a significant percentage of words from the original speech. Results of experiments performed to assess the effectiveness of the proposed attack, show its ability to recognize a percentage of words which is higher for short speeches and diminishes as the speeches get longer. We experimented with speeches of lengths ranging from 5 to 60 s, obtaining a recognition percentage going from about 80% for the shortest speeches, down to about 54% for the longest ones.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4823369
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact