Due to their open nature and popularity, Android-based devices represent one of the main targets for malware attacks that may adversely affect the privacy of their users. Considering the huge Android market share, it is necessary to build effective tools able to reliably detect zero-day malware on these platforms. Therefore, several static and dynamic analysis methods based on Neural Networks and Deep Learning have been proposed in the literature. Despite machine learning can be considered the most promising approach for classifying applications into malware or legitimate ones, its success strongly depends on the choice of the right features used for building the detection model. This is definitely not an easy task that requires a systematic solution. Accordingly, this work represents the sequences of API calls invoked by apps during their execution as sparse matrices looking like images (API-images), which can be used as fingerprints of the apps’ behavior over time. We also used autoencoders to autonomously extract the most representative and discriminating features from these matrices, that, once provided to an artificial neural network-based classifier have shown to be effective in detecting malware, also when the network is trained on a reduced number of samples. Experimental results show that the resulting framework is able to outperform more complex and sophisticated machine learning approaches in malware classification.

Malware detection in mobile environments based on Autoencoders and API-images

D'Angelo G.;Ficco M.;Palmieri F.
2020-01-01

Abstract

Due to their open nature and popularity, Android-based devices represent one of the main targets for malware attacks that may adversely affect the privacy of their users. Considering the huge Android market share, it is necessary to build effective tools able to reliably detect zero-day malware on these platforms. Therefore, several static and dynamic analysis methods based on Neural Networks and Deep Learning have been proposed in the literature. Despite machine learning can be considered the most promising approach for classifying applications into malware or legitimate ones, its success strongly depends on the choice of the right features used for building the detection model. This is definitely not an easy task that requires a systematic solution. Accordingly, this work represents the sequences of API calls invoked by apps during their execution as sparse matrices looking like images (API-images), which can be used as fingerprints of the apps’ behavior over time. We also used autoencoders to autonomously extract the most representative and discriminating features from these matrices, that, once provided to an artificial neural network-based classifier have shown to be effective in detecting malware, also when the network is trained on a reduced number of samples. Experimental results show that the resulting framework is able to outperform more complex and sophisticated machine learning approaches in malware classification.
2020
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4746272
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 87
  • ???jsp.display-item.citation.isi??? 64
social impact