The interest in applying Artificial Intelligence algorithms within security contexts is rapidly growing, particularly for the tasks related to malware detection and classification. Over the last decade, numerous Machine Learning (ML) and Deep Learning (DL)-based techniques have been proposed to address the growth of malicious applications, focusing on utilizing features derived from dynamic malware analysis. However, these approaches are often considered black boxes due to their limited ability to explain the results they produce. On the contrary, this study seeks to develop a new model for identifying malware families in an interpretable manner. The methodology employs Genetic Programming to construct a multi-class classifier characterized by a mathematical formula expressing the relationship between dynamic features and the considered malware families. Experimental results, based on Android applications from Unisa Malware Dataset (UMD), showcase the effectiveness of our approach in achieving comparable average scores to the most famous Machine Learning techniques.

An Android Malware Multi-class Classification Explained Through Genetic Programming

D'Angelo G.;Palmieri F.;Robustelli A.
2024-01-01

Abstract

The interest in applying Artificial Intelligence algorithms within security contexts is rapidly growing, particularly for the tasks related to malware detection and classification. Over the last decade, numerous Machine Learning (ML) and Deep Learning (DL)-based techniques have been proposed to address the growth of malicious applications, focusing on utilizing features derived from dynamic malware analysis. However, these approaches are often considered black boxes due to their limited ability to explain the results they produce. On the contrary, this study seeks to develop a new model for identifying malware families in an interpretable manner. The methodology employs Genetic Programming to construct a multi-class classifier characterized by a mathematical formula expressing the relationship between dynamic features and the considered malware families. Experimental results, based on Android applications from Unisa Malware Dataset (UMD), showcase the effectiveness of our approach in achieving comparable average scores to the most famous Machine Learning techniques.
2024
9783031652226
9783031652233
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4898255
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact