The interest in applying Artificial Intelligence algorithms within security contexts is rapidly growing, particularly for the tasks related to malware detection and classification. Over the last decade, numerous Machine Learning (ML) and Deep Learning (DL)-based techniques have been proposed to address the growth of malicious applications, focusing on utilizing features derived from dynamic malware analysis. However, these approaches are often considered black boxes due to their limited ability to explain the results they produce. On the contrary, this study seeks to develop a new model for identifying malware families in an interpretable manner. The methodology employs Genetic Programming to construct a multi-class classifier characterized by a mathematical formula expressing the relationship between dynamic features and the considered malware families. Experimental results, based on Android applications from Unisa Malware Dataset (UMD), showcase the effectiveness of our approach in achieving comparable average scores to the most famous Machine Learning techniques.
An Android Malware Multi-class Classification Explained Through Genetic Programming
D'Angelo G.;Palmieri F.;Robustelli A.
2024-01-01
Abstract
The interest in applying Artificial Intelligence algorithms within security contexts is rapidly growing, particularly for the tasks related to malware detection and classification. Over the last decade, numerous Machine Learning (ML) and Deep Learning (DL)-based techniques have been proposed to address the growth of malicious applications, focusing on utilizing features derived from dynamic malware analysis. However, these approaches are often considered black boxes due to their limited ability to explain the results they produce. On the contrary, this study seeks to develop a new model for identifying malware families in an interpretable manner. The methodology employs Genetic Programming to construct a multi-class classifier characterized by a mathematical formula expressing the relationship between dynamic features and the considered malware families. Experimental results, based on Android applications from Unisa Malware Dataset (UMD), showcase the effectiveness of our approach in achieving comparable average scores to the most famous Machine Learning techniques.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.