Adversarial attacks have gained growing attention due to their ability to mislead machine learning models by introducing carefully crafted perturbations. These attacks span a wide variety of domains, from image recognition to graph-based and NLP models. In this context, understanding the nature of such perturbations is crucial for both detecting attacks and designing effective defenses. Despite the abundance of research on adversarial machine learning, little is known about how perturbation types vary based on the category of the targeted feature, or how the amount of perturbation influences the attack’s impact. To this aim, we conducted a systematic study to (i) identify and classify the perturbation strategies used in adversarial attacks and (ii) analyze the relationship between the strength of perturbation, the perturbation type, and the impact on model behavior. Our findings show that while many attacks apply minimal but targeted changes, the perturbation type plays a major role in determining the success of the attack. Furthermore, attacks with similar perturbation magnitudes may have vastly different impacts depending on their semantic focus. These insights can support the prioritization of defense mechanisms by focusing on high-impact perturbations, and lay the groundwork for improved adversarial detection systems based on perturbation-level analysis.

Exploring Perturbation Patterns and Impact in Adversarial Machine Learning: A Systematic Literature Review

Sheykina A.;Palomba F.;De Lucia A.
2026

Abstract

Adversarial attacks have gained growing attention due to their ability to mislead machine learning models by introducing carefully crafted perturbations. These attacks span a wide variety of domains, from image recognition to graph-based and NLP models. In this context, understanding the nature of such perturbations is crucial for both detecting attacks and designing effective defenses. Despite the abundance of research on adversarial machine learning, little is known about how perturbation types vary based on the category of the targeted feature, or how the amount of perturbation influences the attack’s impact. To this aim, we conducted a systematic study to (i) identify and classify the perturbation strategies used in adversarial attacks and (ii) analyze the relationship between the strength of perturbation, the perturbation type, and the impact on model behavior. Our findings show that while many attacks apply minimal but targeted changes, the perturbation type plays a major role in determining the success of the attack. Furthermore, attacks with similar perturbation magnitudes may have vastly different impacts depending on their semantic focus. These insights can support the prioritization of defense mechanisms by focusing on high-impact perturbations, and lay the groundwork for improved adversarial detection systems based on perturbation-level analysis.
2026
9783032041999
9783032042002
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11386/4943675
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact