Modern cyber-attacks are evolving into Advanced Persistent Threats (APTs). They are attacks orchestrated by cybercriminals or state-sponsored groups, which perform carefully-planned, stealthy, targeted attacks that span over a long period of time. It is difficult to defend against APTs, mostly because the absence of high-quality data to build detectors and train personnel. In fact, new attacks are continuously crafted, and most organizations are unwilling to share data about attacks they have experienced. In this paper, we argue about an approach for the automatic generation of representative datasets of APTs, without forcing organizations to disclose their sensitive information. We propose to adopt the Federated Learning paradigm to train a Generative Machine Learning model, which will generate new traces of network and host events representative of real APT attacks. Blockchain-based strategies will overcome the typical shortcomings of a centralized approach, such as single-point-failure and malicious clients. The generated APT datasets can be leveraged for training and assessing APT detectors based on AI, and emulating attacks in live cyber-ranges exercises.
Federated and Generative Data Sharing for Data-Driven Security: Challenges and Approach
Ficco M.
2022-01-01
Abstract
Modern cyber-attacks are evolving into Advanced Persistent Threats (APTs). They are attacks orchestrated by cybercriminals or state-sponsored groups, which perform carefully-planned, stealthy, targeted attacks that span over a long period of time. It is difficult to defend against APTs, mostly because the absence of high-quality data to build detectors and train personnel. In fact, new attacks are continuously crafted, and most organizations are unwilling to share data about attacks they have experienced. In this paper, we argue about an approach for the automatic generation of representative datasets of APTs, without forcing organizations to disclose their sensitive information. We propose to adopt the Federated Learning paradigm to train a Generative Machine Learning model, which will generate new traces of network and host events representative of real APT attacks. Blockchain-based strategies will overcome the typical shortcomings of a centralized approach, such as single-point-failure and malicious clients. The generated APT datasets can be leveraged for training and assessing APT detectors based on AI, and emulating attacks in live cyber-ranges exercises.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.