Adversarial agent-learning for cybersecurity: a comparison of algorithms

Alexander Shashkov; Erik Hemberg; Miguel Tulla; Una-May O’Reilly; Alexander Shashkov; Erik Hemberg; Miguel Tulla; Una-May O’Reilly

doi:10.1017/S0269888923000012

2023 Volume 38

Article Contents

Next Previous

RESEARCH ARTICLE Open Access

Adversarial agent-learning for cybersecurity: a comparison of algorithms

¹Williams College, Williamstown, MA 01267, USA; e-mail: aes7@williams.edu
²MIT CSAIL, Cambridge, MA 02139, USA; e-mails: hembergerik@csail.mit.edu, mtulla@mit.edu, unamay@csail.mit.edu

More Information

Received: 19 December 2022
Revised: 07 January 2023
Accepted: 07 January 2023
Published online: 06 March 2023
The Knowledge Engineering Review 38, Article number: e3 (2023) | Cite this article

Abstract

Abstract: We investigate artificial intelligence and machine learning methods for optimizing the adversarial behavior of agents in cybersecurity simulations. Our cybersecurity simulations integrate the modeling of agents launching Advanced Persistent Threats (APTs) with the modeling of agents using detection and mitigation mechanisms against APTs. This simulates the phenomenon of how attacks and defenses coevolve. The simulations and machine learning are used to search for optimal agent behaviors. The central question is: under what circumstances, is one training method more advantageous than another? We adapt and compare a variety of deep reinforcement learning (DRL), evolutionary strategies (ES) and Monte Carlo Tree Search methods within Connect 4, a baseline game environment, and on both a simulation supporting a simple APT threat model, SNAPT, as well as CyberBattleSim, an open-source cybersecurity simulation. Our results show that when attackers are trained by DRL and ES algorithms, as well as when they are trained with both algorithms being used in alternation, they are able to effectively choose complex exploits that thwart a defense. The algorithm that combines DRL and ES achieves the best comparative performance when attackers and defenders are simultaneously trained, rather than when each is trained against its non-learning counterpart.
Rights and permissions
© The Author(s), 2023. Published by Cambridge University Press2023Cambridge University Press

References

Allis , L. V. 1988. A knowledge-based approach of connect-four. Journal of the International Computer Games Association 11, 165.

Google Scholar

Applebaum , A., Miller , D., Strom , B., Korban , C. & Wolf , R. 2016. Intelligent, automated red team emulation. In Proceedings of the 32nd Annual Conference on Computer Security Applications, 363–373.

Google Scholar

Arulkumaran , K., Deisenroth , M. P., Brundage , M. & Bharath , A. A. 2017. Deep reinforcement learning: a brief survey. IEEE Signal Processing Magazine 34(6), 26–38.

Google Scholar

Backes , M., Hoffmann , J., Künnemann , R., Speicher , P. & Steinmetz , M. 2017. Simulated penetration testing and mitigation analysis. ArXiv abs/1705.05088.

Google Scholar

Baillie , C., Standen , M., Schwartz , J., Docking , M., Bowman , D. & Kim , J. 2020. Cyborg: An autonomous cyber operations research gym. ArXiv abs/2002.10667.

Google Scholar

Bräm , T., Brunner , G., Richter , O. & Wattenhofer , R. 2020. Attentive multi-task deep reinforcement learning. In Machine Learning and Knowledge Discovery in Databases, Brefeld , U., Fromont , E., Hotho , A., Knobbe , A., Maathuis , M. & Robardet , C. (eds). Springer International Publishing, 134–149.

Google Scholar

Brockman , G., Cheung , V., Pettersson , L., Schneider , J., Schulman , J., Tang , J. & Zaremba , W. 2016. Openai gym. ArXiv abs/1606.01540.

Google Scholar

Corporation , T. M. (n.d.a). Mitre att&ck. https://attack.mitre.org.

Google Scholar

Corporation , T. M. (n.d.b). Mitre engage. https://engage.mitre.org.

Google Scholar

Droste , S., Jansen , T. & Wegener , I. 2002. On the analysis of the (1+1) evolutionary algorithm. Theoretical Computer Science 276(1), 51–81. https://www.sciencedirect.com/science/article/pii/S0304397501001827.

Google Scholar

Engström , V. & Lagerström , R. 2022. Two decades of cyberattack simulations: a systematic literature review. Computers & Security, 102681.

Google Scholar

Falco , G., Viswanathan , A., Caldera , C. & Shrobe , H. 2018. A master attack methodology for an ai-based automated attack planner for smart cities. IEEE Access 6, 48360–48373.

Google Scholar

Goldberg , D. E. 1989. Genetic Algorithms in Search, Optimization and Machine Learning, 1st edition. Addison-Wesley Longman Publishing Co., Inc.

Google Scholar

Grondman , I., Buşoniu , L., Lopes , G. & Babuška , R. 2012. A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 42, 1291–1307.

Google Scholar

Group , A. 2022. Adversarial agent-learning for cybersecurity. https://github.com/ALFA-group/adversarial_agent_learning_for_cybersecurity.

Google Scholar

Hansen , N. (2016). The CMA evolution strategy: a tutorial. ArXiv abs/1604.00772.

Google Scholar

Harris , S. N. & Tauritz , D. R. 2021. Competitive Coevolution for Defense and Security: Elo-Based Similar-Strength Opponent Sampling. Association for Computing Machinery, 1898–1906. https://doi.org/10.1145/3449726.3463193.

Google Scholar

Huang , L. & Zhu , Q. 2020. A dynamic games approach to proactive defense strategies against advanced persistent threats in cyber-physical systems. Computers & Security 89,101660.

Google Scholar

Jiménez , S., De La Rosa , T., Fernández , S., Fernández , F. & Borrajo , D. 2012. A review of machine learning for automated planning. The Knowledge Engineering Review 27(4), 433–467.

Google Scholar

Klijn , D. & Eiben , A. E. 2021. A Coevolutionary Approach to Deep Multi-Agent Reinforcement Learning. Association for Computing Machinery, 283–284. https://doi.org/10.1145/3449726.3459576.

Google Scholar

Lee , K., Lee , B.-U., Shin , U. & Kweon , I. S. 2020. An efficient asynchronous method for integrating evolutionary and gradient-based policy search. In Advances in Neural Information Processing Systems, Larochelle , H., Ranzato , M., Hadsell , R., Balcan , M. F. & Lin , H. (eds), 33. Curran Associates, Inc., 10124–10135. https://proceedings.neurips.cc/paper/2020/file/731309c4bb223491a9f67eac5214fb2e-Paper.pdf.

Google Scholar

Lillicrap , T. P., Hunt , J. J., Pritzel , A., Heess , N. M. O., Erez , T., Tassa , Y., Silver , D. & Wierstra , D. 2016. Continuous control with deep reinforcement learning. ArXiv abs/1509.02971.

Google Scholar

Liu , J., Pérez-Liébana , D. & Lucas , S. M. 2016. Rolling horizon coevolutionary planning for two-player video games. In 2016 8th Computer Science and Electronic Engineering (CEEC), 174–179.

Google Scholar

Liu , L., Yasin Chouhan , A., Li , T., Fatima , R. & Wang , J. 2018. Improving software security awareness using a serious game. IET Software 13, 159–169.

Google Scholar

Luh , R., Temper , M., Tjoa , S., Schrittwieser , S. & Janicke , H. 2019. Penquest: a gamified attacker/defender meta model for cyber security assessment and education. Journal of Computer Virology and Hacking Techniques 16, 19–61.

Google Scholar

Macua , S. V., Davies , I., Tukiainen , A. & De Cote , E. M. 2021. Fully distributed actor-critic architecture for multitask deep reinforcement learning. The Knowledge Engineering Review 36, e6.

Google Scholar

Metz , L., Ibarz , J., Jaitly , N. & Davidson , J. 2017. Discrete sequential prediction of continuous actions for deep RL. ArXiv abs/1705.05035.

Google Scholar

Mnih , V., Kavukcuoglu , K., Silver , D., Rusu , A. A., Veness , J., Bellemare , M. G., Graves , A., Riedmiller , M. A., Fidjeland , A., Ostrovski , G., Petersen , S., Beattie , C., Sadik , A., Antonoglou , I., King , H., Kumaran , D., Wierstra , D., Legg , S. & Hassabis , D. 2015. Human-level control through deep reinforcement learning. Nature 518, 529–533.

Google Scholar

Molina-Markham , A., Winder , R. K. & Ridley , A. 2021. Network defense is not a game. ArXiv abs/2104.10262.

Google Scholar

Nguyen , T. T. & Reddi , V. J. 2021. Deep reinforcement learning for cyber security. IEEE Transactions on Neural Networks and Learning Systems, 1–17.

Google Scholar

Olesen , T. V. A. N., Nguyen , D. T. T., Palm , R. B. & Risi , S. 2021. Evolutionary planning in latent space. In Applications of Evolutionary Computation, Castillo , P. A. & Jiménez Laredo , J. L. (eds). Springer International Publishing, 522–536.

Google Scholar

Palangi , H., Deng , L., Shen , Y., Gao , J., He , X., Chen , J., Song , X. & Ward , R. 2016. Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, 694–707.

Google Scholar

Panait , L. & Luke , S. 2002. A comparison of two competitive fitness functions. In Proceedings of the 4th Annual Conference on Genetic and Evolutionary Computation, GECCO’02, Morgan Kaufmann Publishers Inc., 503–511.

Google Scholar

Partalas , I., Vrakas , D. & Vlahavas , I. 2012. Reinforcement learning and automated planning: a survey. In Artificial Intelligence for Advanced Problem Solving Techniques.

Google Scholar

Popovici , E., Bucci , A., Wiegand , R. P. & Jong , E. D. 2012. Coevolutionary principles. In Handbook of Natural Computing.

Google Scholar

Potter , M. A. & Jong , K. A. D. 2000. Cooperative coevolution: an architecture for evolving coadapted subcomponents. Evolutionary Computation 8, 1–29.

Google Scholar

Pourchot , A. & Sigaud , O. 2018. CEM-RL: combining evolutionary and gradient-based methods for policy search. ArXiv abs/1810.01222.

Google Scholar

Prince , M. H., McGehee , A. J. & Tauritz , D. R. 2021. Edm-drl: toward stable reinforcement learning through ensembled directed mutation. In Applications of Evolutionary Computation, Castillo , P. A. & Jiménez Laredo , J. L. (eds). Springer International Publishing, 275–290.

Google Scholar

Rechenberg , I. 1989. Evolution strategy: nature’s way of optimization. In Optimization: Methods and Applications, Possibilities and Limitations, Bergmann , H. W. (ed.). Springer Berlin Heidelberg, 106–126.

Google Scholar

Reinstadler , B. 2021. Ai Attack Planning for Emulated Networks. Master’s thesis, Massachusetts Institute of Technology.

Google Scholar

Rush , G., Tauritz , D. R. & Kent , A. D. 2015. Coevolutionary agent-based network defense lightweight event system (candles). In Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation. GECCO Companion’15. Association for Computing Machinery, 859–866. https://doi.org/10.1145/2739482.2768429.

Google Scholar

Salimans , T., Ho , J., Chen , X. & Sutskever , I. 2017. Evolution strategies as a scalable alternative to reinforcement learning. ArXiv abs/1703.03864.

Google Scholar

Schaul , T. & Schmidhuber , J. 2008. A scalable neural network architecture for board games. In 2008 IEEE Symposium on Computational Intelligence and Games, CIG 2008, 357–364.

Google Scholar

Sigaud , O. & Stulp , F. 2019. Policy search in continuous action domains: an overview. Neural Networks: The Official Journal of the International Neural Network Society 113, 28–40.

Google Scholar

Silver , D., Hubert , T., Schrittwieser , J., Antonoglou , I., Lai , M., Guez , A., Lanctot , M., Sifre , L., Kumaran , D., Graepel , T., Lillicrap , T., Simonyan , K. & Hassabis , D. 2018. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419), 1140–1144. https://www.science.org/doi/abs/10.1126/science.aar6404.

Google Scholar

Simione , L. & Nolfi , S. 2017. Achieving long-term progress in competitive co-evolution. In 2017 IEEE Symposium Series on Computational Intelligence (SSCI), 1–8.

Google Scholar

Team , M. D. R. 2021. Cyberbattlesim. https://github.com/microsoft/cyberbattlesim. Created by Christian Seifert, Michael Betser, William Blum, James Bono, Kate Farris, Emily Goren, Justin Grana, Kristian Holsheimer, Brandon Marken, Joshua Neil, Nicole Nichols, Jugal Parikh, Haoran Wei.

Google Scholar

The MITRE Corporation 2020. Caldera. https://github.com/mitre/caldera.

Google Scholar

Vinyals , O., Babuschkin , I., Czarnecki , W., Mathieu , M., Dudzik , A., Chung , J., Choi , D., Powell , R., Ewalds , T., Georgiev , P., Oh , J., Horgan , D., Kroiss , M., Danihelka , I., Huang , A., Sifre , L., Cai , T., Agapiou , J., Jaderberg , M., Vezhnevets , A., Leblond , R., Pohlen , T., Dalibard , V., Budden , D., Sulsky , Y., Molloy , J., Paine , T., Gulcehre , C., Wang , Z., Pfaff , T., Wu , Y., Ring , R., Yogatama , D., Wünsch , D., McKinney , K., Smith , O., Schaul , T., Lillicrap , T., Kavukcuoglu , K., Hassabis , D., Apps , C. & Silver , D. 2019. Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782), 350–354.

Google Scholar

Walter , E. C., Ferguson-Walter , K. J. & Ridley , A. D. 2021. Incorporating deception into cyberbattlesim for autonomous defense. ArXiv abs/2108.13980.

Google Scholar

Yang , L.-X., Pengdeng , L., Zhang , Y., Yang , X., Xiang , Y. & Zhou , W. 2018. Effective repair strategy against advanced persistent threat: a differential game approach. IEEE Transactions on Information Forensics and Security14(7), 1713–1728.

Google Scholar

Zhu , Q. & Rass , S. 2018. On multi-phase and multi-stage game-theoretic modeling of advanced persistent threats. IEEE Access 6, 13958–13971.

Google Scholar

About this article

Cite this article

Alexander Shashkov, Erik Hemberg, Miguel Tulla, Una-May O’Reilly. 2023. Adversarial agent-learning for cybersecurity: a comparison of algorithms. The Knowledge Engineering Review. 38:12 doi: 10.1017/S0269888923000012

Alexander Shashkov, Erik Hemberg, Miguel Tulla, Una-May O’Reilly. 2023. Adversarial agent-learning for cybersecurity: a comparison of algorithms. The Knowledge Engineering Review. 38:12 doi: 10.1017/S0269888923000012

Download PDF

Article Metrics

Article views(430) PDF downloads(408)

{{lists.name}}

Adversarial agent-learning for cybersecurity: a comparison of algorithms

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors