A Multi-Objective Reinforcement Learning Framework for Security Enhancement in Autonomous Vehicle

Document Type : Research Article

Authors

1 Department of Computer Engineering, Faculty of Engineering, University of Guilan

2 Faculty of Computer Science and Engineering, Shahid Beheshti University

10.22042/isecure.2026.242014
Abstract
Autonomous vehicles must balance road-safety objectives with growing cybersecurity threats. In this paper, we present a reinforcement-learning framework that jointly optimizes driving performance and resilience to Denial-of-Service (DoS) attacks.The problem is formulated as a multi-objective Markov Decision Process that integrates a safety reward with a security reward, while the partial observability of attacks is captured via a Bayesian belief. A Proximal Policy Optimization (PPO) agent controls steering, throttle, and dedicated mitigation actions. The system is implemented in the CARLA simulator with camera and LiDAR inputs and evaluated on urban driving scenarios. Experimental results demonstrate that the agent sustains stable lane-keeping and target-speed performance, while substantially reducing collision-prone incidents and retaining more than 90 % of the nominal travel distance under attack scenarios. The framework outperforms the safety-only PPO baseline and a rule-based security countermeasure.

Keywords


[1] Wang, Jun, Zhang, Li, Huang, Yanjun, Zhao, Jian, Safety of Autonomous Vehicles, Journal of Advanced Transportation, 2020, 8867757, 13 pages, 2020. https://doi.org/10.1155/2020/8867757.
[2] Schnellbach, A., Griessnig, G. (2019). Development of the ISO 21448. In: Walker, A., O’Connor, R., Messnarz, R. (eds) Systems, Software and Services Process Improvement. EuroSPI 2019. Communications in Computer and Information Science, vol 1060. Springer, Cham. https://doi.org/10.1007/978-3-030-28005-5_45.
[3] Mehran Alidoost Nia, Radu Calinescu, Mehdi Kargahi and Alessandro Abate, “Efficient Model Verification at Runtime through Adaptive Dynamic Approximation,” ACM Transactions on Adaptive and Autonomous Systems, Volume 20, Issue 1, Article No.: 3, Pages 1 - 45, 2025.
[4] Afshin Hassani, Mehran Alidoost Nia and Reza Ebrahimi Atani, “Balancing Safety and Security in Autonomous Driving Systems: A Machine Learning Approach with Safety-First Prioritization”, The First Workshop on Recent Advances in Resilient and Trustworthy MAchine learniNg (ARTMAN) @ ACSAC 2024, USA, Hawaii, December 2024.
[5] Aslan, M.F., Durdu, A., Yusefi, A., Sabanci, K., Sungur, C. (2021). A Tutorial: Mobile Robotics, SLAM, Bayesian Filter, Keyframe Bundle Adjustment and ROS Applications. In: Koubaa, A. (eds) Robot Operating System (ROS). Studies in Computational Intelligence, vol 962. Springer, Cham. https://doi.org/10.1007/978-3-030-754723_7.
[6] A. d. Rio, D. Jimenez and J. Serrano, "Comparative Analysis of A3C and PPO Algorithms in Reinforcement Learning: A Survey on General Environments," in IEEE Access, vol. 12, pp. 146795-146806, 2024, https://doi.org/10.1109/ACCESS.2024.3472473.
[7] D. R. Niranjan, B. C. VinayKarthik and Mohana, "Deep Learning based Object Detection Model for Autonomous Driving Research using CARLA Simulator," 2021 2nd International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 2021, pp. 12511258, doi: 10.1109/ICOSEC51865.2021.9591747.
[8] M. Dastranj, M. A. Nia and M. Kargahi, "Deploying Reinforcement Learning for Efficient Runtime Decision-Making in Autonomous Systems," 2022 CPSSI 4th International Symposium on Real-Time and Embedded Systems and Technologies (RTEST), Tehran, Iran, Islamic Republic of, 2022, pp. 1-9, doi: 10.1109/RTEST56034.2022.9850141.
[9] Mehran Alidoost Nia, ”Runtime Probabilistic Analysis of Self-Adaptive Systems via Formal Approximation Techniques,” PhD Dissertation, University of Tehran (Iran), ProQuest Dissertations & Theses, 2022, 29068313.
[10] R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA, USA: MIT Press, 2018.
[11] K. Arulkumaran, M.P. Deisenroth, M. Brundage, A.A. Bharath, A brief survey of deep reinforcement learning, arXiv preprint arXiv:1708.05866, 2017.
[12] D.M. Roijers, P. Vamplew, S. Whiteson, R. Dazeley, A survey of multi-objective sequential decision-making, Journal of Artificial Intelligence Research 48 (2013) 67–113, https://doi.org/10.5555/2591248.2591251.
[13] M.N. Kurt, O. Ogundijo, C. Li, X. Wang, Online cyber-attack detection in smart grid: A reinforcement learning approach, IEEE Transactions on Smart Grid 10 (5) (2019) 5174–5185, https://doi.org/10.1109/TSG.2018.2878570.
[14] C. Yang, J. Wang, L. Zhang, M. Liu, Self-driving car racing: Application of deep reinforcement learning, arXiv preprint arXiv:2410.22645, 2024.
[15] A. Demontis, M. Pintor, L. Demetrio, K. Grosse, H.-Y. Lin, C. Fang, B. Biggio, F. Roli, A survey on reinforcement learning security with application to autonomous driving, arXiv preprint arXiv:2212.06123, 2022.
[16] T.T. Nguyen, V.J. Reddi, Deep reinforcement learning for cyber security, IEEE Transactions on Neural Networks and Learning Systems 34 (8) (2023) 3779–3795, https://doi.org/10.1109/TNNLS.2021.3121870.
[17] I. Ortega-Fernandez, F. Liberati, A review of denial of service attack and mitigation in the smart gridusingreinforcementlearning,Energies 16(2) (2023) 635, https://doi.org/10.3390/en16020635.
[18] A. Giannaros, A. Karras, L. Theodorakopoulos, C. Karras, P. Kranias, N. Schizas, G. Kalogeratos, D. Tsolis, Autonomous vehicles: Sophisticated attacks, safety issues, challenges, open topics, blockchain, and future directions, Journal of Cybersecurity and Privacy 4 (1) (2023) 1–30, https://doi.org/10.3390/jcp3030025.
[19] T. Stübler, A. Amodei, D. Capriglione, G. Tomasso, N. Bonnotte, S. Mohammed, An investigation of denial of service attacks on autonomous driving software and hardware in operation, arXiv preprint arXiv:2309.03102, 2023.
[20] G. Li, Y. Li, S. Jha, T. Tsai, M. Sullivan, S.K.S. Hari, Z. Kalbarczyk, R. Iyer, AV-FUZZER: Finding safety violations in autonomous driving systems, 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), Coimbra, Portugal, 2020, pp. 25-36, https://doi.org/10.1109/ISSRE5003.2020.00012.
[21] D.L. Bergin, Cyber-attack and defense simulation framework, Journal of Defense Modeling and Simulation 12 (4) (2015) 383–392, https://doi.org/10.1177/1548512915593528.
[22] I. Durlik, T. Miller, E. Kostecka, Z. Zwierzewicz, A. Łobodzińska, Cybersecurity in autonomous vehicles—Are we ready for the challenge?, Electronics 2024, 13, 832, https://doi.org/10.3390/electronics13132654.
[23] I. Rasheed, F. Hu, L. Zhang, Deep reinforcement learning approach for autonomous vehicle systems for maintaining security and safety using LSTM-GAN, Computer Communications 188 (2022) 68–78, https://doi.org/10.1016/j.vehcom.2020.100266.
[24] T. Yasin Rezapour, E. Zeinali, R. Bbrahimi Atani, M. M. Gilanian Sadeghi, A Novel Epidemic-Inspired Routing Protocol for Internet of Vehicles Leveraging the Covid19 Spread Model, Journal of Computing and Security, Vol. 12, No. 1, pages 85-100, https://doi.org/2025,10.22108/jcs.2025.144774.1163.
[25] S. Raio, K. Corder, T.W. Parker, G.G. Shearer, J.S. Edwards, M.R. Thogaripally, S.J. Park, F.F. Nelson, Reinforcement learning as a path to autonomous intelligent cyber-defense agents in vehicle platforms, Sensors 22 (4) (2022) 1572.
[26] F. Pishdad, R. Ebrahimi Atani, Prevention and detection of botnet attacks in IoT using ensemble learning methods, Biannual Journal Monadi for Cyberspace Security (AFTA), Vol. 13, No. 2, 2024.
[27] S. Kiruthika, A. Aashin, K. Gopinath, A. Gowtham, Securing connected and autonomous vehicles: Challenges posed by adversarial machine learning and the way forward, Journal of Physics: Conference Series 1916 (2021) 012110.
[28] S. Tayeb, M. Pirouz, G. Esguerra, K. Ghobadi, J. Huang, R. Hill, D. Lawson, S. Li, T. Zhan, J. Zhan, S. Latifi, Securing the positioning signals of autonomous vehicles, in: IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 2017, pp. 4522-4528, https://doi.org/10.1109/BigData.2017.8258493.
[29] K. Ren, Q. Wang, C. Wang, Z. Qin, X. Lin, The security of autonomous driving: Threats, defenses, and future directions, IEEE Communications Magazine 58 (5) (2020) 74–80, https://doi.org/10.1109/JPROC.2019.2948775.
[30] Arman Moradi, Mehran Alidoost Nia, and Reza Ebrahimi Atani ”EnhancingSecurity-and-Safety-in-AutonomousVehicles,” 2025, GitHub, GitHub repository, https://github.com/ArmanMoradi001/EnhancingSecurity-and-Safety-in-Autonomous-Vehicles.
[31] Zhongli Wang, Hao Wang, Xin Cui, and Chaochao Zheng, “A Multi-sensing Input and Multi-constraintRewardMechanismBasedDeep Reinforcement Learning Method for Self-driving Policy Learning,” School of Electronic Information Engineering, Beijing Jiaotong University, and China Railway Electrification Bureau Group Co., Ltd., Beijing, China, 2020.
[32] Tianmeng Hu, Biao Luo, and Chunhua Yang, “Multi-objective Optimization for Autonomous Driving Strategy Based on Deep Q Network,” Accepted: 8 September 2021, Received: 11 August 2021.
[33] Kunkun Hao, Wen Cui, Lu Liu, Yuxi Pan, and Zijiang Yang, “Integrating Data-Driven and Knowledge-Driven Methodologies for SafetyCritical Scenario Generation in Autonomous Vehicle Validation,” Research Center of Synkrotron, Xi’an, China, 2020.

Articles in Press, Accepted Manuscript
Available Online from 12 March 2026