A Novel Reinforcement Learning-based Congestion Control Algorithm for DDoS-Induced Adversarial Conditions in Blockchain and Distributed Networks
Volume 18, Issue 1, January 2026, Pages 49-60
https://doi.org/10.22042/isecure.2025.515662.1221
Ehsan Abedini, Amir Jalaly Bidgoly, Mohsen Nickray
Abstract Distributed Denial-of-Service (DDoS) attacks are among the most critical security threats to distributed network infrastructures, including blockchain systems. These attacks degrade performance, cause congestion, and disrupt service delivery or transaction processing. Traditional mitigation techniques have undergone extensive development. However, they often fail to intelligently detect and manage traffic patterns and struggle to adapt to dynamic conditions in decentralized environments. This paper proposes a reinforcement learning-based congestion control (CC) method that dynamically adjusts congestion window (CWND) following traditional TCP principles based on signals such as delay and packet loss. What distinguishes our approach is that the RL-agent interprets persistent or abnormal congestion patterns as potential indicators of adversarial high-load conditions (e.g., DDoS-induced congestion) and adapts CWND adjustments more intelligently to reduce their adverse. Leveraging the Q-learning algorithm, the proposed approach adapts dynamically to fluctuating traffic and conditions. Its learning capability enables continuous monitoring of behavior and timely responsiveness to anomalies, including sustained congestion patterns often associated with adversarial traffic surges. Simulation results across various DDoS scenarios—evaluated against conventional CC algorithms—demonstrate considerable improvements in key performance indicators such as reduced latency, enhanced bandwidth utilization, improved stability, decreased packet loss, and increased throughput. The proposed Q-learning-based CC operates at the peer-to-peer layer, regulating flow among blockchain nodes. It is independent of consensus mechanisms while indirectly improving consensus efficiency by reducing message delays and packet loss. This method offers a scalable and intelligent solution for cc under adversarial conditions, thereby contributing to improved robustness and efficiency in both general distributed systems and blockchain networks.
Bypassing Web Application Firewalls Using Deep Reinforcement Learning
Volume 14, Issue 2, July 2022, Pages 131-145
https://doi.org/10.22042/isecure.2022.323140.744
Mojtaba Hemmati, Mohammad Ali Hadavi
Abstract Web application firewalls (WAFs) are used for protecting web applications from attacks such as SQL injection, cross-site request forgery, and cross-site scripting. As a result of the growing complexity of web attacks, WAFs need to be tested and updated on a regular basis. There are various tools and techniques to verify the correct performance of WAFs but most of them are manual or use brute-force attacks, so suffer from poor efficacy. In this work, we propose a solution based on Reinforcement Learning (RL) to discover malicious payloads, which can bypass WAFs. We provide an RL framework with an environment compatible with OpenAI gym toolset standards. This environment is employed for training agents to implement WAF circumvention tasks. The agent mutates a malicious payload syntax using a set of modification operators as actions, without changes to its semantic. Then, upon WAF's reaction to the payload, the environment ascertains a reward for the agent. Eventually, based on the rewards, the agent learns a suitable sequence of mutations for any malicious payload. The payloads, which bypass the WAF can determine rules defects, which can be further used in rule tuning for rule-based WAFs. Also, it can enrich the machine learning-based datasets for retraining. We use Q-learning, advantage actor-critic (A2C), and proximal policy optimization (PPO) algorithms with the deep neural network. Our solution is successful in evading signature-based and machine learning-based WAFs. While we focus on SQL injection in this work, the method can be simply extended to use for any string-based injection attacks.
