Efficient Dynamic Spectrum Anti-jamming Access With Large Action Space: An Action Space Decomposition-Based Approach
IEEE Wireless Communications Letters(2024)
Abstract
In this letter, we study the problem of dynamic spectrum anti-jamming access with exponentially growing action space. Traditional deep reinforcement learning methods, which were restricted to scenarios with relatively small action space, worked poorly with large action space due to low exploration efficiency. To address this challenge, we propose an efficient algorithm called Proximal Policy Optimization with Action Branching and Dynamic Action Masking (PPO-ABM). To achieve linear growth of output nodes in neural network, the joint action space is decoupled using action branching architecture. A dynamic action masking based sequential decision scheme is further proposed to eliminate invalid actions and accelerate convergence. Simulation results show that PPO-ABM converges rapidly and achieves almost the optimal performance regardless of exponentially growing action space. Performance of PPO-ABM is 53.14% higher than that of baseline when there are 115 actions.
MoreTranslated text
Key words
Dynamic spectrum anti-jamming access,large action space,deep reinforcement learning,action branching,dynamic action masking
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined