Joint Optimization of Handover Control and Power Allocation Based on Multi-Agent Deep Reinforcement Learning

IEEE Transactions on Vehicular Technology(2020)

引用 69|浏览92
暂无评分
摘要
In this paper, we study the handover (HO), and power allocation problem in a two-tier heterogeneous network (HetNet), which consists of a macro base station, and some millimeter-wave (mmWave) small base stations. We establish an HO management, and power allocation scheme to maximize the overall throughput while reducing the HO frequency. In particular, considering the interrelationship among decisions made by different user equipments (UEs), we first model the HO, and power allocation problem as a fully cooperative multi-agent task, in which all agents, i.e., UEs, have the same target. Then, to solve the multi-agent task, and get decentralized policies for each UE, we develop a multi-agent reinforcement learning (MARL) algorithm based on the proximal policy optimization (PPO) method, by introducing the centralized training with decentralized execution framework. That is, we use global information to train policies for each UE, and after the training is finished, each UE obtains a decentralized policy, which can be implemented only based on each UE's local observation. Specially, we introduce the counterfactual baseline to address the credit assignment problem in centralized learning. Due to the centralized training, the decentralized polices learned by multi-agent PPO (MAPPO) can work more cooperatively. Finally, the simulation results demonstrate that our method can achieve better performance comparing with other existing works.
更多
查看译文
关键词
Handover,multi-agent deep reinforcement learning,power allocation,HetNet,mmWave
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要