Structured Reinforcement Learning for Delay-Optimal Data Transmission in Dense mmWave Networks
arxiv(2024)
摘要
We study the data packet transmission problem (mmDPT) in dense cell-free
millimeter wave (mmWave) networks, i.e., users sending data packet requests to
access points (APs) via uplinks and APs transmitting requested data packets to
users via downlinks. Our objective is to minimize the average delay in the
system due to APs' limited service capacity and unreliable wireless channels
between APs and users. This problem can be formulated as a restless multi-armed
bandits problem with fairness constraint (RMAB-F). Since finding the optimal
policy for RMAB-F is intractable, existing learning algorithms are
computationally expensive and not suitable for practical dynamic dense mmWave
networks. In this paper, we propose a structured reinforcement learning (RL)
solution for mmDPT by exploiting the inherent structure encoded in RMAB-F. To
achieve this, we first design a low-complexity and provably asymptotically
optimal index policy for RMAB-F. Then, we leverage this structure information
to develop a structured RL algorithm called mmDPT-TS, which provably achieves
an Õ(√(T)) Bayesian regret. More importantly, mmDPT-TS is
computation-efficient and thus amenable to practical implementation, as it
fully exploits the structure of index policy for making decisions. Extensive
emulation based on data collected in realistic mmWave networks demonstrate
significant gains of mmDPT-TS over existing approaches.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要