AoI Minimization using Multi-agent Proximal Policy Optimization in UAVs-assisted Sensor Networks

ICC 2023 - IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS(2023)

引用 0|浏览5
暂无评分
摘要
Unmanned Aerial Vehicle (UAV) swarm can be employed to collect time-sensitive data of ground sensors in remote and hostile areas. Inadequate design of UAVs ' trajectories and data collection schedule incur delay and negatively impact the information freshness of ground sensors. This paper aims to jointly optimize the trajectories and data collection schedules of multiple UAVs to minimize the average Age of Information (AoI), adapting to the AoI of the ground sensors, and the trajectories of the UAVs. The optimization is formulated as a multi-agent Markov decision process (MMDP), where network states consist of AoI at the ground sensors and the flight trajectories. In practice, a multi-UAV-assisted sensor network contains a large number of network states and actions in MMDP. Exploring the actions of multiple agents in a large state space results in considerable training uncertainties that destabilize the AoI minimization. For stabilizing the formulated MMDP, we propose an onboard Proximal Policy Optimization-based flight resource allocation scheme (PPO-FRAS), which conducts an on-policy learning to optimize the trajectories of the UAVs and data collection schedule of the ground sensors. Numerical results show that the proposed PPO-FRAS achieves 28% and 59% lower AoI than the existing trajectory planning solution based on Deep QNetwork and the greedy algorithm, respectively.
更多
查看译文
关键词
UAVs,Age of information,Proximal policy optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要