Multiexperience-Assisted Efficient Multiagent Reinforcement Learning

Tianle Zhang,Zhen Liu,Jianqiang Yi,Shiguang Wu,Zhiqiang Pu, Yanjie Zhao

IEEE transactions on neural networks and learning systems（2024）

引用 0|浏览18

暂无评分

摘要

Recently, multiagent reinforcement learning (MARL) has shown great potential for learning cooperative policies in multiagent systems (MASs). However, a noticeable drawback of current MARL is the low sample efficiency, which causes a huge amount of interactions with environment. Such amount of interactions greatly hinders the real-world application of MARL. Fortunately, effectively incorporating experience knowledge can assist MARL to quickly find effective solutions, which can significantly alleviate the drawback. In this article, a novel multiexperience-assisted reinforcement learning (MEARL) method is proposed to improve the learning efficiency of MASs. Specifically, monotonicity-constrained reward shaping is innovatively designed using expert experience to provide additional individual rewards to guide multiagent learning efficiently, with the invariance guarantee of the team optimization objective. Furthermore, a reward distribution estimator is specially developed to model an implicated reward distribution of environment by using transition experience from environment, containing collected samples (state-action pair, reward, and next state). This estimator can predict the expectation reward of each agent for the taken action to accurately estimate the state value function and accelerate its convergence. Besides, the performance of MEARL is evaluated on two multiagent environment platforms: our designed unmanned aerial vehicle combat (UAV-C) and StarCraft II Micromanagement (SCII-M). Simulation results demonstrate that the proposed MEARL can greatly improve the learning efficiency and performance of MASs and is superior to the state-of-the-art methods in multiagent tasks.

查看译文

关键词

Experience assist,multiagent reinforcement learning (MARL),reward distribution estimation,reward shaping

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要