Improving Fast Adaptation for Newcomers in Multi-Robot Reinforcement Learning System.
SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI(2019)
摘要
Multi-robot system has been adopted as a kind of ubiquitous intelligent systems to perform critical tasks in various fields. In multi-robot systems, multi-agent reinforcement learning (MARL) is regarded as a promising technology to support decision-making. However, existing MARL approaches assume either a predefined system configuration or a unified model for agents with identical roles, and thus cannot effectively deal with the dynamic change in the number of robots, which is very common in the real world. This kind of "adaptation" problem seriously hinders the development of intelligence in multi-robot systems. In this paper, we propose a novel meta-MADDPG approach to enable new robots to integrate into an existing multi-robot system quickly. We build on the MADDPG (Multi-Agent Deep Deterministic Policy Gradient) algorithm and distill the meta-knowledge of a specific robot team by training a meta-actor and a meta-critic simultaneously. The meta-actor can learn an experienced policy net for new robots to perform reasonable actions directly if the situation is urgent, while the meta-critic trains a value net to criticize the current situation for better evolution of new robots. Our experiments on a typical application case (multi-robot collision avoidance) indicate that the meta-knowledge can significantly improve the fast adaptation for the newcomers. Our source code is available at https://github.com/liyiying/meta-MADDPG.
更多查看译文
关键词
multi-robot system, ubiquitous intelligence, fast adaptation, meta-learning, deep reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络