A Q-values Sharing Framework for Multiple Independent Q-learners

Changxi Zhu,Ho-fung Leung,Shuyue Hu,Yi Cai

adaptive agents and multi-agents systems（2019）

引用 11|浏览27

暂无评分

摘要

By using a multiagent reinforcement learning (MARL) framework, cooperative agents can communicate with one another to accelerate the joint learning. In the teacher-student paradigm applied in MARL, a more experienced agent (advisor) can advise another agent (advisee) which action to take in a state. However, when agents need to cooperate with one another, the advisee may fail to cooperate well with others since their policies may have changed. It requires a long period for an advisee to learn the same best actions as an advisor has learned, especially when the amount of advice is limited. We propose a partaker-sharer advising framework (PSAF) for independent Q-learners with limited communication in cooperative MARL. In PSAF, the overall learning process is shown to accelerate by multiple independent Q-learners' sharing their maximum Q-values with one another at every time step. We perform experiments in the Predator-Prey domain and HFO game. The results show that our approach significantly outperforms existing advising methods.

查看译文

关键词

multiagent learning,Q-learning,reinforcement learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要