Chrome Extension
WeChat Mini Program
Use on ChatGLM

M3: Modularization for Multi-task and Multi-agent Offline Pre-training

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems(2023)

Cited 0|Views34
No score
Abstract
Learning a multi-task policy is crucial in multi-agent reinforcement learning (MARL). Recent work has focused on learning in the context of online multi-task reinforcement learning, where a policy is jointly trained from scratch, aiming to generalize well to few-shot or even zero-shot tasks. However, existing online methods require tremendous interactions and are therefore unsuitable for environments where interactions are expensive. In this work, we novelly introduce the modularization for multi-task and multi-agent offline pre-training (M3) to learn high-level transferable policy representations. We claim that the discrete policy representation is critical for multi-task offline learning and accordingly leverage contexts as a task prompt to enhance the adaptability of pre-trained models to various tasks. To disentangle multiple agents of variation under heterogeneous and non-stationary properties even though they receive the same task, we employ an agent-invariant VQ-VAE to identify each of the multiple agents. We encapsulate the pre-trained model as part of an online MARL algorithm and fine-tune it to improve generalization. We also theoretically analyze the generalization error of our method. We test the proposed method on the challenging StarCraft Multi-Agent Challenge (SMAC) tasks, and empirical results show that it can achieve supreme performance in few-shot or even zero-shot settings across multiple tasks over state-of-the-art MARL methods.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined