Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty
CoRR(2024)
摘要
To overcome the sim-to-real gap in reinforcement learning (RL), learned
policies must maintain robustness against environmental uncertainties. While
robust RL has been widely studied in single-agent regimes, in multi-agent
environments, the problem remains understudied – despite the fact that the
problems posed by environmental uncertainties are often exacerbated by
strategic interactions. This work focuses on learning in distributionally
robust Markov games (RMGs), a robust variant of standard Markov games, wherein
each agent aims to learn a policy that maximizes its own worst-case performance
when the deployed environment deviates within its own prescribed uncertainty
set. This results in a set of robust equilibrium strategies for all agents that
align with classic notions of game-theoretic equilibria. Assuming a
non-adaptive sampling mechanism from a generative model, we propose a
sample-efficient model-based algorithm (DRNVI) with finite-sample complexity
guarantees for learning robust variants of various notions of game-theoretic
equilibria. We also establish an information-theoretic lower bound for solving
RMGs, which confirms the near-optimal sample complexity of DRNVI with respect
to problem-dependent factors such as the size of the state space, the target
accuracy, and the horizon length.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要