Modeling Others using Oneself in Multi-Agent Reinforcement Learning.
ICML, pp. 4254-4263, 2018.
We consider the multi-agent reinforcement learning setting with imperfect information in which each agent is trying to maximize its own utility. The reward function depends on the hidden state (or goal) of both agents, so the agents must infer the other playersu0027 hidden goals from their observed behavior in order to solve the tasks. We...更多
下载 PDF 全文