Coordination in Adversarial Sequential Team Games via Multi-Agent Deep Reinforcement Learning

arxiv(2019)

引用 1|浏览10
暂无评分
摘要
Many real-world applications involve teams of agents that have to coordinate their actions to reach a common goal against potential adversaries. This paper focuses on zero-sum games where a team of players faces an opponent, as is the case, for example, in Bridge, collusion in poker, and collusion in bidding. The possibility for the team members to communicate before gameplay---that is, coordinate their strategies ex ante---makes the use of behavioral strategies unsatisfactory. We introduce Soft Team Actor-Critic (STAC) as a solution to the team's coordination problem that does not require any prior domain knowledge. STAC allows team members to effectively exploit ex ante communication via exogenous signals that are shared among the team. STAC reaches near-optimal coordinated strategies both in perfectly observable and partially observable games, where previous deep RL algorithms fail to reach optimal coordinated behaviors.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要