SUB-PLAY: Adversarial Policies against Partially Observed Multi-Agent Reinforcement Learning Systems
CoRR(2024)
摘要
Recent advances in multi-agent reinforcement learning (MARL) have opened up
vast application prospects, including swarm control of drones, collaborative
manipulation by robotic arms, and multi-target encirclement. However, potential
security threats during the MARL deployment need more attention and thorough
investigation. Recent researches reveal that an attacker can rapidly exploit
the victim's vulnerabilities and generate adversarial policies, leading to the
victim's failure in specific tasks. For example, reducing the winning rate of a
superhuman-level Go AI to around 20
competitive environments, assuming attackers possess complete global state
observation.
In this study, we unveil, for the first time, the capability of attackers to
generate adversarial policies even when restricted to partial observations of
the victims in multi-agent competitive environments. Specifically, we propose a
novel black-box attack (SUB-PLAY), which incorporates the concept of
constructing multiple subgames to mitigate the impact of partial observability
and suggests the sharing of transitions among subpolicies to improve the
exploitative ability of attackers. Extensive evaluations demonstrate the
effectiveness of SUB-PLAY under three typical partial observability
limitations. Visualization results indicate that adversarial policies induce
significantly different activations of the victims' policy networks.
Furthermore, we evaluate three potential defenses aimed at exploring ways to
mitigate security threats posed by adversarial policies, providing constructive
recommendations for deploying MARL in competitive environments.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要