NA^2Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning
ICML 2023(2023)
摘要
Value decomposition is widely used in cooperative multi-agent reinforcement
learning, however, its implicit credit assignment mechanism is not yet fully
understood due to black-box networks. In this work, we study an interpretable
value decomposition framework via the family of generalized additive models. We
present a novel method, named Neural Attention Additive Q-learning
(NA^2Q), providing inherent intelligibility of collaboration
behavior. NA^2Q can explicitly factorize the optimal joint
policy induced by enriching shape functions to model all possible coalitions of
agents into individual policies. Moreover, we construct identity semantics to
promote estimating credits together with the global state and individual value
functions, where local semantic masks help us diagnose whether each agent
captures relevant-task information. Extensive experiments show that
NA^2Q consistently achieves superior performance compared to
different state-of-the-art methods on all challenging tasks, while yielding
human-like interpretability.
更多查看译文
关键词
neural attention additive model,interpretable
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要