Multi-agent Adversarial Inverse Reinforcement Learning with Latent Variables

Nate Gruver,Jiaming Song,Mykel J. Kochenderfer,Stefano Ermon

AAMAS '19: International Conference on Autonomous Agents and Multiagent Systems Auckland New Zealand May, 2020（2020）

引用 12|浏览129

暂无评分

摘要

We introduce an algorithm for inferring reward functions from expert human trajectories in multiagent environments. Current techniques exhibit poor sample-efficiency, lack stability in training, or scale poorly to large numbers of agents. We focus on settings with a large, variable number of agents and attempt to resolve these settings by exploiting similarities between agent behaviors. In particular, we learn a shared reward function using adversarial inverse reinforcement learning and a continuous latent variable. We demonstrate our algorithm on two real-world settings: traffic on highways and in terminal airspace.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要