MA-TREX: Mutli-agent Trajectory-Ranked Reward Extrapolation via Inverse Reinforcement Learning
knowledge science, engineering and management, pp. 3-14, 2020.
Trajectory-ranked reward extrapolation (T-REX) provides a general framework to infer users’ intentions from sub-optimal demonstrations. However, it becomes inflexible when encountering multi-agent scenarios, due to its high complexity caused by rational behaviors, e.g., cooperation and communication. In this paper, we propose a novel Mult...更多
下载 PDF 全文 (上传PDF)