Robustifying a Policy in Multi-Agent RL with Diverse Cooperative Behaviors and Adversarial Style Sampling for Assistive Tasks
CoRR(2024)
摘要
Autonomous assistance of people with motor impairments is one of the most
promising applications of autonomous robotic systems. Recent studies have
reported encouraging results using deep reinforcement learning (RL) in the
healthcare domain. Previous studies showed that assistive tasks can be
formulated as multi-agent RL, wherein there are two agents: a caregiver and a
care-receiver. However, policies trained in multi-agent RL are often sensitive
to the policies of other agents. In such a case, a trained caregiver's policy
may not work for different care-receivers. To alleviate this issue, we propose
a framework that learns a robust caregiver's policy by training it for diverse
care-receiver responses. In our framework, diverse care-receiver responses are
autonomously learned through trials and errors. In addition, to robustify the
care-giver's policy, we propose a strategy for sampling a care-receiver's
response in an adversarial manner during the training. We evaluated the
proposed method using tasks in an Assistive Gym. We demonstrate that policies
trained with a popular deep RL method are vulnerable to changes in policies of
other agents and that the proposed framework improves the robustness against
such changes.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要