Relay Hindsight Experience Replay: Self-guided continual reinforcement learning for sequential object manipulation tasks with sparse rewards

arxiv(2023)

引用 0|浏览8
暂无评分
摘要
Learning with sparse rewards remains a challenging problem in reinforcement learning (RL). In particular, for sequential object manipulation tasks, the RL agent generally only receives a reward upon successful completion of the entire task, leading to low exploration efficiency. To address this sample inefficiency, we propose a novel self-guided continual RL framework, named Relay Hindsight Experience Replay (RHER). RHER decomposes a sequential task into several subtasks with increasing complexity, allowing the agent to learn from the simplest subtask and gradually complete the task. It is crucial that a Self-Guided Exploration Strategy (SGES) is proposed to use the already-learned simpler subtask policy to guide the exploration of a more complex subtask. This strategy allows the agent to break the barriers of sparse reward sequential tasks and achieve efficient learning stage by stage. As a result, the proposed RHER method achieves state-of-the-art performance on the benchmark tasks (FetchPush and FetchPickAndPlace). Furthermore, the experimental results demonstrate the superiority and high efficiency of RHER on a variety of single-object and multi-object manipulation tasks (e.g., ObstaclePush, DrawerBox, TStack, etc.). Finally, the proposed RHER method can also learn a contact-rich task on a real robot from scratch within 250 episodes.
更多
查看译文
关键词
Deep reinforcement learning,Robotic manipulation,Continual learning,Hindsight experience replay,Sparse reward
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要