Multi-Step Hindsight Experience Replay with Bias Reduction for Efficient Multi-Goal Reinforcement Learning

Yu Yang,Rui Yang,Jiafei Lyu,Jiangpeng Yan,Feng Luo,Dijun Luo,Xiu Li,Lanqing Li

2023 International Conference on Frontiers of Robotics and Software Engineering (FRSE)（2023）

引用 0|浏览10

暂无评分

摘要

Multi-goal reinforcement learning has emerged as a powerful approach for planning and robot manipulation tasks, but it faces challenges such as sparse rewards and sample inefficiency. Hindsight Experience Replay (HER) has been proposed as a solution to these challenges by relabeling goals, but it still requires a large number of samples and significant computation. To address these issues, we propose Multi-step Hindsight Experience Replay (MHER), which incorporates multi-step relabeling to improve sample efficiency. Despite the advantages of $n -$step relabeling, we theoretically and experimentally prove the off-policy $n -$step bias introduced by $n -$step relabeling may lead to poor performance in many environments. To address this issue, two bias-reduced MHER algorithms, MHER $( \lambda )$ and Model-based MHER (MMHER) are presented. MHER $( \lambda )$ exploits the $\lambda$ return while MMHER benefits from model-based value expansions. Experimental results on numerous multi-goal robotic tasks show that our solutions can successfully alleviate the off-policy $n -$step bias and achieve significantly higher sample efficiency than previous multi-goal RL baselines with little additional computation beyond HER.

查看译文

关键词

Multi-goal Reinforcement Learning,hindsight experience replay,off-policy evaluation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要