Potential Driven Reinforcement Learning for Hard Exploration Tasks

IJCAI 2020(2020)

引用 5|浏览162
暂无评分
摘要
Experience replay plays a crucial role in Reinforcement Learning (RL), enabling the agent to remember and reuse experience from the past. Most previous methods sample experience transitions using simple heuristics like uniformly sampling or prioritizing those good ones. Since humans can learn from both good and bad experiences, more sophisticated experience replay algorithms need to be developed. Inspired by the potential energy in physics, this work introduces the artificial potential field into experience replay and develops Potentialized Experience Replay (PotER) as a new and effective sampling algorithm for RL in hard exploration tasks with sparse rewards. PotER defines a potential energy function for each state in experience replay and helps the agent to learn from both good and bad experiences using intrinsic state supervision. PotER can be combined with different RL algorithms as well as the self-imitation learning algorithm. Experimental analyses and comparisons on multiple challenging hard exploration environments have verified its effectiveness and efficiency.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要