Informed Policies: Overcoming Subpolicy Dependence

international conference on robotics and automation(2019)

引用 0|浏览14
暂无评分
摘要
Hierarchical reinforcement learning addresses some of the difficulties that reinforcement learning has with long time horizon tasks by decomposing them into subtasks which are then solved separately. This formulation has the limitation that often the optimal solutions to the subtasks do not combine to be a globally optimal solution in terms of the overall task. This paper seeks to address this subpolicy independence problem through the use of informed policies. By passing information from subsequent subtask policies back into the current subtask policy, the disconnect between consecutive subtasks can be bridged and subtask policy can be considered in-formed. Using this method on an inverted pendulum domain task, we were able to show that using informed policies can outperform uninformed policies.
更多
查看译文
关键词
policies,subpolicy dependence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要