Safety Augmented Value Estimation From Demonstrations (Saved): Safe Deep Model-Based Rl For Sparse Cost Robotic Tasks
IEEE ROBOTICS AND AUTOMATION LETTERS(2020)
摘要
Reinforcement learning (RL) for robotics is challenging due to the difficulty in hand-engineering a dense cost function, which can lead to unintended behavior, and dynamical uncertainty, which makes exploration and constraint satisfaction challenging. We address these issues with a new model-based reinforcement learning algorithm, Safety Augmented Value Estimation from Demonstrations (SAVED), which uses supervision that only identifies task completion and a modest set of suboptimal demonstrations to constrain exploration and learn efficiently while handling complex constraints. We then compare SAVED with 3 state-of-the-art model-based and model-free RL algorithms on 6 standard simulation benchmarks involving navigation and manipulation and a physical knot-tying task on the daVinci surgical robot. Results suggest that SAVEDoutperforms priormethods in terms of success rate, constraint satisfaction, and sample efficiency, making it feasible to safely learn a control policy directly on a real robot in less than an hour. For tasks on the robot, baselines succeed less than 5% of the time while SAVED has a success rate of over 75% in the first 50 training iterations. Code and supplementary material is available at https://tinyurl.com/ saved-rl.
更多查看译文
关键词
Task analysis, Heuristic algorithms, Cost function, Planning, Uncertainty, Robots, Trajectory, Reinforcement learning, imitation learning, optimal control
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络