Towards Physically Safe Reinforcement Learning under Supervision
arXiv: Learning, Volume abs/1901.06576, 2019.
EI
Abstract:
This paper addresses the question of how a previously available control policy $pi_s$ can be used as a supervisor to more quickly and safely train a new learned control policy $pi_L$ for a robot. A weighted average of the supervisor and learned policies is used during trials, with a heavier weight initially on the supervisor, in order to ...More
Code:
Data:
Full Text
Tags
Comments