Safe Policy Learning for Continuous Control
2019.
Abstract:
We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through safe policies, ie,~ policies that keep the agent in desirable situations, both during training and at convergence. We formulate these problems as {\em constrained} Markov decision processes (CMDPs) a...More
Code:
Data:
Tags
Comments