Optimizing over a Restricted Policy Class in MDPs

international conference on artificial intelligence and statistics, 2019.

Cited by: 2|Views30
EI

Abstract:

We address the problem of finding an optimal policy in a Markov decision process (MDP) under a restricted policy class defined by the convex hull of a set of base policies. This problem is of great interest in applications in which a number of reasonably good (or safe) policies are already known and we are interested in optimizing in thei...More

Code:

Data:

Your rating :
0

 

Tags
Comments