Reinforcement Learning When All Actions are Not Always Available

Yash Chandak
Yash Chandak
Blossom Metevier
Blossom Metevier

CoRR, 2019.

Cited by: 0|Bibtex|Views19
EI
Other Links: dblp.uni-trier.de|arxiv.org

Abstract:

The Markov decision process (MDP) formulation used to model many real-world sequential decision making problems does not capture the setting where the set of available decisions (actions) at each time step is stochastic. Recently, the stochastic action set Markov decision process (SAS-MDP) formulation has been proposed, which captures t...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments