Corruption Robust Exploration in Episodic Reinforcement Learning

Cited by: 11|Bibtex|Views17
Other Links: arxiv.org

Abstract:

We initiate the study of multi-stage episodic reinforcement learning under adversarial manipulations in both the rewards and the transition probabilities of the underlying system. Existing efficient algorithms heavily rely on the "optimism under uncertainty" principle which dictates their behavior and does not allow flexibility to perfo...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments