Observational Overfitting in Reinforcement Learning
international conference on learning representations, 2020.
Weibo:
Abstract:
A major component of overfitting in model-free reinforcement learning (RL) involves the case where the agent may mistakenly correlate reward with certain spurious features from the observations generated by the Markov Decision Process (MDP). We provide a general framework for analyzing this scenario, which we use to design multiple synthe...More
Tags
Comments