Reward-Free Exploration for Reinforcement Learning
ICML, pp. 4870-4879, 2020.
Our planning procedure can be instantiated by any black-box approximate planner, such as value iteration or natural policy gradient
Exploration is widely regarded as one of the most challenging aspects of reinforcement learning (RL), with many naive approaches succumbing to exponential sample complexity. To isolate the challenges of exploration, we propose a new "reward-free RL" framework. In the exploration phase, the agent first collects trajectories from an MDP $...More
PPT (Upload PPT)