Policy Optimization with Model-based Explorations
national conference on artificial intelligence, 2019.
Model-free reinforcement learning methods such as the Proximal Policy Optimization algorithm (PPO) have successfully applied in complex decision-making problems such as Atari games. However, these methods suffer from high variances and high sample complexity. On the other hand, model-based reinforcement learning methods that learn the tra...More
PPT (Upload PPT)