Policy Optimization with Model-based Explorations

Qingpeng Cai
Qingpeng Cai
Anxiang Zeng
Anxiang Zeng
Chun-Xiang Pan
Chun-Xiang Pan
Hua-Lin He
Hua-Lin He

national conference on artificial intelligence, 2019.

Cited by: 6|Views22
EI

Abstract:

Model-free reinforcement learning methods such as the Proximal Policy Optimization algorithm (PPO) have successfully applied in complex decision-making problems such as Atari games. However, these methods suffer from high variances and high sample complexity. On the other hand, model-based reinforcement learning methods that learn the tra...More

Code:

Data:

Your rating :
0

 

Tags
Comments