Phasic Policy Gradient

Karl Cobbe
Karl Cobbe
Jacob Hilton
Jacob Hilton
Oleg Klimov
Oleg Klimov
Cited by: 0|Bibtex|Views10
Other Links: arxiv.org

Abstract:

We introduce Phasic Policy Gradient (PPG), a reinforcement learning framework which modifies traditional on-policy actor-critic methods by separating policy and value function training into distinct phases. In prior methods, one must choose between using a shared network or separate networks to represent the policy and value function. U...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments