Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning

NIPS, 2017.

Cited by: 101|Bibtex|Views80
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org

Abstract:

Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques. On the other hand, on-policy algorithms are often more stable and easier to use. This paper examines, both theoretically and empirically, approaches to merging on- and off-polic...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments