Is the Policy Gradient a Gradient?

Chris Nota
Chris Nota

AAMAS, pp. 939-947, 2020.

Cited by: 4|Bibtex|Views16|DOI:https://doi.org/abs/10.5555/3398761.3398871
EI
Other Links: dblp.uni-trier.de|arxiv.org

Abstract:

The policy gradient theorem describes the gradient of the expected discounted return with respect to an agent's policy parameters. However, most policy gradient methods do not use the discount factor in the manner originally prescribed, and therefore do not optimize the discounted objective. It has been an open question in RL as to whic...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments