Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, pp. 461.0-494, 2018.
EI
Abstract:
In this paper, we introduce proximal gradient temporal difference learning, which provides a principled way of designing and analyzing true stochastic gradient temporal difference learning algorithms. We show how gradient TD (GTD) reinforcement learning methods can be formally derived, not by starting from their original objective functio...More
Code:
Data:
Get fulltext within 24h
Other Links
Upload PDF
1.Your uploaded documents will be check within 24h, and coins will be credited to your account.
2.As the current system does not support cash withdrawal, you can add staff WeChat (AMxiaomai) to receive it as a red packet.
3.10 coins will be exchanged for 1 yuan.
?
¥
Upload a single paper
for 5 coins
Wechat's Red Packet
?
¥
Upload 50 articles
for 250 coins
Wechat's Red Packet
?
¥
Upload 200 articles
for 1000 coins
Wechat's Red Packet
?
¥
Upload 500 articles
for 2500 coins
Wechat's Red Packet
?
¥
Upload 1000 articles
for 5000 coins
Wechat's Red Packet
Tags
Comments