AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return.

UAI, pp.63-72, (2018)

被引用5|浏览44
EI
24小时获取PDF
引用

摘要

Temporal-difference (TD) learning methods are widely used in reinforcement learning to estimate the expected return for each state, without a model, because of their significant advantages in computational and data efficiency. For many applications involving risk mitigation it would also be useful to estimate the variance of the return by...更多

代码

数据

作者
Craig Sherstan
Craig Sherstan
Dylan R. Ashley
Dylan R. Ashley
Brendan Bennett
Brendan Bennett
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科