AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

R-max - a general polynomial time algorithm for near-optimal reinforcement learning

Journal of Machine Learning Research, no. 2 (2003): 213-231

被引用1063|浏览131
EI
下载 PDF 全文
引用

摘要

R-MAX is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-MAX, the agent always maintains a complete, but possibly inaccurate model of its environment and acts based on the optimal policy derived from this model. The model is initialized in an optimistic fashi...更多

代码

数据

您的评分 :
0

 

标签
评论
小科