Protecting against evaluation overfitting in empirical reinforcement learning

Shimon Whiteson,Brian Tanner,Matthew E. Taylor,Peter Stone

Adaptive Dynamic Programming And Reinforcement Learning（2011）

引用 113|浏览12

暂无评分

摘要

Empirical evaluations play an important role in machine learning. However, the usefulness of any evaluation depends on the empirical methodology employed. Designing good empirical methodologies is difficult in part because agents can overfit test evaluations and thereby obtain misleadingly high scores. We argue that reinforcement learning is particularly vulnerable to environment overfitting and propose as a remedy generalized methodologies, in which evaluations are based on multiple environments sampled from a distribution. In addition, we consider how to summarize performance when scores from different environments may not have commensurate values. Finally, we present proof-of-concept results demonstrating how these methodologies can validate an intuitively useful range-adaptive tile coding method.

查看译文

关键词

generalisation (artificial intelligence),learning (artificial intelligence),evaluation overfitting,machine learning,range-adaptive tile coding method,reinforcement learning,remedy generalized methodology

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要