Reward Predictive Representations Generalize Across Tasks in Reinforcement Learning

bioRxiv(2019)

引用 24|浏览33
暂无评分
摘要
One question central to reinforcement learning is which representations can be generalized or re-used across different tasks. Existing algorithms that facilitate transfer typically are limited to cases in which the transition function or the optimal policy is portable to new contexts, but achieving “deep transfer” characteristic of human behavior has been elusive. This article demonstrates that model reductions that minimize error in predictions of reward outcomes generalize across tasks with different transition and reward functions. Such state representations compress the state space of a task into a lower dimensional representation by combining states that are equivalent in terms of both the transition and reward functions. Because only state equivalences are considered, the resulting state representation is not tied to the transition and reward functions themselves and thus generalizes across tasks with different reward and transition functions. The presented results motivate further experiments to investigate if humans or animals learn such a representation, and whether neural systems involved in state representation reflect the modeled equivalence relations.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要