StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning

Jinghuan Shang,Kumara Kahatapitiya,Xiang Li,Michael S. Ryoo

Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIX（2022）

引用 0|浏览4

暂无评分

摘要

Reinforcement Learning (RL) can be considered as a sequence modeling task: given a sequence of past state-action-reward experiences, an agent predicts a sequence of next actions. In this work, we propose St ate- A ction- R eward Transformer ( StAR former) for visual RL, which explicitly models short-term state-action-reward representations (StAR-representations), essentially introducing a Markovian-like inductive bias to improve long-term modeling. Our approach first extracts StAR-representations by self-attending image state patches, action, and reward tokens within a short temporal window. These are then combined with pure image state representations—extracted as convolutional features, to perform self-attention over the whole sequence. Our experiments show that StARformer outperforms the state-of-the-art Transformer-based method on image-based Atari and DeepMind Control Suite benchmarks, in both offline-RL and imitation learning settings. StARformer is also more compliant with longer sequences of inputs. Our code is available at https://github.com/elicassion/StARformer .

查看译文

关键词

visual reinforcement learning,reinforcement learning,representations,state-action-reward

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要