Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation
arxiv(2024)
摘要
Representation rank is an important concept for understanding the role of
Neural Networks (NNs) in Deep Reinforcement learning (DRL), which measures the
expressive capacity of value networks. Existing studies focus on unboundedly
maximizing this rank; nevertheless, that approach would introduce overly
complex models in the learning, thus undermining performance. Hence,
fine-tuning representation rank presents a challenging and crucial optimization
problem. To address this issue, we find a guiding principle for adaptive
control of the representation rank. We employ the Bellman equation as a
theoretical foundation and derive an upper bound on the cosine similarity of
consecutive state-action pairs representations of value networks. We then
leverage this upper bound to propose a novel regularizer, namely BEllman
Equation-based automatic rank Regularizer (BEER). This regularizer adaptively
regularizes the representation rank, thus improving the DRL agent's
performance. We first validate the effectiveness of automatic control of rank
on illustrative experiments. Then, we scale up BEER to complex continuous
control tasks by combining it with the deterministic policy gradient method.
Among 12 challenging DeepMind control tasks, BEER outperforms the baselines by
a large margin. Besides, BEER demonstrates significant advantages in Q-value
approximation. Our code is available at
https://github.com/sweetice/BEER-ICLR2024.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要