Asymmetric Norms to Approximate the Minimum Action Distance
CoRR(2023)
摘要
This paper presents a state representation for reward-free Markov decision
processes. The idea is to learn, in a self-supervised manner, an embedding
space where distances between pairs of embedded states correspond to the
minimum number of actions needed to transition between them. Unlike previous
methods, our approach incorporates an asymmetric norm parametrization, enabling
accurate approximations of minimum action distances in environments with
inherent asymmetry. We show how this representation can be leveraged to learn
goal-conditioned policies, providing a notion of similarity between states and
goals and a useful heuristic distance to guide planning. To validate our
approach, we conduct empirical experiments on both symmetric and asymmetric
environments. Our results show that our asymmetric norm parametrization
performs comparably to symmetric norms in symmetric environments and surpasses
symmetric norms in asymmetric environments.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要