Asymmetric Norms to Approximate the Minimum Action Distance

CoRR(2023)

引用 0|浏览2
暂无评分
摘要
This paper presents a state representation for reward-free Markov decision processes. The idea is to learn, in a self-supervised manner, an embedding space where distances between pairs of embedded states correspond to the minimum number of actions needed to transition between them. Unlike previous methods, our approach incorporates an asymmetric norm parametrization, enabling accurate approximations of minimum action distances in environments with inherent asymmetry. We show how this representation can be leveraged to learn goal-conditioned policies, providing a notion of similarity between states and goals and a useful heuristic distance to guide planning. To validate our approach, we conduct empirical experiments on both symmetric and asymmetric environments. Our results show that our asymmetric norm parametrization performs comparably to symmetric norms in symmetric environments and surpasses symmetric norms in asymmetric environments.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要