ViViDex: Learning Vision-based Dexterous Manipulation from Human Videos
CoRR(2024)
摘要
In this work, we aim to learn a unified vision-based policy for a
multi-fingered robot hand to manipulate different objects in diverse poses.
Though prior work has demonstrated that human videos can benefit policy
learning, performance improvement has been limited by physically implausible
trajectories extracted from videos. Moreover, reliance on privileged object
information such as ground-truth object states further limits the applicability
in realistic scenarios. To address these limitations, we propose a new
framework ViViDex to improve vision-based policy learning from human videos. It
first uses reinforcement learning with trajectory guided rewards to train
state-based policies for each video, obtaining both visually natural and
physically plausible trajectories from the video. We then rollout successful
episodes from state-based policies and train a unified visual policy without
using any privileged information. A coordinate transformation method is
proposed to significantly boost the performance. We evaluate our method on
three dexterous manipulation tasks and demonstrate a large improvement over
state-of-the-art algorithms.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要