Don't Start from Scratch: Behavioral Refinement via Interpolant-based Policy Diffusion
arxiv(2024)
摘要
Imitation learning empowers artificial agents to mimic behavior by learning
from demonstrations. Recently, diffusion models, which have the ability to
model high-dimensional and multimodal distributions, have shown impressive
performance on imitation learning tasks. These models learn to shape a policy
by diffusing actions (or states) from standard Gaussian noise. However, the
target policy to be learned is often significantly different from Gaussian and
this mismatch can result in poor performance when using a small number of
diffusion steps (to improve inference speed) and under limited data. The key
idea in this work is that initiating from a more informative source than
Gaussian enables diffusion methods to mitigate the above limitations. We
contribute both theoretical results, a new method, and empirical findings that
show the benefits of using an informative source policy. Our method, which we
call BRIDGER, leverages the stochastic interpolants framework to bridge
arbitrary policies, thus enabling a flexible approach towards imitation
learning. It generalizes prior work in that standard Gaussians can still be
applied, but other source policies can be used if available. In experiments on
challenging simulation benchmarks and on real robots, BRIDGER outperforms
state-of-the-art diffusion policies. We provide further analysis on design
considerations when applying BRIDGER.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要