Gradient-based Planning with World Models
CoRR(2023)
摘要
The enduring challenge in the field of artificial intelligence has been the
control of systems to achieve desired behaviours. While for systems governed by
straightforward dynamics equations, methods like Linear Quadratic Regulation
(LQR) have historically proven highly effective, most real-world tasks, which
require a general problem-solver, demand world models with dynamics that cannot
be easily described by simple equations. Consequently, these models must be
learned from data using neural networks. Most model predictive control (MPC)
algorithms designed for visual world models have traditionally explored
gradient-free population-based optimisation methods, such as Cross Entropy and
Model Predictive Path Integral (MPPI) for planning. However, we present an
exploration of a gradient-based alternative that fully leverages the
differentiability of the world model. In our study, we conduct a comparative
analysis between our method and other MPC-based alternatives, as well as
policy-based algorithms. In a sample-efficient setting, our method achieves on
par or superior performance compared to the alternative approaches in most
tasks. Additionally, we introduce a hybrid model that combines policy networks
and gradient-based MPC, which outperforms pure policy based methods thereby
holding promise for Gradient-based planning with world models in complex
real-world tasks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要