Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning.

CDC(2021)

引用 3|浏览78
暂无评分
摘要
Accurately predicting the dynamics of robotic systems is crucial for model-based control and reinforcement learning. The most common way to estimate dynamics is by fitting a one-step ahead prediction model and using it to recursively propagate the predicted state distribution over long horizons. Unfortunately, this approach is known to compound even small prediction errors, making long-term predictions inaccurate. In this paper, we propose a new parametrization to supervised learning on state-action data to stably predict at longer horizons -- that we call a trajectory-based model. This trajectory-based model takes an initial state, a future time index, and control parameters as inputs, and predicts the state at the future time. Our results in simulated and experimental robotic tasks show that our trajectory-based models yield significantly more accurate long term predictions, improved sample efficiency, and ability to predict task reward.
更多
查看译文
关键词
long-term dynamics,model-based reinforcement learning,model-based control,one-step ahead prediction model,predicted state distribution,long-term predictions,supervised learning,trajectory-based model,future time index,robotic systems
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要