RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes
arxiv(2024)
摘要
Reinforcement learning provides an appealing framework for robotic control
due to its ability to learn expressive policies purely through real-world
interaction. However, this requires addressing real-world constraints and
avoiding catastrophic failures during training, which might severely impede
both learning progress and the performance of the final policy. In many
robotics settings, this amounts to avoiding certain "unsafe" states. The
high-speed off-road driving task represents a particularly challenging
instantiation of this problem: a high-return policy should drive as
aggressively and as quickly as possible, which often requires getting close to
the edge of the set of "safe" states, and therefore places a particular burden
on the method to avoid frequent failures.
To both learn highly performant policies and avoid excessive failures, we
propose a reinforcement learning framework that combines risk-sensitive control
with an adaptive action space curriculum.
Furthermore, we show that our risk-sensitive objective automatically avoids
out-of-distribution states when equipped with an estimator for epistemic
uncertainty.
We implement our algorithm on a small-scale rally car and show that it is
capable of learning high-speed policies for a real-world off-road driving task.
We show that our method greatly reduces the number of safety violations during
the training process, and actually leads to higher-performance policies in both
driving and non-driving simulation environments with similar challenges.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要