Safe Reinforcement Learning with Model Uncertainty Estimates

2019 International Conference on Robotics and Automation (ICRA)(2019)

引用 157|浏览98
暂无评分
摘要
Many current autonomous systems are being designed with a strong reliance on black box predictions from deep neural networks (DNNs). However, DNNs tend to be overconfident in predictions on unseen data and can give unpredictable results for far-from-distribution test data. The importance of predictions that are robust to this distributional shift is evident for safety-critical applications, such as collision avoidance around pedestrians. Measures of model uncertainty can be used to identify unseen data, but the state-of-the-art extraction methods such as Bayesian neural networks are mostly intractable to compute. This paper uses MC-Dropout and Bootstrapping to give computationally tractable and parallelizable uncertainty estimates. The methods are embedded in a Safe Reinforcement Learning framework to form uncertainty-aware navigation around pedestrians. The result is a collision avoidance policy that knows what it does not know and cautiously avoids pedestrians that exhibit unseen behavior. The policy is demonstrated in simulation to be more robust to novel observations and take safer actions than an uncertainty-unaware baseline.
更多
查看译文
关键词
model uncertainty estimates,current autonomous systems,strong reliance,black box predictions,deep neural networks,DNNs,unpredictable results,far-from-distribution test data,distributional shift,safety-critical applications,pedestrians,state-of-the-art extraction methods,Bayesian neural networks,MC-Dropout,computationally tractable uncertainty estimates,parallelizable uncertainty estimates,uncertainty-aware navigation,collision avoidance policy,unseen behavior,uncertainty-unaware baseline,safe reinforcement learning framework
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要