Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations

2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS)(2017)

引用 187|浏览241
暂无评分
摘要
Policy search methods in reinforcement learning have demonstrated success in scaling up to larger problems beyond toy examples. However, deploying these methods on real robots remains challenging due to the large sample complexity required during learning and their vulnerability to malicious intervention. We introduce Adversarially Robust Policy Learning (ARPL), an algorithm that leverages active computation of physically-plausible adversarial examples during training to enable robust policy learning in the source domain and robust performance under both random and adversarial input perturbations. We evaluate ARPL on four continuous control tasks and show superior resilience to changes in physical environment dynamics parameters and environment state as compared to state-of-the-art robust policy learning methods. Code, data, and additional experimental results are available at: stanfordvl.github.io/ARPL.
更多
查看译文
关键词
adversarial input perturbations,physical environment dynamics parameters,state-of-the-art robust policy learning methods,Adversarially Robust Policy Learning,active construction,physically-plausible perturbations,Policy search methods,reinforcement learning,physically-plausible adversarial examples,robust performance,random input perturbations,ARPL
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要