Reinforcement learning with multi-fidelity simulators

Robotics and Automation(2014)

引用 85|浏览42
暂无评分
摘要
We present a framework for reinforcement learning (RL) in a scenario where multiple simulators are available with decreasing amounts of fidelity to the real-world learning scenario. Our framework is designed to limit the number of samples used in each successively higher-fidelity/cost simulator by allowing the agent to choose to run trajectories at the lowest level that will still provide it with information. The approach transfers state-action Q-values from lower-fidelity models as heuristics for the “Knows What It Knows” family of RL algorithms, which is applicable over a wide range of possible dynamics and reward representations. Theoretical proofs of the framework's sample complexity are given and empirical results are demonstrated on a remote controlled car with multiple simulators. The approach allows RL algorithms to find near-optimal policies for the real world with fewer expensive real-world samples than previous transfer approaches or learning without simulators.
更多
查看译文
关键词
automobiles,control engineering computing,learning (artificial intelligence),mobile robots,telerobotics,trajectory control,RL algorithms,multifidelity simulators,reinforcement learning,remote controlled car,robotic control algorithm,state-action Q-values,trajectory level
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要