RB2: Robotics Benchmarking with a Twist

Sudeep Dasari, Jianren Wang, Joyce Hong, Shikhar Bahl, Yixin Lin, Austin Wang, Abitha Thankaraj,Karanbir Chahal, Berk Calli, Saurabh Gupta, David Held, Lerrel Pinto, Deepak Pathak, Vikash Kumar, Abhinav Gupta

semanticscholar(2021)

引用 0|浏览0
暂无评分
摘要
Benchmarks offer a scientific way to compare algorithms using objective perfor1 mance metrics. Good benchmarks have two features: (a) they should be widely 2 useful for many research groups; (b) and they should produce reproducible findings. 3 In robotics, there is a trade-off between reproducibility and broad accessibility. 4 If the benchmark is kept restrictive (fixed hardware, objects), the numbers are 5 reproducible but the setup becomes less general. On the other hand, a benchmark 6 could be a loose set of protocols (e.g. object set [9]) but the underlying variation 7 in setups make the results non-reproducible. In this paper, we re-imagine robotics 8 benchmarks as state-of-the-art algorithmic implementations, alongside the usual 9 set of tasks and experimental protocols. The added baseline implementations will 10 provide a way to easily recreate SOTA numbers in a new local robotic setup, thus 11 providing credible relative rankings between existing approaches and new work. 12 However, these “local rankings” could vary between different setups. To resolve 13 this issue, we build a mechanism for pooling experimental data between labs, 14 and thus we establish a single global ranking for existing (and proposed) SOTA 15 algorithms. Our benchmark, called Ranking-Based Robotics Benchmark (RB2), is 16 evaluated on tasks that are inspired from clinically validated Southampton Hand 17 Assessment Procedures [27]. Our benchmark was run across two different labs 18 and reveals several surprising findings. For example, extremely simple baselines 19 like open-loop behavior cloning, outperform more complicated models (e.g. closed 20 loop, RNN, Offline-RL, etc.) that are preferred by the field. We hope our fellow 21 researchers will use RB2 to improve their research quality and rigour. 22
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要