Cross-Modal Adaptation For Rgb-D Detection

2016 IEEE International Conference on Robotics and Automation (ICRA)(2016)

引用 104|浏览159
暂无评分
摘要
In this paper we propose a technique to adapt convolutional neural network (CNN) based object detectors trained on RGB images to effectively leverage depth images at test time to boost detection performance. Given labeled depth images for a handful of categories we adapt an RGB object detector for a new category such that it can now use depth images in addition to RGB images at test time to produce more accurate detections. Our approach is built upon the observation that lower layers of a CNN are largely task and category agnostic and domain specific while higher layers are largely task and category specific while being domain agnostic. We operationalize this observation by proposing a mid-level fusion of RGB and depth CNNs. Experimental evaluation on the challenging NYUD2 dataset shows that our proposed adaptation technique results in an average 21% relative improvement in detection performance over an RGB-only baseline even when no depth training data is available for the particular category evaluated. We believe our proposed technique will extend advances made in computer vision to RGB-D data leading to improvements in performance at little additional annotation effort.
更多
查看译文
关键词
cross-modal adaptation,RGB-D detection,convolutional neural network,CNN based object detectors,RGB images,labeled depth images,RGB object detector,mid-level fusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要