ScanRefer: 3D Object Localization in RGB-D Scans Using Natural Language

European Conference on Computer Vision(2020)

引用 204|浏览1080
暂无评分
摘要
We introduce the task of 3D object localization in RGB-D scans using natural language descriptions. As input, we assume a point cloud of a scanned 3D scene along with a free-form description of a specified target object. To address this task, we propose ScanRefer, learning a fused descriptor from 3D object proposals and encoded sentence embeddings. This fused descriptor correlates language expressions with geometric features, enabling regression of the 3D bounding box of a target object. We also introduce the ScanRefer dataset, containing \(51,583\) descriptions of \(11,046\) objects from \(800\) ScanNet[8] scenes. ScanRefer is the first large-scale effort to perform object localization via natural language expression directly in 3D (Code: https://daveredrum.github.io/ScanRefer/).
更多
查看译文
关键词
3d object localization,scanrefer,scans
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要