iLabel: Revealing Objects in Neural Fields

IEEE Robotics and Automation Letters(2023)

引用 6|浏览49
暂无评分
摘要
A neural field trained with self-supervision to efficiently represent the geometry and colour of a 3D scene tends to automatically decompose it into coherent and accurate object-like regions, which can be revealed with sparse labelling interactions to produce a 3D semantic scene segmentation. Our real-time iLabel system takes input from a hand-held RGB-D camera, requires zero prior training data, and works in an ‘open set’ manner, with semantic classes defined on the fly by the user. iLabel's underlying model is a simple multilayer perceptron (MLP), trained from scratch to learn a neural representation of a single 3D scene. The model is updated continually and visualised in real-time, allowing the user to focus interactions to achieve extremely efficient semantic segmentation. A room-scale scene can be accurately labelled into 10+ semantic categories with around 100 clicks, taking less than 5 minutes. Quantitative labelling accuracy scales powerfully with the number of clicks, and rapidly surpasses standard pre-trained semantic segmentation methods. We also demonstrate a hierarchical labelling variant of iLabel and a ‘hands-free’ mode where the user only needs to supply label names for automatically-generated locations.
更多
查看译文
关键词
Deep learning for visual perception,representation learning,semantic scene understanding,SLAM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要