O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation
arxiv(2024)
摘要
Online construction of open-ended language scenes is crucial for robotic
applications, where open-vocabulary interactive scene understanding is
required. Recently, neural implicit representation has provided a promising
direction for online interactive mapping. However, implementing open-vocabulary
scene understanding capability into online neural implicit mapping still faces
three challenges: lack of local scene updating ability, blurry spatial
hierarchical semantic segmentation and difficulty in maintaining multi-view
consistency. To this end, we proposed O2V-mapping, which utilizes voxel-based
language and geometric features to create an open-vocabulary field, thus
allowing for local updates during online training process. Additionally, we
leverage a foundational model for image segmentation to extract language
features on object-level entities, achieving clear segmentation boundaries and
hierarchical semantic features. For the purpose of preserving consistency in 3D
object properties across different viewpoints, we propose a spatial adaptive
voxel adjustment mechanism and a multi-view weight selection method. Extensive
experiments on open-vocabulary object localization and semantic segmentation
demonstrate that O2V-mapping achieves online construction of language scenes
while enhancing accuracy, outperforming the previous SOTA method.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要