Open-Set 3D Detection via Image-level Class and Debiased Cross-modal Contrastive Learning

Yuheng Lu,Chenfeng Xu,Xiaobao Wei,Xiaodong Xie,Masayoshi Tomizuka,Kurt Keutzer,Shanghang Zhang

ICLR 2023（2023）

引用 0|浏览54

暂无评分

摘要

Current point-cloud detection methods have difficulty detecting the open-set objects in the real world, due to their limited generalization capability. Moreover, it is extremely laborious and expensive to collect and fully annotate a point-cloud detection dataset with numerous classes of objects, leading to the limited classes of existing point-cloud datasets and hindering the model to learn general representations to achieve open-set point-cloud detection. Instead of seeking a point-cloud dataset with full labels, we resort to ImageNet1K to broaden the vocabulary of the point-cloud detector. We propose OS-3DETIC, an Open-Set 3D DETector using Image-level Class supervision. Specifically, we take advantage of two modalities, the image modality for recognition and the point-cloud modality for localization, to generate pseudo labels for unseen classes. Then we propose a novel debiased cross-modal cross-task contrastive learning method to transfer the knowledge from image modality to point-cloud modality during training. Without hurting the latency during inference, OS-3DETIC makes the well-known point-cloud detector capable of achieving open-set detection. Extensive experiments demonstrate that the proposed OS-3DETIC achieves at least 10.77 % mAP improvement (absolute value) and 9.56 % mAP improvement (absolute value) by a wide range of baselines on the SUN-RGBD dataset and ScanNet dataset, respectively. Besides, we conduct sufficient experiments to shed light on why the proposed OS-3DETIC works.

查看译文

关键词

open vocabulary,3d detection,contrastive learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要