ELiTe: Efficient Image-to-LiDAR Knowledge Transfer for Semantic Segmentation
arxiv(2024)
摘要
Cross-modal knowledge transfer enhances point cloud representation learning
in LiDAR semantic segmentation. Despite its potential, the weak teacher
challenge arises due to repetitive and non-diverse car camera images and
sparse, inaccurate ground truth labels. To address this, we propose the
Efficient Image-to-LiDAR Knowledge Transfer (ELiTe) paradigm. ELiTe introduces
Patch-to-Point Multi-Stage Knowledge Distillation, transferring comprehensive
knowledge from the Vision Foundation Model (VFM), extensively trained on
diverse open-world images. This enables effective knowledge transfer to a
lightweight student model across modalities. ELiTe employs Parameter-Efficient
Fine-Tuning to strengthen the VFM teacher and expedite large-scale model
training with minimal costs. Additionally, we introduce the Segment Anything
Model based Pseudo-Label Generation approach to enhance low-quality image
labels, facilitating robust semantic representations. Efficient knowledge
transfer in ELiTe yields state-of-the-art results on the SemanticKITTI
benchmark, outperforming real-time inference models. Our approach achieves this
with significantly fewer parameters, confirming its effectiveness and
efficiency.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要