Language-Driven Active Learning for Diverse Open-Set 3D Object Detection
arxiv(2024)
摘要
Object detection is crucial for ensuring safe autonomous driving. However,
data-driven approaches face challenges when encountering minority or novel
objects in the 3D driving scene. In this paper, we propose VisLED, a
language-driven active learning framework for diverse open-set 3D Object
Detection. Our method leverages active learning techniques to query diverse and
informative data samples from an unlabeled pool, enhancing the model's ability
to detect underrepresented or novel objects. Specifically, we introduce the
Vision-Language Embedding Diversity Querying (VisLED-Querying) algorithm, which
operates in both open-world exploring and closed-world mining settings. In
open-world exploring, VisLED-Querying selects data points most novel relative
to existing data, while in closed-world mining, it mines new instances of known
classes. We evaluate our approach on the nuScenes dataset and demonstrate its
effectiveness compared to random sampling and entropy-querying methods. Our
results show that VisLED-Querying consistently outperforms random sampling and
offers competitive performance compared to entropy-querying despite the
latter's model-optimality, highlighting the potential of VisLED for improving
object detection in autonomous driving scenarios.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要