Weakly Supervised Monocular 3D Detection with a Single-View Image
CVPR 2024(2024)
摘要
Monocular 3D detection (M3D) aims for precise 3D object localization from a
single-view image which usually involves labor-intensive annotation of 3D
detection boxes. Weakly supervised M3D has recently been studied to obviate the
3D annotation process by leveraging many existing 2D annotations, but it often
requires extra training data such as LiDAR point clouds or multi-view images
which greatly degrades its applicability and usability in various applications.
We propose SKD-WM3D, a weakly supervised monocular 3D detection framework that
exploits depth information to achieve M3D with a single-view image exclusively
without any 3D annotations or other training data. One key design in SKD-WM3D
is a self-knowledge distillation framework, which transforms image features
into 3D-like representations by fusing depth information and effectively
mitigates the inherent depth ambiguity in monocular scenarios with little
computational overhead in inference. In addition, we design an
uncertainty-aware distillation loss and a gradient-targeted transfer modulation
strategy which facilitate knowledge acquisition and knowledge transfer,
respectively. Extensive experiments show that SKD-WM3D surpasses the
state-of-the-art clearly and is even on par with many fully supervised methods.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要