You Only Look Bottom-Up for Monocular 3D Object Detection
IEEE Robotics and Automation Letters(2024)
摘要
Monocular 3D Object Detection is an essential task for autonomous driving.
Meanwhile, accurate 3D object detection from pure images is very challenging
due to the loss of depth information. Most existing image-based methods infer
objects' location in 3D space based on their 2D sizes on the image plane, which
usually ignores the intrinsic position clues from images, leading to
unsatisfactory performances. Motivated by the fact that humans could leverage
the bottom-up positional clues to locate objects in 3D space from a single
image, in this paper, we explore the position modeling from the image feature
column and propose a new method named You Only Look Bottum-Up (YOLOBU).
Specifically, our YOLOBU leverages Column-based Cross Attention to determine
how much a pixel contributes to pixels above it. Next, the Row-based Reverse
Cumulative Sum (RRCS) is introduced to build the connections of pixels in the
bottom-up direction. Our YOLOBU fully explores the position clues for monocular
3D detection via building the relationship of pixels from the bottom-up way.
Extensive experiments on the KITTI dataset demonstrate the effectiveness and
superiority of our method.
更多查看译文
关键词
monocular 3d object detection
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要