BEVoxSeg: BEV-Voxel Representation for Fast and Accurate Camera-Based 3D Segmentation
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)
摘要
Recent research has demonstrated the advantages of Bird’s-eye-view (BEV) representation in the field of 3D perception. However, due to the lack of height information, BEV representation alone is insufficient to accurately reconstruct the complete surrounding 3D scene. On the other hand, voxel representation excels in describing 3D structures, but their memory and computational cost pose challenges for fast inference. To tackle these limitations, we propose an innovative method dubbed BEVoxSeg, which leverages the computational efficiency of BEV methods while incorporating essential geometric information from voxel features. By combining the advantages from both representations, our approach achieved state-of-the-art results for LiDAR semantic segmentation on nuScenes and demonstrated a superior performance in the occupancy prediction tasks on Occ3D-nuScenes dataset.
更多查看译文
关键词
3D perception,Birds-eye-view,LiDAR segmentation,occupancy prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要