MFF-Net: Towards Efficient Monocular Depth Completion With Multi-Modal Feature Fusion

IEEE Robotics and Automation Letters(2023)

引用 5|浏览43
暂无评分
摘要
Remarkable progress has been achieved by current depth completion approaches, which produce dense depth maps from sparse depth maps and corresponding color images. However, the performances of these approaches are limited due to the insufficient feature extractions and fusions. In this work, we propose an efficient multi-modal feature fusion based depth completion framework (MFF-Net), which can efficiently extract and fuse features with different modals in both encoding and decoding processes, thus more depth details with better performance can be obtained. In specific, the encoding process contains three branches where different modals of features from both color and sparse depth input can be extracted, and a multi-feature channel shuffle is utilized to enhance these features thus features with better representation abilities can be obtained. Meanwhile, the decoding process contains two branches to sufficiently fuse the extracted multi-modal features, and a multi-level weighted combination is employed to further enhance and fuse features with different modals, thus leading to more accurate and better refined depth maps. Extensive experiments on different benchmarks demonstrate that we achieve state-of-the-art among online methods. Meanwhile, we further evaluate the predicted dense depth by RGB-D SLAM, which is a commonly used downstream robotic perception task, and higher accuracy on vehicle's trajectory can be obtained in KITTI odometry dataset, which demonstrates the high quality of our depth prediction and the potential of improving the related downstream tasks with depth completion results.
更多
查看译文
关键词
RGB-D perception,recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要