Cascaded multi-scale and multi-dimension convolutional neural network for stereo matching.

2018 IEEE Visual Communications and Image Processing (VCIP)（2018）

引用 8|浏览16

暂无评分

摘要

Convolutional neural networks (CNN) have been shown to perform better than the conventional stereo algorithms for stereo estimation. Numerous CNN algorithms focus on the pixel-wise matching cost computation, which is the important building block for many state-of-the-art algorithms. However, these architectures are limited to small and single scale receptive fields and use traditional methods for cost aggregation or even ignore cost aggregation. In this paper, we propose a novel architecture called cascaded multi-scale and multi-dimension network (MSMD) to take them both into consideration. Firstly, we propose a new multi-scale matching cost computation sub-network, in which two different sizes of receptive fields are implemented parallelly. In this way, the network can make the best use of both variants to balance the trade-off between the increase of receptive field and the loss of details. Furthermore, we show that our multi-dimension aggregation sub-network which contains 2D convolution and 3D convolution operations can provide rich context and semantic information for estimating an accurate initial disparity. Finally, experiments on challenging stereo benchmark KITTI demonstrate that the proposed method can achieve competitive results even without any additional post-processing.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要