Digging into contrastive learning for robust depth estimation with diffusion models
arxiv(2024)
摘要
Recently, diffusion-based depth estimation methods have drawn widespread
attention due to their elegant denoising patterns and promising performance.
However, they are typically unreliable under adverse conditions prevalent in
real-world scenarios, such as rainy, snowy, etc. In this paper, we propose a
novel robust depth estimation method called D4RD, featuring a custom
contrastive learning mode tailored for diffusion models to mitigate performance
degradation in complex environments. Concretely, we integrate the strength of
knowledge distillation into contrastive learning, building the `trinity'
contrastive scheme. This scheme utilizes the sampled noise of the forward
diffusion process as a natural reference, guiding the predicted noise in
diverse scenes toward a more stable and precise optimum. Moreover, we extend
noise-level trinity to encompass more generic feature and image levels,
establishing a multi-level contrast to distribute the burden of robust
perception across the overall network. Before addressing complex scenarios, we
enhance the stability of the baseline diffusion model with three
straightforward yet effective improvements, which facilitate convergence and
remove depth outliers. Extensive experiments demonstrate that D4RD surpasses
existing state-of-the-art solutions on synthetic corruption datasets and
real-world weather conditions. The code for D4RD will be made available for
further exploration and adoption.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要