D-SCo: Dual-Stream Conditional Diffusion for Monocular Hand-Held Object Reconstruction
arxiv(2023)
摘要
Reconstructing hand-held objects from a single RGB image is a challenging
task in computer vision. In contrast to prior works that utilize deterministic
modeling paradigms, we employ a point cloud denoising diffusion model to
account for the probabilistic nature of this problem. In the core, we introduce
centroid-fixed dual-stream conditional diffusion for monocular hand-held object
reconstruction (D-SCo), tackling two predominant challenges. First, to avoid
the object centroid from deviating, we utilize a novel hand-constrained
centroid fixing paradigm, enhancing the stability of diffusion and reverse
processes and the precision of feature projection. Second, we introduce a
dual-stream denoiser to semantically and geometrically model hand-object
interactions with a novel unified hand-object semantic embedding, enhancing the
reconstruction performance of the hand-occluded region of the object.
Experiments on the synthetic ObMan dataset and three real-world datasets HO3D,
MOW and DexYCB demonstrate that our approach can surpass all other
state-of-the-art methods. Codes will be released.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要