Multi-Person 3D Pose Estimation from Multi-View Uncalibrated Depth Cameras
CoRR(2024)
摘要
We tackle the task of multi-view, multi-person 3D human pose estimation from
a limited number of uncalibrated depth cameras. Recently, many approaches have
been proposed for 3D human pose estimation from multi-view RGB cameras.
However, these works (1) assume the number of RGB camera views is large enough
for 3D reconstruction, (2) the cameras are calibrated, and (3) rely on ground
truth 3D poses for training their regression model. In this work, we propose to
leverage sparse, uncalibrated depth cameras providing RGBD video streams for 3D
human pose estimation. We present a simple pipeline for Multi-View Depth Human
Pose Estimation (MVD-HPE) for jointly predicting the camera poses and 3D human
poses without training a deep 3D human pose regression model. This framework
utilizes 3D Re-ID appearance features from RGBD images to formulate more
accurate correspondences (for deriving camera positions) compared to using
RGB-only features. We further propose (1) depth-guided camera-pose estimation
by leveraging 3D rigid transformations as guidance and (2) depth-constrained 3D
human pose estimation by utilizing depth-projected 3D points as an alternative
objective for optimization. In order to evaluate our proposed pipeline, we
collect three video sets of RGBD videos recorded from multiple sparse-view
depth cameras and ground truth 3D poses are manually annotated. Experiments
show that our proposed method outperforms the current 3D human pose
regression-free pipelines in terms of both camera pose estimation and 3D human
pose estimation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要