End-to-End Human Pose and Mesh Reconstruction with Transformers

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021(2021)

引用 573|浏览377
暂无评分
摘要
We present a new method, called MEsh TRansfOrmer (METRO), to reconstruct 3D human pose and mesh vertices from a single image. Our method uses a transformer encoder to jointly model vertex-vertex and vertex-joint interactions, and outputs 3D joint coordinates and mesh vertices simultaneously. Compared to existing techniques that regress pose and shape parameters, METRO does not rely on any parametric mesh models like SMPL, thus it can be easily extended to other objects such as hands. We further relax the mesh topology and allow the transformer self-attention mechanism to freely attend between any two vertices, making it possible to learn non-local relationships among mesh vertices and joints. With the proposed masked vertex modeling, our method is more robust and effective in handling challenging situations like partial occlusions. METRO generates new state-of-the-art results for human mesh reconstruction on the public Human3.6M and 3DPW datasets. Moreover, we demonstrate the generalizability of METRO to 3D hand reconstruction in the wild, outperforming existing state-of-the-art methods on FreiHAND dataset.
更多
查看译文
关键词
METRO,transformer encoder,vertex-joint interactions,3D joint coordinates,parametric mesh models,mesh topology,transformer self-attention mechanism,nonlocal relationships,mesh vertices,masked vertex modeling,human mesh reconstruction,3DPW datasets,3D hand reconstruction,MEsh TRansfOrmer,end-to-end Human pose reconstruction,partial occlusions,3D human pose
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要