Expression Invariant 3d Face Modeling From An Rgb-D Video

2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)(2016)

引用 0|浏览72
暂无评分
摘要
We aim to reconstruct an accurate neutral 3D face model from an RGB-D video in the presence of extreme expression changes. Since each depth frame, taken by a low-cost sensor, is noisy, point clouds from multiple frames can be registered and aggregated to build an accurate 3D model. However, direct aggregation of multiple data produces erroneous results in natural interaction (e.g., talking and showing expressions). We propose to analyze facial expression from an RGB frame and neutralize the corresponding 3D point cloud if needed. We first estimate the person's expression by fitting blend-shape coefficients using 2D facial landmarks for each frame and calculate an expression deformity (expression score). With the estimated expression score, we determine whether an input face is neutral or non-neutral. If the face is non-neutral, we proceed to neutralize the expression of the 3D point cloud in that frame. To neutralize the 3D point cloud of a face, we deform our generic 3D face model by applying the estimated blendshape coefficients, find displacement vectors from the deformed generic face to a neutral generic face, and apply the displacement vectors to the input 3D point cloud. After preprocessing frames in a video, we rank frames based on the expression scores and register the ranked frames into a single 3D model. Our system produces a neutral 3D face model in the presence of extreme expression changes even when neutral faces do not exist in the video.
更多
查看译文
关键词
expression invariant 3D face modeling,RGB-D video,depth frame,noisy point clouds,low-cost sensor,multiple data direct aggregation,3D point cloud,blendshape coefficients,2D facial landmarks,displacement vectors,neutral generic face,video preprocessing frames,frame ranking,person expression estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要