X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention
SIGGRAPH '24 ACM SIGGRAPH 2024 Conference Papers(2024)
Abstract
We propose X-Portrait, an innovative conditional diffusion model tailored forgenerating expressive and temporally coherent portrait animation. Specifically,given a single portrait as appearance reference, we aim to animate it withmotion derived from a driving video, capturing both highly dynamic and subtlefacial expressions along with wide-range head movements. As its core, weleverage the generative prior of a pre-trained diffusion model as the renderingbackbone, while achieve fine-grained head pose and expression control withnovel controlling signals within the framework of ControlNet. In contrast toconventional coarse explicit controls such as facial landmarks, our motioncontrol module is learned to interpret the dynamics directly from the originaldriving RGB inputs. The motion accuracy is further enhanced with a patch-basedlocal control module that effectively enhance the motion attention tosmall-scale nuances like eyeball positions. Notably, to mitigate the identityleakage from the driving signals, we train our motion control modules withscaling-augmented cross-identity images, ensuring maximized disentanglementfrom the appearance reference modules. Experimental results demonstrate theuniversal effectiveness of X-Portrait across a diverse range of facialportraits and expressive driving sequences, and showcase its proficiency ingenerating captivating portrait animations with consistently maintainedidentity characteristics.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined