Full-Range Virtual Try-On with Recurrent Tri-Level Transform

IEEE Conference on Computer Vision and Pattern Recognition(2022)

引用 13|浏览32
Virtual try-on aims to transfer a target clothing image onto a reference person. Though great progress has been achieved, the functioning zone of existing works is still limited to standard clothes (e.g., plain shirt without complex laces or ripped effect), while the vast complexity and variety of non-standard clothes (e.g., off-shoulder shirt, word-shoulder dress) are largely ignored. In this work, we propose a principled framework, Re-current Tri-Level Transform (RT-VTON), that performs full-range virtual try-on on both standard and non-standard clothes. We have two key insights towards the framework design: 1) Semantics transfer requires a gradual feature transform on three different levels of clothing representations, namely clothes code, pose code and parsing code. 2) Geometry transfer requires a regularized image deformation between rigidity and flexibility. Firstly, we predict the semantics of the “after-try-on” person by recurrently refining the tri-level feature codes using local gated attention and non-local correspondence learning. Next, we design a semi-rigid deformation to align the clothing image and the predicted semantics, which preserves local warping similarity. Finally, a canonical try-on synthesizer fuses all the processed information to generate the clothed person image. Extensive experiments on conventional benchmarks along with user studies demonstrate that our framework achieves state-of-the-art performance both quantitatively and qualitatively. Notably, RT-VTON shows compelling results on a wide range of non-standard clothes. Project page: https://lzqhardworker.github.io/RT-VTON/.
Image and video synthesis and generation
AI 理解论文
Chat Paper