Disentangling The Spatial Structure And Style In Conditional Vae

2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)(2020)

引用 5|浏览16
暂无评分
摘要
This paper proposes a structure in conditional variation autoencoder (cVAE) to disentangle the latent vector into a spatial structure and a style code, complementary to each other, with the one (z(s)) being label relevant and the other (z(u)) irrelevant. Different from traditional cVAE, our network maps the condition label into its relevant code zs through a separated module. Depending on whether the label directly relates to the image spatial structure or not, z(s) output from the condition mapping module is used either as the style code with the two spatial dimension of 1x1, or as the spatial structure code with a single channel. Based on the input image and its corresponding z(s), the encoder provides the posterior distribution close to a common prior regardless of its label, thus z(u) sampled from it becomes label irrelevant. The decoder employs z(s) and z(u) by two typical adaptive normalization modules to reconstruct the input image. Results on two datasets with different types of labels show the effectiveness of our method.
更多
查看译文
关键词
cVAE, GAN, disentanglement
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要