InDecGAN: Learning to Generate Complex Images From Captions via Independent Object-Level Decomposition and Enhancement

Jun Cheng,Fuxiang Wu,Liu Liu,Qieshi Zhang,Leszek Rutkowski,Dacheng Tao

IEEE TRANSACTIONS ON MULTIMEDIA（2023）

Cited 0|Views63

No score

Abstract

Text-to-image synthesis is a challenging problem, in which a complex scene contains diverse objects of various sizes and sub-images of objects belonging to the same class have diverse forms from different perspectives. Thus, synthesis models have difficulty in capturing varied objects in the complex scene. To alleviate these problems, we devise an independent object-level decomposing and enhancing generative adversarial networks, denoted as InDecGAN, to synthesize complex images and capture varied objects in a complex scene. Specifically, InDecGAN fully utilizes the independent object-level information, bounding boxes and high-resolution images of objects in training, by employing independent object-level pathways to synthesize varied objects. The independent object-level pathway integrates an independent object-level adversarial loss and the bounding box information to learn the visual features of objects independently, then, the main pathway exploits the features provided by the object-level pathway to compose the full scene and synthesize images. In addition, we analyze the generalization properties of the proposed InDecGAN and demonstrate the improvement from the perspective of the model architecture. Moreover, extensive experiments conducted on a widely used dataset are presented to demonstrate that the proposed model with an independent object-level pathway produces synthesized images of significantly improved quality.

Translated text

Key words

Layout,Task analysis,Generators,Shape,Semantics,Generative adversarial networks,Image synthesis,Complex scene,independent object-level pathway,size information,text-to-image synthesis

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined