AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We have demonstrated that learning and discriminating sample embeddings using their corresponding spherical circles on a hypersphere is highly effective in generating diverse samples of high quality

CircleGAN: Generative Adversarial Learning across Spherical Circles

NIPS 2020, (2020)

Cited by: 0|Views12
EI
Full Text
Bibtex
Weibo

Abstract

We present a novel discriminator for GANs that improves realness and diversity of generated samples by learning a structured hypersphere embedding space using spherical circles. The proposed discriminator learns to populate realistic samples around the longest spherical circle, i.e., a great circle, while pushing unrealistic samples tow...More

Code:

Data:

0
Introduction
  • Generative Adversarial Networks (GANs) [8] aim at learning to produce high-quality data samples, as observed from a target dataset.
  • To this end, it trains a generator, which synthesizes samples, adversarially against a discriminator, which classifies the samples into real or fake ones; the discriminator’s gradient in backpropagation guides the generator to improve the quality of the generated samples.
  • The problem of mode collapse remains even in conditional models
Highlights
  • Generative Adversarial Networks (GANs) [8] aim at learning to produce high-quality data samples, as observed from a target dataset
  • In conventional GAN frameworks including [21], the discriminator can be viewed as evaluating realness based on a prototype representation, i.e., the closer a generated sample is to the prototype of real samples, the more realistic it is evaluated as
  • The results demonstrate that the images synthesized from CircleGAN correspond to their classes and almost overlapped to the train and the validation sets of the dataset at the 2D t-SNE space, but the images from Proj
  • We have demonstrated that learning and discriminating sample embeddings using their corresponding spherical circles on a hypersphere is highly effective in generating diverse samples of high quality
  • The proposed method provides the state-of-the-art performance in unconditional generation and extends to conditional setups with class labels by creating hyperspheres for the classes
  • The impressive performance gain over the recent methods on standard benchmarks demonstrates the effectiveness of the proposed approach
Methods
  • CircleGAN - radius equalization - center estimation - circle learning - score normalization - 2-projection FID(↓)

    12.9 14.6 15.2 15.8 16.8 20.4 is due to the sensitivity of IS, as reported in [3]; IS on the test set of STL10 is 14.8, which is significantly lower than 26.1 on the training set, but for other datasets such as CIFAR10 and CIFAR100, IS values are similar between train and test sets (CIFAR10: 11.2 vs. 11.3, CIFAR100: 14.8 vs. 14.7).

    To further compare CircleGAN to SphereGAN, the authors conduct ablation studies on CIFAR10 with the model using angles smult for score function (Table 1b).
  • CircleGAN - radius equalization - center estimation - circle learning - score normalization - 2-projection FID(↓).
  • Each component in the table is subsequently removed from the full CircleGAN model to see its effect.
  • The authors remove radius equalization and center estimation losses, respectively.
  • The results show that the proposed components consistently improve FIDs. In particular, replacing 2-normalization with ISP significantly deteriorates FID, which implies that ISP of SphereGAN may be problematic due to the embedding bias of samples.
  • CircleGAN outperforms SphereGAN with a large margin, and extends to conditional settings as demonstrated in the experiment
Results
  • Evaluation Metrics

    The common evaluation metrics for image generation are Inception Score (IS) [23] and Frechet Incéption Distance (FID) [10].
  • FID measures a distance between the distribution of real data and that of generated samples in an embedding space, where the embeddings are assumed to be Gaussian distributed.
  • While these are easy to calculate and correlate well with human assessment of generated samples, there have been some concerns about them.
  • In addition to IS and FID, the authors use two other metrics [24], GAN-train and GAN-test, for evaluation on conditional settings
Conclusion
  • The authors have demonstrated that learning and discriminating sample embeddings using their corresponding spherical circles on a hypersphere is highly effective in generating diverse samples of high quality.
  • The impressive performance gain over the recent methods on standard benchmarks demonstrates the effectiveness of the proposed approach.
  • (a) CircleGAN images and embeddings (b) Proj.
  • SNGAN [17] images and embeddings Figure 3: CircleGAN vs Projection-SNGAN on TinyImagenet.
  • For 5 classes, generated images and their t-SNE embeddings are visualized.
  • For t-SNE, the authors train a classifier using the training set and use it for embedding the generated images.
  • The authors use 250 images randomly taken from train and validation sets, respectively
Summary
  • Introduction:

    Generative Adversarial Networks (GANs) [8] aim at learning to produce high-quality data samples, as observed from a target dataset.
  • To this end, it trains a generator, which synthesizes samples, adversarially against a discriminator, which classifies the samples into real or fake ones; the discriminator’s gradient in backpropagation guides the generator to improve the quality of the generated samples.
  • The problem of mode collapse remains even in conditional models
  • Methods:

    CircleGAN - radius equalization - center estimation - circle learning - score normalization - 2-projection FID(↓)

    12.9 14.6 15.2 15.8 16.8 20.4 is due to the sensitivity of IS, as reported in [3]; IS on the test set of STL10 is 14.8, which is significantly lower than 26.1 on the training set, but for other datasets such as CIFAR10 and CIFAR100, IS values are similar between train and test sets (CIFAR10: 11.2 vs. 11.3, CIFAR100: 14.8 vs. 14.7).

    To further compare CircleGAN to SphereGAN, the authors conduct ablation studies on CIFAR10 with the model using angles smult for score function (Table 1b).
  • CircleGAN - radius equalization - center estimation - circle learning - score normalization - 2-projection FID(↓).
  • Each component in the table is subsequently removed from the full CircleGAN model to see its effect.
  • The authors remove radius equalization and center estimation losses, respectively.
  • The results show that the proposed components consistently improve FIDs. In particular, replacing 2-normalization with ISP significantly deteriorates FID, which implies that ISP of SphereGAN may be problematic due to the embedding bias of samples.
  • CircleGAN outperforms SphereGAN with a large margin, and extends to conditional settings as demonstrated in the experiment
  • Results:

    Evaluation Metrics

    The common evaluation metrics for image generation are Inception Score (IS) [23] and Frechet Incéption Distance (FID) [10].
  • FID measures a distance between the distribution of real data and that of generated samples in an embedding space, where the embeddings are assumed to be Gaussian distributed.
  • While these are easy to calculate and correlate well with human assessment of generated samples, there have been some concerns about them.
  • In addition to IS and FID, the authors use two other metrics [24], GAN-train and GAN-test, for evaluation on conditional settings
  • Conclusion:

    The authors have demonstrated that learning and discriminating sample embeddings using their corresponding spherical circles on a hypersphere is highly effective in generating diverse samples of high quality.
  • The impressive performance gain over the recent methods on standard benchmarks demonstrates the effectiveness of the proposed approach.
  • (a) CircleGAN images and embeddings (b) Proj.
  • SNGAN [17] images and embeddings Figure 3: CircleGAN vs Projection-SNGAN on TinyImagenet.
  • For 5 classes, generated images and their t-SNE embeddings are visualized.
  • For t-SNE, the authors train a classifier using the training set and use it for embedding the generated images.
  • The authors use 250 images randomly taken from train and validation sets, respectively
Tables
  • Table1: Unnconditional GAN results on CIFAR10 and STL10
  • Table2: Conditional GAN results on CIFAR10
  • Table3: Conditional GAN results on CIFAR100
  • Table4: Conditional GAN results on TinyImagenet
Download tables as Excel
Related work
  • 2.1 Generative Adversarial Networks

    Previous work for improving GANs concentrates on addressing the difficulty of training. These studies have been conducted in different aspects such as network architectures [9, 12, 13, 22], objective functions [11, 19] and regularization techniques [2, 9, 16, 30], which impose the Lipschitz constraint to the discriminator. SphereGAN [21] has shown that using hypersphere as an embedding space affords the stability in GAN training by the boundedness of distances between samples and their gradients. Our work also adopt a hypersphere embedding space, but propose a different strategy in structuring and learning the hypersphere, which will be discussed in details.

    The most relevant line of GAN research to ours is on the lack of sample diversity. In many cases, GAN objectives can be satisfied with samples with a limited diversity, and no guarantee exists that a model in training reaches to generate diverse samples. To tackle the lack of diversity, a.k.a. mode collapse, several approaches are proposed from different perspectives. Chen et al [5] and Karras et al [13] modulate normalization layers using a noise vector that is transformed through a sequence of layers. Yang et al [27] penalize the mode collapsing behavior directly to the generator by maximizing the gradient norm with respect to the noise vector. Yamaguchi and Koyama [26] regularize the discriminator to have local concavity on the support of the generator function to increase its entropy monotonically at every stage of the training. Liu et al [15] propose a spectral regularization to prevent spectral collapse when training with the spectral normalization, which is shown to be closely linked to the mode collapse. Our approach is very different from these in combating mode collapse.
Funding
  • This research was supported by Basic Science Research Program (NRF-2017R1E1A1A01077999) and Next-Generation Information Computing Development Program (NRF-2017M3C4A7069369), through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (MSIT), and also by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIP) (No 2019-0-01906, Artificial Intelligence Graduate School Program (POSTECH))
Reference
  • Albuquerque, I., Monteiro, J., Doan, T., Considine, B., Falk, T., Mitliagkas, I.: Multi-objective training of generative adversarial networks with multiple discriminators. Proceedings of the 36th International Conference on Machine Learning (2019)
    Google ScholarLocate open access versionFindings
  • Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein gan. Proceedings of the 34th International Conference on Machine Learning (2017)
    Google ScholarLocate open access versionFindings
  • Barratt, S., Sharma, R.: A note on the inception score. Proceedings of the 35th International Conference on Machine Learning Workshop (2018)
    Google ScholarLocate open access versionFindings
  • Brock, A., Donahue, J., Simonyan, K.: Large scale gan training for high fidelity natural image synthesis. International Conference on Learning Representations (2019)
    Google ScholarLocate open access versionFindings
  • Chen, T., Lucic, M., Houlsby, N., Gelly, S.: On self modulation for generative adversarial networks. International Conference on Learning Representations (2019)
    Google ScholarLocate open access versionFindings
  • Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics. pp. 215–223 (2011)
    Google ScholarLocate open access versionFindings
  • Gong, M., Xu, Y., Li, C., Zhang, K., Batmanghelich, K.: Twin auxiliary classifiers gan. Advances in Neural Information Processing Systems (2019)
    Google ScholarLocate open access versionFindings
  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems. pp. 2672–2680 (2014)
    Google ScholarLocate open access versionFindings
  • Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein gans. In: Advances in neural information processing systems. pp. 5767–5777 (2017)
    Google ScholarFindings
  • Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems. pp. 6626–6637 (2017)
    Google ScholarLocate open access versionFindings
  • Jolicoeur-Martineau, A.: The relativistic discriminator: a key element missing from standard gan. International Conference on Learning Representations (2019)
    Google ScholarLocate open access versionFindings
  • Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. International Conference on Learning Representations (2018)
    Google ScholarLocate open access versionFindings
  • Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4401–4410 (2019)
    Google ScholarLocate open access versionFindings
  • Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Tech. rep., Citeseer (2009)
    Google ScholarFindings
  • Liu, K., Tang, W., Zhou, F., Qiu, G.: Spectral regularization for combating mode collapse in gans. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 6382–6390 (2019)
    Google ScholarLocate open access versionFindings
  • Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. International Conference on Learning Representations (2018)
    Google ScholarLocate open access versionFindings
  • Miyato, T., Koyama, M.: cgans with projection discriminator. International Conference on Learning Representations (2018)
    Google ScholarLocate open access versionFindings
  • Neyshabur, B., Bhojanapalli, S., Chakrabarti, A.: Stabilizing gan training with multiple random projections. arXiv preprint arXiv:1705.07831 (2017)
    Findings
  • Nowozin, S., Cseke, B., Tomioka, R.: f-gan: Training generative neural samplers using variational divergence minimization. In: Advances in Neural Information Processing Systems. pp. 271–279 (2016)
    Google ScholarLocate open access versionFindings
  • Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier gans. In: Proceedings of the 34th International Conference on Machine Learning. pp. 2642–2651. JMLR. org (2017)
    Google ScholarLocate open access versionFindings
  • Park, S.W., Kwon, J.: Sphere generative adversarial network based on geometric moment matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4292–4301 (2019)
    Google ScholarLocate open access versionFindings
  • Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. International Conference on Learning Representations (2016)
    Google ScholarLocate open access versionFindings
  • Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. In: Advances in Neural Information Processing Systems. pp. 2234–2242 (2016)
    Google ScholarLocate open access versionFindings
  • Shmelkov, K., Schmid, C., Alahari, K.: How good is my gan? In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 213–229 (2018)
    Google ScholarLocate open access versionFindings
  • Wu, J., Huang, Z., Acharya, D., Li, W., Thoma, J., Paudel, D.P., Gool, L.V.: Sliced wasserstein generative models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3713–3722 (2019)
    Google ScholarLocate open access versionFindings
  • Yamaguchi, S., Koyama, M.: Distributional concavity regularization for gans. International Conference on Learning Representations (2019)
    Google ScholarLocate open access versionFindings
  • Yang, D., Hong, S., Jang, Y., Zhao, T., Lee, H.: Diversity-sensitive conditional generative adversarial networks. International Conference on Learning Representations (2019)
    Google ScholarLocate open access versionFindings
  • Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. Proceedings of the 36th International Conference on Machine Learning (2019)
    Google ScholarLocate open access versionFindings
  • Zhou, Z., Cai, H., Rong, S., Song, Y., Ren, K., Zhang, W., Yu, Y., Wang, J.: Activation maximization generative adversarial nets. International Conference on Learning Representations (2018)
    Google ScholarLocate open access versionFindings
  • Zhou, Z., Liang, J., Song, Y., Yu, L., Wang, H., Zhang, W., Yu, Y., Zhang, Z.: Lipschitz generative adversarial nets. Proceedings of the 36th International Conference on Machine Learning (2019)
    Google ScholarLocate open access versionFindings
Author
Woohyeon Shim
Woohyeon Shim
Minsu Cho
Minsu Cho
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科