Instance Selection for GANs

NIPS 2020, 2020.

Cited by: 0|Bibtex|Views13
EI
Other Links: arxiv.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
In this work we propose a novel approach to improve sample quality: altering the training dataset via instance selection before model training has taken place

Abstract:

Recent advances in Generative Adversarial Networks (GANs) have led to their widespread adoption for the purposes of generating high quality synthetic imagery. While capable of generating photo-realistic images, these models often produce unrealistic samples which fall outside of the data manifold. Several recently proposed techniques at...More
0
Introduction
  • Recent advances in Generative Adversarial Networks (GANs) have enabled these models to be considered a tool of choice for vision synthesis tasks that demand high fidelity outputs, such as image and video generation [5, 11], image editing [36], inpainting [30], and superresolution [27].
  • The majority of current techniques attempt to eliminate low quality samples after the model is trained, either by changing the model distribution by truncating the latent space [2, 10] or by performing some form of rejection sampling using a trained discriminator to inform the rejection process [1, 4, 26]
  • These methods are inefficient with respect to model capacity and training time, since much of the capacity and optimization efforts dedicated to representing the sparse regions of the data manifold is wasted
Highlights
  • Recent advances in Generative Adversarial Networks (GANs) have enabled these models to be considered a tool of choice for vision synthesis tasks that demand high fidelity outputs, such as image and video generation [5, 11], image editing [36], inpainting [30], and superresolution [27]
  • We use a variety of evaluation metrics to diagnose the effect that training with instance selection has on the learned distribution, including: (1) Inception Score (IS) [22], (2) Fréchet Inception Distance (FID) [9], (3) Precision and Recall (P&R) [13], and (4) Density and Coverage (D&C) [18]
  • Having established that dataset manifold density is correlated with GAN performance, we explore artificially increasing the overall density of the training set by removing data points that lie in low density regions of the data manifold
  • Despite using a much smaller batch size, our model trained with instance selection achieves a 66% increase in Inception Score and a 16% decrease in FID over the baseline while training in a quarter of the time (Table 2, Figure 6)
  • Popular post-processing methods such as rejection sampling or latent space truncation will likely ignore these regions as represented by the model
  • There are several benefits of taking the instance selection approach: (1) We improve sample quality across a variety of metrics compared to training on uncurated data and compared to post-hoc methods; (2) We demonstrate that reallocating model capacity to denser regions of the data manifold leads to efficiency gains: meaning that we can achieve SOTA quality with a smaller-capacity model trained in far less time; and (3) We show that instance selection and truncation are stackable, leading to mutual benefit
Methods
  • The authors review evaluation metrics, motivate selecting instances based on manifold density, and analyze the impact of applying instance selection to GAN training.
Results
  • The authors use a variety of evaluation metrics to diagnose the effect that training with instance selection has on the learned distribution, including: (1) Inception Score (IS) [22], (2) Fréchet Inception Distance (FID) [9], (3) Precision and Recall (P&R) [13], and (4) Density and Coverage (D&C) [18].
  • When calculating FID the authors follow Brock et al [2] in using all images in the training set to estimate the reference distribution, and sampling 50k images to make up the generated distribution.
  • For P&R and D&C the authors use an Inceptionv3 embedding.1 N and M are set to 10k samples for both the reference and generated distributions, and K is set equal to 5 as recommended by Naeem et al [18]
Conclusion
  • Folk wisdom suggests more data is better, it is known that areas of the data manifold that are sparsely represented pose a challenge to current GANs [10].
  • To directly address this challenge the authors introduce a new tool: dataset curation via instance selection.
  • The authors argue that it is more important here than in supervised learning because of the absence of an annotation phase where humans often perform some kind of formal or informal curation
Summary
  • Introduction:

    Recent advances in Generative Adversarial Networks (GANs) have enabled these models to be considered a tool of choice for vision synthesis tasks that demand high fidelity outputs, such as image and video generation [5, 11], image editing [36], inpainting [30], and superresolution [27].
  • The majority of current techniques attempt to eliminate low quality samples after the model is trained, either by changing the model distribution by truncating the latent space [2, 10] or by performing some form of rejection sampling using a trained discriminator to inform the rejection process [1, 4, 26]
  • These methods are inefficient with respect to model capacity and training time, since much of the capacity and optimization efforts dedicated to representing the sparse regions of the data manifold is wasted
  • Methods:

    The authors review evaluation metrics, motivate selecting instances based on manifold density, and analyze the impact of applying instance selection to GAN training.
  • Results:

    The authors use a variety of evaluation metrics to diagnose the effect that training with instance selection has on the learned distribution, including: (1) Inception Score (IS) [22], (2) Fréchet Inception Distance (FID) [9], (3) Precision and Recall (P&R) [13], and (4) Density and Coverage (D&C) [18].
  • When calculating FID the authors follow Brock et al [2] in using all images in the training set to estimate the reference distribution, and sampling 50k images to make up the generated distribution.
  • For P&R and D&C the authors use an Inceptionv3 embedding.1 N and M are set to 10k samples for both the reference and generated distributions, and K is set equal to 5 as recommended by Naeem et al [18]
  • Conclusion:

    Folk wisdom suggests more data is better, it is known that areas of the data manifold that are sparsely represented pose a challenge to current GANs [10].
  • To directly address this challenge the authors introduce a new tool: dataset curation via instance selection.
  • The authors argue that it is more important here than in supervised learning because of the absence of an annotation phase where humans often perform some kind of formal or informal curation
Tables
  • Table1: Comparison of embedding and scoring functions. Models trained with instance selection significantly outperform models trained without instance selection, despite training on a fraction of the available data. RR is the retention ratio (percentage of dataset trained on). Best results in bold
  • Table2: Performance of models on the 128 × 128 ImageNet image generation task
  • Table3: Performance of models trained on 64 × 64 resolution ImageNet. A retention ratio of less than 100 indicates that instance selection is used. Best results in bold
Download tables as Excel
Related work
  • Generative modelling of images is a very challenging problem due to the high dimensional nature of images and the complexity of the distributions they form. Several different approaches towards image generation have been proposed, with GANs currently being state-of-the-art in terms of image generation quality. In this work we will focus primarily on GANs, but other types of generative models might also benefit from instance selection prior to model fitting.

    2.1 Sample Filtering in GANs

    One way to improve the sample quality from GANs without making any changes to the architecture or optimization algorithm is by applying techniques which automatically filter out poor quality samples from a trained model. Discriminator Rejection Sampling (DRS) [1] accomplishes this by performing rejection sampling on the generator. This process is informed by the discriminator, which is reused to estimate density ratios between the real and generated image manifolds. Metropolis-Hastings GAN (MH-GAN) [26] builds on DRS by i) calibrating the discriminator to achieve more accurate density ratio estimates, and by ii) applying Markov chain Monte Carlo (MCMC) instead of rejection sampling for better performance on high dimensional data. Ding et al [7] further improve density ratio estimates by fine-tuning a pretrained ImageNet classifier for the task. For more efficient sampling, Discriminator Driven Latent Sampling (DDLS) [4] iteratively updates samples in the latent space to push them closer to realistic outputs.
Funding
  • Resources used in preparing this research were provided to GWT and TD, in part, by NSERC, the Canada Foundation for Innovation, the Province of Ontario, the Government of Canada through CIFAR, Compute Canada, and companies sponsoring the Vector Institute: http://www.vectorinstitute.ai/#partners
Study subjects and analysis
class-conditioned samples with the BigGAN: 700
Instead, we use a single class-conditional BigGAN from [2] that has been pretrained on ImageNet at 128 × 128 resolution. For each class, we sample 700 real images from the dataset, and generate 700 class-conditioned samples with the BigGAN. To measure the density for each class manifold we compare three different methods: Gaussian likelihood, Probabilistic Principal Component Analysis (PPCA) likelihood, and distance to the Kth neighbour (KNN Distance) (§3)

Reference
  • Samaneh Azadi, Catherine Olsson, Trevor Darrell, Ian J. Goodfellow, and Augustus Odena. Discriminator rejection sampling. ArXiv, abs/1810.06758, 2019.
    Findings
  • Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale GAN training for high fidelity natural image synthesis. ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • Joel Luis Carbonera and Mara Abel. A density-based approach for instance selection. In 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), pages 768–774. IEEE, 2015.
    Google ScholarLocate open access versionFindings
  • Tong Che, Ruixiang Zhang, Jascha Sohl-Dickstein, Hugo Larochelle, Liam Paull, Yuan Cao, and Yoshua Bengio. Your gan is secretly an energy-based model and you should use discriminator driven latent sampling. ArXiv, abs/2003.06060, 2020.
    Findings
  • Aidan Clark, Jeff Donahue, and Karen Simonyan. Efficient video generation on complex datasets. arXiv preprint arXiv:1907.06571, 2019.
    Findings
  • Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255.
    Google ScholarLocate open access versionFindings
  • Xin Ding, Z. Jane Wang, and William J. Welch. Subsampling generative adversarial networks: Density ratio estimation in feature space with softplus loss. IEEE Transactions on Signal Processing, 68: 1910–1922, 2020.
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
    Google ScholarLocate open access versionFindings
  • Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in neural information processing systems, pages 6626–6637, 2017.
    Google ScholarLocate open access versionFindings
  • Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4401–4410, 2019.
    Google ScholarLocate open access versionFindings
  • Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. Analyzing and improving the image quality of stylegan. arXiv preprint arXiv:1912.04958, 2019.
    Findings
  • Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. Training generative adversarial networks with limited data. arXiv preprint arXiv:2006.06676, 2020.
    Findings
  • Tuomas Kynkäänniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, and Timo Aila. Improved precision and recall metric for assessing generative models. In Advances in Neural Information Processing Systems, pages 3929–3938, 2019.
    Google ScholarLocate open access versionFindings
  • Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, and Laurens van der Maaten. Exploring the limits of weakly supervised pretraining. In Proceedings of the European Conference on Computer Vision (ECCV), pages 181–196, 2018.
    Google ScholarLocate open access versionFindings
  • Marco Marchesi. Megapixel size image creation using generative adversarial networks. arXiv preprint arXiv:1706.00082, 2017.
    Findings
  • Leland McInnes, John Healy, and James Melville. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018.
    Findings
  • Lars Mescheder, Andreas Geiger, and Sebastian Nowozin. Which training methods for gans do actually converge? arXiv preprint arXiv:1801.04406, 2018.
    Findings
  • Muhammad Ferjad Naeem, Seong Joon Oh, Youngjung Uh, Yunjey Choi, and Jaejun Yoo. Reliable fidelity and diversity metrics for generative models. arXiv preprint arXiv:2002.09797, 2020.
    Findings
  • Fajar Ulin Nuha et al. Training dataset reduction on generative adversarial network. Procedia computer science, 144:133–139, 2018.
    Google ScholarLocate open access versionFindings
  • J Arturo Olvera-López, J Ariel Carrasco-Ochoa, J Francisco Martínez-Trinidad, and Josef Kittler. A review of instance selection methods. Artificial Intelligence Review, 34(2):133–143, 2010.
    Google ScholarLocate open access versionFindings
  • F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12: 2825–2830, 2011.
    Google ScholarLocate open access versionFindings
  • Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans. In Advances in neural information processing systems, pages 2234–2242, 2016.
    Google ScholarLocate open access versionFindings
  • Samarth Sinha, Anirudh Goyal, Colin Raffel, and Augustus Odena. Top-k training of gans: Improving generators by making critics less critical. ArXiv, abs/2002.06224, 2020.
    Findings
  • Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826, 2016.
    Google ScholarLocate open access versionFindings
  • Michael E Tipping and Christopher M Bishop. Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(3):611–622, 1999.
    Google ScholarLocate open access versionFindings
  • Ryan C Turner, Jane Hung, Yunus Saatci, and Jason Yosinski. Metropolis-hastings generative adversarial networks. In ICML, 2018.
    Google ScholarLocate open access versionFindings
  • Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European Conference on Computer Vision (ECCV), 2018.
    Google ScholarLocate open access versionFindings
  • Yan Wu, Jeff Donahue, David Balduzzi, Karen Simonyan, and Timothy P. Lillicrap. Logan: Latent optimisation for generative adversarial networks. ArXiv, abs/1912.00953, 2019.
    Findings
  • Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1492–1500, 2017.
    Google ScholarLocate open access versionFindings
  • Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. Generative image inpainting with contextual attention. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5505–5514, 2018.
    Google ScholarLocate open access versionFindings
  • Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. Self-attention generative adversarial networks. arXiv preprint arXiv:1805.08318, 2018.
    Findings
  • Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018.
    Google ScholarLocate open access versionFindings
  • Shengyu Zhao, Zhijian Liu, Ji Lin, Jun-Yan Zhu, and Song Han. Differentiable augmentation for data-efficient gan training. arXiv preprint arXiv:2006.10738, 2020.
    Findings
  • Yang Zhao, Chunyuan Li, Ping Yu, Jianfeng Gao, and Changyou Chen. Feature quantization improves gan training. arXiv preprint arXiv:2004.02088, 2020.
    Findings
  • Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
    Google ScholarLocate open access versionFindings
  • Jiapeng Zhu, Yujun Shen, Deli Zhao, and Bolei Zhou. In-domain gan inversion for real image editing. arXiv preprint arXiv:2004.00049, 2020.
    Findings
Full Text
Your rating :
0

 

Tags
Comments