AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
In this work we developed a method to stabilize Generative Adversarial Networks training and reduce mode collapse by defining the generator objective with respect to unrolled optimization of the discriminator
Unrolled Generative Adversarial Networks.
We introduce a method to stabilize Generative Adversarial Networks (GANs) by defining the generator objective with respect to an unrolled optimization of the discriminator. This allows training to be adjusted between using the optimal discriminator in the generatoru0027s objective, which is ideal but infeasible in practice, and using the ...More
PPT (Upload PPT)
- The use of deep neural networks as generative models for complex data has made great advances in recent years.
- Two models are used to solve a minimax game: a generator which samples data, and a discriminator which classifies the data as real or generated
- In theory these models are capable of modeling an arbitrarily complex probability distribution.
- The flexibility of the GAN framework has enabled a number of successful extensions of the technique, for instance for structured prediction (Reed et al, 2016a;b; Odena et al, 2016), training energy based models (Zhao et al, 2016), and combining the GAN loss with a mutual information loss (Chen et al, 2016)
- The use of deep neural networks as generative models for complex data has made great advances in recent years. This success has been achieved through a surprising diversity of training losses and model architectures, including denoising autoencoders (Vincent et al, 2010), variational autoencoders (Kingma & Welling, 2013; Rezende et al, 2014; Gregor et al, 2015; Kulkarni et al, 2015; Burda et al, 2015; Kingma et al, 2016), generative stochastic networks (Alain et al, 2015), diffusion probabilistic models (Sohl-Dickstein et al, 2015), autoregressive models (Theis & Bethge, 2015; van den Oord et al, 2016a;b), real non-volume preserving transformations (Dinh et al, 2014; 2016), Helmholtz machines (Dayan et al, 1995; Bornschein et al, 2015), and Generative Adversarial Networks (GANs) (Goodfellow et al, 2014)
- While most deep generative models are trained by maximizing log likelihood or a lower bound on log likelihood, Generative Adversarial Networks take a radically different approach that does not require inference or explicit calculation of the data likelihood
- We explore how unrolling compares to historical averaging, and compares to using the unrolled discriminator to update the generator, but without backpropagating through the generator. In both cases we find that the unrolled objective performs better. 3.2 PATHOLOGICAL MODEL WITH MISMATCHED GENERATOR AND DISCRIMINATOR To evaluate the ability of this approach to improve trainability, we look to a traditionally challenging family of models to train – recurrent neural networks (RNNs)
- In this work we developed a method to stabilize Generative Adversarial Networks training and reduce mode collapse by defining the generator objective with respect to unrolled optimization of the discriminator
- The GAN learning problem is to find the optimal parameters θG∗ for a generator function G (z; θG) in a minimax objective, θG∗ =.
- Argmin θG max θD f (1).
- = argmin f (θG, θD∗) (2).
- ΘD∗ = argmax f , (3).
- Where f is commonly chosen to be f = Ex∼pdata [log (D (x; θD))] + Ez∼N (0,I) [log (1 − D (G (z; θG) ; θD))] .
- In the zero step case, there is poor reconstruction and less than 1% of the time does it obtain the lowest error of the 4 configurations.
- In this work the authors developed a method to stabilize GAN training and reduce mode collapse by defining the generator objective with respect to unrolled optimization of the discriminator.
- The authors have some initial positive results suggesting it may be sufficient to further perturb the training gradient in the same direction that a single unrolling step perturbs it.
- While this is more computationally efficient, further investigation is required
- Table1: Unrolled GANs cover more discrete modes when modeling a dataset with 1,000 data modes, corresponding to all combinations of three MNIST digits (103 digit combinations). The number of modes covered is given for different numbers of unrolling steps, and for two different architectures
- Table2: Unrolled GANs better model a continuous distribution. GANs are trained to model randomly colored MNIST digits, where the color is drawn from a Gaussian distribution. The JS divergence between the data and model distributions over digit colors is then reported, along with standard error in the JS divergence. More unrolling steps, and larger models, lead to better JS divergence
- Table3: GANs trained with unrolling are better able to match images in the training set than standard GANs, likely due to mode dropping by the standard GAN. Results show the MSE between training images and the best reconstruction for a model with the given number of unrolling steps. The fraction of training images best reconstructed by a given model is given in the final column. The best reconstructions is found by optimizing the latent representation z to produce the closest matching pixel output G (z; θG). Results are averaged over all 5 runs of each model with different random seeds
- In the zero step case, there is poor reconstruction and less than 1% of the time does it obtain the lowest error of the 4 configurations
Study subjects and analysis
Giving more information to G by allowing it to ‘see into the future’ may thus help the two models be more balanced. In this section we demonstrate improved mode coverage and stability by applying this technique to five datasets of increasing complexity. Evaluation of generative models is a notoriously hard problem (Theis et al, 2016)
This new dataset has 1,000 distinct modes, corresponding to each combination of the ten MNIST classes in the three channels. We train a GAN on this dataset, and generate samples from the trained model (25,600 samples for all experiments). We then compute the predicted class label of each color channel using a pre-trained MNIST classifier
- Guillaume Alain, Yoshua Bengio, Li Yao, Jason Yosinski, Eric Thibodeau-Laufer, Saizheng Zhang, and Pascal Vincent. Gsns: Generative stochastic networks. arXiv preprint arXiv:1503.05571, 2015.
- Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W Hoffman, David Pfau, Tom Schaul, and Nando de Freitas. Learning to learn by gradient descent by gradient descent. arXiv preprint arXiv:1606.04474, 2016.
- David Belanger and Andrew McCallum. Structured prediction energy networks. arXiv preprint arXiv:1511.06350, 2015.
- Jorg Bornschein, Samira Shabanian, Asja Fischer, and Yoshua Bengio. Bidirectional helmholtz machines. arXiv preprint arXiv:1506.03877, 2015.
- Michael Bowling and Manuela Veloso. Multiagent learning using a variable learning rate. Artificial Intelligence, 136(2):215–250, 2002.
- Yuri Burda, Roger B. Grosse, and Ruslan Salakhutdinov. Importance weighted autoencoders. arXiv preprint arXiv:1509.00519, 2015.
- Alex J. Champandard. Semantic style transfer and turning two-bit doodles into fine artworks. arXiv preprint arXiv:1603.01768, 2016.
- Tong Che, Yanran Li, Athul Paul Jacob, Yoshua Bengio, and Wenjie Li. Mode regularized generative adversarial networks. arXiv preprint arXiv: 1612.02136, 2016.
- Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. arXiv preprint arXiv:1606.03657, 2016.
- John M Danskin. The theory of max-min and its application to weapons allocation problems, volume 5. Springer Science & Business Media, 1967.
- Peter Dayan, Geoffrey E Hinton, Radford M Neal, and Richard S Zemel. The helmholtz machine. Neural computation, 7(5):889–904, 1995.
- Laurent Dinh, David Krueger, and Yoshua Bengio. NICE: non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
- Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real NVP. arXiv preprint arXiv:1605.08803, 2016.
- Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Alex Lamb, Martin Arjovsky, Olivier Mastropietro, and Aaron Courville. Adversarially learned inference. arXiv preprint arXiv:1606.00704, 2016.
- Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. In JMLR W&CP: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), volume 9, pp. 249–256, May 2010.
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (eds.), Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates, Inc., 2014. URL http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf.
- Karol Gregor, Ivo Danihelka, Alex Graves, and Daan Wierstra. DRAW: A recurrent neural network for image generation. In Proceedings of The 32nd International Conference on Machine Learning, pp. 1462–1471, 2015. URL http://www.jmlr.org/proceedings/papers/v37/gregor15.html.
- Tian Han, Yang Lu, Song-Chun Zhu, and Ying Nian Wu. Alternating back-propagation for generator network, 2016. URL https://arxiv.org/abs/1606.08571.
- Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. Neural Comput., 9(8):1735– 1780, November 1997. ISSN 0899-7667. doi: 10.1162/neco.1918.104.22.1685. URL http://dx.doi.org/10.1162/neco.1922.214.171.1245.
- Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, pp. 448–456, 2015. URL http://jmlr.org/proceedings/papers/v37/ioffe15.html.
- Justin Johnson, Alexandre Alahi, and Fei-Fei Li. Perceptual losses for real-time style transfer and super-resolution. arXiv preprint arXiv:1603.08155, 2016.
- Anatoli Juditsky, Arkadi Nemirovski, et al. First order methods for nonsmooth convex large-scale optimization, i: general purpose methods. Optimization for Machine Learning, pp. 121–148, 2011.
- Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Diederik P Kingma and Max Welling. Auto-encoding variational bayes, 2013. URL https://arxiv.org/abs/1312.6114.
- Diederik P. Kingma, Tim Salimans, and Max Welling. Improving variational inference with inverse autoregressive flow. 2016.
- Tejas D. Kulkarni, Will Whitney, Pushmeet Kohli, and Joshua B. Tenenbaum. Deep convolutional inverse graphics network. arXiv preprint arXiv:1503.03167, 2015.
- Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi. Photo-realistic single image super-resolution using a generative adversarial network, 2016. URL https://arxiv.org/abs/1609.04802.
- Dougal Maclaurin, David Duvenaud, and Ryan P. Adams. Gradient-based hyperparameter optimization through reversible learning, 2015.
- Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, and Jeff Clune. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. arXiv preprint arXiv:1605.09304, 2016.
- Sebastian Nowozin, Botond Cseke, and Ryota Tomioka. f-gan: Training generative neural samplers using variational divergence minimization. arXiv preprint arXiv:1606.00709, 2016.
- Augustus Odena, Christopher Olah, and Jonathon Shlens. Conditional image synthesis with auxiliary classifier gans. arXiv preprint arXiv:1610.09585, 2016.
- Barak A. Pearlmutter and Jeffrey Mark Siskind. Reverse-mode ad in a functional framework: Lambda the ultimate backpropagator. ACM Trans. Program. Lang. Syst., 30(2):7:1–7:36, March 2008. ISSN 0164-0925. doi: 10.1145/1330017.1330018. URL http://doi.acm.org/10.1145/1330017.1330018.
- Ben Poole, Alexander A Alemi, Jascha Sohl-Dickstein, and Anelia Angelova. Improved generator objectives for gans. arXiv preprint arXiv:1612.02780, 2016.
- Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
- Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, and Honglak Lee. Learning what and where to draw. In NIPS, 2016a.
- Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. Generative adversarial text-to-image synthesis. In Proceedings of The 33rd International Conference on Machine Learning, 2016b.
- Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. Stochastic backpropagation and variational inference in deep latent gaussian models. In International Conference on Machine Learning. Citeseer, 2014.
- Tim Salimans, Ian J. Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans. arXiv preprint arXiv:1606.03498, 2016.
- Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
- Satinder Singh, Michael Kearns, and Yishay Mansour. Nash convergence of gradient dynamics in general-sum games. In Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence, pp. 541–548. Morgan Kaufmann Publishers Inc., 2000.
- Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of The 32nd International Conference on Machine Learning, pp. 2256–2265, 2015. URL http://arxiv.org/abs/1503.03585.
- Casper Kaae Sonderby, Jose Caballero, Lucas Theis, Wenzhe Shi, and Ferenc Huszar. Amortised map inference for image super-resolution, 2016. URL https://arxiv.org/abs/1610.04490v1.
- L. Theis and M. Bethge. Generative image modeling using spatial lstms. In Advances in Neural Information Processing Systems 28, Dec 2015. URL http://arxiv.org/abs/1506.03478/.
- L. Theis, A. van den Oord, and M. Bethge. A note on the evaluation of generative models. In International Conference on Learning Representations, Apr 2016. URL http://arxiv.org/abs/1511.01844.
- T. Tieleman and G. Hinton. Lecture 6.5—RmsProp: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 2012.
- Aaron van den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759, abs/1601.06759, 2016a. URL http://arxiv.org/abs/1601.06759.
- Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. Conditional image generation with pixelcnn decoders. arXiv preprint arXiv:1606.05328, 2016b.
- Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res., 11:3371–3408, December 2010. ISSN 1532-4435. URL http://dl.acm.org/citation.cfm?id=1756006.1953039.
- Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson. Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579, 2015.
- Chongjie Zhang and Victor R Lesser. Multi-agent learning with policy prediction. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010.
- Junbo Zhao, Michael Mathieu, and Yann LeCun. Energy-based generative adversarial network. arXiv preprint arXiv:1609.03126, 2016.
- Jun-Yan Zhu, Philipp Krahenbuhl, Eli Shechtman, and Alexei A. Efros. Generative visual manipulation on the natural image manifold. In Proceedings of European Conference on Computer Vision (ECCV), 2016.