AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Generative adversarial networks are somewhat harder to evaluate than other generative models because it can be difficult to estimate the likelihood for generative adversarial networks )

NIPS 2016 Tutorial: Generative Adversarial Networks.

arXiv: Learning, (2017)

Cited: 1252|Views211
EI
Full Text
Bibtex
Weibo

Abstract

This report summarizes the tutorial presented by the author at NIPS 2016 on generative adversarial networks (GANs). The tutorial describes: (1) Why generative modeling is a topic worth studying, (2) how generative models work, and how GANs compare to other generative models, (3) the details of how GANs work, (4) research frontiers in GANs...More

Code:

Data:

0
Introduction
  • This report1 summarizes the content of the NIPS 2016 tutorial on generative adversarial networks (GANs) (Goodfellow et al, 2014b).
  • The tutorial was designed primarily to ensure that it answered most of the questions asked by audience members ahead of time, in order to make sure that the tutorial would be as useful as possible to the audience
  • This tutorial is not intended to be a comprehensive review of the field of GANs; many excellent papers are not described here, because they were not relevant to answering the most frequent questions, and because the tutorial was delivered as a two hour oral presentation and did not have unlimited time cover all subjects.
  • The slides for the tutorial are available in PDF and Keynote format at the following URLs: http://www.iangoodfellow.com/slides/2016-12-04-NIPS.pdf
Highlights
  • This report1 summarizes the content of the NIPS 2016 tutorial on generative adversarial networks (GANs) (Goodfellow et al, 2014b)
  • The tutorial describes: (1) Why generative modeling is a topic worth studying, (2) how generative models work, and how generative adversarial networks compare to other generative models, (3) the details of how generative adversarial networks work, (4) research frontiers in generative adversarial networks, and (5) state-of-the-art image models that combine generative adversarial networks with other methods
  • 5.2 Evaluation of generative models. Another highly important research area related to generative adversarial networks is that it is not clear how to quantitatively evaluate generative models
  • generative adversarial networks are somewhat harder to evaluate than other generative models because it can be difficult to estimate the likelihood for generative adversarial networks (but it is possible—see Wu et al (2016))
  • The only real requirement imposed on the design of the generator by the generative adversarial networks framework is that the generator must be differentiable
  • generative adversarial networks can use this supervised ratio estimation technique to approximate many cost functions, including the KL divergence used for maximum likelihood estimation
Results
  • Evaluation of generative models

    Another highly important research area related to GANs is that it is not clear how to quantitatively evaluate generative models.
  • The only real requirement imposed on the design of the generator by the GAN framework is that the generator must be differentiable
  • This means that the generator cannot produce discrete data, such as one-hot word or character representations.
  • Removing this limitation is an important research direction that could unlock the potential of GANs for NLP.
  • Using the REINFORCE algorithm (Williams, 1992)
Conclusion
  • GANs are generative models that use supervised learning to approximate an intractable cost function, much as Boltzmann machines use Markov chains to approximate their cost and VAEs use the variational lower bound to approximate their cost.
  • GANs can use this supervised ratio estimation technique to approximate many cost functions, including the KL divergence used for maximum likelihood estimation.
  • Researchers should strive to develop better theoretical understanding and better training algorithms for this scenario.
  • Success on this front would improve many other applications, besides GANs
Funding
  • Summarizes the tutorial presented by the author at NIPS 2016 on generative adversarial networks
  • The tutorial describes: Why generative modeling is a topic worth studying, how generative models work, and how GANs compare to other generative models, the details of how GANs work, research frontiers in GANs, and state-of-the-art image models that combine GANs with other methods
  • The tutorial was designed primarily to ensure that it answered most of the questions asked by audience members ahead of time, in order to make sure that the tutorial would be as useful as possible to the audience
  • This tutorial is not intended to be a comprehensive review of the field of GANs; many excellent papers are not described here, because they were not relevant to answering the most frequent questions, and because the tutorial was delivered as a two hour oral presentation and did not have unlimited time cover all subjects
  • The term refers to any model that takes a training set, consisting of samples drawn from a distribution pdata, and learns to represent an estimate of that distribution somehow
Study subjects and analysis
minibatches of sixteen samples: 2
Goodfellow (2014) demonstrated the following connections between minimax GANs, noise-contrastive estimation, and maximum likelihood: all three can be interpreted as strategies for playing a minimax game with the same value function. The biggest difference is in where pmodel lies. For GANs, the generator is pmodel, while for NCE and MLE, pmodel is part of the discriminator. Beyond this, the differences between the methods lie in the update strategy. GANs learn both players with gradient descent. MLE learns the discriminator using gradient descent, but has a heuristic update rule for the generator. Specifically, after each discriminator update step, MLE copies the density model learned inside the discriminator and converts it into a sampler to be used as the generator. NCE never updates the generator; it is just a fixed source of noise. Two minibatches of sixteen samples each, generated by a generator network using batch normalization. These minibatches illustrate a problem that occurs occasionally when using batch normalization: fluctuations in the mean and standard deviation of feature values in a minibatch can have a greater effect than the individual z codes for individual images within the minibatch. This manifests here as one minibatch containing all orange-tinted samples and the other containing all green-tinted samples. The examples within a minibatch should be independent from each other, but in this case, batch normalization has caused them to become correlated with each other. An illustration of the mode collapse problem on a two-dimensional toy dataset. In the top row, we see the target distribution pdata that the model should learn. It is a mixture of Gaussians in a two-dimensional space. In the lower row, we see a series of different distributions learned over time as the GAN is trained. Rather than converging to a distribution containing all of the modes in the training set, the generator only ever produces a single mode at a time, cycling between different modes as the discriminator learns to reject each one. Images from Metz et al (2016)

Reference
  • Abadi, M. and Andersen, D. G. (2016). Learning to protect communications with adversarial neural cryptography. arXiv preprint arXiv:1610.06918.
    Findings
  • Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org.
    Google ScholarLocate open access versionFindings
  • Ackley, D. H., Hinton, G. E., and Sejnowski, T. J. (1985). A learning algorithm for Boltzmann machines. Cognitive Science, 9, 147–169.
    Google ScholarLocate open access versionFindings
  • Bengio, Y., Thibodeau-Laufer, E., Alain, G., and Yosinski, J. (2014). Deep generative stochastic networks trainable by backprop. In ICML’2014.
    Google ScholarFindings
  • Brock, A., Lim, T., Ritchie, J. M., and Weston, N. (2016). Neural photo editing with introspective adversarial networks. CoRR, abs/1609.07093.
    Findings
  • Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., and Abbeel, P. (2016a). Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in Neural Information Processing Systems, pages 2172–2180.
    Google ScholarLocate open access versionFindings
  • Chen, X., Kingma, D. P., Salimans, T., Duan, Y., Dhariwal, P., Schulman, J., Sutskever, I., and Abbeel, P. (2016b). Variational lossy autoencoder. arXiv preprint arXiv:1611.02731.
    Findings
  • Deco, G. and Brauer, W. (1995). Higher order statistical decorrelation without information loss. NIPS.
    Google ScholarLocate open access versionFindings
  • Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.
    Google ScholarFindings
  • Deng, J., Berg, A. C., Li, K., and Fei-Fei, L. (2010). What does classifying more than 10,000 image categories tell us? In Proceedings of the 11th European Conference on Computer Vision: Part V, ECCV’10, pages 71–84, Berlin, Heidelberg. Springer-Verlag.
    Google ScholarLocate open access versionFindings
  • Denton, E., Chintala, S., Szlam, A., and Fergus, R. (2015). Deep generative image models using a Laplacian pyramid of adversarial networks. NIPS.
    Google ScholarLocate open access versionFindings
  • Dinh, L., Krueger, D., and Bengio, Y. (2014). NICE: Non-linear independent components estimation. arXiv:1410.8516.
    Findings
  • Dinh, L., Sohl-Dickstein, J., and Bengio, S. (2016). Density estimation using real nvp. arXiv preprint arXiv:1605.08803.
    Findings
  • Donahue, J., Krahenbuhl, P., and Darrell, T. (2016). Adversarial feature learning. arXiv preprint arXiv:1605.09782.
    Findings
  • Dumoulin, V., Belghazi, I., Poole, B., Lamb, A., Arjovsky, M., Mastropietro, O., and Courville, A. (2016). Adversarially learned inference. arXiv preprint arXiv:1606.00704.
    Findings
  • Dziugaite, G. K., Roy, D. M., and Ghahramani, Z. (2015). Training generative neural networks via maximum mean discrepancy optimization. arXiv preprint arXiv:1505.03906.
    Findings
  • Edwards, H. and Storkey, A. (2015). Censoring representations with an adversary. arXiv preprint arXiv:1511.05897.
    Findings
  • Fahlman, S. E., Hinton, G. E., and Sejnowski, T. J. (1983). Massively parallel architectures for AI: NETL, thistle, and Boltzmann machines. In Proceedings of the National Conference on Artificial Intelligence AAAI-83.
    Google ScholarLocate open access versionFindings
  • Finn, C. and Levine, S. (2016). Deep visual foresight for planning robot motion. arXiv preprint arXiv:1610.00696.
    Findings
  • Finn, C., Christiano, P., Abbeel, P., and Levine, S. (2016a). A connection between generative adversarial networks, inverse reinforcement learning, and energy-based models. arXiv preprint arXiv:1611.03852.
    Findings
  • Finn, C., Goodfellow, I., and Levine, S. (2016b). Unsupervised learning for physical interaction through video prediction. NIPS.
    Google ScholarFindings
  • Frey, B. J. (1998). Graphical models for machine learning and digital communication. MIT Press.
    Google ScholarFindings
  • Frey, B. J., Hinton, G. E., and Dayan, P. (1996). Does the wake-sleep algorithm learn good density estimators? In D. Touretzky, M. Mozer, and M. Hasselmo, editors, Advances in Neural Information Processing Systems 8 (NIPS’95), pages 661–670. MIT Press, Cambridge, MA.
    Google ScholarLocate open access versionFindings
  • Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., and Lempitsky, V. (2015). Domain-adversarial training of neural networks. arXiv preprint arXiv:1505.07818.
    Findings
  • Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press. http://www.deeplearningbook.org.
    Findings
  • Goodfellow, I. J. (2014). On distinguishability criteria for estimating generative models. In International Conference on Learning Representations, Workshops Track.
    Google ScholarLocate open access versionFindings
  • Goodfellow, I. J., Shlens, J., and Szegedy, C. (2014a). Explaining and harnessing adversarial examples. CoRR, abs/1412.6572.
    Findings
  • Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014b). Generative adversarial networks. In NIPS’2014.
    Google ScholarFindings
  • Gutmann, M. and Hyvarinen, A. (2010). Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of The Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS’10).
    Google ScholarLocate open access versionFindings
  • Hinton, G. E. (2007). Learning multiple layers of representation. Trends in cognitive sciences, 11(10), 428–434.
    Google ScholarLocate open access versionFindings
  • Hinton, G. E. and Sejnowski, T. J. (1986). Learning and relearning in Boltzmann machines. In D. E. Rumelhart and J. L. McClelland, editors, Parallel Distributed Processing, volume 1, chapter 7, pages 282–317. MIT Press, Cambridge.
    Google ScholarLocate open access versionFindings
  • Hinton, G. E., Sejnowski, T. J., and Ackley, D. H. (1984). Boltzmann machines: Constraint satisfaction networks that learn. Technical Report TR-CMU-CS84-119, Carnegie-Mellon University, Dept. of Computer Science.
    Google ScholarFindings
  • Hinton, G. E., Osindero, S., and Teh, Y. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527–1554.
    Google ScholarLocate open access versionFindings
  • Ho, J. and Ermon, S. (2016). Generative adversarial imitation learning. In Advances in Neural Information Processing Systems, pages 4565–4573.
    Google ScholarLocate open access versionFindings
  • Ioffe, S. and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift.
    Google ScholarFindings
  • Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. (2016). Image-to-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004.
    Findings
  • Jang, E., Gu, S., and Poole, B. (2016). Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144.
    Findings
  • Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
    Findings
  • Kingma, D. P. (2013). Fast gradient-based inference with continuous latent variable models in auxiliary form. Technical report, arxiv:1306.0733.
    Findings
  • Kingma, D. P., Salimans, T., and Welling, M. (2016). Improving variational inference with inverse autoregressive flow. NIPS.
    Google ScholarLocate open access versionFindings
  • Ledig, C., Theis, L., Huszar, F., Caballero, J., Aitken, A. P., Tejani, A., Totz, J., Wang, Z., and Shi, W. (2016). Photo-realistic single image super-resolution using a generative adversarial network. CoRR, abs/1609.04802.
    Findings
  • Li, Y., Swersky, K., and Zemel, R. S. (2015). Generative moment matching networks. CoRR, abs/1502.02761.
    Findings
  • Lotter, W., Kreiman, G., and Cox, D. (2015). Unsupervised learning of visual structure using predictive generative networks. arXiv preprint arXiv:1511.06380.
    Findings
  • Maddison, C. J., Mnih, A., and Teh, Y. W. (2016). The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712.
    Findings
  • Metz, L., Poole, B., Pfau, D., and Sohl-Dickstein, J. (2016). Unrolled generative adversarial networks. arXiv preprint arXiv:1611.02163.
    Findings
  • Nguyen, A., Yosinski, J., Bengio, Y., Dosovitskiy, A., and Clune, J. (2016). Plug & play generative networks: Conditional iterative generation of images in latent space. arXiv preprint arXiv:1612.00005.
    Findings
  • Nowozin, S., Cseke, B., and Tomioka, R. (2016). f-gan: Training generative neural samplers using variational divergence minimization. arXiv preprint arXiv:1606.00709.
    Findings
  • Odena, A. (2016). Semi-supervised learning with generative adversarial networks. arXiv preprint arXiv:1606.01583.
    Findings
  • Oord, A. v. d., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., and Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499.
    Findings
  • Pfau, D. and Vinyals, O. (2016). Connecting generative adversarial networks and actor-critic methods. arXiv preprint arXiv:1610.01945.
    Findings
  • Radford, A., Metz, L., and Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.
    Findings
  • Ratliff, L. J., Burden, S. A., and Sastry, S. S. (2013). Characterization and computation of local nash equilibria in continuous games. In Communication, Control, and Computing (Allerton), 2013 51st Annual Allerton Conference on, pages 917–924. IEEE.
    Google ScholarLocate open access versionFindings
  • Reed, S., van den Oord, A., Kalchbrenner, N., Bapst, V., Botvinick, M., and de Freitas, N. (2016a). Generating interpretable images with controllable structure. Technical report.
    Google ScholarFindings
  • Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., and Lee, H. (2016b). Generative adversarial text to image synthesis. arXiv preprint arXiv:1605.05396.
    Findings
  • Rezende, D. J. and Mohamed, S. (2015). Variational inference with normalizing flows. arXiv preprint arXiv:1505.05770.
    Findings
  • Rezende, D. J., Mohamed, S., and Wierstra, D. (2014). Stochastic backpropagation and approximate inference in deep generative models. In ICML’2014. Preprint: arXiv:1401.4082.
    Findings
  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. (2014). ImageNet Large Scale Visual Recognition Challenge.
    Google ScholarFindings
  • Salakhutdinov, R. and Hinton, G. (2009). Deep Boltzmann machines. In Proceedings of the International Conference on Artificial Intelligence and Statistics, volume 5, pages 448–455.
    Google ScholarLocate open access versionFindings
  • Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., and Chen, X. (2016). Improved techniques for training gans. In Advances in Neural Information Processing Systems, pages 2226–2234.
    Google ScholarLocate open access versionFindings
  • Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al. (2016). Mastering the game of go with deep neural networks and tree search. Nature, 529(7587), 484–489.
    Google ScholarLocate open access versionFindings
  • Springenberg, J. T. (2015). Unsupervised and semi-supervised learning with categorical generative adversarial networks. arXiv preprint arXiv:1511.06390.
    Findings
  • Springenberg, J. T., Dosovitskiy, A., Brox, T., and Riedmiller, M. (2015). Striving for simplicity: The all convolutional net. In ICLR.
    Google ScholarFindings
  • Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I. J., and Fergus, R. (2014). Intriguing properties of neural networks. ICLR, abs/1312.6199.
    Google ScholarLocate open access versionFindings
  • Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2015). Rethinking the Inception Architecture for Computer Vision. ArXiv e-prints.
    Google ScholarFindings
  • Theis, L., van den Oord, A., and Bethge, M. (2015). A note on the evaluation of generative models. arXiv:1511.01844.
    Findings
  • Warde-Farley, D. and Goodfellow, I. (2016). Adversarial perturbations of deep neural networks. In T. Hazan, G. Papandreou, and D. Tarlow, editors, Perturbations, Optimization, and Statistics, chapter 11. MIT Press.
    Google ScholarLocate open access versionFindings
  • Williams, R. J. (1992). Simple statistical gradient-following algorithms connectionist reinforcement learning. Machine Learning, 8, 229–256.
    Google ScholarLocate open access versionFindings
  • Wu, Y., Burda, Y., Salakhutdinov, R., and Grosse, R. (2016). On the quantitative analysis of decoder-based generative models. arXiv preprint arXiv:1611.04273.
    Findings
  • Zhang, H., Xu, T., Li, H., Zhang, S., Huang, X., Wang, X., and Metaxas, D. (2016). Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. arXiv preprint arXiv:1612.03242.
    Findings
  • Zhu, J.-Y., Krahenbuhl, P., Shechtman, E., and Efros, A. A. (2016). Generative visual manipulation on the natural image manifold. In European Conference on Computer Vision, pages 597–613. Springer.
    Google ScholarLocate open access versionFindings
0
Your rating :

No Ratings

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn