We believe that our results provide new insights into Generative Adversarial Networks training and point towards a rich algorithmic landscape to be explored in order to further understand Generative Adversarial Networks dynamics
On the Limitations of First-Order Approximation in GAN Dynamics.
ICML, pp.3011-3019, (2018)
While Generative Adversarial Networks (GANs) have demonstrated promising performance on multiple vision tasks, their learning dynamics are not yet well understood, both in theory and in practice. To address this issue, we study GAN dynamics in a simple yet rich parametric model that exhibits several of the common problematic convergence...更多
下载 PDF 全文
- Generative Adversarial Networks (GANs) have recently been proposed as a novel framework for learning generative models (Goodfellow et al, 2014).
- The key idea of GANs is to learn both the generative model and the loss function at the same time.
- GANs have shown promising results on a variety of tasks, and there is a large body of work that explores the power of this framework (Goodfellow, 2017).
- Generative Adversarial Networks (GANs) have recently been proposed as a novel framework for learning generative models (Goodfellow et al, 2014)
- The resulting training dynamics are usually described as a game between a generator and a discriminator
- We provide the first rigorous proof of global convergence and show that a Generative Adversarial Networks with an optimal discriminator always converges to an approximate equilibrium
- We haven taken a step towards a principled understanding of Generative Adversarial Networks dynamics
- We find an interesting dichotomy: If we take optimal discriminator steps, the training dynamics provably converge
- We believe that our results provide new insights into Generative Adversarial Networks training and point towards a rich algorithmic landscape to be explored in order to further understand Generative Adversarial Networks dynamics
- To illustrate more conclusively that the phenomena demonstrated in Figure 1 are not rare, and that first order dynamics do often fail to converge, the authors conducted the following heatmap experiments.
- For each of these grid points, the authors randomly chose a set of initial discriminator intervals, and ran the first order dynamics for 3000 iterations, with constant stepsize 0.3
- The authors repeated this 120 times for each grid point, and plotted the probability that the generator converged to the truth, where the authors say the generator converged to the truth if the TV distance between the generator and optimality is < 0.1.
- The authors did the same thing for the optimal discriminator dynamics, and for unrolled discriminator dynamics with 5 unrolling steps, as described in (Metz et al, 2017), which attempt to match the optimal discriminator dynamics
- The authors haven taken a step towards a principled understanding of GAN dynamics.
- The authors find an interesting dichotomy: If the authors take optimal discriminator steps, the training dynamics provably converge.
- The authors show experimentally that the dynamics often fail if the authors take first order discriminator steps.
- The authors believe that the results provide new insights into GAN training and point towards a rich algorithmic landscape to be explored in order to further understand GAN dynamics
- GANs have received a tremendous amount of attention over the past two years (Goodfellow, 2017). Hence we only compare our results to the most closely related papers here.
The recent paper (Arora et al, 2017) studies generalization aspects of GANs and the existence of equilibria in the two-player game. In contrast, our paper focuses on the dynamics of GAN training. We provide the first rigorous proof of global convergence and show that a GAN with an optimal discriminator always converges to an approximate equilibrium.
One recently proposed method for improving the convergence of GAN dynamics is the unrolled GAN (Metz et al, 2017). The paper proposes to “unroll” multiple discriminator gradient steps in the generator loss function. The authors argue that this improves the GAN dynamics by bringing the discriminator closer to an optimal discriminator response. Our experiments show that this is not a perfect approximation: the unrolled GAN still fails to converge in multiple initial configurations (however, it does converge more often than a “vanilla” one-step discriminator).
- Jerry Li was supported by NSF Award CCF-1453261 (CAREER), CCF-1565235, a Google Faculty Research Award, and an NSF Graduate Research Fellowship
- Aleksander Madry was supported in part by an Alfred P
- Sloan Research Fellowship, a Google Research Award, and the NSF grant CCF-1553428
- John Peebles was supported by the NSF Graduate Research Fellowship under Grant No 1122374 and by the NSF Grant No 1065125
- Ludwig Schmidt was supported by a Google PhD Fellowship
- Arjovsky, M. and Bottou, L. Towards principled methods for training generative adversarial networks. In ICLR, 2017.
- Arjovsky, M., Chintala, S., and Bottou, L. Wasserstein GAN. In ICML, 2017.
- Arora, S. and Zhang, Y. Do gans actually learn the distribution? an empirical study. In ICLR, 2018.
- Arora, S., Ge, R., Liang, Y., Ma, T., and Zhang, Y. Generalization and equilibrium in generative adversarial nets (GANs). In ICML, 2017.
- Chan, S.-O., Diakonikolas, I., Servedio, R. A., and Sun, X. Efficient density estimation via piecewise polynomial approximation. In STOC, 2014.
- Devroye, L. and Lugosi, G. Combinatorial methods in density estimation. Springer Science & Business Media, 2012.
- Gautschi, W. How (un) stable are Vandermonde systems? Lecture Notes in Pure and Applied Mathematics, 124: 193–210, 1990.
- Goodfellow, I. NIPS 2016 tutorial: Generative adversarial networks. CoRR, abs/1701.00160, 2017.
- Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial nets. In NIPS, 2014.
- Grnarova, P., Levy, K. Y., Lucchi, A., Hofmann, T., and Krause, A. An online learning approach to generative adversarial networks. In ICLR, 2018.
- Hummel, R. and Gidas, B. Zero Crossings and the Heat Equation. New York University., 1984.
- Markov, V. On functions deviating least from zero in a given interval. Izdat. Imp. Akad. Nauk, St. Petersburg, pp. 218–258, 1892.
- Metz, L., Poole, B., Pfau, D., and Sohl-Dickstein, J. Unrolled generative adversarial networks. In ICLR, 2017.
- Nagarajan, V. and Kolter, J. Z. Gradient descent gan optimization is locally stable. In NIPS, 2017.