Generative Flows with Matrix Exponential

Changyi Xiao
Changyi Xiao

ICML, pp. 10452-10461, 2020.

Cited by: 0|Views18
EI
Weibo:
We propose matrix exponential coupling layers which are a generalization of affine coupling layers

Abstract:

Generative flows models enjoy the properties of tractable exact likelihood and efficient sampling, which are composed of a sequence of invertible functions. In this paper, we incorporate matrix exponential into generative flows. Matrix exponential is a map from matrices to invertible matrices, this property is suitable for generative fl...More

Code:

Data:

0
Full Text
Bibtex
Weibo
Introduction
  • Generative models aim to learn a probability distribution given data sampled from that distribution, in contrast with discriminative models, which do not require a large amount of annotations.
  • Generative flows models transform a simple probability distribution into a complex probability distribution through a sequence of invertible functions
  • They gain popularity recently due to exact density estimation and efficient sampling.
  • Kingma & Dhariwal (2018) proposed Glow: generative flow with invertible 1 × 1 convolutions, which significantly improved the performance of generative flows models on density estimation and showed that generative flows models are capable of realistic synthesis
  • These flows all have easy Jacobian determinant and inverse.
Highlights
  • Generative models aim to learn a probability distribution given data sampled from that distribution, in contrast with discriminative models, which do not require a large amount of annotations
  • We propose matrix exponential coupling layers to enhance the expressiveness of networks, which can be seen as multivariate affine coupling layers
  • We propose matrix exponential coupling layers, which are a generalization of affine coupling layers
  • We propose a new type of generative flows, called matrix exponential flows, which utilizes the properties of matrix exponential
  • In order to solve the stability problem, we propose matrix exponential 1 × 1 convolutions and improve the coupling layers
Methods
  • The authors run several experiments to demonstrate the performance of the model.
  • In Section 6.1, the authors compare the performance on density estimation with other generative flows models.
  • In Section 6.2, the authors study the training stability of generative flows models.
  • In Section 6.3, the authors compare three 1 × 1 convolutions.
  • In Section 6.4, the authors analyze the computation of matrix exponential.
  • In Section 6.5 the authors show samples from the trained models
Conclusion
  • The authors propose a new type of generative flows, called matrix exponential flows, which utilizes the properties of matrix exponential.
  • The authors incorporate matrix exponential into neural networks and combine it with generative flows.
  • The authors propose matrix exponential coupling layers which are a generalization of affine coupling layers.
  • In order to solve the stability problem, the authors propose matrix exponential 1 × 1 convolutions and improve the coupling layers.
  • The authors hope that more layers can be proposed or incorporate it into other layers
Summary
  • Introduction:

    Generative models aim to learn a probability distribution given data sampled from that distribution, in contrast with discriminative models, which do not require a large amount of annotations.
  • Generative flows models transform a simple probability distribution into a complex probability distribution through a sequence of invertible functions
  • They gain popularity recently due to exact density estimation and efficient sampling.
  • Kingma & Dhariwal (2018) proposed Glow: generative flow with invertible 1 × 1 convolutions, which significantly improved the performance of generative flows models on density estimation and showed that generative flows models are capable of realistic synthesis
  • These flows all have easy Jacobian determinant and inverse.
  • Methods:

    The authors run several experiments to demonstrate the performance of the model.
  • In Section 6.1, the authors compare the performance on density estimation with other generative flows models.
  • In Section 6.2, the authors study the training stability of generative flows models.
  • In Section 6.3, the authors compare three 1 × 1 convolutions.
  • In Section 6.4, the authors analyze the computation of matrix exponential.
  • In Section 6.5 the authors show samples from the trained models
  • Conclusion:

    The authors propose a new type of generative flows, called matrix exponential flows, which utilizes the properties of matrix exponential.
  • The authors incorporate matrix exponential into neural networks and combine it with generative flows.
  • The authors propose matrix exponential coupling layers which are a generalization of affine coupling layers.
  • In order to solve the stability problem, the authors propose matrix exponential 1 × 1 convolutions and improve the coupling layers.
  • The authors hope that more layers can be proposed or incorporate it into other layers
Tables
  • Table1: The definition of several related generative flows and our generative flows. These flows all have easy Jacobian determinant and inverse. h, w, c denote the height, width and number of channels. The symbols , / denote element-wise multiplication and division. x, y may denote the tensors with shape h × w × c
  • Table2: Density estimation performance on CIFAR-10 and ImageNet 32×32, ImageNet 64×64 datasets. Results are reported in bits/dim (negative log2 likelihood). In brackets are models that use variational dequantization (<a class="ref-link" id="cHo_et+al_2019_a" href="#rHo_et+al_2019_a">Ho et al, 2019</a>)
  • Table3: Comparison of the number of parameters of Glow, Emerging, Flow++ and MEF
  • Table4: Comparison of models with different coupling layers and learning rate. Performance is measured in bits per dimension. In brackets are the learning rate. Results are obtained by running 3 times with different random seeds, ± reports standard deviation
  • Table5: Comparison of standard, P LU decomposition and matrix exponential convolutions. Performance is measured in bits per dimension. Computation is measured in running time per epoch. Results are obtained by running 3 times with different random seeds, ± reports standard deviation
  • Table6: Mean, standard deviation, maximum and minimum of the coefficient m
Download tables as Excel
Related work
Funding
  • This work is supported by the National Natural Science Foundation of China (61672482) and Zhejiang Lab (NO. 2019NB0AB03)
Reference
  • Behrmann, J., Grathwohl, W., Chen, R. T., Duvenaud, D., and Jacobsen, J.-H. Invertible residual networks. In International Conference on Machine Learning, pp. 573– 582, 2019.
    Google ScholarLocate open access versionFindings
  • Chen, R. T., Behrmann, J., Duvenaud, D. K., and Jacobsen, J.-H. Residual flows for invertible generative modeling. In Advances in Neural Information Processing Systems, pp. 9916–9926, 2019.
    Google ScholarLocate open access versionFindings
  • Clevert, D.-A., Unterthiner, T., and Hochreiter, S. Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289, 2015.
    Findings
  • Culver, W. J. On the existence and uniqueness of the real logarithm of a matrix. Proceedings of the American Mathematical Society, 17(5):1146–1151, 1966.
    Google ScholarLocate open access versionFindings
  • Dinh, L., Krueger, D., and Bengio, Y. Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
    Findings
  • Dinh, L., Sohl-Dickstein, J., and Bengio, S. Density estimation using real nvp. arXiv preprint arXiv:1605.08803, 2016.
    Findings
  • Durkan, C., Bekasov, A., Murray, I., and Papamakarios, G. Neural spline flows. In Advances in Neural Information Processing Systems, pp. 7511–7522, 2019.
    Google ScholarLocate open access versionFindings
  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680, 2014.
    Google ScholarLocate open access versionFindings
  • Grathwohl, W., Chen, R. T., Bettencourt, J., Sutskever, I., and Duvenaud, D. Ffjord: Free-form continuous dynamics for scalable reversible generative models. In International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • Hall, B. Lie groups, Lie algebras, and representations: an elementary introduction, volume 222, pp. 31–71. 2015.
    Google ScholarFindings
  • He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
    Google ScholarLocate open access versionFindings
  • Ho, J., Chen, X., Srinivas, A., Duan, Y., and Abbeel, P. Flow++: Improving flow-based generative models with variational dequantization and architecture design. In International Conference on Machine Learning, pp. 2722– 2730, 2019.
    Google ScholarLocate open access versionFindings
  • Hoogeboom, E., Van Den Berg, R., and Welling, M. Emerging convolutions for generative normalizing flows. In International Conference on Machine Learning, pp. 2771– 2780, 2019.
    Google ScholarLocate open access versionFindings
  • Huang, C.-W., Krueger, D., Lacoste, A., and Courville, A. Neural autoregressive flows. In International Conference on Machine Learning, pp. 2083–2092, 2018.
    Google ScholarLocate open access versionFindings
  • Kingma, D. P. and Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    Findings
  • Kingma, D. P. and Dhariwal, P. Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems, pp. 10215–10224, 2018.
    Google ScholarLocate open access versionFindings
  • Kingma, D. P. and Welling, M. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
    Findings
  • Kingma, D. P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., and Welling, M. Improved variational inference with inverse autoregressive flow. In Advances in neural information processing systems, pp. 4743–4751, 2016.
    Google ScholarLocate open access versionFindings
  • Krizhevsky, A., Hinton, G., et al. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
    Google ScholarFindings
  • Liou, M. A novel method of evaluating transient response. Proceedings of the IEEE, 54(1):20–23, 1966.
    Google ScholarLocate open access versionFindings
  • Moler, C. and Van Loan, C. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM review, 45(1):3–49, 2003.
    Google ScholarLocate open access versionFindings
  • Papamakarios, G., Pavlakou, T., and Murray, I. Masked autoregressive flow for density estimation. In Advances in Neural Information Processing Systems, pp. 2338–2347, 2017.
    Google ScholarLocate open access versionFindings
  • Rezende, D. and Mohamed, S. Variational inference with normalizing flows. In International Conference on Machine Learning, pp. 1530–1538, 2015.
    Google ScholarLocate open access versionFindings
  • Rezende, D. J., Mohamed, S., and Wierstra, D. Stochastic backpropagation and approximate inference in deep generative models. In International Conference on Machine Learning, pp. 1278–1286, 2014.
    Google ScholarLocate open access versionFindings
  • Van Oord, A., Kalchbrenner, N., and Kavukcuoglu, K. Pixel recurrent neural networks. In International Conference on Machine Learning, pp. 1747–1756, 2016.
    Google ScholarLocate open access versionFindings
  • Ward, P. N., Smofsky, A., and Bose, A. J. Improving exploration in soft-actor-critic with normalizing flows policies. arXiv preprint arXiv:1906.02771, 2019.
    Findings
Your rating :
0

 

Tags
Comments