# Generative Flows with Matrix Exponential

ICML, pp. 10452-10461, 2020.

EI

Weibo:

Abstract:

Generative flows models enjoy the properties of tractable exact likelihood and efficient sampling, which are composed of a sequence of invertible functions. In this paper, we incorporate matrix exponential into generative flows. Matrix exponential is a map from matrices to invertible matrices, this property is suitable for generative fl...More

Code:

Data:

Introduction

- Generative models aim to learn a probability distribution given data sampled from that distribution, in contrast with discriminative models, which do not require a large amount of annotations.
- Generative flows models transform a simple probability distribution into a complex probability distribution through a sequence of invertible functions
- They gain popularity recently due to exact density estimation and efficient sampling.
- Kingma & Dhariwal (2018) proposed Glow: generative flow with invertible 1 × 1 convolutions, which significantly improved the performance of generative flows models on density estimation and showed that generative flows models are capable of realistic synthesis
- These flows all have easy Jacobian determinant and inverse.

Highlights

- Generative models aim to learn a probability distribution given data sampled from that distribution, in contrast with discriminative models, which do not require a large amount of annotations
- We propose matrix exponential coupling layers to enhance the expressiveness of networks, which can be seen as multivariate affine coupling layers
- We propose matrix exponential coupling layers, which are a generalization of affine coupling layers
- We propose a new type of generative flows, called matrix exponential flows, which utilizes the properties of matrix exponential
- In order to solve the stability problem, we propose matrix exponential 1 × 1 convolutions and improve the coupling layers

Methods

- The authors run several experiments to demonstrate the performance of the model.
- In Section 6.1, the authors compare the performance on density estimation with other generative flows models.
- In Section 6.2, the authors study the training stability of generative flows models.
- In Section 6.3, the authors compare three 1 × 1 convolutions.
- In Section 6.4, the authors analyze the computation of matrix exponential.
- In Section 6.5 the authors show samples from the trained models

Conclusion

- The authors propose a new type of generative flows, called matrix exponential flows, which utilizes the properties of matrix exponential.
- The authors incorporate matrix exponential into neural networks and combine it with generative flows.
- The authors propose matrix exponential coupling layers which are a generalization of affine coupling layers.
- In order to solve the stability problem, the authors propose matrix exponential 1 × 1 convolutions and improve the coupling layers.
- The authors hope that more layers can be proposed or incorporate it into other layers

Summary

## Introduction:

Generative models aim to learn a probability distribution given data sampled from that distribution, in contrast with discriminative models, which do not require a large amount of annotations.- Generative flows models transform a simple probability distribution into a complex probability distribution through a sequence of invertible functions
- They gain popularity recently due to exact density estimation and efficient sampling.
- Kingma & Dhariwal (2018) proposed Glow: generative flow with invertible 1 × 1 convolutions, which significantly improved the performance of generative flows models on density estimation and showed that generative flows models are capable of realistic synthesis
- These flows all have easy Jacobian determinant and inverse.
## Methods:

The authors run several experiments to demonstrate the performance of the model.- In Section 6.1, the authors compare the performance on density estimation with other generative flows models.
- In Section 6.2, the authors study the training stability of generative flows models.
- In Section 6.3, the authors compare three 1 × 1 convolutions.
- In Section 6.4, the authors analyze the computation of matrix exponential.
- In Section 6.5 the authors show samples from the trained models
## Conclusion:

The authors propose a new type of generative flows, called matrix exponential flows, which utilizes the properties of matrix exponential.- The authors incorporate matrix exponential into neural networks and combine it with generative flows.
- The authors propose matrix exponential coupling layers which are a generalization of affine coupling layers.
- In order to solve the stability problem, the authors propose matrix exponential 1 × 1 convolutions and improve the coupling layers.
- The authors hope that more layers can be proposed or incorporate it into other layers

- Table1: The definition of several related generative flows and our generative flows. These flows all have easy Jacobian determinant and inverse. h, w, c denote the height, width and number of channels. The symbols , / denote element-wise multiplication and division. x, y may denote the tensors with shape h × w × c
- Table2: Density estimation performance on CIFAR-10 and ImageNet 32×32, ImageNet 64×64 datasets. Results are reported in bits/dim (negative log2 likelihood). In brackets are models that use variational dequantization (<a class="ref-link" id="cHo_et+al_2019_a" href="#rHo_et+al_2019_a">Ho et al, 2019</a>)
- Table3: Comparison of the number of parameters of Glow, Emerging, Flow++ and MEF
- Table4: Comparison of models with different coupling layers and learning rate. Performance is measured in bits per dimension. In brackets are the learning rate. Results are obtained by running 3 times with different random seeds, ± reports standard deviation
- Table5: Comparison of standard, P LU decomposition and matrix exponential convolutions. Performance is measured in bits per dimension. Computation is measured in running time per epoch. Results are obtained by running 3 times with different random seeds, ± reports standard deviation
- Table6: Mean, standard deviation, maximum and minimum of the coefficient m

Related work

- This work mainly builds upon the ideas proposed in (Dinh et al, 2016; Kingma & Dhariwal, 2018). Generative flows models can roughly be divided into two categories according to the Jacobian. One is the models whose Jacobian is a triangular matrix, which are based on coupling layers proposed in (Dinh et al, 2014; 2016) or autoregressive flows proposed in (Kingma et al, 2016; Papamakarios et al, 2017). Ho et al (2019); Hoogeboom et al (2019); Durkan et al (2019) extended the models with more expressive invertible functions. The other is the models with free-form Jacobian. Behrmann et al (2019) proposed invertible residual networks and utilized it for density estimation. Chen et al (2019) further improved the model with a unbiased estimate of the log density. Grathwohl et al (2018) proposed a continuous-time generative flow with unbiased density estimation.

Funding

- This work is supported by the National Natural Science Foundation of China (61672482) and Zhejiang Lab (NO. 2019NB0AB03)

Reference

- Behrmann, J., Grathwohl, W., Chen, R. T., Duvenaud, D., and Jacobsen, J.-H. Invertible residual networks. In International Conference on Machine Learning, pp. 573– 582, 2019.
- Chen, R. T., Behrmann, J., Duvenaud, D. K., and Jacobsen, J.-H. Residual flows for invertible generative modeling. In Advances in Neural Information Processing Systems, pp. 9916–9926, 2019.
- Clevert, D.-A., Unterthiner, T., and Hochreiter, S. Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289, 2015.
- Culver, W. J. On the existence and uniqueness of the real logarithm of a matrix. Proceedings of the American Mathematical Society, 17(5):1146–1151, 1966.
- Dinh, L., Krueger, D., and Bengio, Y. Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
- Dinh, L., Sohl-Dickstein, J., and Bengio, S. Density estimation using real nvp. arXiv preprint arXiv:1605.08803, 2016.
- Durkan, C., Bekasov, A., Murray, I., and Papamakarios, G. Neural spline flows. In Advances in Neural Information Processing Systems, pp. 7511–7522, 2019.
- Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680, 2014.
- Grathwohl, W., Chen, R. T., Bettencourt, J., Sutskever, I., and Duvenaud, D. Ffjord: Free-form continuous dynamics for scalable reversible generative models. In International Conference on Learning Representations, 2018.
- Hall, B. Lie groups, Lie algebras, and representations: an elementary introduction, volume 222, pp. 31–71. 2015.
- He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- Ho, J., Chen, X., Srinivas, A., Duan, Y., and Abbeel, P. Flow++: Improving flow-based generative models with variational dequantization and architecture design. In International Conference on Machine Learning, pp. 2722– 2730, 2019.
- Hoogeboom, E., Van Den Berg, R., and Welling, M. Emerging convolutions for generative normalizing flows. In International Conference on Machine Learning, pp. 2771– 2780, 2019.
- Huang, C.-W., Krueger, D., Lacoste, A., and Courville, A. Neural autoregressive flows. In International Conference on Machine Learning, pp. 2083–2092, 2018.
- Kingma, D. P. and Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Kingma, D. P. and Dhariwal, P. Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems, pp. 10215–10224, 2018.
- Kingma, D. P. and Welling, M. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Kingma, D. P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., and Welling, M. Improved variational inference with inverse autoregressive flow. In Advances in neural information processing systems, pp. 4743–4751, 2016.
- Krizhevsky, A., Hinton, G., et al. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
- Liou, M. A novel method of evaluating transient response. Proceedings of the IEEE, 54(1):20–23, 1966.
- Moler, C. and Van Loan, C. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM review, 45(1):3–49, 2003.
- Papamakarios, G., Pavlakou, T., and Murray, I. Masked autoregressive flow for density estimation. In Advances in Neural Information Processing Systems, pp. 2338–2347, 2017.
- Rezende, D. and Mohamed, S. Variational inference with normalizing flows. In International Conference on Machine Learning, pp. 1530–1538, 2015.
- Rezende, D. J., Mohamed, S., and Wierstra, D. Stochastic backpropagation and approximate inference in deep generative models. In International Conference on Machine Learning, pp. 1278–1286, 2014.
- Van Oord, A., Kalchbrenner, N., and Kavukcuoglu, K. Pixel recurrent neural networks. In International Conference on Machine Learning, pp. 1747–1756, 2016.
- Ward, P. N., Smofsky, A., and Bose, A. J. Improving exploration in soft-actor-critic with normalizing flows policies. arXiv preprint arXiv:1906.02771, 2019.

Tags

Comments