Associative Compression Networks for Representation Learning

arXiv: Neural and Evolutionary Computing, Volume abs/1804.02476, 2018.

Cited by: 3|Bibtex|Views154
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org
Weibo:
We have introduced Associative Compression Networks, a new form of Variational Autoencoder in which associated codes are used to condition the latent prior

Abstract:

This paper introduces Associative Compression Networks (ACNs), a new framework for variational autoencoding with neural networks. The system differs from existing variational autoencoders (VAEs) in that the prior distribution used to model each code is conditioned on a similar code from the dataset. In compression terms this equates to se...More

Code:

Data:

0
Introduction
  • Unsupervised learning—the discovery of structure in data without extrinsic reward or supervision signals—is likely to be critical to the development of artificial intelligence, as it enables algorithms to exploit the vast amounts of data for which such signals are partially or completely lacking.
  • Authors have proposed various modifications to correct this shortcoming, such as reweighting the coding cost (Higgins et al, 2017) or removing it entirely from the loss function (van den Oord et al, 2017), weakening the decoder by e.g. limiting its range of context (Chen et al, 2016b; Bowman et al, 2015) or adding auxiliary objectives that reward more informative codes, for example by maximising the mutual information between the prior distribution and generated samples (Zhao et al, 2017)—a tactic that has been fruitfully applied to Generative Adversarial Networks (Chen et al, 2016a)
  • These approaches have had considerable success at discovering useful and interesting latent representations.
  • The prior is constant for all x
Highlights
  • Unsupervised learning—the discovery of structure in data without extrinsic reward or supervision signals—is likely to be critical to the development of artificial intelligence, as it enables algorithms to exploit the vast amounts of data for which such signals are partially or completely lacking
  • Variational Autoencoders (VAEs) (Kingma & Welling, 2013; Rezende et al, 2014) are a family of generative models consisting of two neural networks —an encoder and a decoder— trained in tandem
  • The Associative Compression Networks encoder was a convolutional network fashioned after a VGG-style classifier (Simonyan & Zisserman, 2014), and the encoding distribution q(z|x) was a unit variance Gaussian with mean specified by the output of the encoder network
  • We have introduced Associative Compression Networks (ACNs), a new form of Variational Autoencoder in which associated codes are used to condition the latent prior
  • Our experiments show that the latent representations learned by Associative Compression Networks contain meaningful, high-level information that is not diminished by the use of autoregressive decoders
  • We hope this work will open the door to more holistic, dataset-wide approaches to generative modelling and representation learning
Results
  • The authors present experimental results on four image datasets: binarized MNIST (Salakhutdinov & Murray, 2008), CIFAR10 (Krizhevsky, 2009), ImageNet (Deng et al, 2009) and CelebA (Liu et al, 2015).
  • The ACN prior distribution p(z|c) was parameterised using the outputs of the prior network as follows: DM p(z|c) =.
  • Πmd N, d=1 m=1 where D was the dimensionality of z, zd is the dth element of z, there are M mixture components for each dimension, and all parameters πmd , μdm, σmd are emitted by the prior network, with the softmax function used to normalise πmd and the softplus function used to ensure σmd > 0.
  • For the unconditional prior p(z) the authors always fit a Gaussian mixture model using Expectation-Maximization, with the number of components optimised on the validation set
Conclusion
  • The authors have introduced Associative Compression Networks (ACNs), a new form of Variational Autoencoder in which associated codes are used to condition the latent prior.
  • The authors' experiments show that the latent representations learned by ACNs contain meaningful, high-level information that is not diminished by the use of autoregressive decoders.
  • As well as providing a clear conditioning signal for the samples, these representations can be used to cluster and linearly classify the data, suggesting that they will be useful for other cognitive tasks.
  • The authors have seen that the joint latent and data space learned by the model can be naturally traversed by daydream sampling.
  • The authors hope this work will open the door to more holistic, dataset-wide approaches to generative modelling and representation learning
Summary
  • Introduction:

    Unsupervised learning—the discovery of structure in data without extrinsic reward or supervision signals—is likely to be critical to the development of artificial intelligence, as it enables algorithms to exploit the vast amounts of data for which such signals are partially or completely lacking.
  • Authors have proposed various modifications to correct this shortcoming, such as reweighting the coding cost (Higgins et al, 2017) or removing it entirely from the loss function (van den Oord et al, 2017), weakening the decoder by e.g. limiting its range of context (Chen et al, 2016b; Bowman et al, 2015) or adding auxiliary objectives that reward more informative codes, for example by maximising the mutual information between the prior distribution and generated samples (Zhao et al, 2017)—a tactic that has been fruitfully applied to Generative Adversarial Networks (Chen et al, 2016a)
  • These approaches have had considerable success at discovering useful and interesting latent representations.
  • The prior is constant for all x
  • Results:

    The authors present experimental results on four image datasets: binarized MNIST (Salakhutdinov & Murray, 2008), CIFAR10 (Krizhevsky, 2009), ImageNet (Deng et al, 2009) and CelebA (Liu et al, 2015).
  • The ACN prior distribution p(z|c) was parameterised using the outputs of the prior network as follows: DM p(z|c) =.
  • Πmd N, d=1 m=1 where D was the dimensionality of z, zd is the dth element of z, there are M mixture components for each dimension, and all parameters πmd , μdm, σmd are emitted by the prior network, with the softmax function used to normalise πmd and the softplus function used to ensure σmd > 0.
  • For the unconditional prior p(z) the authors always fit a Gaussian mixture model using Expectation-Maximization, with the number of components optimised on the validation set
  • Conclusion:

    The authors have introduced Associative Compression Networks (ACNs), a new form of Variational Autoencoder in which associated codes are used to condition the latent prior.
  • The authors' experiments show that the latent representations learned by ACNs contain meaningful, high-level information that is not diminished by the use of autoregressive decoders.
  • As well as providing a clear conditioning signal for the samples, these representations can be used to cluster and linearly classify the data, suggesting that they will be useful for other cognitive tasks.
  • The authors have seen that the joint latent and data space learned by the model can be naturally traversed by daydream sampling.
  • The authors hope this work will open the door to more holistic, dataset-wide approaches to generative modelling and representation learning
Tables
  • Table1: Binarized MNIST test set compression results
  • Table2: Binarized MNIST test set ACN costs
  • Table3: Binarized MNIST linear classification results
  • Table4: CIFAR-10 test set compression results
  • Table5: CIFAR-10 test set ACN costs
  • Table6: ImageNet 32x32 test set compression results
  • Table7: ImageNet test set ACN costs
Download tables as Excel
Reference
  • Bachman, Philip. An architecture for deep, hierarchical generative models. In Advances in Neural Information Processing Systems, pp. 4826–4834, 2016.
    Google ScholarLocate open access versionFindings
  • Bahdanau, Dzmitry, Cho, Kyunghyun, and Bengio, Yoshua. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
    Findings
  • Bowman, Samuel R, Vilnis, Luke, Vinyals, Oriol, Dai, Andrew M, Jozefowicz, Rafal, and Bengio, Samy. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349, 2015.
    Findings
  • Chen, Xi, Duan, Yan, Houthooft, Rein, Schulman, John, Sutskever, Ilya, and Abbeel, Pieter. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in Neural Information Processing Systems, pp. 2172–2180, 2016a.
    Google ScholarLocate open access versionFindings
  • Chen, Xi, Kingma, Diederik P, Salimans, Tim, Duan, Yan, Dhariwal, Prafulla, Schulman, John, Sutskever, Ilya, and Abbeel, Pieter. Variational lossy autoencoder. arXiv preprint arXiv:1611.02731, 2016b.
    Findings
  • Chen, Xi, Mishra, Nikhil, Rohaninejad, Mostafa, and Abbeel, Pieter. Pixelsnail: An improved autoregressive generative model. arXiv preprint arXiv:1712.09763, 2017.
    Findings
  • Deng, Jia, Dong, Wei, Socher, Richard, Li, Li-Jia, Li, Kai, and Fei-Fei, Li. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 248– 255. IEEE, 2009.
    Google ScholarLocate open access versionFindings
  • Doersch, Carl, Gupta, Abhinav, and Efros, Alexei A. Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1422–1430, 2015.
    Google ScholarLocate open access versionFindings
  • Donahue, Jeff, Krahenbuhl, Philipp, and Darrell, Trevor. Adversarial feature learning. arXiv preprint arXiv:1605.09782, 2016.
    Findings
  • Graves, Alex. Practical variational inference for neural networks. In Advances in Neural Information Processing Systems, pp. 2348–2356, 2011.
    Google ScholarLocate open access versionFindings
  • Graves, Alex, Wayne, Greg, and Danihelka, Ivo. Neural turing machines. arXiv preprint arXiv:1410.5401, 2014.
    Findings
  • Gregor, Karol, Danihelka, Ivo, Graves, Alex, Rezende, Danilo Jimenez, and Wierstra, Daan. Draw: A recurrent neural network for image generation. arXiv preprint arXiv:1502.04623, 2015.
    Findings
  • Gregor, Karol, Besse, Frederic, Rezende, Danilo Jimenez, Danihelka, Ivo, and Wierstra, Daan. Towards conceptual compression. In Advances In Neural Information Processing Systems, pp. 3549–3557, 2016.
    Google ScholarLocate open access versionFindings
  • Gulrajani, Ishaan, Kumar, Kundan, Ahmed, Faruk, Taiga, Adrien Ali, Visin, Francesco, Vazquez, David, and Courville, Aaron. Pixelvae: A latent variable model for natural images. arXiv preprint arXiv:1611.05013, 2016.
    Findings
  • Higgins, Irina, Matthey, Loic, Pal, Arka, Burgess, Christopher, Glorot, Xavier, Botvinick, Matthew, Mohamed, Shakir, and Lerchner, Alexander. β-vae: Learning basic visual concepts with a constrained variational framework. ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Hinton, Geoffrey E and Van Camp, Drew. Keeping neural networks simple by minimizing the description length of the weights. In Proceedings of the sixth annual conference on Computational learning theory, pp. 5–13. ACM, 1993.
    Google ScholarLocate open access versionFindings
  • Kingma, Diederik P and Welling, Max. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
    Findings
  • Krizhevsky, Alex. Learning multiple layers of features from tiny images. 2009.
    Google ScholarFindings
  • Liu, Ziwei, Luo, Ping, Wang, Xiaogang, and Tang, Xiaoou. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738, 2015.
    Google ScholarLocate open access versionFindings
  • Mnih, Andriy and Rezende, Danilo. Variational inference for monte carlo objectives. In International Conference on Machine Learning, pp. 2188–2196, 2016.
    Google ScholarLocate open access versionFindings
  • Oord, Aaron van den, Kalchbrenner, Nal, and Kavukcuoglu, Koray. Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759, 2016a.
    Findings
  • Oord, Aaron van den, Kalchbrenner, Nal, Vinyals, Oriol, Espeholt, Lasse, Graves, Alex, and Kavukcuoglu, Koray. Conditional image generation with pixelcnn decoders. In Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 4797–4805. Curran Associates Inc., 2016b.
    Google ScholarLocate open access versionFindings
  • Polyak, Boris T and Juditsky, Anatoli B. Acceleration of stochastic approximation by averaging. SIAM Journal on Control and Optimization, 30(4):838–855, 1992.
    Google ScholarLocate open access versionFindings
  • Rezende, Danilo Jimenez, Mohamed, Shakir, and Wierstra, Daan. Stochastic backpropagation and approximate inference in deep generative models. arXiv preprint arXiv:1401.4082, 2014.
    Findings
  • Rolfe, Jason Tyler. Discrete variational autoencoders. arXiv preprint arXiv:1609.02200, 2016.
    Findings
  • Salakhutdinov, Ruslan and Murray, Iain. On the quantitative analysis of deep belief networks. In Proceedings of the 25th international conference on Machine learning, pp. 872–879. ACM, 2008.
    Google ScholarLocate open access versionFindings
  • Salimans, Tim, Karpathy, Andrej, Chen, Xi, and Kingma, Diederik P. Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications. arXiv preprint arXiv:1701.05517, 2017.
    Findings
  • Simonyan, Karen and Zisserman, Andrew. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
    Findings
  • Smilkov, Daniel, Thorat, Nikhil, Nicholson, Charles, Reif, Emily, Viegas, Fernanda B, and Wattenberg, Martin. Embedding projector: Interactive visualization and interpretation of embeddings. arXiv preprint arXiv:1611.05469, 2016.
    Findings
  • Tieleman, Tijmen and Hinton, Geoffrey. Lecture 6.5rmsprop, coursera: Neural networks for machine learning. University of Toronto, Technical Report, 2012.
    Google ScholarFindings
  • Tomczak, Jakub M and Welling, Max. Vae with a vampprior. arXiv preprint arXiv:1705.07120, 2017.
    Findings
  • van den Oord, Aaron, Vinyals, Oriol, et al. Neural discrete representation learning. In Advances in Neural Information Processing Systems, pp. 6309–6318, 2017.
    Google ScholarLocate open access versionFindings
  • Veness, Joel, Lattimore, Tor, Bhoopchand, Avishkar, Grabska-Barwinska, Agnieszka, Mattern, Christopher, and Toth, Peter. Online learning with gated linear networks. arXiv preprint arXiv:1712.01897, 2017.
    Findings
  • Vinyals, Oriol, Blundell, Charles, Lillicrap, Tim, Wierstra, Daan, et al. Matching networks for one shot learning. In Advances in Neural Information Processing Systems, pp. 3630–3638, 2016.
    Google ScholarLocate open access versionFindings
  • Wang, Xiaolong and Gupta, Abhinav. Unsupervised learning of visual representations using videos. arXiv preprint arXiv:1505.00687, 2015.
    Findings
  • Zhao, Shengjia, Song, Jiaming, and Ermon, Stefano. Infovae: Information maximizing variational autoencoders. CoRR, abs/1706.02262, 2017.
    Findings
Full Text
Your rating :
0

 

Tags
Comments