Scalable Inference for Logistic-Normal Topic Models

NIPS, pp. 2445-2453, 2013.

Cited by: 81|Views33
EI
Weibo:
Many subsequent extensions have been developed, including dynamic topic models that deal with time series via a dynamic linear system on the Gaussian variables and infinite CTMs that can resolve the number of topics from data

Abstract:

Logistic-normal topic models can effectively discover correlation structures among latent topics. However, their inference remains a challenge because of the non-conjugacy between the logistic-normal prior and multinomial topic mixing proportions. Existing algorithms either make restricting mean-field assumptions or are not scalable to la...More

Code:

Data:

0
Full Text
Bibtex
Weibo
Introduction
  • In Bayesian models, though conjugate priors normally result in easier inference problems, nonconjugate priors could be more expressive in capturing desired model properties.
  • One elegant extension of LDA is the logistic-normal topic models [3], which use a logisticnormal prior to capture the correlation structures among topics effectively
  • Along this line, many subsequent extensions have been developed, including dynamic topic models [4] that deal with time series via a dynamic linear system on the Gaussian variables and infinite CTMs [11] that can resolve the number of topics from data
Highlights
  • In Bayesian models, though conjugate priors normally result in easier inference problems, nonconjugate priors could be more expressive in capturing desired model properties
  • For the most popular latent Dirichlet allocation (LDA) [5], a Dirichlet distribution is used as the conjugate prior for multinomial mixing proportions
  • Many subsequent extensions have been developed, including dynamic topic models [4] that deal with time series via a dynamic linear system on the Gaussian variables and infinite CTMs [11] that can resolve the number of topics from data
  • To address the limitations listed above, we develop a scalable Gibbs sampling algorithm for logisticnormal topic models, without making any restricting assumptions on the posterior distribution
  • We present a block-wise Gibbs sampling algorithm for logistic-normal topic models
Methods
  • The authors present qualitative and quantitative evaluation to demonstrate the efficacy and scalability of the Gibbs sampler for CTM.
  • Experiments are conducted on a 40-node cluster, where each node is equipped with two 6-core CPUs (2.93GHz).
  • The authors will use M to denote the number of machines and P to denote the number of CPU cores.
  • The authors compare with the variational CTM [3] and the state-of-the-art LDA implementation, Yahoo!
  • In order to achieve fair comparison, for both vCTM and gCTM the authors select T such that the models converge sufficiently, as the authors shall discuss later in Section 5.3
Conclusion
  • The authors present a scalable Gibbs sampling algorithm for logistic-normal topic models.
  • The authors' method builds on a novel data augmentation formulation and addresses the non-conjugacy without making strict mean-field assumptions.
  • The authors' method enjoys good scalability, suggesting the ability to extract large structures from massive data.
  • The authors are interested in developing scalable sampling algorithms of other logistic-normal topic models, e.g., infinite CTM and dynamic topic models.
  • The fast sampler of Poly-Gamma distributions can be used in relational and supervised topic models [6, 21]
Summary
  • Introduction:

    In Bayesian models, though conjugate priors normally result in easier inference problems, nonconjugate priors could be more expressive in capturing desired model properties.
  • One elegant extension of LDA is the logistic-normal topic models [3], which use a logisticnormal prior to capture the correlation structures among topics effectively
  • Along this line, many subsequent extensions have been developed, including dynamic topic models [4] that deal with time series via a dynamic linear system on the Gaussian variables and infinite CTMs [11] that can resolve the number of topics from data
  • Methods:

    The authors present qualitative and quantitative evaluation to demonstrate the efficacy and scalability of the Gibbs sampler for CTM.
  • Experiments are conducted on a 40-node cluster, where each node is equipped with two 6-core CPUs (2.93GHz).
  • The authors will use M to denote the number of machines and P to denote the number of CPU cores.
  • The authors compare with the variational CTM [3] and the state-of-the-art LDA implementation, Yahoo!
  • In order to achieve fair comparison, for both vCTM and gCTM the authors select T such that the models converge sufficiently, as the authors shall discuss later in Section 5.3
  • Conclusion:

    The authors present a scalable Gibbs sampling algorithm for logistic-normal topic models.
  • The authors' method builds on a novel data augmentation formulation and addresses the non-conjugacy without making strict mean-field assumptions.
  • The authors' method enjoys good scalability, suggesting the ability to extract large structures from massive data.
  • The authors are interested in developing scalable sampling algorithms of other logistic-normal topic models, e.g., infinite CTM and dynamic topic models.
  • The fast sampler of Poly-Gamma distributions can be used in relational and supervised topic models [6, 21]
Tables
  • Table1: Training time of vCTM and gCTM (M = 40) on various datasets
Download tables as Excel
Funding
  • This work is supported by the National Basic Research Program (973 Program) of China (Nos. 2013CB329403, 2012CB316301), National Natural Science Foundation of China (Nos. 61322308, 61305066), Tsinghua University Initiative Scientific Research Program (No 20121088071), and Tsinghua National Laboratory for Information Science and Technology, China
Reference
  • A. Ahmed, M. Aly, J. Gonzalez, S. Narayanamurthy, and A. Smola. Scalable inference in latent variable models. In International Conference on Web Search and Data Mining (WSDM), 2012.
    Google ScholarLocate open access versionFindings
  • K. Bache and M. Lichman. UCI machine learning repository, 2013.
    Google ScholarFindings
  • D. Blei and J. Lafferty. Correlated topic models. In Advances in Neural Information Processing Systems (NIPS), 2006.
    Google ScholarLocate open access versionFindings
  • D. Blei and J. Lafferty. Dynamic topic models. In International Conference on Machine Learning (ICML), 2006.
    Google ScholarLocate open access versionFindings
  • D.M. Blei, A.Y. Ng, and M.I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003.
    Google ScholarLocate open access versionFindings
  • N. Chen, J. Zhu, F. Xia, and B. Zhang. Generalized relational topic models with data augmentation. In International Joint Conference on Artificial Intelligence (IJCAI), 2013.
    Google ScholarLocate open access versionFindings
  • M. Hoffman, D. Blei, and F. Bach. Online learning for latent Dirichlet allocation. In Advances in Neural Information Processing Systems (NIPS), 2010.
    Google ScholarLocate open access versionFindings
  • C. Holmes and L. Held. Bayesian auxiliary variable models for binary and multinomial regression. Bayesian Analysis, 1(1):145–168, 2006.
    Google ScholarLocate open access versionFindings
  • D. Mimno, H. Wallach, and A. McCallum. Gibbs sampling for logistic normal topic models with graph-based priors. In NIPS Workshop on Analyzing Graphs, 2008.
    Google ScholarLocate open access versionFindings
  • D. Newman, A. Asuncion, P. Smyth, and M. Welling. Distributed algorithms for topic models. Journal of Machine Learning Research, (10):1801–1828, 2009.
    Google ScholarLocate open access versionFindings
  • J. Paisley, C. Wang, and D. Blei. The discrete infinite logistic normal distribution for mixedmembership modeling. In International Conference on Artificial Intelligence and Statistics (AISTATS), 2011.
    Google ScholarLocate open access versionFindings
  • N. G. Polson and J. G. Scott. Default bayesian analysis for multi-way tables: a dataaugmentation approach. arXiv:1109.4180, 2011.
    Findings
  • N. G. Polson, J. G. Scott, and J. Windle. Bayesian inference for logistic models using PolyaGamma latent variables. arXiv:1205.0310v2, 2013.
    Findings
  • C. P. Robert. Simulation of truncated normal variables. Statistics and Compuating, 5:121–125, 1995.
    Google ScholarLocate open access versionFindings
  • W. Rudin. Principles of mathematical analysis. McGraw-Hill Book Co., 1964.
    Google ScholarFindings
  • A. Smola and S. Narayanamurthy. An architecture for parallel topic models. Very Large Data Base (VLDB), 2010.
    Google ScholarFindings
  • M. A. Tanner and W. H. Wong. The calculation of posterior distributions by data augmentation. Journal of the Americal Statistical Association, 82(398):528–540, 1987.
    Google ScholarLocate open access versionFindings
  • D. van Dyk and X. Meng. The art of data augmentation. Journal of Computational and Graphical Statistics, 10(1):1–50, 2001.
    Google ScholarLocate open access versionFindings
  • L. Yao, D. Mimno, and A. McCallum. Efficient methods for topic model inference on streaming document collections. In International Conference on Knowledge Discovery and Data mining (SIGKDD), 2009.
    Google ScholarLocate open access versionFindings
  • A. Zhang, J. Zhu, and B. Zhang. Sparse online topic models. In International Conference on World Wide Web (WWW), 2013.
    Google ScholarLocate open access versionFindings
  • J. Zhu, X. Zheng, and B. Zhang. Improved bayesian supervised topic models with data augmentation. In Annual Meeting of the Association for Computational Linguistics (ACL), 2013.
    Google ScholarFindings
Your rating :
0

 

Tags
Comments