AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
The text the model is trained on can influence the types of implicit biases that are transmitted to the learned syntactic component, the learned thematic component, and the tradeoff(s) between these two dynamics

A Discrete Variational Recurrent Topic Model Without The Reparametrization Trick

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), (2020)

Cited by: 5|Views32
EI
Full Text
Bibtex
Weibo

Abstract

We show how to learn a neural topic model with discrete random variables-one that explicitly models each word's assigned topic-using neural variational inference that does not rely on stochastic backpropagation to handle the discrete variables. The model we utilize combines the expressive power of neural methods for representing sequences...More
0
Introduction
  • With the successes of deep learning models, neural variational inference (NVI) [27]— called variational autoencoders (VAE)—has emerged as an important tool for neural-based, probabilistic modeling [19, 40, 37].
  • With its ability to effectively cluster words into thematically-similar groups, has a rich history in NLP and semantics-oriented applications [4, 3, 50, 6, 46, 35, i.a.]
  • They can struggle to capture shorter-range dependencies among the words in a document [8].
  • Two common ways of learning an LDA model are either through Monte Carlo sampling techniques, that iteratively sample states for the latent random variables in the model, or variational/EM-based methods, which minimize the distribution distance between the posterior p(θ, z|w) and an approximation q(θ, z; γ, φ) to that posterior that is controlled by learnable parameters γ and φ.
  • The authors focus on variational inference as it cleanly allows neural components to be used
Highlights
  • With the successes of deep learning models, neural variational inference (NVI) [27]— called variational autoencoders (VAE)—has emerged as an important tool for neural-based, probabilistic modeling [19, 40, 37]
  • NVI is relatively straight-forward when dealing with continuous random variables, but necessitates more complicated approaches for discrete random variables
  • Setup Following Dieng et al [8], we find that, for basic language modeling and core topic modeling, identifying which words are/are not stopwords is a sufficient indicator of thematic relevance
  • The model used in this paper is fundamentally an associative-based language model
  • While NVI does provide some degree of regularization, a significant component of the training criteria is still a cross-entropy loss
  • The text the model is trained on can influence the types of implicit biases that are transmitted to the learned syntactic component, the learned thematic component, and the tradeoff(s) between these two dynamics
Methods
  • BNC (a) Test perplexity for different RNN cells and (b) Test perplexity, as reported in previous works.6.
  • T denotes the denotes the number of topics.
  • Consistent with Wang number of topics.
  • Et al [49] the authors report the maximum of three VRTM runs.
  • LDA VRTM (a) SwitchP for VRTM vs LDA VB [4] (b) Average document-level topic θ entropy, across averaged across three runs.
  • Lower entropy means a a corpus and for the same number of topics
Results
  • Datasets The authors test the performance of the algorithm on the APNEWS, IMDB and BNC datasets that are publicly available.3 Roughly, there are between 7.7k and 9.8k vocab words in each corpus, with between 15M and 20M training tokens each; Table A1 in the appendix details the statistics of these datasets.
  • APNEWS contains 54k newswire articles, IMDB contains 100k movie reviews, and BNC contains 17k assorted texts, such as journals, books excerpts, and newswire.
  • These are the same datasets including the train, validation and test splits, as used by prior work, where additional details can be found [48].
  • Following previous work [8], to avoid overfitting on the BNC dataset the authors grouped 10 documents in the training set into a new pseudo-document
Conclusion
  • The authors incorporated discrete variables into neural variational without analytically integrating them out or reparametrizing and running stochastic backpropagation on them.
  • Neural topic model, the approach maintains the discrete topic assignments, yielding a simple yet effective way to learn thematic vs non-thematic word dynamics.
  • While NVI does provide some degree of regularization, a significant component of the training criteria is still a cross-entropy loss.
  • This paper’s model does not examine adjusting this cross-entropy component.
  • Note that the thematic vs non-thematic aspect of this work provides a potential avenue for examining this.
  • While the authors treated lt as a binary indicator, future work could involve a more nuanced, gradient view
Tables
  • Table1: Test set perplexity (lower is better) of VRTM demonstrates the effectiveness of our approach at learning a topic-based language model. In 1a we demonstrate the stability of VRTM using different recurrent cells. In 1b, we demonstrate our VRTM-LSTM model outperforms prior neural topic models. We do not use pretrained word embeddings
  • Table2: We provide both SwitchP [<a class="ref-link" id="c24" href="#r24">24</a>] results and entropy analysis of the model. These results support the idea that if topic models capture semantic dependencies, then they should capture the topics well, explain the topic assignment for each word, and provide an overall level of thematic consistency across the document (lower θ entropy)
  • Table3: Nine random topics extracted from a 50 topic VRTM learned on the APNEWS corpus. See Table A2 in the Appendix for topics from IMBD and BNC
  • Table4: Seven randomly generated sentences from a VRTM model learned on the three corpora
  • Table5: A summary of the datasets used in our experiments. We use the same datasets and splits as in previous work [<a class="ref-link" id="c49" href="#r49">49</a>]
  • Table6: Nine random topics extracted from a 50 topic VRTM learned on the APNEWS, IMDB and BNC corpora
Download tables as Excel
Funding
  • This material is based in part upon work supported by the National Science Foundation under Grant No IIS-1940931
  • This material is also based on research that is in part supported by the Air Force Research Laboratory (AFRL), DARPA, for the KAIROS program under agreement number FA8750-19-2-1003
Study subjects and analysis
documents: 10
We use the publicly provided tokenization and following past work we lowercase all text and map infrequent words (those in the bottom 0.01% of frequency) to a special token. Following previous work [8], to avoid overfitting on the BNC dataset we grouped 10 documents in the training set into a new pseudo-document. 3https://github.com/jhlau/topically-driven-language-model

Reference
  • Sanjeev Arora, Mikhail Khodak, Nikunj Saunshi, and Kiran Vodrahalli. A compressed sensing view of unsupervised text embeddings, bag-of-n-grams, and lstms. In International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • Kayhan Batmanghelich, Ardavan Saeedi, Karthik Narasimhan, and Sam Gershman. Nonparametric spherical topic modeling with word embeddings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2016.
    Google ScholarLocate open access versionFindings
  • David M Blei and John D Lafferty. Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning, pages 113–120. ACM, 2006.
    Google ScholarLocate open access versionFindings
  • David M Blei, Andrew Y Ng, and Michael I Jordan. Latent dirichlet allocation. Journal of machine Learning research, 3(Jan):993–1022, 2003.
    Google ScholarLocate open access versionFindings
  • Shammur Absar Chowdhury and Roberto Zamparelli. RNN simulations of grammaticality judgments on long-distance dependencies. In Proceedings of the 27th International Conference on Computational Linguistics, 2018.
    Google ScholarLocate open access versionFindings
  • Steven P Crain, Shuang-Hong Yang, Hongyuan Zha, and Yu Jiao. Dialect topic modeling for improved consumer medical search. In AMIA Annual Symposium Proceedings, volume 2010, page 132. American Medical Informatics Association, 2010.
    Google ScholarLocate open access versionFindings
  • Rajarshi Das, Manzil Zaheer, and Chris Dyer. Gaussian LDA for topic models with word embeddings. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2015.
    Google ScholarLocate open access versionFindings
  • Adji B. Dieng, Chong Wang, Jianfeng Gao, and John W. Paisley. Topicrnn: A recurrent neural network with long-range semantic dependency. In 5th International Conference on Learning Representations, ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Jacob Eisenstein, Amr Adel Hassan Ahmed, and Eric P. Xing. Sparse additive generative models of text. In ICML, 2011.
    Google ScholarLocate open access versionFindings
  • Francis Ferraro. Unsupervised Induction of Frame-Based Linguistic Forms. PhD thesis, Johns Hopkins University, 2017.
    Google ScholarFindings
  • Olivier Ferret. How to thematically segment texts by using lexical cohesion? In 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2, 1998.
    Google ScholarLocate open access versionFindings
  • Mikhail Figurnov, Shakir Mohamed, and Andriy Mnih. Implicit reparameterization gradients. In Advances in Neural Information Processing Systems, pages 441–452, 2018.
    Google ScholarLocate open access versionFindings
  • Jun Gao, Di He, Xu Tan, Tao Qin, Liwei Wang, and Tieyan Liu. Representation degeneration problem in training natural language generation models. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=SkEYojRqtm.
    Locate open access versionFindings
  • Pankaj Gupta, Yatin Chaudhary, Florian Buettner, and Hinrich Schütze. Texttovec: Deep contextualized neural autoregressive topic models of language with distributed compositional prior. arXiv preprint arXiv:1810.03947, 2018.
    Findings
  • Suchin Gururangan, Tam Dang, Dallas Card, and Noah A. Smith. Variational pretraining for semi-supervised text classification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019.
    Google ScholarLocate open access versionFindings
  • Eva Hajicová and Jirí Mírovský. Discourse coherence through the lens of an annotated text corpus: A case study. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May 2018. European Language Resources Association (ELRA). URL https://www.aclweb.org/anthology/L18-1259.
    Locate open access versionFindings
  • Geoffrey E Hinton and Ruslan R Salakhutdinov. Replicated softmax: an undirected topic model. In Advances in neural information processing systems, pages 1607–1614, 2009.
    Google ScholarLocate open access versionFindings
  • Weonyoung Joo, Wonsung Lee, Sungrae Park,, and Il-Chul Moon. Dirichlet variational autoencoder, 2019. URL https://openreview.net/forum?id=rkgsvoA9K7.
    Findings
  • Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
    Findings
  • Yair Lakretz, German Kruszewski, Theo Desbordes, Dieuwke Hupkes, Stanislas Dehaene, and Marco Baroni. The emergence of number and syntax units in LSTM language models. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, 2019. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/N19-1002.
    Locate open access versionFindings
  • Hugo Larochelle and Stanislas Lauly. A neural autoregressive topic model. In Advances in Neural Information Processing Systems, pages 2708–2716, 2012.
    Google ScholarLocate open access versionFindings
  • Jey Han Lau, Timothy Baldwin, and Trevor Cohn. Topically driven neural language model. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pages 355–365, 2017.
    Google ScholarLocate open access versionFindings
  • Shuangyin Li, Yu Zhang, Rong Pan, Mingzhi Mao, and Yang Yang. Recurrent attentional topic model. In Thirty-First AAAI Conference on Artificial Intelligence, 2017.
    Google ScholarLocate open access versionFindings
  • Jeffrey Lund, Piper Armstrong, Wilson Fearn, Stephen Cowley, Courtni Byun, Jordan L. BoydGraber, and Kevin D. Seppi. Automatic evaluation of local topic quality. In ACL, 2019.
    Google ScholarLocate open access versionFindings
  • Chandler May, Francis Ferraro, Alan McCree, Jonathan Wintrode, Daniel Garcia-Romero, and Benjamin Van Durme. Topic identification and discovery on text and speech. In EMNLP, 2015.
    Google ScholarLocate open access versionFindings
  • Paola Merlo and Suzanne Stevenson. Automatic verb classification based on statistical distributions of argument structure. Computational Linguistics, 27(3):373–408, 2001.
    Google ScholarLocate open access versionFindings
  • Yishu Miao, Lei Yu, and Phil Blunsom. Neural variational inference for text processing. In International conference on machine learning, pages 1727–1736, 2016.
    Google ScholarLocate open access versionFindings
  • Tomas Mikolov and Geoffrey Zweig. Context dependent recurrent neural network language model. In 2012 IEEE Spoken Language Technology Workshop (SLT), pages 234–239. IEEE, 2012.
    Google ScholarLocate open access versionFindings
  • Tomas Mikolov, Armand Joulin, Sumit Chopra, Michael Mathieu, and Marc’Aurelio Ranzato. Learning longer memory in recurrent neural networks. arXiv preprint arXiv:1412.7753, 2014.
    Findings
  • David Mimno, Hanna M Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum. Optimizing semantic coherence in topic models. In EMNLP, 2011.
    Google ScholarLocate open access versionFindings
  • Andriy Mnih and Karol Gregor. Neural variational inference and learning in belief networks. In Proceedings of the 31st International Conference on International Conference on Machine Learning-Volume 32, pages II–1791. JMLR. org, 2014.
    Google ScholarLocate open access versionFindings
  • Seyedahmad Mousavi, Mehdi Rezaee, Ramin Ayanzadeh, et al. A survey on compressive sensing: Classical results and recent advancements. arXiv: 1908.01014, 2019.
    Findings
  • Christian A Naesseth, Francisco JR Ruiz, Scott W Linderman, and David M Blei. Reparameterization gradients through acceptance-rejection sampling algorithms. arXiv preprint arXiv:1610.05683, 2016.
    Findings
  • Ramesh Nallapati, Igor Melnyk, Abhishek Kumar, and Bowen Zhou. Sengen: Sentence generating neural variational topic model. CoRR, abs/1708.00308, 2017. URL http://arxiv.org/abs/1708.00308.
    Findings
  • Anh Tuan Nguyen, Tung Thanh Nguyen, Tien N Nguyen, David Lo, and Chengnian Sun. Duplicate bug report detection with a combination of information retrieval and topic modeling. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, pages 70–79. ACM, 2012.
    Google ScholarLocate open access versionFindings
  • Michael John Paul. Topic Modeling with Structured Priors for Text-Driven Science. PhD thesis, Johns Hopkins University, 2015.
    Google ScholarFindings
  • Rajesh Ranganath, Sean Gerrish, and David Blei. Black box variational inference. In Artificial Intelligence and Statistics, pages 814–822, 2014.
    Google ScholarLocate open access versionFindings
  • Joseph Reisinger, Austin Waters, Bryan Silverthorn, and Raymond J Mooney. Spherical topic models. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 903–910, 2010.
    Google ScholarLocate open access versionFindings
  • Mehdi Rezaee, Francis Ferraro, et al. Event representation with sequential, semi-supervised discrete variables. arXiv: 2010.04361, 2020.
    Findings
  • Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. Stochastic backpropagation and approximate inference in deep generative models. In International Conference on Machine Learning, pages 1278–1286, 2014.
    Google ScholarLocate open access versionFindings
  • Pannawit Samatthiyadikun and Atsuhiro Takasu. Supervised deep polylingual topic modeling for scholarly information recommendations. In ICPRAM, pages 196–201, 2018.
    Google ScholarLocate open access versionFindings
  • Akash Srivastava and Charles A. Sutton. Autoencoding variational inference for topic models. In ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Nitish Srivastava, Ruslan Salakhutdinov, and Geoffrey Hinton. Modeling documents with a deep boltzmann machine. In Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, UAI’13, pages 616–624, Arlington, Virginia, United States, 2013. AUAI Press. URL http://dl.acm.org/citation.cfm?id=3023638.3023701.
    Locate open access versionFindings
  • Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pages 3104–3112, 2014.
    Google ScholarLocate open access versionFindings
  • Shawn Tan and Khe Chai Sim. Learning utterance-level normalisation using variational autoencoders for robust automatic speech recognition. In 2016 IEEE Spoken Language Technology Workshop (SLT), pages 43–49. IEEE, 2016.
    Google ScholarLocate open access versionFindings
  • Mirwaes Wahabzada, Anne-Katrin Mahlein, Christian Bauckhage, Ulrike Steiner, ErichChristian Oerke, and Kristian Kersting. Plant phenotyping using probabilistic topic models: uncovering the hyperspectral language of plants. Scientific reports, 6:22482, 2016.
    Google ScholarLocate open access versionFindings
  • Hanna M Wallach, David M Mimno, and Andrew McCallum. Rethinking lda: Why priors matter. In Advances in neural information processing systems, pages 1973–1981, 2009.
    Google ScholarLocate open access versionFindings
  • Wenlin Wang, Zhe Gan, Wenqi Wang, Dinghan Shen, Jiaji Huang, Wei Ping, Sanjeev Satheesh, and Lawrence Carin. Topic compositional neural language model. In AISTATS, 2017.
    Google ScholarLocate open access versionFindings
  • Wenlin Wang, Zhe Gan, Hongteng Xu, Ruiyi Zhang, Guoyin Wang, Dinghan Shen, Changyou Chen, and Lawrence Carin. Topic-guided variational auto-encoder for text generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 166–177, 2019.
    Google ScholarLocate open access versionFindings
  • Xiaogang Wang. Action recognition using topic models. In Visual Analysis of Humans, pages 311–332.
    Google ScholarLocate open access versionFindings
  • Tsung-Hsien Wen and Minh-Thang Luong. Latent topic conversational models. arXiv preprint arXiv:1809.07070, 2018.
    Findings
  • Yinfei Yang, Forrest Bao, and Ani Nenkova. Detecting (un)important content for singledocument news summarization. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 2017.
    Google ScholarLocate open access versionFindings
  • Hao Zhang, Bo Chen, Dandan Guo, and Mingyuan Zhou. Whai: Weibull hybrid autoencoding inference for deep topic modeling. In ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • 0. Also Lθ = 0, since it is the KL divergence between two equal distributions. Overall, L reduces to log p(wt; ht), indicating that the model is just maximizing the log-model evidence based on the RNN output.
    Google ScholarFindings
Author
Mohammad Mehdi Rezaee Taghiabadi
Mohammad Mehdi Rezaee Taghiabadi
Francis Ferraro
Francis Ferraro
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科