AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
We propose a novel model that takes advantage of both explicit and implicit representations for short text classification
Combining Knowledge with Deep Convolutional Neural Networks for Short Text Classification.
IJCAI, pp.2915-2921, (2017)
Text classification is a fundamental task in NLP applications. Most existing work relied on either explicit or implicit text representation to address this problem. While these techniques work well for sentences, they can not easily be applied to short text because of its shortness and sparsity. In this paper, we propose a framework based...More
PPT (Upload PPT)
- Text classification is a crucial technology in many applications, such as web search, ads matching, and sentiment analysis.
- Such characteristics pose major challenges in short text classification.
- In order to overcome them, researchers need to capture more semantic as well as syntax information from short texts.
- A crucial step to reach this goal is to use more advanced text representation models.
- According to the different ways of leveraging external sources, previous text representation models can be divided into two categories: explicit representation and implicit representation [Wang and Wang, 2016]
- Text classification is a crucial technology in many applications, such as web search, ads matching, and sentiment analysis
- We propose a deep neural network that makes the fusion of explicit and implicit representations of short texts
- By comparing our model with the proposed baseline WCCNN, we can see that the character level information can help improve the performance of the model
- We propose a novel model that takes advantage of both explicit and implicit representations for short text classification
- We utilize the character level information to enhance the embedding of short text
- Experiments on real data show that our method achieves significant improvement over state-of-the-art methods for short text classification
- The authors compared the method with several state-of-the-art approaches: two feature-based methods and two deep learning based methods.
- Word-Concept Embedding + LR.
- This baseline uses the weighted word embedding as well as concept embedding to represent each short text.
- For the weighted word embedding Vw ∈ Rm, the authors use the tf-idf value of each word as the weight.
- Given the concept vector C, the concept embedding Vc ∈ Rm is the weighted average of embedding of each concept:
- 4.1 Experiment Setup To show the effectiveness of the model, the authors conduct experiments on five widely used datasets: TREC, Twitter, AG news, Bing and Movie Review.
- This is a question answering dataset 2.
- It involves 6 different types of questions, such as whether the question is about a location, about person or numeric information.
- This is a set of tweets with 3 kinds of sentiments: positive, neutral and negative.
- Discussion of Results
The results on all the datasets are shown in Table 3.
- Even without the character level information, the WCCNN model still outperforms most state-of-the-art methods.
- The authors utilize the character level information to enhance the embedding of short text.
- With such embedding as the input, the authors build a joint model on the basis of the CNN to perform classification.
- Experiments on real data show that the method achieves significant improvement over state-of-the-art methods for short text classification
- Table1: A Summary of Datasets
- Table2: Hyper Parameters
- Table3: Accuracy of Composed Models on Different Datasets
- Table4: The number of unknown words in each embedding
- 2.1 Short Text Understanding
Short Text Understanding has become a hot topic in recent years. The most crucial step for understanding short text is conceptualization. Previous studies rely on either external knowledge bases [Song et al, 2011; Wang et al, 2015] or lexical information [Hua et al, 2015] to get the concepts associated with the short text. To understand short text, another important task is evaluating the similarity between two short texts. This problem has been solved with either explicit representation [Li et al, 2013] or implicit representation [Kenter and de Rijke, 2015].
2.2 Text Classification
Traditional text classification methods rely on humandesigned features. The most widely used feature is to represent text as a vector of terms, namely “Bag-of-Words”. Other studies mainly focus on generating more complex features, such as POS tagging and tree kernel [Post and Bergsma, 2013]. The classifiers can be built using machine learning algorithms such as Naive Bayes and Support Vector Machine [Wang and Manning, 2012]. For short text classification task, previous studies focus on feature expanding [Shen et al, 2006] by leveraging context information from search engines. However, such methods have a serious problem of data sparsity.
- [Bengio et al., 2003] Yoshua Bengio, Rejean Ducharme, Pascal Vincent, and Christian Janvin. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137–1155, 2003.
- [Cheng et al., 2015] Jianpeng Cheng, Zhongyuan Wang, JiRong Wen, Jun Yan, and Zheng Chen. Contextual text understanding in distributional semantic space. In CIKM, pages 133–142, 2015.
- [Collobert et al., 2011] Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel P. Kuksa. Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12:2493– 2537, 2011.
- [Conneau et al., 2016] Alexis Conneau, Holger Schwenk, Loıc Barrault, and Yann LeCun. Very deep convolutional networks for natural language processing. CoRR, abs/1606.01781, 2016.
- [Duchi et al., 2011] John C. Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, 2011.
- [Hill et al., 2016] Felix Hill, KyungHyun Cho, Anna Korhonen, and Yoshua Bengio. Learning to understand phrases by embedding the dictionary. TACL, 4:17–30, 2016.
- [Hinton et al., 2012] Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Improving neural networks by preventing coadaptation of feature detectors. CoRR, abs/1207.0580, 2012.
- [Hu et al., 2016] Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard H. Hovy, and Eric P. Xing. Harnessing deep neural networks with logic rules. In ACL, 2016.
- [Hua et al., 2015] Wen Hua, Zhongyuan Wang, Haixun Wang, Kai Zheng, and Xiaofang Zhou. Short text understanding through lexical-semantic analysis. In ICDE, pages 495–506, 2015.
- [Huang et al., 2012] Eric H. Huang, Richard Socher, Christopher D. Manning, and Andrew Y. Ng. Improving word representations via global context and multiple word prototypes. In ACL, pages 873–882, 2012.
- [Kalchbrenner et al., 2014] Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. A convolutional neural network for modelling sentences. In ACL, pages 655–665, 2014.
- [Kenter and de Rijke, 2015] Tom Kenter and Maarten de Rijke. Short text similarity with word embeddings. In CIKM, pages 1411–1420, 2015.
- [Kim et al., 2016] Yoon Kim, Yacine Jernite, David Sontag, and Alexander M. Rush. Character-aware neural language models. In AAAI, pages 2741–2749, 2016.
- [Kim, 2014] Yoon Kim. Convolutional neural networks for sentence classification. In EMNLP, pages 1746–1751, 2014.
- [Le and Mikolov, 2014] Quoc V. Le and Tomas Mikolov. Distributed representations of sentences and documents. In ICML, pages 1188–1196, 2014.
- [Li et al., 2013] Pei-Pei Li, Haixun Wang, Kenny Q. Zhu, Zhongyuan Wang, and Xindong Wu. Computing term similarity by large probabilistic isa knowledge. In CIKM, pages 1401–1410, 2013.
- [Palangi et al., 2016] Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, and Rabab K. Ward. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval. IEEE/ACM Trans. Audio, Speech & Language Processing, 24(4):694–707, 2016.
- [Pang and Lee, 2005] Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In ACL, 2005.
- [Pennington et al., 2014] Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In EMNLP, pages 1532–1543, 2014.
- [Post and Bergsma, 2013] Matt Post and Shane Bergsma. Explicit and implicit syntactic features for text classification. In ACL, pages 866–872, 2013.
- [Shen et al., 2006] Dou Shen, Rong Pan, Jian-Tao Sun, Jeffrey Junfeng Pan, Kangheng Wu, Jie Yin, and Qiang Yang. Query enrichment for web-query classification. ACM Trans. Inf. Syst., 24(3):320–352, 2006.
- [Socher et al., 2011] Richard Socher, Eric H. Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In NIPS, pages 801–809, 2011.
- [Socher et al., 2013] Richard Socher, Alex Perelygin, Jean Y Wu, Jason Chuang, Christopher D Manning, Andrew Y Ng, and Christopher Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In EMNLP, pages 1631–1642.
- [Song et al., 2011] Yangqiu Song, Haixun Wang, Zhongyuan Wang, Hongsong Li, and Weizhu Chen. Short text conceptualization using a probabilistic knowledgebase. In IJCAI, pages 2330–2336, 2011.
- [Wang and Manning, 2012] Sida I. Wang and Christopher D. Manning. Baselines and bigrams: Simple, good sentiment and topic classification. In ACL, pages 90–94, 2012.
- [Wang and Wang, 2016] Zhongyuan Wang and Haixun Wang. Understanding short texts(tutorial). In ACL, 2016.
- [Wang et al., 2014] Fang Wang, Zhongyuan Wang, Zhoujun Li, and Ji-Rong Wen. Concept-based short text classification and ranking. In CIKM, pages 1069–1078, 2014.
- [Wang et al., 2015] Zhongyuan Wang, Kejun Zhao, Haixun Wang, Xiaofeng Meng, and Ji-Rong Wen. Query understanding through knowledge-based conceptualization. In IJCAI, pages 3264–3270, 2015.
- [Wu et al., 2012] Wentao Wu, Hongsong Li, Haixun Wang, and Kenny Qili Zhu. Probase: a probabilistic taxonomy for text understanding. In SIGMOD, pages 481–492, 2012.
- [Zhang et al., 2015] Xiang Zhang, Junbo Zhao, and Yann LeCun. Character-level convolutional networks for text classification. In NIPS, pages 649–657, 2015.