AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We propose a novel model that takes advantage of both explicit and implicit representations for short text classification

Combining Knowledge with Deep Convolutional Neural Networks for Short Text Classification.

IJCAI, pp.2915-2921, (2017)

Cited: 239|Views135
EI
Full Text
Bibtex
Weibo

Abstract

Text classification is a fundamental task in NLP applications. Most existing work relied on either explicit or implicit text representation to address this problem. While these techniques work well for sentences, they can not easily be applied to short text because of its shortness and sparsity. In this paper, we propose a framework based...More

Code:

Data:

0
Introduction
  • Text classification is a crucial technology in many applications, such as web search, ads matching, and sentiment analysis.
  • Such characteristics pose major challenges in short text classification.
  • In order to overcome them, researchers need to capture more semantic as well as syntax information from short texts.
  • A crucial step to reach this goal is to use more advanced text representation models.
  • According to the different ways of leveraging external sources, previous text representation models can be divided into two categories: explicit representation and implicit representation [Wang and Wang, 2016]
Highlights
  • Text classification is a crucial technology in many applications, such as web search, ads matching, and sentiment analysis
  • We propose a deep neural network that makes the fusion of explicit and implicit representations of short texts
  • By comparing our model with the proposed baseline WCCNN, we can see that the character level information can help improve the performance of the model
  • We propose a novel model that takes advantage of both explicit and implicit representations for short text classification
  • We utilize the character level information to enhance the embedding of short text
  • Experiments on real data show that our method achieves significant improvement over state-of-the-art methods for short text classification
Methods
  • The authors compared the method with several state-of-the-art approaches: two feature-based methods and two deep learning based methods.
  • Word-Concept Embedding + LR.
  • This baseline uses the weighted word embedding as well as concept embedding to represent each short text.
  • For the weighted word embedding Vw ∈ Rm, the authors use the tf-idf value of each word as the weight.
  • Given the concept vector C, the concept embedding Vc ∈ Rm is the weighted average of embedding of each concept:
Results
  • 4.1 Experiment Setup To show the effectiveness of the model, the authors conduct experiments on five widely used datasets: TREC, Twitter, AG news, Bing and Movie Review.
  • This is a question answering dataset 2.
  • It involves 6 different types of questions, such as whether the question is about a location, about person or numeric information.
  • This is a set of tweets with 3 kinds of sentiments: positive, neutral and negative.
Conclusion
  • Discussion of Results

    The results on all the datasets are shown in Table 3.
  • Even without the character level information, the WCCNN model still outperforms most state-of-the-art methods.
  • The authors utilize the character level information to enhance the embedding of short text.
  • With such embedding as the input, the authors build a joint model on the basis of the CNN to perform classification.
  • Experiments on real data show that the method achieves significant improvement over state-of-the-art methods for short text classification
Tables
  • Table1: A Summary of Datasets
  • Table2: Hyper Parameters
  • Table3: Accuracy of Composed Models on Different Datasets
  • Table4: The number of unknown words in each embedding
Download tables as Excel
Related work
  • 2.1 Short Text Understanding

    Short Text Understanding has become a hot topic in recent years. The most crucial step for understanding short text is conceptualization. Previous studies rely on either external knowledge bases [Song et al, 2011; Wang et al, 2015] or lexical information [Hua et al, 2015] to get the concepts associated with the short text. To understand short text, another important task is evaluating the similarity between two short texts. This problem has been solved with either explicit representation [Li et al, 2013] or implicit representation [Kenter and de Rijke, 2015].

    2.2 Text Classification

    Traditional text classification methods rely on humandesigned features. The most widely used feature is to represent text as a vector of terms, namely “Bag-of-Words”. Other studies mainly focus on generating more complex features, such as POS tagging and tree kernel [Post and Bergsma, 2013]. The classifiers can be built using machine learning algorithms such as Naive Bayes and Support Vector Machine [Wang and Manning, 2012]. For short text classification task, previous studies focus on feature expanding [Shen et al, 2006] by leveraging context information from search engines. However, such methods have a serious problem of data sparsity.
Reference
  • [Bengio et al., 2003] Yoshua Bengio, Rejean Ducharme, Pascal Vincent, and Christian Janvin. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137–1155, 2003.
    Google ScholarLocate open access versionFindings
  • [Cheng et al., 2015] Jianpeng Cheng, Zhongyuan Wang, JiRong Wen, Jun Yan, and Zheng Chen. Contextual text understanding in distributional semantic space. In CIKM, pages 133–142, 2015.
    Google ScholarLocate open access versionFindings
  • [Collobert et al., 2011] Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel P. Kuksa. Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12:2493– 2537, 2011.
    Google ScholarLocate open access versionFindings
  • [Conneau et al., 2016] Alexis Conneau, Holger Schwenk, Loıc Barrault, and Yann LeCun. Very deep convolutional networks for natural language processing. CoRR, abs/1606.01781, 2016.
    Findings
  • [Duchi et al., 2011] John C. Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121–2159, 2011.
    Google ScholarLocate open access versionFindings
  • [Hill et al., 2016] Felix Hill, KyungHyun Cho, Anna Korhonen, and Yoshua Bengio. Learning to understand phrases by embedding the dictionary. TACL, 4:17–30, 2016.
    Google ScholarLocate open access versionFindings
  • [Hinton et al., 2012] Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Improving neural networks by preventing coadaptation of feature detectors. CoRR, abs/1207.0580, 2012.
    Findings
  • [Hu et al., 2016] Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard H. Hovy, and Eric P. Xing. Harnessing deep neural networks with logic rules. In ACL, 2016.
    Google ScholarLocate open access versionFindings
  • [Hua et al., 2015] Wen Hua, Zhongyuan Wang, Haixun Wang, Kai Zheng, and Xiaofang Zhou. Short text understanding through lexical-semantic analysis. In ICDE, pages 495–506, 2015.
    Google ScholarLocate open access versionFindings
  • [Huang et al., 2012] Eric H. Huang, Richard Socher, Christopher D. Manning, and Andrew Y. Ng. Improving word representations via global context and multiple word prototypes. In ACL, pages 873–882, 2012.
    Google ScholarLocate open access versionFindings
  • [Kalchbrenner et al., 2014] Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. A convolutional neural network for modelling sentences. In ACL, pages 655–665, 2014.
    Google ScholarLocate open access versionFindings
  • [Kenter and de Rijke, 2015] Tom Kenter and Maarten de Rijke. Short text similarity with word embeddings. In CIKM, pages 1411–1420, 2015.
    Google ScholarLocate open access versionFindings
  • [Kim et al., 2016] Yoon Kim, Yacine Jernite, David Sontag, and Alexander M. Rush. Character-aware neural language models. In AAAI, pages 2741–2749, 2016.
    Google ScholarLocate open access versionFindings
  • [Kim, 2014] Yoon Kim. Convolutional neural networks for sentence classification. In EMNLP, pages 1746–1751, 2014.
    Google ScholarLocate open access versionFindings
  • [Le and Mikolov, 2014] Quoc V. Le and Tomas Mikolov. Distributed representations of sentences and documents. In ICML, pages 1188–1196, 2014.
    Google ScholarLocate open access versionFindings
  • [Li et al., 2013] Pei-Pei Li, Haixun Wang, Kenny Q. Zhu, Zhongyuan Wang, and Xindong Wu. Computing term similarity by large probabilistic isa knowledge. In CIKM, pages 1401–1410, 2013.
    Google ScholarLocate open access versionFindings
  • [Palangi et al., 2016] Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, and Rabab K. Ward. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval. IEEE/ACM Trans. Audio, Speech & Language Processing, 24(4):694–707, 2016.
    Google ScholarLocate open access versionFindings
  • [Pang and Lee, 2005] Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In ACL, 2005.
    Google ScholarLocate open access versionFindings
  • [Pennington et al., 2014] Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In EMNLP, pages 1532–1543, 2014.
    Google ScholarLocate open access versionFindings
  • [Post and Bergsma, 2013] Matt Post and Shane Bergsma. Explicit and implicit syntactic features for text classification. In ACL, pages 866–872, 2013.
    Google ScholarLocate open access versionFindings
  • [Shen et al., 2006] Dou Shen, Rong Pan, Jian-Tao Sun, Jeffrey Junfeng Pan, Kangheng Wu, Jie Yin, and Qiang Yang. Query enrichment for web-query classification. ACM Trans. Inf. Syst., 24(3):320–352, 2006.
    Google ScholarLocate open access versionFindings
  • [Socher et al., 2011] Richard Socher, Eric H. Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In NIPS, pages 801–809, 2011.
    Google ScholarLocate open access versionFindings
  • [Socher et al., 2013] Richard Socher, Alex Perelygin, Jean Y Wu, Jason Chuang, Christopher D Manning, Andrew Y Ng, and Christopher Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In EMNLP, pages 1631–1642.
    Google ScholarLocate open access versionFindings
  • [Song et al., 2011] Yangqiu Song, Haixun Wang, Zhongyuan Wang, Hongsong Li, and Weizhu Chen. Short text conceptualization using a probabilistic knowledgebase. In IJCAI, pages 2330–2336, 2011.
    Google ScholarLocate open access versionFindings
  • [Wang and Manning, 2012] Sida I. Wang and Christopher D. Manning. Baselines and bigrams: Simple, good sentiment and topic classification. In ACL, pages 90–94, 2012.
    Google ScholarLocate open access versionFindings
  • [Wang and Wang, 2016] Zhongyuan Wang and Haixun Wang. Understanding short texts(tutorial). In ACL, 2016.
    Google ScholarLocate open access versionFindings
  • [Wang et al., 2014] Fang Wang, Zhongyuan Wang, Zhoujun Li, and Ji-Rong Wen. Concept-based short text classification and ranking. In CIKM, pages 1069–1078, 2014.
    Google ScholarLocate open access versionFindings
  • [Wang et al., 2015] Zhongyuan Wang, Kejun Zhao, Haixun Wang, Xiaofeng Meng, and Ji-Rong Wen. Query understanding through knowledge-based conceptualization. In IJCAI, pages 3264–3270, 2015.
    Google ScholarLocate open access versionFindings
  • [Wu et al., 2012] Wentao Wu, Hongsong Li, Haixun Wang, and Kenny Qili Zhu. Probase: a probabilistic taxonomy for text understanding. In SIGMOD, pages 481–492, 2012.
    Google ScholarLocate open access versionFindings
  • [Zhang et al., 2015] Xiang Zhang, Junbo Zhao, and Yann LeCun. Character-level convolutional networks for text classification. In NIPS, pages 649–657, 2015.
    Google ScholarLocate open access versionFindings
0
Your rating :

No Ratings

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn