Attention-based LSTM for Aspect-level Sentiment Classification

EMNLP, pp. 606-615, 2016.

Cited by: 896|Bibtex|Views289
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com
Weibo:
We have proposed attention-based Long Short-term Memory for aspect-level sentiment classification

Abstract:

Aspect-level sentiment classification is a finegrained task in sentiment analysis. Since it provides more complete and in-depth results, aspect-level sentiment analysis has received much attention these years. In this paper, we reveal that the sentiment polarity of a sentence is not only determined by the content but is also highly relate...More

Code:

Data:

0
Introduction
  • Sentiment analysis (Nasukawa and Yi, 2003), known as opinion mining (Liu, 2012), is a key NLP task that receives much attention these years.
  • Target dependent sentiment classification can be benefited from taking into account target information, such as in Target-Dependent LSTM (TD-LSTM) and Target-Connection LSTM (TC-LSTM) (Tang et al, 2015a).
  • Those models can only take into consideration the target but not aspect information which is proved to be crucial for aspect-level classification
Highlights
  • Sentiment analysis (Nasukawa and Yi, 2003), known as opinion mining (Liu, 2012), is a key NLP task that receives much attention these years
  • We deal with aspect-level sentiment classification and we find that the sentiment polarity of a sentence is highly dependent on both content and aspect
  • Target dependent sentiment classification can be benefited from taking into account target information, such as in Target-Dependent Long Short-term Memory (TD-Long Short-term Memory) and Target-Connection Long Short-term Memory (TC-Long Short-term Memory) (Tang et al, 2015a)
  • Our proposed models can concentrate on different parts of a sentence when different aspects are given so that they are more competitive for aspect-level classification
  • An interesting and possible direction would be to model more than one aspect simultaneously with the attention mechanism
Methods
  • Experiments results are shown in Table

    3 and Table 4.
  • Experiments results are shown in Table.
  • 3 and Table 4.
  • Similar to the experiment on aspect-level classification, the models achieve state-of-the-art performance.
  • 4.3 Comparison with baseline methods.
  • The authors compare the model with several baselines, including LSTM, TD-LSTM, and TC-LSTM.
  • LSTM: Standard LSTM cannot capture any aspect information in sentence, so it must get the same (a) the aspect of this sentence: service (b) the aspect of this sentence: food LSTM TD-LSTM.
  • AE-LSTM ATAE-LSTM indicates binary prediction where ignoring all neutral instances.
Results
  • Experimental results indicate that the approach can improve the performance compared with several baselines, and further examples demonstrate the attention mechanism works well for aspect-level sentiment classification.
  • Similar to the experiment on aspect-level classification, the models achieve state-of-the-art performance
Conclusion
  • The authors have proposed attention-based LSTMs for aspect-level sentiment classification.
  • The authors' proposed models can concentrate on different parts of a sentence when different aspects are given so that they are more competitive for aspect-level classification.
  • Though the proposals have shown potentials for aspect-level sentiment analysis, different aspects are input separately.
  • An interesting and possible direction would be to model more than one aspect simultaneously with the attention mechanism
Summary
  • Introduction:

    Sentiment analysis (Nasukawa and Yi, 2003), known as opinion mining (Liu, 2012), is a key NLP task that receives much attention these years.
  • Target dependent sentiment classification can be benefited from taking into account target information, such as in Target-Dependent LSTM (TD-LSTM) and Target-Connection LSTM (TC-LSTM) (Tang et al, 2015a).
  • Those models can only take into consideration the target but not aspect information which is proved to be crucial for aspect-level classification
  • Methods:

    Experiments results are shown in Table

    3 and Table 4.
  • Experiments results are shown in Table.
  • 3 and Table 4.
  • Similar to the experiment on aspect-level classification, the models achieve state-of-the-art performance.
  • 4.3 Comparison with baseline methods.
  • The authors compare the model with several baselines, including LSTM, TD-LSTM, and TC-LSTM.
  • LSTM: Standard LSTM cannot capture any aspect information in sentence, so it must get the same (a) the aspect of this sentence: service (b) the aspect of this sentence: food LSTM TD-LSTM.
  • AE-LSTM ATAE-LSTM indicates binary prediction where ignoring all neutral instances.
  • Results:

    Experimental results indicate that the approach can improve the performance compared with several baselines, and further examples demonstrate the attention mechanism works well for aspect-level sentiment classification.
  • Similar to the experiment on aspect-level classification, the models achieve state-of-the-art performance
  • Conclusion:

    The authors have proposed attention-based LSTMs for aspect-level sentiment classification.
  • The authors' proposed models can concentrate on different parts of a sentence when different aspects are given so that they are more competitive for aspect-level classification.
  • Though the proposals have shown potentials for aspect-level sentiment analysis, different aspects are input separately.
  • An interesting and possible direction would be to model more than one aspect simultaneously with the attention mechanism
Tables
  • Table1: Aspects distribution per sentiment class. {Fo., Pr., Se, Am., An.} refer to {food, price, service, ambience, anecdotes/miscellaneous}. “Asp.” refers to aspect
  • Table2: Accuracy on aspect level polarity classification about restaurants. Three-way stands for 3-class prediction. Pos./Neg. indicates binary prediction where ignoring all neutral instances. Best scores are in bold
  • Table3: Accuracy on aspect term polarity classification about restaurants. Three-way stands for 3-class prediction. Pos./Neg
  • Table4: Accuracy on aspect term polarity classification about laptops. Three-way stands for 3-class prediction. Pos./Neg. indicates binary prediction where ignoring all neutral instances. Best scores are in bold
Download tables as Excel
Related work
  • In this section, we will review related works on aspect-level sentiment classification and neural networks for sentiment classification briefly.

    2.1 Sentiment Classification at Aspect-level

    Aspect-level sentiment classification is typically considered as a classification problem in the literature. As we mentioned before, aspect-level sentiment classification is a fine-grained classification task. The majority of current approaches attempt to detecting the polarity of the entire sentence, regardless of the entities mentioned or aspects. Traditional approaches to solve those problems are to manually design a set of features. With the abundance of sentiment lexicons (Rao and Ravichandran, 2009; Perez-Rosas et al, 2012; Kaji and Kitsuregawa, 2007), the lexicon-based features were built for sentiment analysis (Mohammad et al, 2013). Most of these studies focus on building sentiment classifiers with features, which include bag-of-words and sentiment lexicons, using SVM (Mullen and Collier, 2004). However, the results highly depend on the quality of features. In addition, feature engineering is labor intensive.
Funding
  • This work was partly supported by the National Basic Research Program (973 Program) under grant No.2012CB316301/2013CB329403, the National Science Foundation of China under grant No.61272227/61332007, and the Beijing Higher Education Young Elite Teacher Project
  • The work was also supported by Tsinghua University Beijing Samsung Telecom R&D Center Joint Laboratory for Intelligent Media Computing
Reference
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
    Findings
  • Frederic Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian Goodfellow, Arnaud Bergeron, Nicolas Bouchard, David Warde-Farley, and Yoshua Bengio. 201Theano: new features and speed improvements. arXiv preprint arXiv:1211.5590.
    Findings
  • Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Andrew Senior, Paul Tucker, Ke Yang, Quoc V Le, et al. 2012. Large scale distributed deep networks. In Advances in Neural Information Processing Systems, pages 1223–1231.
    Google ScholarLocate open access versionFindings
  • Wankun Deng, Yongbo Wang, Zexian Liu, Han Cheng, and Yu Xue. 201Hemi: a toolkit for illustrating heatmaps. PloS one, 9(11):e111988.
    Google ScholarLocate open access versionFindings
  • Li Dong, Furu Wei, Chuanqi Tan, Duyu Tang, Ming Zhou, and Ke Xu. 2014. Adaptive recursive neural network for target-dependent twitter sentiment classification. In ACL (2), pages 49–54.
    Google ScholarLocate open access versionFindings
  • John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, 12:2121–2159.
    Google ScholarLocate open access versionFindings
  • David Golub and Xiaodong He. 2016. Character-level question answering with attention. arXiv preprint arXiv:1604.00727.
    Findings
  • Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. In Advances in Neural Information Processing Systems, pages 1684–1692.
    Google ScholarLocate open access versionFindings
  • Sepp Hochreiter and Jurgen Schmidhuber. 1997. Long short-term memory. Neural computation, 9(8):1735– 1780.
    Google ScholarLocate open access versionFindings
  • Nobuhiro Kaji and Masaru Kitsuregawa. 2007. Building lexicon for sentiment analysis from massive collection of html documents. In EMNLP-CoNLL, pages 1075– 1083.
    Google ScholarLocate open access versionFindings
  • Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360.
    Findings
  • Bing Liu. 20Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, 5(1):1–167.
    Google ScholarLocate open access versionFindings
  • Tomas Mikolov, Martin Karafiat, Lukas Burget, Jan Cernocky, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In INTERSPEECH, volume 2, page 3.
    Google ScholarLocate open access versionFindings
  • Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pages 3111–3119.
    Google ScholarLocate open access versionFindings
  • Volodymyr Mnih, Nicolas Heess, Alex Graves, et al. 2014. Recurrent models of visual attention. In Advances in Neural Information Processing Systems, pages 2204–2212.
    Google ScholarLocate open access versionFindings
  • Saif M Mohammad, Svetlana Kiritchenko, and Xiaodan Zhu. 2013. Nrc-canada: Building the state-of-theart in sentiment analysis of tweets. arXiv preprint arXiv:1308.6242.
    Findings
  • Tony Mullen and Nigel Collier. 2004. Sentiment analysis using support vector machines with diverse information sources. In EMNLP, volume 4, pages 412– 418.
    Google ScholarLocate open access versionFindings
  • Tetsuya Nasukawa and Jeonghee Yi. 2003. Sentiment analysis: Capturing favorability using natural language processing. In Proceedings of the 2nd international conference on Knowledge capture, pages 70–77. ACM.
    Google ScholarLocate open access versionFindings
  • Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014), 12:1532–1543.
    Google ScholarLocate open access versionFindings
  • Veronica Perez-Rosas, Carmen Banea, and Rada Mihalcea. 2012. Learning sentiment lexicons in spanish. In LREC, volume 12, page 73.
    Google ScholarLocate open access versionFindings
  • Maria Pontiki, Dimitris Galanis, John Pavlopoulos, Harris Papageorgiou, Ion Androutsopoulos, and Suresh Manandhar. 2014. Semeval-2014 task 4: Aspect based sentiment analysis. In Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pages 27–35.
    Google ScholarLocate open access versionFindings
  • Qiao Qian, Bo Tian, Minlie Huang, Yang Liu, Xuan Zhu, and Xiaoyan Zhu. 2015. Learning tag embeddings and tag-specific composition functions in recursive neural network. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, volume 1, pages 1365–1374.
    Google ScholarLocate open access versionFindings
  • Delip Rao and Deepak Ravichandran. 2009. Semisupervised polarity lexicon induction. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pages 675–682. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Tim Rocktaschel, Edward Grefenstette, Karl Moritz Hermann, Tomas Kocisky, and Phil Blunsom. 2015. Reasoning about entailment with neural attention. arXiv preprint arXiv:1509.06664.
    Findings
  • Alexander M Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685.
    Findings
  • Richard Socher, Jeffrey Pennington, Eric H Huang, Andrew Y Ng, and Christopher D Manning. 2011. Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 151–161. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Richard Socher, Alex Perelygin, Jean Y Wu, Jason Chuang, Christopher D Manning, Andrew Y Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In EMNLP, volume 1631, page 1642. Citeseer.
    Google ScholarLocate open access versionFindings
  • Kai Sheng Tai, Richard Socher, and Christopher D Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075.
    Findings
  • Duyu Tang, Bing Qin, Xiaocheng Feng, and Ting Liu. 2015a. Target-dependent sentiment classification with long short term memory. arXiv preprint arXiv:1512.01100. Duyu Tang, Bing Qin, and Ting Liu. 2015b. Document modeling with gated recurrent neural network for sentiment classification. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1422–1432. Wenpeng Yin, Hinrich Schutze, Bing Xiang, and Bowen Zhou. 2015. Abcnn: Attention-based convolutional neural network for modeling sentence pairs. arXiv preprint arXiv:1512.05193.
    Findings
Full Text
Your rating :
0

 

Tags
Comments