Dependency Graph Enhanced Dual-transformer Structure for Aspect-based Sentiment Classification

ACL, pp. 6578-6588, 2020.

Cited by: 0|Bibtex|Views107
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com
Weibo:
We propose a dependency graph enhanced dual-transformer network by jointly considering the flat representations learnt from Transformer and graphbased representations learnt from the corresponding dependency graph in an iterative interaction manner

Abstract:

Aspect-based sentiment classification is a popular task aimed at identifying the corresponding emotion of a specific aspect. One sentence may contain various sentiments for different aspects. Many sophisticated methods such as attention mechanism and Convolutional Neural Networks (CNN) have been widely employed for handling this challenge...More
0
Introduction
  • Aspect-based or aspect-level sentiment classification is a popular task with the purpose of identifying the sentiment polarity of the given aspect (Yang et al, 2017; Zhang and Liu, 2017; Zeng et al, 2019).
  • Giving a specific aspect is crucial for sentiment classification owing to the situation that one sentence sometimes contains several aspects, and these aspects may have different sentiment polarities.
  • Modern neural methods such as Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN) (Dong et al, 2014; Vo and Zhang, 2015) have already been widely applied to aspectbased sentiment classification.
  • CNN based attention methods (Xue and Li, 2018; Li et al, 2018) are proposed to enhance the phrase-level representation and achieved encouraging results
Highlights
  • Aspect-based or aspect-level sentiment classification is a popular task with the purpose of identifying the sentiment polarity of the given aspect (Yang et al, 2017; Zhang and Liu, 2017; Zeng et al, 2019)
  • It is reported that lower performance is achieved even with the golden dependency tree, by comparing against using only the flat structure (Zhang et al, 2019). To address these two challenges, we propose a dependency graph enhanced dual-transformer network for aspect-based sentiment classification
  • We can see that these two procedures provide slight improvement. ‘– bidirectional GCN(+Graph Convolutional Networks)’ means that we remove the bidirectional connection and only use original Graph Convolutional Networks, the results show that bidirectional Graph Convolutional Networks outperforms original Graph Convolutional Networks owing to the adequate connection information. ‘–BiAffine’ indicates that we remove the BiAffine process and use all the outputs of dual-transformer structure, we can conclude that BiAffine process is critical for our model, and utilizing simple concatenation of the
  • To introduce Transformer into our task and diminish the error induced by incorrect dependency trees, we propose a dual-transformer structure which considers the connections in dependency tree as a supplementary Graph Convolutional Networks module and a Transformer-like structure for self alignment in traditional Transformer
  • The edge information of the dependency trees needs to be exploited in later work
Methods
  • 5.1 Datasets

    The authors' experiments are conducted on five datasets, including one (Twitter) which is originally built by Dong et al (2014), and the other four datasets (Lap14, Rest 14, Rest 15, Rest16) are respectively from SemEval 2014 task 4 (Pontiki et al, 2014), SemEval 2015 task 12 (Pontiki et al, 2015) and SemEval 2016 task 5 (Hercig et al, 2016), consisting

    Dataset Twitter Lap14 Rest14 Rest15 Rest16

    Category Train Test Train Test Train Test Train Test Train Test of data from two categories: laptop and restaurant.
  • The authors' experiments are conducted on five datasets, including one (Twitter) which is originally built by Dong et al (2014), and the other four datasets (Lap14, Rest 14, Rest 15, Rest16) are respectively from SemEval 2014 task 4 (Pontiki et al, 2014), SemEval 2015 task 12 (Pontiki et al, 2015) and SemEval 2016 task 5 (Hercig et al, 2016), consisting.
  • Dataset Twitter Lap14 Rest14 Rest15 Rest16.
Results
  • As shown in Table 2, the model DGEDT outperforms all other alternatives on all five dataset.
  • The authors can conclude that traditional Transformer DGEDT(Transformer) obtains better performance than DGEDT(BiGCN) in the most datasets.
  • DGEDT employs and combines two sub-modules and outperforms any single submodule.
  • Note that the performance of individual modules is already reported in Table 2.
  • ‘– BiGCN(+GCN)’ means that the authors remove the bidirectional connection and only use original GCN, the results show that bidirectional GCN outperforms original GCN owing to the adequate connection information.
  • The authors can see that these two procedures provide slight improvement. ‘– BiGCN(+GCN)’ means that the authors remove the bidirectional connection and only use original GCN, the results show that bidirectional GCN outperforms original GCN owing to the adequate connection information. ‘–BiAffine’ indicates that the authors remove the BiAffine process and use all the outputs of dual-transformer structure, the authors can conclude that BiAffine process is critical for the model, and utilizing simple concatenation of the
Conclusion
  • Neural structures with syntactical information such as semantic dependency tree and constituent tree are widely employed to enhance the word-level representation of traditional neural networks.
  • These structures are often modeled and described by TreeLSTMs or GCNs. To introduce Transformer into the task and diminish the error induced by incorrect dependency trees, the authors propose a dual-transformer structure which considers the connections in dependency tree as a supplementary GCN module and a Transformer-like structure for self alignment in traditional Transformer.
  • Domain-specific knowledge can be incorporated into the method as an external learning source
Summary
  • Introduction:

    Aspect-based or aspect-level sentiment classification is a popular task with the purpose of identifying the sentiment polarity of the given aspect (Yang et al, 2017; Zhang and Liu, 2017; Zeng et al, 2019).
  • Giving a specific aspect is crucial for sentiment classification owing to the situation that one sentence sometimes contains several aspects, and these aspects may have different sentiment polarities.
  • Modern neural methods such as Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN) (Dong et al, 2014; Vo and Zhang, 2015) have already been widely applied to aspectbased sentiment classification.
  • CNN based attention methods (Xue and Li, 2018; Li et al, 2018) are proposed to enhance the phrase-level representation and achieved encouraging results
  • Methods:

    5.1 Datasets

    The authors' experiments are conducted on five datasets, including one (Twitter) which is originally built by Dong et al (2014), and the other four datasets (Lap14, Rest 14, Rest 15, Rest16) are respectively from SemEval 2014 task 4 (Pontiki et al, 2014), SemEval 2015 task 12 (Pontiki et al, 2015) and SemEval 2016 task 5 (Hercig et al, 2016), consisting

    Dataset Twitter Lap14 Rest14 Rest15 Rest16

    Category Train Test Train Test Train Test Train Test Train Test of data from two categories: laptop and restaurant.
  • The authors' experiments are conducted on five datasets, including one (Twitter) which is originally built by Dong et al (2014), and the other four datasets (Lap14, Rest 14, Rest 15, Rest16) are respectively from SemEval 2014 task 4 (Pontiki et al, 2014), SemEval 2015 task 12 (Pontiki et al, 2015) and SemEval 2016 task 5 (Hercig et al, 2016), consisting.
  • Dataset Twitter Lap14 Rest14 Rest15 Rest16.
  • Results:

    As shown in Table 2, the model DGEDT outperforms all other alternatives on all five dataset.
  • The authors can conclude that traditional Transformer DGEDT(Transformer) obtains better performance than DGEDT(BiGCN) in the most datasets.
  • DGEDT employs and combines two sub-modules and outperforms any single submodule.
  • Note that the performance of individual modules is already reported in Table 2.
  • ‘– BiGCN(+GCN)’ means that the authors remove the bidirectional connection and only use original GCN, the results show that bidirectional GCN outperforms original GCN owing to the adequate connection information.
  • The authors can see that these two procedures provide slight improvement. ‘– BiGCN(+GCN)’ means that the authors remove the bidirectional connection and only use original GCN, the results show that bidirectional GCN outperforms original GCN owing to the adequate connection information. ‘–BiAffine’ indicates that the authors remove the BiAffine process and use all the outputs of dual-transformer structure, the authors can conclude that BiAffine process is critical for the model, and utilizing simple concatenation of the
  • Conclusion:

    Neural structures with syntactical information such as semantic dependency tree and constituent tree are widely employed to enhance the word-level representation of traditional neural networks.
  • These structures are often modeled and described by TreeLSTMs or GCNs. To introduce Transformer into the task and diminish the error induced by incorrect dependency trees, the authors propose a dual-transformer structure which considers the connections in dependency tree as a supplementary GCN module and a Transformer-like structure for self alignment in traditional Transformer.
  • Domain-specific knowledge can be incorporated into the method as an external learning source
Tables
  • Table1: Detailed statistics of five datasets in our experiments
  • Table2: Overall performance of accuracy and F1 on five datasets, AS means aspect-based
  • Table3: Overall ablation results of accuracy on five datasets
Download tables as Excel
Related work
Funding
  • This work is supported through the grants from National Natural Science Foundation of China (NSFC-61772378), the National Key research and Development Program of China (No.2017YFC1200500) and the Major Projects of the National Social Science Foundation of China (No.11&ZD189)
Reference
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
    Google ScholarLocate open access versionFindings
  • Giuseppe Castellucci, Simone Filice, Danilo Croce, and Roberto Basili. 2014. UNITOR: aspect based sentiment analysis with structured learning. In Proceedings of the 8th International Workshop on Semantic Evaluation, SemEval@COLING 2014, Dublin, Ireland, August 23-24, 2014, pages 761–767. The Association for Computer Linguistics.
    Google ScholarLocate open access versionFindings
  • Zhuang Chen and Tieyun Qian. 2019. Transfer capsule network for aspect level sentiment classification. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pages 547–556. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pages 4171–4186. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Li Dong, Furu Wei, Chuanqi Tan, Duyu Tang, Ming Zhou, and Ke Xu. 2014. Adaptive recursive neural network for target-dependent twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, June 22-27, 2014, Baltimore, MD, USA, Volume 2: Short Papers, pages 49–54. The Association for Computer Linguistics.
    Google ScholarLocate open access versionFindings
  • Zhengjie Gao, Ao Feng, Xinyu Song, and Xi Wu. 2019. Target-dependent sentiment classification with BERT. IEEE Access, 7:154290–154299.
    Google ScholarLocate open access versionFindings
  • Tom’avs Hercig, Tomas Brychcın, Lukas Svoboda, and Michal Konkol. 2016. UWB at semeval-2016 task 5: Aspect based sentiment analysis. In Proceedings of the 10th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2016, San Diego, CA, USA, June 16-17, 2016, pages 342–349. The Association for Computer Linguistics.
    Google ScholarLocate open access versionFindings
  • S Hochreiter and J Schmidhuber. 1997. Long shortterm memory. Neural Computation, 9(8):1735– 1780.
    Google ScholarLocate open access versionFindings
  • Binxuan Huang, Yanglan Ou, and Kathleen M. Carley. 2018. Aspect level sentiment classification with attention-over-attention neural networks. In Social, Cultural, and Behavioral Modeling - 11th International Conference, SBP-BRiMS 2018, Washington, DC, USA, July 10-13, 2018, Proceedings, volume 10899 of Lecture Notes in Computer Science, pages 197–206. Springer.
    Google ScholarLocate open access versionFindings
  • Rie Johnson and Tong Zhang. 2015. Semi-supervised convolutional neural networks for text categorization via region embedding. In Advances in Neural Information Processing Systems 28: Annual
    Google ScholarLocate open access versionFindings
  • Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 919–927.
    Google ScholarLocate open access versionFindings
  • Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pages 1746–1751. ACL.
    Google ScholarLocate open access versionFindings
  • Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR, abs/1412.6980.
    Findings
  • Thomas N. Kipf and Max Welling. 2017. Semisupervised classification with graph convolutional networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net.
    Google ScholarLocate open access versionFindings
  • Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 20Recurrent convolutional neural networks for text classification. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA, pages 2267–2273. AAAI Press.
    Google ScholarLocate open access versionFindings
  • Xin Li, Lidong Bing, Wai Lam, and Bei Shi. 2018. Transformation networks for target-oriented sentiment classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pages 946– 956. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, pages 1412–1421. The Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Dehong Ma, Sujian Li, Xiaodong Zhang, and Houfeng Wang. 2017. Interactive attention networks for aspect-level sentiment classification. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017, pages 4068– 4074. ijcai.org.
    Google ScholarLocate open access versionFindings
  • Diego Marcheggiani and Ivan Titov. 2017. Encoding sentences with graph convolutional networks for semantic role labeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017, pages 1506–1515. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language
    Google ScholarLocate open access versionFindings
  • Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pages 1532–1543. ACL.
    Google ScholarLocate open access versionFindings
  • Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Suresh Manandhar, and Ion Androutsopoulos. 2015. Semeval-2015 task 12: Aspect based sentiment analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval@NAACLHLT 2015, Denver, Colorado, USA, June 4-5, 2015, pages 486–495. The Association for Computer Linguistics.
    Google ScholarLocate open access versionFindings
  • Maria Pontiki, Dimitris Galanis, John Pavlopoulos, Harris Papageorgiou, Ion Androutsopoulos, and Suresh Manandhar. 2014. Semeval-2014 task 4: Aspect based sentiment analysis. In Proceedings of the 8th International Workshop on Semantic Evaluation, SemEval@COLING 2014, Dublin, Ireland, August 23-24, 2014, pages 27–35. The Association for Computer Linguistics.
    Google ScholarLocate open access versionFindings
  • Mike Schuster and Kuldip K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Trans. Signal Processing, 45(11):2673–2681.
    Google ScholarLocate open access versionFindings
  • Chi Sun, Luyao Huang, and Xipeng Qiu. 2019. Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pages 380–385. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers, pages 1556–1566. The Association for Computer Linguistics.
    Google ScholarLocate open access versionFindings
  • Duyu Tang, Bing Qin, Xiaocheng Feng, and Ting Liu. 2016a. Effective lstms for target-dependent sentiment classification. In COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, December 11-16, 2016, Osaka, Japan, pages 3298– 3307. ACL.
    Google ScholarLocate open access versionFindings
  • Duyu Tang, Bing Qin, and Ting Liu. 2016b. Aspect level sentiment classification with deep memory network. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, pages 214–224. The Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • 2018. Dating documents using graph convolution networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pages 1605– 1615. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pages 5998–6008.
    Google ScholarLocate open access versionFindings
  • Duy-Tin Vo and Yue Zhang. 2015. Target-dependent twitter sentiment classification with rich automatic features. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, pages 1347–1353. AAAI Press.
    Google ScholarLocate open access versionFindings
  • Yequan Wang, Minlie Huang, Xiaoyan Zhu, and Li Zhao. 2016. Attention-based LSTM for aspectlevel sentiment classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, pages 606–615. The Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jiangfeng Zeng, Xiao Ma, and Ke Zhou. 2019. Enhancing attention-based LSTM with position context for aspect-level sentiment classification. IEEE Access, 7:20462–20471.
    Google ScholarLocate open access versionFindings
  • Chen Zhang, Qiuchi Li, and Dawei Song. 2019. Aspect-based sentiment classification with aspectspecific graph convolutional networks. CoRR, abs/1909.03477.
    Findings
  • Yue Zhang and Jiangming Liu. 2017. Attention modeling for targeted sentiment. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, Valencia, Spain, April 3-7, 2017, Volume 2: Short Papers, pages 572–577. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yuhao Zhang, Peng Qi, and Christopher D. Manning. 2018. Graph convolution over pruned dependency trees improves relation extraction. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, pages 2205–2215. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. 2015.
    Google ScholarFindings
  • Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, volume 37 of JMLR Workshop and Conference Proceedings, pages 2048–2057. JMLR.org.
    Google ScholarLocate open access versionFindings
  • Wei Xue and Tao Li. 2018. Aspect based sentiment analysis with gated convolutional networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pages 2514–2523. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Min Yang, Wenting Tu, Jingxuan Wang, Fei Xu, and Xiaojun Chen. 2017. Attention based LSTM for target dependent sentiment classification. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA, pages 5013–5014. AAAI Press.
    Google ScholarLocate open access versionFindings
  • Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pages 7370–7377. AAAI Press.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments