Unsupervised stance detection for arguments from consequences

EMNLP 2020, pp.50-60, (2020)

被引用0|浏览124
下载 PDF 全文
引用
微博一下

摘要

Social media platforms have become an essential venue for online deliberation where users discuss arguments, debate, and form opinions. In this paper, we propose an unsupervised method to detect the stance of argumentative claims with respect to a topic. Most related work focuses on topic-specific supervised models that need to be trained...更多

代码

数据

0
简介
  • In the context of decision making it is crucial to compare positive and negative effects that result from a potential decision.
  • Called stance classification, is the task to decide whether a text is in favor of, against, or unrelated to a given topic
  • This problem is related to opinion mining, but while opinion mining focuses on the sentiment polarity explicitly expressed by a text, stance detection aims to determine the position that the text holds with respect to a topic that is generally more abstract and might not be mentioned in the text.
  • The text Holocaust denial psychologically harms Holocaust survivors expresses a negative opinion, but its stance towards Criminalization of Holocaust denial is positive.1
重点内容
  • In the context of decision making it is crucial to compare positive and negative effects that result from a potential decision
  • This problem is related to opinion mining, but while opinion mining focuses on the sentiment polarity explicitly expressed by a text, stance detection aims to determine the position that the text holds with respect to a topic that is generally more abstract and might not be mentioned in the text
  • Our contribution is three-fold: (i) we propose a fully unsupervised approach for stance detection, focusing on arguments that refer to consequences; (ii) we define rules over grammatical dependencies that exploit sentiment as well as effect words in order to determine good and bad consequences; (iii) we publish a new stance detection dataset that labels claims that refer to consequences, and which was crowdsourced on Amazon Mechanical Turk (AMT)
  • To understand why the annotators usually disagree, we investigated such instances and identified several possible reasons: Complexity In the topic-claim pair Criminalization of Holocaust denial – Danger of public accepting holocaust denial should be fought by logic, both topic and claim have a negative stance towards holocaust denial, which suggests the label in favor
  • We propose a fully unsupervised method to detect the stance of arguments from consequences in online debates
  • Its good performance on the claims that refer to consequences reinforces our intuition that designing systems tailored for particular argumentation schemes might be a good alternative to topic-specific models
结果
  • Results and Discussion

    The results that compare the system to BERT and the sentiment detection baseline are presented in

    9https://idebate.org/ 10We worked with the original release: https:// github.com/google-research/bert sent BERT

    - BERT std deviation the system ECF the system EWN pro con mac acc pro con mac acc pro con mac acc pro con mac acc

    Concerning the two stance classes, with both lexicon settings, the system is better than BERT at predicting the pro class in arguments from consequences, but is outperformed on the con class.
  • The authors computed the F1 macro standard deviation of the system with ECF when run on the same 10 folds, and the values lie between .03 on debate and .07 on conseq.
  • This indicates that the unsupervised approach is more robust with more predictable performance
结论
  • The authors propose a fully unsupervised method to detect the stance of arguments from consequences in online debates.
  • The authors annotated arguments from Debatepedia regarding their stance and whether they involve consequences or not.
  • Besides the future extensions of this approach that the authors mentioned in the results discussion and error analysis, this work opens several interesting research paths.
  • Its good performance on the claims that refer to consequences reinforces the intuition that designing systems tailored for particular argumentation schemes might be a good alternative to topic-specific models.
  • The authors plan to complement this work with approaches for other frequently applied schemes such as arguments by expert opinion and arguments by example
总结
  • Introduction:

    In the context of decision making it is crucial to compare positive and negative effects that result from a potential decision.
  • Called stance classification, is the task to decide whether a text is in favor of, against, or unrelated to a given topic
  • This problem is related to opinion mining, but while opinion mining focuses on the sentiment polarity explicitly expressed by a text, stance detection aims to determine the position that the text holds with respect to a topic that is generally more abstract and might not be mentioned in the text.
  • The text Holocaust denial psychologically harms Holocaust survivors expresses a negative opinion, but its stance towards Criminalization of Holocaust denial is positive.1
  • Results:

    Results and Discussion

    The results that compare the system to BERT and the sentiment detection baseline are presented in

    9https://idebate.org/ 10We worked with the original release: https:// github.com/google-research/bert sent BERT

    - BERT std deviation the system ECF the system EWN pro con mac acc pro con mac acc pro con mac acc pro con mac acc

    Concerning the two stance classes, with both lexicon settings, the system is better than BERT at predicting the pro class in arguments from consequences, but is outperformed on the con class.
  • The authors computed the F1 macro standard deviation of the system with ECF when run on the same 10 folds, and the values lie between .03 on debate and .07 on conseq.
  • This indicates that the unsupervised approach is more robust with more predictable performance
  • Conclusion:

    The authors propose a fully unsupervised method to detect the stance of arguments from consequences in online debates.
  • The authors annotated arguments from Debatepedia regarding their stance and whether they involve consequences or not.
  • Besides the future extensions of this approach that the authors mentioned in the results discussion and error analysis, this work opens several interesting research paths.
  • Its good performance on the claims that refer to consequences reinforces the intuition that designing systems tailored for particular argumentation schemes might be a good alternative to topic-specific models.
  • The authors plan to complement this work with approaches for other frequently applied schemes such as arguments by expert opinion and arguments by example
表格
  • Table1: Example of topic-claim pair in Table 1, the target of both topic and claim is medical marijuana. Our solution starts by first determining the stance of the claim and of the topic towards their respective targets Tc and Tt. We then use these stances and the semantic relation between the targets to determine the claim’s stance towards the topic
  • Table2: Dependency graph patterns. ∗ ∈ {dobj, nsubjpass, cobj, csubjpass, nmod, xcomp}; ∈ {nsubj, csubj}; † ∈ {amod, nn, advmod}; NegP stands for negative preposition good (dir = eff = −1, sent = +1); (iv) the target’s reduction implies a positive effect over something bad (dir = +1, eff = −1, sent = +1). Hence, the stance is favorable towards the target if the multiplication of the three components’ values is +1. Consequently, we define the stance of a statement towards the target as s = dir ·eff ·val and interpret s = 1 as In favor and s = −1 as Against
  • Table3: Worked out Examples
  • Table4: Fleiss’ Kappa dependent on the number of valid annotations
  • Table5: Class distributions
  • Table6: Experimental results. F1 scores per stance class (pro and con), macro-F1 (mac), and Accuracy (acc). For BERT, we show the mean of the respective cross-validation results and their standard deviation. First, as expected, our system performs better on arguments related to consequences than on other arguments, with a macro-F1 difference of 10pp between conseq and other. Further, our system with both lexicon settings consistently outperforms the sent baseline, but its macro-F1 score is outperformed by BERT on conseq and wiki, and its accuracy is outperformed by BERT on all datasets. This is not surprising, given that we use BERT pre-trained and then fine-tuned to our data. Interestingly, our system with ECF achieves better results than BERT in terms of macro F1 score on the arguments that are not related to consequences (other), and on the complete debate dataset. This indicates that our method can deal reasonably well with arguments that are not from consequences
  • Table7: Evaluation of the target identification and stance detection strategies; r denotes the rate of data instances
Download tables as Excel
相关工作
基金
  • This work has been funded by the Deutsche Forschungsgemeinschaft (DFG) within the project ExpLAIN, Grant Number STU 266/14-1, as part of the Priority Program ”Robust Argumentation Machines (RATIO)” (SPP-1999)
研究对象与分析
cases: 4
To infer the stance that a statement expresses towards its target, we use the intuition that the stance is unfavorable when the text expresses negative consequences of the target, and positive otherwise. Thus, we define that the stance towards the target is positive in exactly the following four cases: (i) the target’s amplification implies a positive effect over something good (dir = eff = sent = +1); (ii) the target’s amplification implies a negative effect over something bad (dir = +1, eff = sent = −1); (iii) the target’s reduction implies a negative effect over something. 6Those are except, less, minus, opposite, sans, unlike, versus, without, w/o, vice, instead (of), lack

Featured Debate Digest articles: 236
To create such a corpus, we run an AMT crowdsourcing study, where we annotate claims and topics extracted from Debatepedia7. We only use the 236 Featured Debate Digest articles as they are of higher quality. They contain more than 10,000 arguments labeled by their author as either pro or con the debate’s topic

pairs: 1502
Since the original labels are only pro or con, all pairs that our study determined as neither are removed. This filter resulted in a total of 1502 pairs, out of which 822 have been annotated to relate to consequences. conseq other debate wiki pro con pro con pro con pro con

pairs: 822
5.1 Data. We report results both on the 822 pairs that relate to consequences, denoted by conseq, and on the rest of the pairs, denoted by other, as well as on their union, denoted by debate. For checking the performance of the systems on an independent dataset, we also use the claim stance dataset8 published by Bar-Haim et al (2017a)

引用论文
  • Aseel Addawood, Jodi Schneider, and Masooda Bashir. 2017. Stance classification of twitter debates: The encryption debate as a use case. In Proceedings of the 8th International Conference on Social Media & Society, #SMSociety17, New York, NY, USA. Association for Computing Machinery.
    Google ScholarLocate open access versionFindings
  • Khalid Al-Khatib, Yufang Hou, Henning Wachsmuth, Charles Jochim, Francesca Bonin, and Benno Stein. 2020. End-to-end argumentation knowledge graph construction. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020).
    Google ScholarLocate open access versionFindings
  • Pranav Anand, Marilyn Walker, Rob Abbott, Jean E. Fox Tree, Robeson Bowmani, and Michael Minor. 2011. Cats rule and dogs drool!: Classifying stance in online debate. In Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011), pages 1– 9, Portland, Oregon. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Roy Bar-Haim, Indrajit Bhattacharya, Francesco Dinuzzo, Amrita Saha, and Noam Slonim. 2017a. Stance classification of context-dependent claims. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 251–261, Valencia, Spain. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Roy Bar-Haim, Lilach Edelstein, Charles Jochim, and Noam Slonim. 2017b. Improving claim stance classification with lexical knowledge expansion and context utilization. In Proceedings of the 4th Workshop on Argument Mining. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Yoonjung Choi and Janyce Wiebe. 2014. +/EffectWordNet: Sense-level lexicon acquisition for opinion inference. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1181–1191, Doha, Qatar. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding.
    Google ScholarFindings
  • Kuntal Dey, Ritvik Shrivastava, and Saroj Kaushik. 201Topical stance detection for twitter: A twophase lstm model using attention. In Advances in Information Retrieval, pages 529–536, Cham. Springer International Publishing.
    Google ScholarLocate open access versionFindings
  • Jiachen Du, Ruifeng Xu, Yulan He, and Lin Gui. 2017. Stance classification with target-specific neural attention. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pages 3988–3994.
    Google ScholarLocate open access versionFindings
  • Adam Faulkner. 2014. Automated classification of stance in student essays: An approach using stance target information and the wikipedia link-based measure. Proceedings of the 27th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2014, pages 174–179.
    Google ScholarLocate open access versionFindings
  • Christiane Fellbaum. 2010. Princeton university: About wordnet.
    Google ScholarFindings
  • Shalmoli Ghosh, Prajwal Singhania, Siddharth Singh, Koustav Rudra, and Saptarshi Ghosh. 2019. Stance detection in web and social media: A comparative study. In Experimental IR Meets Multilinguality, Multimodality, and Interaction, pages 75–87, Cham. Springer International Publishing.
    Google ScholarLocate open access versionFindings
  • Subrata Ghosh, Konjengbam Anand, Sailaja Rajanala, A Bharath Reddy, and Manish Singh. 2018. Unsupervised Stance Classification in Online Debates. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, CoDS-COMAD ’18, pages 30–36, New York, NY, USA. ACM. Event-place: Goa, India.
    Google ScholarLocate open access versionFindings
  • Kazi Saidul Hasan and Vincent Ng. 2013. Extralinguistic constraints on stance recognition in ideological debates. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 816–821, Sofia, Bulgaria. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Dirk Hovy, Taylor Berg-Kirkpatrick, Ashish Vaswani, and Eduard Hovy. 2013. Learning whom to trust with MACE. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1120–1130, Atlanta, Georgia. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’04, page 168–177, New York, NY, USA. Association for Computing Machinery.
    Google ScholarLocate open access versionFindings
  • Anand Konjengbam, Subrata Ghosh, Nagendra Kumar, and Manish Singh. 2018. Debate stance classification using word embeddings. In Big Data Analytics and Knowledge Discovery, pages 382–395, Cham. Springer International Publishing.
    Google ScholarFindings
  • Dilek Kucuk and Fazli Can. 2020. Stance detection: A survey. ACM Comput. Surv., 53(1).
    Google ScholarLocate open access versionFindings
  • Christopher Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 55–60, Baltimore, Maryland. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Saif Mohammad, Svetlana Kiritchenko, Parinaz Sobhani, Xiaodan Zhu, and Colin Cherry. 2016. SemEval-2016 task 6: Detecting stance in tweets. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pages 31– 41, San Diego, California. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Dhanya Sridhar, Lise Getoor, and Marilyn Walker. 2014. Collective stance classification of posts in online debate forums. In Proceedings of the Joint Workshop on Social Dynamics and Personal Attributes in Social Media, pages 109–117, Baltimore, Maryland. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Akiko Murakami and Rudy Raymond. 2010. Support or oppose? classifying positions in online debates from reply activities and opinion expressions. In Coling 2010: Posters, pages 869–875, Beijing, China. Coling 2010 Organizing Committee.
    Google ScholarLocate open access versionFindings
  • Pavithra Rajendran, Danushka Bollegala, and Simon Parsons. 2016. Contextual stance classification of opinions: A step towards enthymeme reconstruction in online reviews. In Proceedings of the Third Workshop on Argument Mining (ArgMining2016), pages 31–39, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Hannah Rashkin, Sameer Singh, and Yejin Choi. 2016. Connotation Frames: A Data-Driven Investigation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 311–321. Association for Computational Linguistics. Event-place: Berlin, Germany.
    Google ScholarLocate open access versionFindings
  • Paul Reisert, Naoya Inoue, Tatsuki Kuribayashi, and Kentaro Inui. 2018. Feasible Annotation Scheme for Capturing Policy Argument Reasoning using Argument Templates. In Proceedings of the 5th Workshop on Argument Mining, pages 79–89, Brussels, Belgium. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Benjamin Schiller, Johannes Daxenberger, and Iryna Gurevych. 2020. Stance detection benchmark: How robust is your stance detection?
    Google ScholarFindings
  • Parinaz Sobhani, Saif Mohammad, and Svetlana Kiritchenko. 2016. Detecting stance in tweets and analyzing its interaction with sentiment. In Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics, pages 159–169, Berlin, Germany. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Swapna Somasundaran and Janyce Wiebe. 2009. Recognizing Stances in Online Debates. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1, ACL ’09, pages 226– 234, Stroudsburg, PA, USA. Association for Computational Linguistics. Event-place: Suntec, Singapore.
    Google ScholarLocate open access versionFindings
  • Qingying Sun, Zhongqing Wang, Qiaoming Zhu, and Guodong Zhou. 2018. Stance detection with hierarchical attention network. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2399–2409, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Matt Thomas, Bo Pang, and Lillian Lee. 2006. Get out the vote: Determining support or opposition from congressional floor-debate transcripts. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 327–335, Sydney, Australia. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Orith Toledo-Ronen, Roy Bar-Haim, Alon Halfon, Charles Jochim, Amir Menczel, Ranit Aharonov, and Noam Slonim. 2018. Learning sentiment composition from sentiment lexicons. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2230–2241, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Douglas Walton, Christopher Reed, and Fabrizio Macagno. 2008. Argumentation Schemes. Cambridge University Press.
    Google ScholarFindings
  • Rui Wang, Deyu Zhou, Mingmin Jiang, Si Jiasheng, and Yang Yang. 2019. A survey on opinion mining: from stance to product aspect. IEEE Access, PP:1– 1.
    Google ScholarFindings
  • Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. 2005. Recognizing Contextual Polarity in Phraselevel Sentiment Analysis. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05, pages 347–354, Stroudsburg, PA, USA. Association for Computational Linguistics. Eventplace: Vancouver, British Columbia, Canada.
    Google ScholarLocate open access versionFindings
  • Swapna Somasundaran and Janyce Wiebe. 2010. Recognizing stances in ideological on-line debates. In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pages 116–124, Los Angeles, CA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
作者
您的评分 :
0

 

标签
评论
小科