AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
Our work shows the competitive results in this shared task using customized processing to dataset, as well as the power of pre-trained model

NULI at SemEval-2019 Task 6: Transfer Learning for Offensive Language Detection using Bidirectional Transformers.

pp.87-91 (2019)

被引用0|浏览8
EI
下载 PDF 全文
引用
微博一下

摘要

Transfer learning and domain adaptive learning have been applied to various fields including computer vision (e.g., image recognition) and natural language processing (e.g., text classification). One of the benefits of transfer learning is to learn effectively and efficiently from limited labeled data with a pretrained model. In the share...更多

代码

数据

0
简介
  • Anti-social online behaviors, including cyberbullying, trolling and offensive language (Xu et al, 2012; Kwok and Wang, 2013; Cheng et al, 2017), are attracting more attention on different social networks.
  • The intervention of such behaviors should be taken at the earliest opportunity.
  • The three sub-tasks are independently evaluated by macro-F1 metric
重点内容
  • Anti-social online behaviors, including cyberbullying, trolling and offensive language (Xu et al, 2012; Kwok and Wang, 2013; Cheng et al, 2017), are attracting more attention on different social networks
  • 1https://github.com/google-research/bert (OFF) or not (NOT); b) identifying the offense type of an offensive post as targeted insult (TIN), targeted threat (TTH), or untargeted (UNT); c) for a post labeled as TIN/TTH in sub-task B, identifying the target of offense as individual (IND), group of people (GRP), organization or entity (ORG), or other (OTH)
  • LSTM has been used in tons of natural language processing task, such as sentiment classification, neural translation, language generation etc
  • Offensive language and online hostility is crucial on the social network
  • Our work shows the competitive results in this shared task using customized processing to dataset, as well as the power of pre-trained model
方法
  • Linear model The authors firstly select Logistic Regression as the baseline model to determine the lower bound performance that the authors should compare.
  • The authors adopt the pre-trained word2vec model from google 4, aggregate the maximum and average value in each dimension.
  • 2https://github.com/carpedm20/emoji 3https://github.com/grantjenks/python-wordsegment 4https://code.google.com/archive/p/word2vec/ (a) Sub-task A.
  • System All NOT All OFF Linear LSTM BERT (b) Sub-task B.
  • The authors validate all the features combinations, report the accuracy and F1 with the highest to determine the model parameters.
  • The authors would like to use LSTM as the second powerful baseline model to compare and report the result.
  • The maximum sequence length is 140, the sentences would be either cut off or padded
结果
  • The evaluation metric of this task is Macro-F1, which is the unweighted-average F1 of all the classes.
  • There is one independent validation set to determine the model selection that is split from train set.
  • One observation from the table shows the problem of imbalanced data, so that higher accuracy does not guarantee higher macro-F1 score.
  • Based on the results of validation, the authors choose to use BERT as the selected model for the final submission
结论
  • Offensive language and online hostility is crucial on the social network.
  • The minority proportion of the nature and morphological language are the difficulties to achieve high performance.
  • The Diversity and evolution of the language at different ages is another challenge for social media detection task.
  • The authors' work shows the competitive results in this shared task using customized processing to dataset, as well as the power of pre-trained model.
  • How to tune the parameters is nontrivial, and there are a lot of more efficient ways to be explored, which could yield better performance
总结
  • Introduction:

    Anti-social online behaviors, including cyberbullying, trolling and offensive language (Xu et al, 2012; Kwok and Wang, 2013; Cheng et al, 2017), are attracting more attention on different social networks.
  • The intervention of such behaviors should be taken at the earliest opportunity.
  • The three sub-tasks are independently evaluated by macro-F1 metric
  • Methods:

    Linear model The authors firstly select Logistic Regression as the baseline model to determine the lower bound performance that the authors should compare.
  • The authors adopt the pre-trained word2vec model from google 4, aggregate the maximum and average value in each dimension.
  • 2https://github.com/carpedm20/emoji 3https://github.com/grantjenks/python-wordsegment 4https://code.google.com/archive/p/word2vec/ (a) Sub-task A.
  • System All NOT All OFF Linear LSTM BERT (b) Sub-task B.
  • The authors validate all the features combinations, report the accuracy and F1 with the highest to determine the model parameters.
  • The authors would like to use LSTM as the second powerful baseline model to compare and report the result.
  • The maximum sequence length is 140, the sentences would be either cut off or padded
  • Results:

    The evaluation metric of this task is Macro-F1, which is the unweighted-average F1 of all the classes.
  • There is one independent validation set to determine the model selection that is split from train set.
  • One observation from the table shows the problem of imbalanced data, so that higher accuracy does not guarantee higher macro-F1 score.
  • Based on the results of validation, the authors choose to use BERT as the selected model for the final submission
  • Conclusion:

    Offensive language and online hostility is crucial on the social network.
  • The minority proportion of the nature and morphological language are the difficulties to achieve high performance.
  • The Diversity and evolution of the language at different ages is another challenge for social media detection task.
  • The authors' work shows the competitive results in this shared task using customized processing to dataset, as well as the power of pre-trained model.
  • How to tune the parameters is nontrivial, and there are a lot of more efficient ways to be explored, which could yield better performance
表格
  • Table1: Data Distribution: The first two rows are the class distribution of sub-task A. The mid part two rows are the class distribution of sub-task B. The last three rows are the class distribution of sub-task C
  • Table2: Results on Dev Data
  • Table3: Results on Test Data
Download tables as Excel
相关工作
  • Schmidt and Wiegand (2017) surveyed features widely used for hate speech detection, including simple surface feature, word generalization, Data and Methodology

    Class StartKit Training Testing NOT OFF TIN UNT IND GRP OTH

    3.2 Preprocessing

    Emoji substitution We use one online emoji project on github 2 which could map the emoji unicode to substituted phrase. We treat such phrases into regular English phrase thus it could maintain their semantic meanings, especially when the dataset size is limited.

    3.1 Data Description

    Offensive Language Identification Dataset (OLID) (Zampieri et al, 2019a) is collected from Twitter API by searching certain keywords set. The keywords include some unbiased targeted phrase such as ‘she is’, ‘he is’ and ‘you are’ which have high proportional offensive tweets. The distribution of offensive tweets is controlled around 30% by using different sampling methods. Another observation reported in the paper is political tweets tend to be more likely offensive using keywords as ‘MEGA’, ‘liberal’ and ‘conservative’.

    The main task of this competition is decomposed into three different levels according to the hierarchical annotation: a) Offensive Language Detection b) Categorization of Offensive Language c) Offensive Language Target Identification. All the three different tasks share the same dataset, and the latter one is the subset of the previous one.
引用论文
  • Justin Cheng, Michael Bernstein, Cristian DanescuNiculescu-Mizil, and Jure Leskovec. 2017. Anyone can become a troll: Causes of trolling behavior in online discussions. In Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing, pages 1217–1230. ACM.
    Google ScholarLocate open access versionFindings
  • Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated Hate Speech Detection and the Problem of Offensive Language. In Proceedings of ICWSM.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
    Findings
  • Darja Fiser, Tomaz Erjavec, and Nikola Ljubesic. 2017. Legal Framework, Dataset and Annotation Schema for Socially Unacceptable On-line Discourse Practices in Slovene. In Proceedings of the Workshop Workshop on Abusive Language Online (ALW), Vancouver, Canada.
    Google ScholarLocate open access versionFindings
  • Jun-Ming Xu, Kwang-Sung Jun, Xiaojin Zhu, and Amy Bellmore. 2012. Learning from bullying traces in social media. In Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies, pages 656–666. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar. 2019a. Predicting the Type and Target of Offensive Posts in Social Media. In Proceedings of NAACL.
    Google ScholarLocate open access versionFindings
  • Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar. 2019b. SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval). In Proceedings of The 13th International Workshop on Semantic Evaluation (SemEval).
    Google ScholarLocate open access versionFindings
  • Sepp Hochreiter and Jurgen Schmidhuber. 1997. Long short-term memory. Neural computation, 9(8):1735–1780.
    Google ScholarLocate open access versionFindings
  • Irene Kwok and Yuzhou Wang. 2013. Locate the hate: Detecting Tweets Against Blacks. In TwentySeventh AAAI Conference on Artificial Intelligence.
    Google ScholarLocate open access versionFindings
  • Ping Liu, Joshua Guberman, Libby Hemphill, and Aron Culotta. 2018. Forecasting the presence and intensity of hostility on instagram using linguistic and social features. In Twelfth International AAAI Conference on Web and Social Media.
    Google ScholarLocate open access versionFindings
  • Anna Schmidt and Michael Wiegand. 2017. A Survey on Hate Speech Detection Using Natural Language Processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media. Association for Computational Linguistics, pages 1–10, Valencia, Spain.
    Google ScholarLocate open access versionFindings
  • Huei-Po Su, Chen-Jie Huang, Hao-Tsung Chang, and Chuan-Jie Lin. 2017. Rephrasing Profanity in Chinese Text. In Proceedings of the Workshop Workshop on Abusive Language Online (ALW), Vancouver, Canada.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, pages 5998–6008.
    Google ScholarLocate open access versionFindings
  • Zeerak Waseem, Thomas Davidson, Dana Warmsley, and Ingmar Weber. 2017. Understanding Abuse: A Typology of Abusive Language Detection Subtasks. In Proceedings of the First Workshop on Abusive Langauge Online.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科