We project predictions on comparable data in Bengali, Hindi, and Spanish and we report results of 0.8415 F1 macro for Bengali, 0.8568 F1 macro for Hindi, and 0.7513 F1 macro for Spanish
Multilingual Offensive Language Identification with Cross lingual Embeddings
EMNLP 2020, pp.5838-5844, (2020)
Offensive content is pervasive in social media and a reason for concern to companies and government organizations. Several studies have been recently published investigating methods to detect the various forms of such content (e.g. hate speech, cyberbulling, and cyberaggression). The clear majority of these studies deal with English parti...更多
下载 PDF 全文
- Offensive posts on social media result in a number of undesired consequences to users.
- To the best of the knowledge, state-of-the-art cross-lingual contextual embeddings such as XLMR (Conneau et al, 2019) have not yet been applied to offensive language identification.
- To address this gap, the authors evaluate the performance of cross-lingual contextual embeddings and transfer learning (TL) methods in projecting predictions from English to other languages.
- The authors take advantage of existing English data to project predictions in three other languages: Bengali, Hindi, and Spanish
- Offensive posts on social media result in a number of undesired consequences to users
- When we adopt XLM-R for multilingual offensive language identification, we perform transfer learning in two different ways
- We save the weights of the XLM-R model as well as the softmax layer. We use these saved weights from English to initialise the weights for a new language. To explore this transfer learning aspect we experimented on Hindi language which was released for HASOC 2019 shared task (Mandl et al, 2019) and on Spanish data released for Hateval 2019 (Basile et al, 2019)
- We have showed that XLM-R with transfer learning outperforms all of the other methods we tested as well as the best results obtained by participants of the three competitions
- The results obtained by our models confirm that offensive language identification dataset (OLID)’s general hierarchical annotation model encompasses multiple types of offensive content such as aggression, included in the Bengali dataset, and hate speech included in the Hindi and Spanish datasets, allowing us to model different subtasks jointly using the methods described in this paper
- This opens exciting new avenues for future research considering the multitude of phenomena, annotation schemes and guidelines used in offensive language datasets
- Transformer models have been used successfully for various NLP tasks (Devlin et al, 2019).
- There were several multilingual models like BERTm (Devlin et al, 2019) there was much speculations about its ability to represent all the languages (Pires et al, 2019) and BERT-m model showed some cross-lingual characteristics it has not been trained on crosslingual data (Karthikeyan et al, 2020)
- The motivation behind this methodology was the recently released cross-lingual transformer models - XLM-R (Conneau et al, 2019) which has been trained on 104 languages.
- This process is known as transfer learning and is illustrated in Figure 1
- Inter-language transfer learning The authors first trained the XLM-R classification model on first level of English offensive language identification dataset (OLID) (Zampieri et al, 2019a).
- The authors did not use the weights of the last softmax layer since the authors wanted to test this strategy on data that has a different number of offensive classes to predict
- The authors explored this transfer learning aspect with Bengali dataset released with TRAC - 2 shared task (Kumar et al, 2020).
- As described in the Section 3 the classifier should make a 3-way classification in between ‘Overtly Aggressive’, ‘Covertly Aggressive’ and ‘Non Aggressive’ text data
- This paper is the first study to apply cross-lingual contextual word embeddings in offensive language identification projecting predictions from English to other languages using benchmarked datasets from shared tasks on Bengali (Kumar et al, 2020), Hindi (Mandl et al, 2019), and Spanish (Basile et al, 2019).
- The authors would like to further evaluate the models using SOLID, a novel large English dataset with over 9 million tweets (Rosenthal et al, 2020), along with datasets in four other languages (Arabic, Danish, Greek, and Turkish) that were made available for the second edition of OffensEval (Zampieri et al, 2020)
- These datasets were collected using the same methodology and were annotated according to OLID’s guidelines.
- The authors would like to apply the models to languages with even less resources available to help coping with the problem of offensive language in social media
- Table1: Instances (Inst.), source (S) and labels in all datasets. F stands for Facebook and T for Twitter
- Table2: Results ordered by macro (M) F1 for Bengali and weighted (W) F1 for Hindi and Spanish
- There is a growing interest in the development of computational models to identify offensive content online. Early approaches relied heavily on feature engineering combined with traditional machine learning classifiers such as naive bayes and support vector machines (Xu et al, 2012; Dadvar et al, 2013). More recently, neural networks such as LSTMs, bidirectional LSTMs, and GRUs combined with word embeddings have proved to outperform traditional machine learning methods in this task (Aroyehun and Gelbukh, 2018; Majumder et al, 2018). In the last couple of years, transformer models like ELMO (Peters et al, 2018) and BERT (Devlin et al, 2019) have been applied to offensive language identification achieving competitive scores and topping the leaderboards in recent shared tasks (Liu et al, 2019; Ranasinghe et al, 2019). Most of these approaches use existing pretrained transformer models which can also be used as text classification models.
The clear majority of studies on this topic deal with English (Malmasi and Zampieri, 2017; Yao et al, 2019; Ridenhour et al, 2020) partially motivated by the availability English resources (e.g. corpora, lexicon, and pre-trained models). In recent years, a number of studies have been published on other languages such as Arabic (Mubarak et al, 2020), Danish (Sigurbergsson and Derczynski, 2020), Dutch (Tulkens et al, 2016), French (Chiril et al, 2019), Greek (Pitenis et al, 2020), Italian (Poletto et al, 2017), Portuguese (Fortuna et al, 2019), Slovene (Fiser et al, 2017), and Turkish (Co ̈ltekin, 2020) creating new datasets and resources for these languages.
We acquired datasets in English and three other languages: Bengali, Hindi, and Spanish (listed in Table 1). The four datasets have been used in shared tasks in 2019 and 2020 allowing us to compare the performance of our methods to other approaches. As our English dataset, we chose the Offensive Language Identification Dataset (OLID) (Zampieri et al, 2019a), used in the SemEval-2019 Task 6 (OffensEval) (Zampieri et al, 2019b)
We chose OLID due to the flexibility provided by its hierarchical annotation model that considers multiple types of offensive content in a single taxonomy (e.g. targeted insults to a group are often hate speech whereas targeted insults to an individual are often cyberbulling). This allows us to map OLID level A (offensive vs. non-offensive) to labels in the other three datasets. OLID’s annotation model is intended to serve as a general-purpose model for multiple abusive language detection subtasks (Waseem et al, 2017)
The Bengali dataset (Bhattacharya et al, 2020) was used in the TRAC-2 shared task (Kumar et al, 2020) on aggression identification. It is different than the other three datasets in terms of domain (Facebook instead of Twitter) and set of labels (three classes instead of binary), allowing us to compare the performance of cross-lingual embeddings on off-domain data and off-task data. Lang
- Segun Taofeek Aroyehun and Alexander Gelbukh. 2018. Aggression detection in social media: Using deep neural networks, data augmentation, and pseudo labeling. In Proceedings of TRAC.
- Rienke Bannink, Suzanne Broeren, Petra M van de Looij-Jansen, Frouwkje G de Waart, and Hein Raat. 2014. Cyber and Traditional Bullying Victimization as a Risk Factor for Mental Health Problems and Suicidal Ideation in Adolescents. PloS one, 9(4).
- Md Abul Bashar and Richi Nayak. 2019. QutNocturnal@ HASOC’19: CNN for hate speech and offensive content identification in Hindi language. In Proceedings of FIRE.
- Valerio Basile, Cristina Bosco, Elisabetta Fersini, Debora Nozza, Viviana Patti, Francisco Manuel Rangel Pardo, Paolo Rosso, and Manuela Sanguinetti. 2019. SemEval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter. In Proceedings of SemEval.
- Shiladitya Bhattacharya, Siddharth Singh, Ritesh Kumar, Akanksha Bansal, Akash Bhagat, Yogesh Dawer, Bornini Lahiri, and Atul Kr. Ojha. 2020. Developing a multilingual annotated corpus of misogyny and aggression. In Proceedings of TRAC.
- Rina A Bonanno and Shelley Hymel. 2013. Cyber bullying and internalizing difficulties: Above and beyond the impact of traditional forms of bullying. Journal of youth and adolescence, 42(5):685–697.
- Cagrı Coltekin. 2020. A Corpus of Turkish Offensive Language on Social Media. In Proceedings of LREC.
- Patricia Chiril, Farah Benamara Zitoune, Veronique Moriceau, Marlene Coulomb-Gully, and Abhishek Kumar. 2019. Multilingual and multitarget hate speech detection in tweets. In Proceedings of TALN.
- Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzman, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 201Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116.
- Maral Dadvar, Dolf Trieschnigg, Roeland Ordelman, and Franciska de Jong. 2013. Improving Dyberbullying Detection with User Context. In Proceedings of ECIR.
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL.
- Darja Fiser, Tomaz Erjavec, and Nikola Ljubesic. 2017. Legal Framework, Dataset and Annotation Schema for Socially Unacceptable On-line Discourse Practices in Slovene. In Proceedings ALW.
- Paula Fortuna, Joao Rocha da Silva, Leo Wanner, Sergio Nunes, et al. 2019. A Hierarchically-labeled Portuguese Hate Speech Dataset. In Proceedings of ALW.
- Erfan Ghadery and Marie-Francine Moens. 2020. Liir at semeval-2020 task 12: A cross-lingual augmentation approach for multilingual offensive language identification. arXiv preprint arXiv:2005.03695.
- K Karthikeyan, Zihan Wang, Stephen Mayhew, and Dan Roth. 2020. Cross-lingual ability of multilingual bert: An empirical study. In Proceedings of ICLR.
- Ritesh Kumar, Atul Kr Ojha, Shervin Malmasi, and Marcos Zampieri. 2018. Benchmarking aggression identification in social media. In Proceedings of TRAC.
- Ritesh Kumar, Atul Kr. Ojha, Shervin Malmasi, and Marcos Zampieri. 2020. Evaluating Aggression Identification in Social Media. In Proceedings of TRAC.
- Ping Liu, Wen Li, and Liang Zou. 2019. NULI at SemEval-2019 task 6: Transfer learning for offensive language detection using bidirectional transformers. In Proceedings of SemEval.
- Prasenjit Majumder, Thomas Mandl, et al. 2018. Filtering Aggression from the Multilingual Social Media Feed. In Proceedings TRAC.
- Shervin Malmasi and Marcos Zampieri. 2017. Detecting Hate Speech in Social Media. In Proceedings of RANLP.
- Shervin Malmasi and Marcos Zampieri. 2018. Challenges in Discriminating Profanity from Hate Speech. Journal of Experimental & Theoretical Artificial Intelligence, 30:1 – 16.
- Thomas Mandl, Sandip Modha, Prasenjit Majumder, Daksh Patel, Mohana Dave, Chintak Mandlia, and Aditya Patel. 2019. Overview of the hasoc track at fire 2019: Hate speech and offensive content identification in indo-european languages. In Proceedings of FIRE.
- Hamdy Mubarak, Darwish Kareem, and Magdy Walid. 2017. Abusive language detection on Arabic social media. In Proceedings of ALW.
- Hamdy Mubarak, Ammar Rashed, Kareem Darwish, Younes Samih, and Ahmed Abdelali. 2020. Arabic offensive language on twitter: Analysis and experiments. arXiv preprint arXiv:2004.02192.
- Endang Wahyu Pamungkas and Viviana Patti. 2019. Cross-domain and cross-lingual abusive language detection: A hybrid approach with deep learning and a multilingual lexicon. In Proceedings ACL:SRW.
- Juan Manuel Perez and Franco M Luque. 2019. Atalaya at semeval 2019 task 5: Robust embeddings for tweet classification. In Proceedings of SemEval.
- Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of NAACL.
- Telmo Pires, Eva Schlinger, and Dan Garrette. 2019. How multilingual is multilingual BERT? In Proceedings of ACL.
- Zeses Pitenis, Marcos Zampieri, and Tharindu Ranasinghe. 2020. Offensive Language Identification in Greek. In Proceedings of LREC.
- Fabio Poletto, Marco Stranisci, Manuela Sanguinetti, Viviana Patti, and Cristina Bosco. 2017. Hate Speech Annotation: Analysis of an Italian Twitter Corpus. In Proceedings of CLiC-it.
- Tharindu Ranasinghe, Marcos Zampieri, and Hansi Hettiarachchi. 2019. Brums at hasoc 2019: Deep learning models for multilingual hate speech and offensive language identification. In Proceedings of FIRE.
- Michael Ridenhour, Arunkumar Bagavathi, Elaheh Raisi, and Siddharth Krishnan. 2020. Detecting Online Hate Speech: Approaches Using Weak Supervision and Network Embedding Models. arXiv preprint arXiv:2007.12724.
- Julian Risch and Ralf Krestel. 2020. Bagging bert models for robust aggression identification. In Proceedings of TRAC.
- Hugo Rosa, N Pereira, Ricardo Ribeiro, Paula Costa Ferreira, Joao Paulo Carvalho, S Oliveira, Luısa Coheur, Paula Paulino, AM Veiga Simao, and Isabel Trancoso. 2019. Automatic cyberbullying detection: A systematic review. Computers in Human Behavior, 93:333–345.
- Sara Rosenthal, Pepa Atanasova, Georgi Karadzhov, Marcos Zampieri, and Preslav Nakov. 2020. A Large-Scale Weakly Supervised Dataset for Offensive Language Identification. In arXiv preprint arXiv:2004.14454.
- Gudbjartur Ingi Sigurbergsson and Leon Derczynski. 2020. Offensive Language and Hate Speech Detection for Danish. In Proceedings of LREC.
- Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. 2019. How to fine-tune bert for text classification? In Chinese Computational Linguistics, pages 194– 206.
- Stephan Tulkens, Lisa Hilte, Elise Lodewyckx, Ben Verhoeven, and Walter Daelemans. 2016. A Dictionary-based Approach to Racism Detection in Dutch Social Media. In Proceedings of TA-COS.
- Luis Enrique Argota Vega, Jorge Carlos ReyesMagana, Helena Gomez-Adorno, and Gemma BelEnguix. 2019. Mineriaunam at semeval-2019 task 5: Detecting hate speech in twitter using multiple features in a combinatorial framework. In Proceedings of SemEval.
- Zeerak Waseem, Thomas Davidson, Dana Warmsley, and Ingmar Weber. 2017. Understanding Abuse: A Typology of Abusive Language Detection Subtasks. In Proceedings of ALW.
- Jun-Ming Xu, Kwang-Sung Jun, Xiaojin Zhu, and Amy Bellmore. 2012. Learning from bullying traces in social media. In Proceedings of NAACL.
- Mengfan Yao, Charalampos Chelmis, and DaphneyStavroula Zois. 2019. Cyberbullying Ends Here: Towards Robust Detection of Cyberbullying in Social Media. In Proceedings of WWW.
- Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar. 2019a. Predicting the type and target of offensive posts in social media. In Proceedings of NAACL.
- Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar. 2019b. SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval). In Proceedings of SemEval.
- Marcos Zampieri, Preslav Nakov, Sara Rosenthal, Pepa Atanasova, Georgi Karadzhov, Hamdy Mubarak, Leon Derczynski, Zeses Pitenis, and Cagrı Coltekin. 2020. Semeval-2020 task 12: Multilingual offensive language identification in social media (offenseval 2020). Proceedings of SemEval.