AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We show that finetuning such affective models is useful, especially in the case of offensive language detection

Leveraging Affective Bidirectional Transformers for Offensive Language Detection

Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Ta..., pp.102-108, (2020)

Cited by: 0|Views2
Full Text
Bibtex
Weibo

Abstract

Social media are pervasive in our life, making it necessary to ensure safe online experiences by detecting and removing offensive and hate speech. In this work, we report our submission to the Offensive Language and hate-speech Detection shared task organized with the 4th Workshop on Open-Source Arabic Corpora and Processing Tools Arabic ...More

Code:

Data:

0
Introduction
  • Social media are widely used at a global scale. Communication between users from different backgrounds, ideologies, preferences, political orientations, etc. on these platforms can result in tensions and use of offensive and hateful speech.
  • On these platforms can result in tensions and use of offensive and hateful speech
  • This negative content can be very harmful, sometimes with real-world consequences.
  • (Agrawal and Awekar, 2018; Badjatiya et al, 2017; Nobata et al, 2016)), works on many other languages are either lacking or rare
  • This is the case for Arabic, where there have been only very few works (e.g., (Alakrot et al, 2018; Albadi et al, 2018; Mubarak et al, 2017; Mubarak and Darwish, 2019)).
  • The authors participated in the Offensive Language and hate-speech Detection shared task organized with the 4th Workshop on Open-Source Arabic Corpora and Processing Tools Arabic (OSACT4)
Highlights
  • Social media are widely used at a global scale
  • We participated in the Offensive Language and hate-speech Detection shared task organized with the 4th Workshop on Open-Source Arabic Corpora and Processing Tools Arabic (OSACT4)
  • We develop highly accurate deep learning models for the two tasks of offensive content and hate speech detection
  • We described our submission to the offensive language detection in Arabic shared task
  • Our best models are significantly better than competitive baseline based on vanilla BERT
  • We show that finetuning such affective models is useful, especially in the case of offensive language detection
Methods
  • A manual way for detecting negative language can involve building a list of offensive words and filtering text based on these words.
  • As such, are much more desirable since they are more nuanced to domain and usually render more accurate, context-sensitive predictions
  • This is especially the case if there are enough data to train these systems.
  • Abozinadah (2017) apply SVMs on 31 features extracted from user profiles in addition to social graph centrality measures
Results
  • The authors' best models are significantly better than a vanilla BERT model, with 89.60% acc (82.31% macro F1) for hate speech and 95.20% acc (70.51% macro F1) on official TEST data.
  • The offensive model obtain 87.45% accuracy and 80.51 F1 on TEST.
  • The hate speech model acquire 93.15% accuracy and 61.57 F1 on TEST.
  • The authors' best offensive predication on TEST is BERT-EMO-AUG.
  • It which achieves an accuracy of 89.35% and F1 of 82.85
Conclusion
  • The authors described the submission to the offensive language detection in Arabic shared task.
  • The authors deploy affective language models on the two sub-tasks of offensive language detection and hate speech identification.
  • The authors show that finetuning such affective models is useful, especially in the case of offensive language detection.
  • The authors will investigate other methods for improving the automatic offensive and hateful language acquisition methods.
  • The authors plan to investigate the utility of semi-supervised methods as a vehicle of improving the models
Summary
  • Introduction:

    Social media are widely used at a global scale. Communication between users from different backgrounds, ideologies, preferences, political orientations, etc. on these platforms can result in tensions and use of offensive and hateful speech.
  • On these platforms can result in tensions and use of offensive and hateful speech
  • This negative content can be very harmful, sometimes with real-world consequences.
  • (Agrawal and Awekar, 2018; Badjatiya et al, 2017; Nobata et al, 2016)), works on many other languages are either lacking or rare
  • This is the case for Arabic, where there have been only very few works (e.g., (Alakrot et al, 2018; Albadi et al, 2018; Mubarak et al, 2017; Mubarak and Darwish, 2019)).
  • The authors participated in the Offensive Language and hate-speech Detection shared task organized with the 4th Workshop on Open-Source Arabic Corpora and Processing Tools Arabic (OSACT4)
  • Objectives:

    Since the goal is to develop exclusively deep learning models, the authors needed to extend the training data such that the authors increase the positive samples.
  • Methods:

    A manual way for detecting negative language can involve building a list of offensive words and filtering text based on these words.
  • As such, are much more desirable since they are more nuanced to domain and usually render more accurate, context-sensitive predictions
  • This is especially the case if there are enough data to train these systems.
  • Abozinadah (2017) apply SVMs on 31 features extracted from user profiles in addition to social graph centrality measures
  • Results:

    The authors' best models are significantly better than a vanilla BERT model, with 89.60% acc (82.31% macro F1) for hate speech and 95.20% acc (70.51% macro F1) on official TEST data.
  • The offensive model obtain 87.45% accuracy and 80.51 F1 on TEST.
  • The hate speech model acquire 93.15% accuracy and 61.57 F1 on TEST.
  • The authors' best offensive predication on TEST is BERT-EMO-AUG.
  • It which achieves an accuracy of 89.35% and F1 of 82.85
  • Conclusion:

    The authors described the submission to the offensive language detection in Arabic shared task.
  • The authors deploy affective language models on the two sub-tasks of offensive language detection and hate speech identification.
  • The authors show that finetuning such affective models is useful, especially in the case of offensive language detection.
  • The authors will investigate other methods for improving the automatic offensive and hateful language acquisition methods.
  • The authors plan to investigate the utility of semi-supervised methods as a vehicle of improving the models
Tables
  • Table1: Offensive (OFF) and Hate Speech (HS) Labels distribution in datasets preceded by verbal hostility (<a class="ref-link" id="cChadefaux_2014_a" href="#rChadefaux_2014_a">Chadefaux, 2014</a>)
  • Table2: Examples of offensive and hateful seeds in our lexica tweets assigned negative sentiment labels. Tweets that carry offensive seeds are labeled as ‘offensive’ and those carrying hateful seeds are tagged as ‘hateful’. This gives us 265,413 offensive tweets and 10,489 hateful tweets. For reference, the majority (%=67) of the collection extracted with our seed lexicon are assigned negative sentiment labels by AraNet. This reflects the effectiveness of our lexicon as it matches our observations about the distribution of sentiment labels in the shared task TRAIN split
  • Table3: Examples of non-offensive/non-hateful seeds filtered out from our lexica
  • Table4: Offensive (OFF) and Hate Speech (HS) results on DEV and TEST datasets feed the final hidden state of ‘[CLS]’ to a Softmax linear layer to get predication probabilities across classes. We set the learning rate to 2e − 6 and train for 20 epochs. We save the checkpoint at the end of each epoch, report F1-score and accuracy of the best model, and use the best checkpoint to predict the labels of the TEST set. We fine-tune the BERT model under five settings. We describe each of these next
Download tables as Excel
Related work
Funding
  • Our best models are significantly better than a vanilla BERT model, with 89.60% acc (82.31% macro F1) for hate speech and 95.20% acc (70.51% macro F1) on official TEST data
  • Our best models are significantly better than competitive baseline based on vanilla BERT
  • Accuracy of the aforementioned systems range between 76% and 90%
  • As Table 4 shows, for offensive language classification, this model obtains 87.10% accuracy and 78.38 F1 score on DEV set
  • We submit the TEST prediction of this model to the shared task and obtain 87.30% accuracy and 77.70 F1 on the TEST set
  • The offensive model obtain 87.45% accuracy and 80.51 F1 on TEST
  • The hate speech model acquire 93.15% accuracy and 61.57 F1 on TEST
  • Our best offensive predication on TEST is BERT-EMO-AUG. It which achieves an accuracy of 89.35% and F1 of 82.85
Study subjects and analysis
tweets: 1100
Arabic Offensive Content: Very few works have been applied to the Arabic language, focusing on detecting offensive language. For example, (Mubarak et al, 2017) develop a list of obscene words and hashtags using patterns common in offensive and rude communications to label a dataset of 1,100 tweets. Mubarak and Darwish (2019) applied character n-gram FasText model on a large dataset (3.3M tweets) of offensive content. Our work is similar to Mubarak and Darwish (2019) in that we also automatically augment training data based on an initial seed lexicon

tweets: 10000
In our experiments, we use two types of data: (1) data distributed by the Offensive Language Detection shared task and (2) an automatically collected dataset that we develop (Section 3.1.). The shared task dataset comprises 10,000 tweets manually annotated for two sub-tasks: offensiveness (Sub task A) 2 and hate speech (Sub task B) 3. According to shared task organizers, 4, offensive tweets in the data contain explicit or implicit insults or attacks against other people, or inappropriate language

tweets: 215365
We apply AraNet on these tweets and keep only tweets assigned a positive sentiment label (%=70). We use 215,365 tweets as ‘non-offensive’ but only 199,291 as ‘non-hateful’. 9 Table 1 shows the size and distribution of class labels in our extended dataset. Figure 2 and Figure 1 are word clouds of unigrams in our extended training data (offensive and hateful speech, respectively) after we remove our seed lexica from the data

tweets: 10489
The clouds show that the data carries lexical cues likely to occur in each of the two classes (offensive and hateful). Examples of frequent words in the offensive class include dog, 9We decided to keep only 199,291 ‘non-hateful’ tweets since our augmented ‘hateful’ class comprises only 10,489 tweets.

tweets: 1100
Arabic Offensive Content: Very few works have been applied to the Arabic language, focusing on detecting offensive language. For example, (Mubarak et al, 2017) develop a list of obscene words and hashtags using patterns common in offensive and rude communications to label a dataset of 1,100 tweets. Mubarak and Darwish (2019) applied character n-gram FasText model on a large dataset (3.3M tweets) of offensive content

tweets: 10489
The clouds show that the data carries lexical cues likely to occur in each of the two classes (offensive and hateful). Examples of frequent words in the offensive class include dog, 9We decided to keep only 199,291 ‘non-hateful’ tweets since our augmented ‘hateful’ class comprises only 10,489 tweets. 4.1

Reference
  • Abdul-Mageed, M., Zhang, C., Elmadany, A., Rajendran, A., and Ungar, L. (2019). Dianet: Bert and hierarchical attention multi-task learning of fine-grained dialect. arXiv preprint arXiv:1910.14243.
    Findings
  • Abdul-Mageed, Muhammad, Z. C., Nagoudi, E. M. B., and Hashemi, A. (2020). Aranet: A deep learning toolkit for arabic social media. In The 4th Workshop on OpenSource Arabic Corpora and Processing Tools (OSACT4), LREC.
    Google ScholarLocate open access versionFindings
  • Abozinadah, E. (2017). Detecting abusive arabic language twitter accounts using a multidimensional analysis model. Ph.D. thesis.
    Google ScholarFindings
  • Agrawal, S. and Awekar, A. (2018). Deep learning for detecting cyberbullying across multiple social media platforms. In European Conference on Information Retrieval, pages 141–153. Springer.
    Google ScholarLocate open access versionFindings
  • Alakrot, A., Murray, L., and Nikolov, N. S. (2018). Towards accurate detection of offensive language in online communication in arabic. Procedia computer science, 142:315–320.
    Google ScholarLocate open access versionFindings
  • Albadi, N., Kurdi, M., and Mishra, S. (2018). Are they our brothers? analysis and detection of religious hate speech in the arabic twittersphere. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pages 69–7IEEE.
    Google ScholarLocate open access versionFindings
  • Badjatiya, P., Gupta, S., Gupta, M., and Varma, V. (2017). Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web Companion, pages 759–760.
    Google ScholarLocate open access versionFindings
  • Barbera, P. and Sood, G. (2015). Follow your ideology: Measuring media ideology on social networks. In Annual Meeting of the European Political Science Association, Vienna, Austria. Retrieved from http://www.gsood.com/research/papers/mediabias.pdf.
    Locate open access versionFindings
  • Chadefaux, T. (2014). Early warning signals for war in the news. Journal of Peace Research, 51(1):5–18.
    Google ScholarLocate open access versionFindings
  • Conover, M. D., Ratkiewicz, J., Francisco, M., Goncalves, B., Menczer, F., and Flammini, A. (2011). Political polarization on twitter. In Fifth international AAAI conference on weblogs and social media.
    Google ScholarLocate open access versionFindings
  • Dadvar, M., Trieschnigg, D., Ordelman, R., and de Jong, F. (2013). Improving cyberbullying detection with user context. In European Conference on Information Retrieval, pages 693–696. Springer.
    Google ScholarLocate open access versionFindings
  • Darwish, K., Alexandrov, D., Nakov, P., and Mejova, Y. (2017). Seminar users in the arabic twitter sphere. In International Conference on Social Informatics, pages 91– 108. Springer.
    Google ScholarLocate open access versionFindings
  • Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
    Findings
  • Fortuna, P., Ferreira, J., Pires, L., Routar, G., and Nunes, S. (2018). Merging datasets for aggressive text identification. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pages 128– 139.
    Google ScholarLocate open access versionFindings
  • Georgakopoulos, S. V., Tasoulis, S. K., Vrahatis, A. G., and Plagianakos, V. P. (2018). Convolutional neural networks for toxic comment classification. In Proceedings of the 10th Hellenic Conference on Artificial Intelligence, pages 1–6.
    Google ScholarLocate open access versionFindings
  • Jay, T. and Janschewitz, K. (2008). The pragmatics of swearing. Journal of Politeness Research. Language, Behaviour, Culture, 4(2):267–288.
    Google ScholarLocate open access versionFindings
  • Kumar, R., Ojha, A. K., Malmasi, S., and Zampieri, M. (2018). Benchmarking aggression identification in social media. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pages 1–11.
    Google ScholarLocate open access versionFindings
  • Kwok, I. and Wang, Y. (2013). Locate the hate: Detecting tweets against blacks. In Twenty-seventh AAAI conference on artificial intelligence.
    Google ScholarLocate open access versionFindings
  • Malmasi, S. and Zampieri, M. (2017). Detecting hate speech in social media. arXiv preprint arXiv:1712.06427.
    Findings
  • Modha, S., Majumder, P., and Mandl, T. (2018). Filtering aggression from the multilingual social media feed. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pages 199–207.
    Google ScholarLocate open access versionFindings
  • Mubarak, H. and Darwish, K. (2019). Arabic offensive language classification on twitter. In International Conference on Social Informatics, pages 269–276. Springer.
    Google ScholarLocate open access versionFindings
  • Mubarak, H., Darwish, K., and Magdy, W. (2017). Abusive language detection on arabic social media. In Proceedings of the First Workshop on Abusive Language Online, pages 52–56.
    Google ScholarLocate open access versionFindings
  • Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., and Chang, Y. (2016). Abusive language detection in online user content. In Proceedings of the 25th international conference on world wide web, pages 145–153.
    Google ScholarLocate open access versionFindings
  • Weber, I., Garimella, V. R. K., and Batayneh, A. (2013). Secular vs. islamist polarization in egypt on twitter. In Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining, pages 290–297.
    Google ScholarLocate open access versionFindings
  • Wiegand, M., Siegel, M., and Ruppenhofer, J. (2018). Overview of the germeval 2018 shared task on the identification of offensive language.
    Google ScholarFindings
  • Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., and Kumar, R. (2019). Semeval-2019 task 6: Identifying and categorizing offensive language in social media (offenseval). arXiv preprint arXiv:1903.08983.
    Findings
Author
Elmadany AbdelRahim
Elmadany AbdelRahim
Zhang Chiyu
Zhang Chiyu
Abdul-Mageed Muhammad
Abdul-Mageed Muhammad
Hashemi Azadeh
Hashemi Azadeh
Your rating :
0

 

Tags
Comments
小科