BERT Model-Based Approach for Detecting Racism and Xenophobia on Twitter Data.

José Alberto Benítez-Andrades, Álvaro González-Jiménez,Álvaro López-Brea,Carmen Benavides,Jose Aveleira-Mata,José-Manuel Alija-Pérez,María Teresa García-Ordás

International Conference on Metadata and Semantics Research (MTSR)（2021）

引用 0|浏览4

暂无评分

摘要

The large amount of data generated on social networks makes the task of moderating textual content written by users complex and impossible to do manually. One of the most prominent problems on social networks is racism and xenophobia. Although there are studies of predictive models that make use of natural language processing techniques to detect racist or xenophobic texts, a lack of these has been observed in the Spanish language. In this paper we present a solution based on deep learning models and, more specifically, models based on transfer learning to detect racist and xenophobic messages in Spanish. For this purpose, a dataset obtained from the social network Twitter has been created using data mining techniques and, after a preprocessing, it has been labelled into racist messages and non-racist messages. The trained models are based on BERT and were called BETO and mBERT. Promising results were obtained showing 85.14% accuracy in the best performing model.

查看译文

关键词

Natural language processing, BERT, Deep learning, Hate speech, Racism, Social networks

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要