Spanish hate-speech detection in football

Esteban Montesinos-Canovas,Francisco Garcia-Sanchez,Jose Antonio Garcia-Diaz,Gema Alcaraz-Marmol,Rafael Valencia-Garcia

PROCESAMIENTO DEL LENGUAJE NATURAL（2023）

引用 0|浏览2

暂无评分

摘要

In the last few years, Natural Language Processing (NLP) tools have been successfully applied to a number of different tasks, including author profiling, negation detection or hate speech detection, to name but a few. For the identification of hate speech from text, pre-trained language models can be leveraged to build high-performing classifiers using a transfer learning approach. In this work, we train and evaluate state-of-the-art pre-trained classifiers based on Transformers. The explored models are fine-tuned using a hate speech corpus in Spanish that has been compiled as part of this research. The corpus contains a total of 7,483 football-related tweets that have been manually annotated under four categories: aggressive, racist, misogynist, and safe. A multi-label approach is used, allowing the same tweet to be labeled with more than one class. The best results, with a macro F1-score of 88.713%, have been obtained by a combination of the models using Knowledge Integration.

查看译文

关键词

Hate speech detection,Large Language Models,Linguistic features,Interpretability

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要