Spanish hate-speech detection in football

PROCESAMIENTO DEL LENGUAJE NATURAL(2023)

引用 0|浏览2
暂无评分
摘要
In the last few years, Natural Language Processing (NLP) tools have been successfully applied to a number of different tasks, including author profiling, negation detection or hate speech detection, to name but a few. For the identification of hate speech from text, pre-trained language models can be leveraged to build high-performing classifiers using a transfer learning approach. In this work, we train and evaluate state-of-the-art pre-trained classifiers based on Transformers. The explored models are fine-tuned using a hate speech corpus in Spanish that has been compiled as part of this research. The corpus contains a total of 7,483 football-related tweets that have been manually annotated under four categories: aggressive, racist, misogynist, and safe. A multi-label approach is used, allowing the same tweet to be labeled with more than one class. The best results, with a macro F1-score of 88.713%, have been obtained by a combination of the models using Knowledge Integration.
更多
查看译文
关键词
Hate speech detection,Large Language Models,Linguistic features,Interpretability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要