A domain adaptation approach for offensive language detection with bidirectional transformers

Sumer Singh,Sheng Li,K. Rasheed,Frederick W. Maier, Ronald W. Walcott

semanticscholar（2020）

引用 0|浏览2

暂无评分

摘要

Offensive language detection (OLD) has received increasing attention due to its societal impact. Recent work shows that bidirectional transformer (BERT) based methods obtain impressive performance on OLD. However, such methods usually rely on large OLD datasets for training. To address the issue of data scarcity in OLD, we propose an effective domain adaptation approach to train bidirectional transformers. Our approach introduces domain adaptation to A Lite BERT (ALBERT), such that it can effectively exploit auxiliary data from source domains to improve the OLD performance in a target domain. Two approaches to domain adaptation are taken. First, we use the auxiliary dataset in an unmodified manner. Next, we modify the auxiliary dataset labels to match the target labels. Experimental results show that the first approach, ALBERT (SA), obtains state-of-the-art performance in most cases. Particularly, our approach significantly benefits underrepresented and underperforming classes, with an improvement of about 40% over ALBERT. INDEX WORDS: Natural Language Processing, NLP, Transfer Learning, Deep Learning, Preprocessing, Offensive Language Detection, Cyberbullying, Hate Speech, Domain Adaptation A DOMAIN ADAPTATION APPROACH FOR OFFENSIVE LANGUAGE DETECTION WITH BIDIRECTIONAL TRANSFORMERS

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要