Identifying Abusive Comments in Hebrew Facebook

2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE)(2018)

引用 7|浏览6
暂无评分
摘要
In this study, we aim to classify comments as abusive or non-abusive. We develop a Hebrew corpus of user comments annotated for abusive language. Then, we investigate highly sparse n-grams representations as well as denser character n-grams representations for comment abuse classification. Since the comments in social media are usually short, we also investigate four dimension reduction methods, which produce word vectors that collapse similar words into groups. We show that the character n-grams representations outperform all the other representation for the task of identifying abusive comments.
更多
查看译文
关键词
abusive comments,dimension reduction,n-grams,n-grams characters,semantic analysis,word embedding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要