Classification of Private Tweets Using Tweet Content

2017 IEEE 11th International Conference on Semantic Computing (ICSC)(2017)

引用 9|浏览49
暂无评分
摘要
Online social networks (OSNs) like Twitter provide an open platform for users to easily convey their thoughts and ideas from personal experiences to breaking news. With the increasing popularity of Twitter and the explosion of tweets, we have observed large amounts of potentially sensitive/private messages being published to OSNs inadvertently or voluntarily. The owners of these messages may become vulnerable to online stalkers or adversaries, and they often regret posting such messages. Therefore, identifying tweets that reveal private/sensitive information is critical for both the users and the service providers. However, the definition of sensitive information is subjective and different from person to person. To develop a privacy protection mechanism that is customizable to fit the needs of diverse audiences, it is essential to accurately and automatically classify potentially sensitive tweets. In this paper, we make the first attempt to classify private tweets into 14 categories, such as alcohol & drugs, family information, etc. We model tweet semantic with term distribution features as well as users' topic-preferences based on personal tweet history. Experiments show that our method can boost classification accuracy compared with the well-known Bag-of-Words and tf-idf methods.
更多
查看译文
关键词
Social Networks,Privacy,Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要