P-Gent: Privacy-Preserving Geocoding Of Non-Geotagged Tweets
2018 17TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (IEEE TRUSTCOM) / 12TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (IEEE BIGDATASE)(2018)
摘要
With the widespread proliferation of location-aware devices and social media applications, more and more people share information on location-based social networks such as Twitter. Such data can be beneficial to better plan and manage individual's activities and other social applications, e.g., location-based advertisement or recommendation. However, only a very small proportion of tweets are geotagged due to privacy concerns or lack of underlying positioning infrastructures. Hence it is meaningful to estimate the geographic information for non-geotagged tweets, i.e., geocoding, which can help to improve the applicability and utility of social media data. Contrary to existing geocoding approaches, this paper aims at the privacy risk and providing a fine-grained estimation. In this paper, we propose Privacy-preserving GEocoding of Non-geotagged Tweets (P-GENT) for geocoding non-geotagged tweets with fine-grained estimation whilst protecting privacy. Our approach estimates the geographic location of a non-geotagged tweet based on the similarities between the content of the tweet and the keyword lists of detected local events form the archived geo-tagged tweets during the same time period. This approach implements a spatio-temporal clustering algorithm to discover local events with a fine-grained granularity and an important keyword extraction mechanism to describe the detected local event. In addition, a density-seed discovery approach is used to reduce the sparseness of geo-tagged tweets and the time complexity of clustering approach. The experimental evaluation with real-world data demonstrates that our approach has at most 92% precision for one timeslot and 33 - 43% precision remained for all time slots after using privacy-preserving mechanisms.
更多查看译文
关键词
Differential privacy, location estimation, spatio-temporal clustering, event detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络