AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
The networks are anonymized, i.e., names and demographic information associated with individual nodes are suppressed. Such suppression is often misinterpreted as removal of “personally identifiable information”, even though personally identifiable information” may include much mo...

De-anonymizing Social Networks

ieee symposium on security and privacy, (2009): 173-187

被引用1528|浏览174
EI WOS
下载 PDF 全文
引用
微博一下

摘要

Operators of online social networks are increasingly sharing potentially sensitive information about users and their relationships with advertisers, application developers, and data-mining researchers. Privacy is typically protected by anonymization, i.e., removing names, addresses, etc.We present a framework for analyzing privacy and ano...更多

代码

数据

0
简介
  • Social networks have been studied for a century [74] and are a staple of research in disciplines such as epidemiology [8], sociology [82], [32], [12], economics [33], and many others [22], [9], [36].
  • Network owners often share this information with advertising partners and other third parties
  • Such sharing is the foundation of the business case for many online socialnetwork operators.
  • The networks are anonymized, i.e., names and demographic information associated with individual nodes are suppressed.
  • Such suppression is often misinterpreted as removal of “personally identifiable information” (PII), even though PII may include much more than names and identifiers.
  • The EU privacy directive defines “personal data” as “any information relating to an identified or identifiable natural person [. . . ]; an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity” [26]
重点内容
  • Social networks have been studied for a century [74] and are a staple of research in disciplines such as epidemiology [8], sociology [82], [32], [12], economics [33], and many others [22], [9], [36]
  • The networks are anonymized, i.e., names and demographic information associated with individual nodes are suppressed. Such suppression is often misinterpreted as removal of “personally identifiable information” (PII), even though personally identifiable information” may include much more than names and identifiers
  • Our experiments underestimate the extent of the privacy risks of anonymized social networks
方法
  • The first graph is the “follow” relationships on the Twitter microblogging service, which the authors crawled in late 2007.
  • The second graph is the “contact” relationships on Flickr, a photo-sharing service, which the authors crawled in late.
  • 2007/early 2008
  • Both services have APIs that expose a mandatory username field, and optional fields name and location.
  • The latter is represented as free-form text.
  • Further details about the crawling process can be found in Appendix F
结果
  • The authors investigated the impact that the number of seeds has on the ability of the propagation algorithm to achieve large-scale re-identification, and its robustness to perturbation.

    Figure 2 shows that the selection of seeds determines.

    whether propagation step dies out or not, but whenever large-scale propagation has been achieved, the re-identification rate stays remarkably constant.

    The authors find that when the algorithm dies out, it re-identifies no more than a few dozen nodes correctly.

    The authors performed a further experiment to study the phase transition better.
  • The authors investigated the impact that the number of seeds has on the ability of the propagation algorithm to achieve large-scale re-identification, and its robustness to perturbation.
  • Figure 3 shows the resulting probabilities of large-scale propagation.
  • The authors caution against reading too much into the numbers
  • What this experiment shows is that a phase transition does happen and that it is strongly dependent on the number of seeds.
  • The adversary can collect seed mappings incrementally until he has enough mappings to carry out large-scale re-identification
结论
  • The main lesson of this paper is that anonymity is not sufficient for privacy when dealing with social networks.
  • The authors developed a generic re-identification algorithm and showed that it can successfully de-anonymize several thousand users in the anonymous graph of a popular microblogging service (Twitter), using a completely different social network (Flickr) as the source of auxiliary information.
  • The authors' experiments underestimate the extent of the privacy risks of anonymized social networks.
  • Since human names are not unique, 5.
  • The authors consider the geographical location to be the same if it is either the same non-U.S country, or the same U.S state
相关工作
  • Privacy properties. A social network consists of nodes, edges, and information associated with each node and edge. The existence of an edge between two nodes can be sensitive: for instance, in a sexual-relationship network with gender information attached to nodes [11] it can reveal sexual orientation. Edge privacy was considered in [44], [7]. In most online social networks, however, edges are public by default, and few users change the default settings [34].

    While the mere presence of an edge may not be sensitive, edge attributes may reveal more information (e.g., a single phone call vs. a pattern of calls indicative of a business or romantic relationship). For example, phone-call patterns of the disgraced NBA referee Tom Donaghy have been used in the investigation [91]. In online networks such as LiveJournal, there is much variability in the semantics of edge relationships [30].
基金
  • David Molnar’s help in reviewing a draft of this paper is appreciated. This material is based upon work supported in part by the NSF grants IIS-0534198, CNS-0716158, and CNS-0746888
引用论文
  • http://www.cpc.unc.edu/projects/addhealth/data/dedisclosure, 2008.
    Findings
  • The National Longitudinal Study of Adolescent Health. http://www.cpc.unc.edu/projects/addhealth, 2008.
    Findings
  • A. Anagnostopoulos, R. Kumar, and M. Mahdian. Influence and correlation in social networks. In KDD, 2008.
    Google ScholarLocate open access versionFindings
  • C. Anderson. Social networking is a feature, not a destination. http://www.thelongtail.com/the long tail/2007/09/social-networki.html, 2007.
    Findings
  • M. Anderson. Mining social connections. Adweek. http://tinyurl.com/6768nh, May 19 2008.
    Findings
  • [25] E. Eldon. VentureBeat: MediaSixDegrees targets ads using case. Techcrunch. http://tinyurl.com/6otok7, 2008. social graph information. http://tinyurl.com/662q3o, 2008.
    Findings
  • L. Backstrom, C. Dwork, and J. Kleinberg. Wherefore art thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. In WWW, 2007.
    Google ScholarFindings
  • [26] European Parliament. http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=CELEX:31995L0046
    Findings
  • [8] Norman T. Bailey. The Mathematical Theory of Infectious Diseases (2nd edition). Hafner Press, 1975. Facebook’s privacy http://www.new.facebook.com/policy.php, 2007.
    Findings
  • [9] A-L. Barabasi and R. Albert. Emergence of scaling in random
    Google ScholarLocate open access versionFindings
  • [28] A. Felt and D. Evans. Privacy protection for social networking networks. Science, 286:509–512, 1999. APIs. In W2SP, 2008.
    Google ScholarLocate open access versionFindings
  • [10] M. Barbaro and T. Zeller. A face is exposed for
    Google ScholarFindings
  • [29] B. Fitzpatrick and D. Recordon. Thoughts on the social graph. http://bradfitz.com/social-graph-problem/, 2007.
    Findings
  • http://www.nytimes.com/2006/08/09/technology/09aol.html?ex=1312776000, Aug 9 2006.
    Findings
  • [30] D. Fono and K. Raynes-Goldie. Hyperfriends and beyond: Friendship and social norms on LiveJournal. In Internet
    Google ScholarLocate open access versionFindings
  • [11] P. Bearman, J. Moody, and K. Stovel. Chains of affection: Research Annual Volume 4: Selected Papers from the Asso-The structure of adolescent romantic and sexual networks. ciation of Internet Researchers Conference, 2007.
    Google ScholarFindings
  • American Journal of Sociology, 110(1):44–91, 2004.
    Google ScholarLocate open access versionFindings
  • [31] K. Frikken and P. Golle. Private social network analysis: How to assemble pieces of a graph privately. In WPES, 2006.
    Google ScholarLocate open access versionFindings
  • American Journal of Sociology, 92(5):1170–1182, 1987.
    Google ScholarLocate open access versionFindings
  • [32] M. Granovetter. The strength of weak ties. American Journal of Sociology, 78:1360–1382, 1983.
    Google ScholarLocate open access versionFindings
  • http://info.sen.ca.gov/pub/01-02/bill/sen/sb 1351-1400/sb 1386 bill 20020926 chaptered.html, 2002.
    Findings
  • [33] M. Granovetter. Economic action and social structure: The problem of embeddedness. American Journal of Sociology,
    Google ScholarLocate open access versionFindings
  • [14] California Codes. Business and Professions Code Section 91:481–510, 1985.
    Google ScholarLocate open access versionFindings
  • 22575-22579. http://tinyurl.com/5fu9ks, 2003. Commonly known as the Online Privacy Protection Act of 2003.
    Findings
  • [34] R. Gross, A. Acquisti, and H. Heinz. Information revelation and privacy in online social networks. In WPES, 2005.
    Google ScholarLocate open access versionFindings
  • [15] A. Campan and T. Truta. A clustering approach for data and structural anonymity in social networks. In PinKDD, 2008.
    Google ScholarLocate open access versionFindings
  • [35] S. Guha, K. Tang, and P. Francis. NOYB: Privacy in online social networks. In WOSN, 2008. http://www.techcrunch.com/2007/11/30/will-irseek-have-a-chilling[-3e6ff]ecPt-eotenr-iHrca-gcgheatt/,and Richard J. Chorley. Network analysis in
    Locate open access versionFindings
  • 2007. [Note: A privacy outcry erupted over a search engine geography. Hodder & Stoughton, 1969.
    Google ScholarFindings
  • [37] R. Hanneman and M. Riddle. Introduction to social
    Google ScholarFindings
  • [17] M. Chew, D. Balfanz, and B. Laurie. (Under)mining privacy network methods. Chapter 10: Centrality and power.
    Google ScholarFindings
  • [38] M. Hay, G. Miklau, D. Jensen, P. Weis, and S. Srivastava.
    Google ScholarFindings
  • [19] D. Crandall, D. Cosley, D. Huttenlocher, J. Kleinberg, and versity of Massachusetts Amherst, 2007.
    Google ScholarFindings
  • [40] W. Hwang, T. Kim, M. Ramanathan, and A. Zhang. Bridging centrality: Graph mining from element level to group level.
    Google ScholarFindings
  • [21] The DataPortability project. http://dataportability.org, 2008. In KDD, 2008.
    Findings
  • [41] T. Jagatic, N. Johnson, M. Jakobsson, and F. Menczer. Social in primates. Journal of Human Evolution, 22:469–493, 1992.
    Google ScholarLocate open access versionFindings
  • [42] Testimony of Chris Kelly before the United States Senate Committee On Commerce, Science, and Transportation, “Privacy implications of online advertising” hearing. http://commerce.senate.gov/public/files/ChrisKellyFacebookOnlinePrivacyT 2008.
    Locate open access versionFindings
  • [43] F. Kerschbaum and A. Schaad. Privacy-preserving social network analysis for criminal investigations. In WPES, 2008.
    Google ScholarLocate open access versionFindings
  • [44] A. Korolova, R. Motwani, S. Nabar, and Y. Xu. Link privacy in social networks. In ICDE, 2008.
    Google ScholarLocate open access versionFindings
  • [45] G. Kossinets, J. Kleinberg, and D. Watts. The structure of information pathways in a social communication network. In KDD, 2008.
    Google ScholarLocate open access versionFindings
  • [46] B. Krishnamurthy and C. Willis. Characterizing privacy in online social networks. In WOSN, 2008.
    Google ScholarLocate open access versionFindings
  • [47] M. Kurucz, A. Benczur, K. Csalogany, and L. Lukacs. Spectral clustering in telephone call graphs. In WebKDD/SNAKDD, 2007.
    Google ScholarFindings
  • [48] R. Lambiotte, V. Blondel, C. de Kerchove, E. Huens, C. Prieur, Z. Smoreda, and P. Van Dooren. Geographical dispersal of mobile communication networks. http://arxiv.org/abs/0802.2178, 2008.
    Findings
  • [49] K. Lewis, J. Kaufman, M. Gonzales, A. Wimmer, and N. Christakis. Tastes, ties, and time: a new social network dataset using Facebook.com. Social Networks, 30:330–342, 2008. [Note: six research assistants were paid to download friends-only information from Facebook].
    Google ScholarLocate open access versionFindings
  • [50] D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. In CIKM, 2003.
    Google ScholarLocate open access versionFindings
  • [51] K. Liu and E. Terzi. Towards identity anonymization on graphs. In SIGMOD, 2008.
    Google ScholarLocate open access versionFindings
  • [52] M. Lucas and N. Borisov. flyByNight: Mitigating the privacy risks of social networking. In WPES, 2008.
    Google ScholarLocate open access versionFindings
  • [54] M. McGlohon, L. Akoglu, and C. Faloutsos. Weighted graphs and disconnected components: Patterns and a generator. In KDD, 2008.
    Google ScholarLocate open access versionFindings
  • [55] Medical News Today. WellNet launches online social networking program for health care coordination. http://www.medicalnewstoday.com/articles/118628.php, 2008.
    Findings
  • [58] A. Mislove, M. Marcon, K. Gummadi, P. Druschel, and B. Bhattacharjee. Measurement and analysis of online social networks. In IMC, 2007.
    Google ScholarLocate open access versionFindings
  • [60] A. Nanavati, S. Gurumurthy, G. Das, D. Chakraborty, K. Dasgupta, S. Mukherjea, and A. Joshi. On the structural properties of massive telecom call graphs: findings and implications. In CIKM, 2006.
    Google ScholarLocate open access versionFindings
  • [61] A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets. In S&P, 2008.
    Google ScholarLocate open access versionFindings
  • [63] J.-P. Onnela, J. Saramaki, J. Hyvonen, G. Szabo, D. Lazer, K. Kaski, J. Kertesz, and A.-L. Barabasi. Structure and tie strengths in mobile communication networks. http://arxiv.org/abs/physics/0610104, 2006.
    Findings
  • [65] Parliament of Canada.
    Google ScholarFindings
  • [68] B. Popescu, B. Crispo, and A. Tanenbaum. Safe and private data sharing with Turtle: Friends team-up and beat the system. In Cambridge Workshop on Security Protocols, 2004.
    Google ScholarLocate open access versionFindings
  • [70] M. Richardson and P. Domingos. Mining knowledge-sharing sites for viral marketing. In KDD, 2002.
    Google ScholarLocate open access versionFindings
  • [71] T. Rohan, T. Tunguz-Zawislak, S. Sheffer, and J. Harmsen. Network node ad targeting. U.S. Patent Application 0080162260, 2008.
    Google ScholarFindings
  • [74] Georg Simmel. Soziologie. Duncker & Humblot, 1908. [Note: Simmel proposed a new and quantitative approach to sociology, one that would fall under Social Network Analysis in modern terms.].
    Google ScholarFindings
  • [76] Z. Stone, T. Zickler, and T. Darrell. Autotagging Facebook: Social network context improves photo annotation. In Workshop on Internet Vision, 2008.
    Google ScholarLocate open access versionFindings
  • [78] G. Swamynathan, C. Wilson, B. Boe, B. Zhao, and K. Almeroth. Can social networks improve e-commerce: a study on social marketplaces. In WOSN, 2008.
    Google ScholarLocate open access versionFindings
  • [82] J. Travers and S. Milgram. An experimental study of the small world problem. Sociometry, 32(4):425–443, 1969.
    Google ScholarLocate open access versionFindings
  • [83] United States Code. The Video Privacy Protection Act (VPPA). http://epic.org/privacy/vppa/, 2002.
    Findings
  • [84] United States Code. The Privacy Act of 1974 and Amendments. http://epic.org/privacy/laws/privacy act.html, 2005.
    Findings
  • [85] United States Department of Health and Human
    Google ScholarFindings
  • [86] United States Senate.
    Google ScholarFindings
  • [87] United States Senate.
    Google ScholarFindings
  • [88] United States Senate. Text of the Privacy Act of 2005. http://www.govtrack.us/congress/billtext.xpd?bill=s109-116, 2005.
    Findings
  • [93] H. Yu, P. Gibbons, M. Kaminsky, and F. Xiao. SybilLimit: A near-optimal social network defense against sybil attacks. In S&P, 2008.
    Google ScholarLocate open access versionFindings
  • [94] E. Zheleva and L. Getoor. Preserving the privacy of sensitive relationships in graph data. In PinKDD, 2007.
    Google ScholarLocate open access versionFindings
  • [95] B. Zhou and J. Pei. Preserving privacy in social networks against neighborhood attacks. In ICDE, 2008.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科