De-Anonymizing Text by Fingerprinting Language Generation

NIPS 2020, 2020.

Cited by: 0|Bibtex|Views21
EI
Other Links: arxiv.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
As our main technical contribution, we demonstrated that the series of nucleus sizes associated with an English-language word sequence is a fingerprint which uniquely identifies this sequence

Abstract:

Components of machine learning systems are not (yet) perceived as security hotspots. Secure coding practices, such as ensuring that no execution paths depend on confidential inputs, have not yet been adopted by ML developers. We initiate the study of code security of ML systems by investigating how nucleus sampling---a popular approach ...More

Code:

Data:

0
Introduction
  • Machine learning (ML) models are composed from building blocks such as layer types, loss functions, sampling methods, etc.
  • Nucleus sampling [19] is similar to top-k sampling but instead of choosing candidates based on ranks, it chooses the maximal set (“nucleus”) of top-ranked words such that the sum of their probabilities is ≤ q
  • It produces high-quality, high-diversity text [19] and performs well on metrics, including the Human Unified with Statistical Evaluation (HUSE) score [18]
Highlights
  • Machine learning (ML) models are composed from building blocks such as layer types, loss functions, sampling methods, etc
  • We show that when a “variable” X and another sequence Y are sampled from a real-world English corpus and X and Y are not similar, it always holds that π (X) − π (Y ) > U (|X|) for a large U (|X|)
  • The victim, like any process that uses PyTorch, loads the shared object (SO) libtorch.so into their process. This SO is in a public, world-readable directory, as long as PyTorch is installed on the machine
  • As our main technical contribution, we demonstrated that the series of nucleus sizes associated with an English-language word sequence is a fingerprint which uniquely identifies this sequence
  • We showed how a sidechannel attacker can measure these fingerprints and use them to de-anonymize anonymous text
  • We explained how to mitigate this leak by reducing input-dependent control flows in the implementations of ML systems
Methods
  • Our “victim” uses an auto-completion app based on Hugging Face’s PyTorch code driving a GPT-2-small language model, as in Section 3.2.
  • The victim, like any process that uses PyTorch, loads the shared object (SO) libtorch.so into their process.
  • This SO is in a public, world-readable directory, as long as PyTorch is installed on the machine.
  • The authors' attacker loads the same SO file into their process.
  • The attacker uses Flush+Reload to monitor the first instruction of a function called within the loop, as shown in Figure 5
Results
  • Figure 6c shows recall for different N ; when N ≥ 1900, the recall of the attack is greater than 99%.
Conclusion
  • A popular approach for text generation, as a case study of ML systems that unwittingly leak their confidential inputs.
  • As the main technical contribution, the authors demonstrated that the series of nucleus sizes associated with an English-language word sequence is a fingerprint which uniquely identifies this sequence.
  • The authors showed how a sidechannel attacker can measure these fingerprints and use them to de-anonymize anonymous text.
  • The authors explained how to mitigate this leak by reducing input-dependent control flows in the implementations of ML systems
Summary
  • Introduction:

    Machine learning (ML) models are composed from building blocks such as layer types, loss functions, sampling methods, etc.
  • Nucleus sampling [19] is similar to top-k sampling but instead of choosing candidates based on ranks, it chooses the maximal set (“nucleus”) of top-ranked words such that the sum of their probabilities is ≤ q
  • It produces high-quality, high-diversity text [19] and performs well on metrics, including the Human Unified with Statistical Evaluation (HUSE) score [18]
  • Methods:

    Our “victim” uses an auto-completion app based on Hugging Face’s PyTorch code driving a GPT-2-small language model, as in Section 3.2.
  • The victim, like any process that uses PyTorch, loads the shared object (SO) libtorch.so into their process.
  • This SO is in a public, world-readable directory, as long as PyTorch is installed on the machine.
  • The authors' attacker loads the same SO file into their process.
  • The attacker uses Flush+Reload to monitor the first instruction of a function called within the loop, as shown in Figure 5
  • Results:

    Figure 6c shows recall for different N ; when N ≥ 1900, the recall of the attack is greater than 99%.
  • Conclusion:

    A popular approach for text generation, as a case study of ML systems that unwittingly leak their confidential inputs.
  • As the main technical contribution, the authors demonstrated that the series of nucleus sizes associated with an English-language word sequence is a fingerprint which uniquely identifies this sequence.
  • The authors showed how a sidechannel attacker can measure these fingerprints and use them to de-anonymize anonymous text.
  • The authors explained how to mitigate this leak by reducing input-dependent control flows in the implementations of ML systems
Tables
  • Table1: Variability and measurement error for Silk Road Forum users. (N is 2700, U (N ) − d(N ) is 31860)
Download tables as Excel
Related work
  • Prior work showed how to infer model architectures and weights—but not inputs—via model execution time [9], addresses of memory accesses leaked by GPUs [21] and trusted hardware enclaves [22], and or via cache [20, 44] and GPU [29] side channels.

    The only prior work on inferring model inputs required hardware attacks, such as physically probing the power consumption of an FPGA accelerator [40], physically probing an external microcontroller executing the model [5], or inferring coarse information about the input’s class from hardware performance counters [2]. To the best of our knowledge, ours is the first work to show the feasibility of inferring neural-network inputs in a conventional, software-only setting, where the attacker is limited to executing an isolated malicious application on the victim’s machine.
Funding
  • Acknowledgments and Disclosure of Funding This research was supported in part by NSF grants 1704296 and 1916717, the Blavatnik Interdisciplinary Cyber Research Center (ICRC), the generosity of Eric and Wendy Schmidt by recommendation of the Schmidt Futures program, and a Google Faculty Research Award
Reference
  • AppArmor. https://gitlab.com/apparmor/apparmor/-/wikis/home, 1998.accessed: May 2020.
    Findings
  • M. Alam and D. Mukhopadhyay. How secure are deep learning algorithms from side-channel based reverse engineering? In DAC, 2019.
    Google ScholarLocate open access versionFindings
  • M. Andrysco, D. Kohlbrenner, K. Mowery, R. Jhala, S. Lerner, and H. Shacham. On subnormal floating point and abnormal timing. In S&P, 2015.
    Google ScholarLocate open access versionFindings
  • S. Axelsson. The base-rate fallacy and the difficulty of intrusion detection. TISSEC, 3 (3):186–205, 2000.
    Google ScholarLocate open access versionFindings
  • L. Batina, S. Bhasin, D. Jap, and S. Picek. CSI neural network: Using side-channels to recover your artificial neural network information. In USENIX Security, 2019.
    Google ScholarLocate open access versionFindings
  • E. F. Brickell. Technologies to improve platform security. In CHES, 2011.
    Google ScholarLocate open access versionFindings
  • S. Cohney, A. Kwong, S. Paz, D. Genkin, N. Heninger, E. Ronen, and Y. Yarom. Pseudorandom black swans: Cache attacks on CTR DRBG. In S&P, 2020.
    Google ScholarLocate open access versionFindings
  • ConvoKit. Cornell conversational analysis toolkit. https://convokit.cornell.edu/, 2020.accessed: June 2020.
    Findings
  • V. Duddu, D. Samanta, D. V. Rao, and V. E. Balas. Stealing neural networks via timing side channels. arXiv:1812.11720, 2018.
    Findings
  • Fit distribution module/script (fitdist). https://github.com/alreich/fitdist, 2020.accessed: June 2020.
    Findings
  • Q. Ge, Y. Yarom, D. Cock, and G. Heiser. A survey of microarchitectural timing attacks and countermeasures on contemporary hardware. Journal of Cryptographic Engineering, 8(1):1–27, 2018.
    Google ScholarLocate open access versionFindings
  • D. Genkin, A. Shamir, and E. Tromer. RSA key extraction via low-bandwidth acoustic cryptanalysis. In CRYPTO, 2014.
    Google ScholarLocate open access versionFindings
  • D. Genkin, L. Pachmanov, I. Pipman, and E. Tromer. ECDH key-extraction via low-bandwidth electromagnetic attacks on PCs. In CT-RSA, 2016.
    Google ScholarLocate open access versionFindings
  • D. Genkin, L. Pachmanov, E. Tromer, and Y. Yarom. Drive-by key-extraction cache attacks from portable code. In ACNS, 2018.
    Google ScholarLocate open access versionFindings
  • B. Gras, K. Razavi, H. Bos, and C. Giuffrida. Translation leak-aside buffer: Defeating cache side-channel protections with TLB attacks. In USENIX Security, 2018.
    Google ScholarLocate open access versionFindings
  • S. Gueron. Efficient software implementations of modular exponentiation. Journal of Cryptographic Engineering, 2(1):31–43, 2012.
    Google ScholarLocate open access versionFindings
  • D. Gullasch, E. Bangerter, and S. Krenn. Cache games–bringing access-based cache attacks on AES to practice. In S&P, 2011.
    Google ScholarLocate open access versionFindings
  • T. B. Hashimoto, H. Zhang, and P. Liang. Unifying human and statistical evaluation for natural language generation. In NAACL, 2019.
    Google ScholarLocate open access versionFindings
  • A. Holtzman, J. Buys, M. Forbes, and Y. Choi. The curious case of neural text degeneration. In ICLR, 2020.
    Google ScholarLocate open access versionFindings
  • S. Hong, M. Davinroy, Y. Kaya, D. Dachman-Soled, and T. Dumitraş. How to 0wn NAS in your spare time. In ICLR, 2020.
    Google ScholarLocate open access versionFindings
  • X. Hu, L. Liang, L. Deng, S. Li, X. Xie, Y. Ji, Y. Ding, C. Liu, T. Sherwood, and Y. Xie. Neural network model extraction attacks in edge devices by hearing architectural hints. In ASPLOS, 2020.
    Google ScholarLocate open access versionFindings
  • W. Hua, Z. Zhang, and G. E. Suh. Reverse engineering convolutional neural networks through side-channel information leaks. In DAC, 2018.
    Google ScholarLocate open access versionFindings
  • Hugging Face. Transformers on github. https://github.com/huggingface/transformers, 2020.accessed: June 2020.
    Findings
  • Hugging Face. Write with Transfomer (demo). https://transformer.huggingface.co/, 2020.accessed: June 2020.
    Findings
  • R. Hund, C. Willems, and T. Holz. Practical timing side channel attacks against kernel space ASLR. In S&P, 2013.
    Google ScholarFindings
  • P. C. Kocher. Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems. In CRYPTO, 1996.
    Google ScholarLocate open access versionFindings
  • M. Lipp, D. Gruss, R. Spreitzer, C. Maurice, and S. Mangard. Armageddon: Cache attacks on mobile devices. In USENIX Security, 2016.
    Google ScholarLocate open access versionFindings
  • F. Liu, Y. Yarom, Q. Ge, G. Heiser, and R. B. Lee. Last-level cache side-channel attacks are practical. In S&P, 2015.
    Google ScholarLocate open access versionFindings
  • H. Naghibijouybari, A. Neupane, Z. Qian, and N. Abu-Ghazaleh. Rendered insecure: GPU side channel attacks are practical. In CCS, 2018.
    Google ScholarLocate open access versionFindings
  • Y. Oren, V. P. Kemerlis, S. Sethumadhavan, and A. D. Keromytis. The spy in the sandbox: Practical cache attacks in JavaScript and their implications. In CCS, 2015.
    Google ScholarLocate open access versionFindings
  • D. A. Osvik, A. Shamir, and E. Tromer. Cache attacks and countermeasures: the case of AES. In CT-RSA, 2006.
    Google ScholarLocate open access versionFindings
  • C. Percival. Cache missing for fun and profit. https://www.daemonology.net/papers/htt.pdf, 2005.
    Findings
  • PyTorch. https://github.com/pytorch/pytorch, 2020.accessed: June 2020.
    Findings
  • A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever. Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 2019.
    Google ScholarLocate open access versionFindings
  • T. Ristenpart, E. Tromer, H. Shacham, and S. Savage. Hey, you, get off of my cloud: Exploring information leakage in third-party compute clouds. In CCS, 2009.
    Google ScholarLocate open access versionFindings
  • E. Ronen, R. Gillham, D. Genkin, A. Shamir, D. Wong, and Y. Yarom. The 9 lives of Bleichenbacher’s CAT: New Cache ATtacks on TLS implementations. In S&P, 2019.
    Google ScholarLocate open access versionFindings
  • R. Schuster, V. Shmatikov, and E. Tromer. Beauty and the Burst: Remote identification of encrypted video streams. In USENIX Security, 2017.
    Google ScholarLocate open access versionFindings
  • [39] P. Sirinam, M. Imani, M. Juarez, and M. Wright. Deep fingerprinting: Undermining website fingerprinting defenses with deep learning. In CCS, 2018.
    Google ScholarLocate open access versionFindings
  • [40] L. Wei, B. Luo, Y. Li, Y. Liu, and Q. Xu. I know what you see: Power side-channel attack on convolutional neural network accelerators. In ACSAC, 2018.
    Google ScholarLocate open access versionFindings
  • [41] S. Welleck, I. Kulikov, J. Kim, R. Y. Pang, and K. Cho. Consistency of a recurrent language model with respect to incomplete decoding. arXiv:2002.02492, 2020.
    Findings
  • [42] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, and J. Brew. HuggingFace’s transformers: State-of-the-art natural language processing. arXiv:1910.03771, 2019.
    Findings
  • [43] M. Yan, R. Sprabery, B. Gopireddy, C. Fletcher, R. Campbell, and J. Torrellas. Attack directories, not caches: Side channel attacks in a non-inclusive world. In S&P, 2019.
    Google ScholarLocate open access versionFindings
  • [44] M. Yan, C. Fletcher, and J. Torrellas. Cache telepathy: Leveraging shared resource attacks to learn DNN architectures. In USENIX Security, 2020.
    Google ScholarLocate open access versionFindings
  • [45] Y. Yarom and K. Falkner. FLUSH+RELOAD: A high resolution, low noise, L3 cache side-channel attack. In USENIX Security, 2014.
    Google ScholarLocate open access versionFindings
  • [46] Y. Yarom, D. Genkin, and N. Heninger. CacheBleed: a timing attack on OpenSSL constant-time RSA. Journal of Cryptographic Engineering, 7(2):99–112, 2017.
    Google ScholarLocate open access versionFindings
  • [47] Y. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart. Cross-VM side channels and their use to extract private keys. In CCS, 2012.
    Google ScholarLocate open access versionFindings
  • [48] Y. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart. Cross-tenant side-channel attacks in PaaS clouds. In CCS, 2014.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments