# Delphi: A Cryptographic Inference Service for Neural Networks

USENIX Security 2020, 2020.

Keywords:

neural network architectureprior workservice providerGarbled circuitsneural architecture searchMore(16+)

Weibo:

Abstract:

Many companies provide neural network prediction services to users for a wide
range of applications. However, current prediction systems compromise one
party's privacy: either the user has to send sensitive inputs to the service
...More

Code:

Data:

Introduction

- Recent advances in machine learning have driven increasing deployment of neural network inference in popular applications like voice assistants [Bar18] and image classification [Liu+17b].
- To achieve good performance on realistic neural networks, DELPHI builds upon techniques from GAZELLE to develop new protocols for evaluating linear and non-linear layers that minimize the use of heavy cryptographic tools, and minimizes communication and computation costs in the preprocessing and online phases.
- After preprocesing, during the online inference phase, the client provides its input to the specialized secure two-party computation protocol, and eventually learns the inference result.

Highlights

- Recent advances in machine learning have driven increasing deployment of neural network inference in popular applications like voice assistants [Bar18] and image classification [Liu+17b]
- We present DELPHI, a cryptographic prediction system for realistic neural network architectures
- To achieve good performance on realistic neural networks, DELPHI builds upon techniques from GAZELLE to develop new protocols for evaluating linear and non-linear layers that minimize the use of heavy cryptographic tools, and minimizes communication and computation costs in the preprocessing and online phases
- We describe the protocol for evaluating the i-th layer, which consists of linear functions and activation functions
- Our machine learning and cryptographic protocol experiments rely on the following datasets and architectures: 1
- We demonstrate the effectiveness of DELPHI’s cryptographic protocol by showing that DELPHI’s preprocessing phase and online phase offer significant savings in latency and communication cost over prior work (GAZELLE)

Results

- On every set of model parameters M that the server holds and every input vector x of the client, the output of the client at the end of the protocol is the correct prediction M(x).
- Because the online phase of the protocol for linear layers requires multiplication of a fixed-point matrix by a secret shared vector, the result is a ∼ 45-bit integer, and can be represented with full precision in a 64-bit floating point number.
- The planner takes as additional inputs the training data, and a constraint on the minimum acceptable prediction accuracy t, and uses NAS to discover a network configuration that maximizes the number of quadratic approximations while still achieving accuracy greater than t.
- In this second mode, the planner uses NAS to optimize the following properties of a candidate network configuration given t: (a) the number of quadratic approximations, (b) the placement of these approximations, and (c) training hyperparameters like learning rate and momentum.
- As explained in Remark 4.2, DELPHI’s choice of prime field enables DELPHI to use standard GPU libraries for evaluating convolutional layers in the online phase.
- The authors' protocol uses Beaver’s multiplication procedure [Bea95], which requires sending one field element from the server to the client and vice versa, and requires some cheap local field operations from each party.
- For the most efficient networks output by the planner, DELPHI requires 1.5–2 × less preprocessing time, and 6–40 × less communication.

Conclusion

- For the most efficient networks output by the planner, DELPHI requires 22–100 × less time to execute its online phase, and 9–40 × less communication.
- GAZELLE [Juv+18] is the system most similar to ours: it uses an efficient HE-based protocol for linear layers, while using garbled circuits to compute non-linear activations.
- The authors rely on NAS algorithms only for optimizing the placement of quadratic approximation layers within a network, as ReLU activations were the bottleneck in the system.

Summary

- Recent advances in machine learning have driven increasing deployment of neural network inference in popular applications like voice assistants [Bar18] and image classification [Liu+17b].
- To achieve good performance on realistic neural networks, DELPHI builds upon techniques from GAZELLE to develop new protocols for evaluating linear and non-linear layers that minimize the use of heavy cryptographic tools, and minimizes communication and computation costs in the preprocessing and online phases.
- After preprocesing, during the online inference phase, the client provides its input to the specialized secure two-party computation protocol, and eventually learns the inference result.
- On every set of model parameters M that the server holds and every input vector x of the client, the output of the client at the end of the protocol is the correct prediction M(x).
- Because the online phase of the protocol for linear layers requires multiplication of a fixed-point matrix by a secret shared vector, the result is a ∼ 45-bit integer, and can be represented with full precision in a 64-bit floating point number.
- The planner takes as additional inputs the training data, and a constraint on the minimum acceptable prediction accuracy t, and uses NAS to discover a network configuration that maximizes the number of quadratic approximations while still achieving accuracy greater than t.
- In this second mode, the planner uses NAS to optimize the following properties of a candidate network configuration given t: (a) the number of quadratic approximations, (b) the placement of these approximations, and (c) training hyperparameters like learning rate and momentum.
- As explained in Remark 4.2, DELPHI’s choice of prime field enables DELPHI to use standard GPU libraries for evaluating convolutional layers in the online phase.
- The authors' protocol uses Beaver’s multiplication procedure [Bea95], which requires sending one field element from the server to the client and vice versa, and requires some cheap local field operations from each party.
- For the most efficient networks output by the planner, DELPHI requires 1.5–2 × less preprocessing time, and 6–40 × less communication.
- For the most efficient networks output by the planner, DELPHI requires 22–100 × less time to execute its online phase, and 9–40 × less communication.
- GAZELLE [Juv+18] is the system most similar to ours: it uses an efficient HE-based protocol for linear layers, while using garbled circuits to compute non-linear activations.
- The authors rely on NAS algorithms only for optimizing the placement of quadratic approximation layers within a network, as ReLU activations were the bottleneck in the system.

- Table1: Running time and communication cost of ResNet-32 convolutions in DELPHI
- Table2: Running time and communication cost of ResNet-32 convolutions in DELPHI when run on the GPU across different batch sizes b
- Table3: Amortized running time and communication cost of individual ReLU and quadratic activations in DELPHI

Related work

- We first discuss cryptographic techniques for for secure execution of machine learning algorithms in Section 8.1. Then, in Section 8.2, we discuss model inference attacks that recover information about the model from predictions, as well as countermeasures for these attacks. Finally, in Section 8.3, we discuss prior work on neural architecture search.

8.1 Secure machine learning

The problem of secure inference can be solved via generic secure computation techniques like secure two-party (2PC) computation [Yao[86]; Gol+87], fully homomorphic encryption (FHE) [Gen09], or homomorphic secret sharing (HSS) [Boy+16]. However, the resulting protocols would suffer from terrible communication and computation complexity. For instance, the cost of using 2PC to compute a function grows with the size of the (arithmetic or boolean) circuit for that function. In our setting, the function being computed is the neural network itself. Evaluating the network requires matrix-vector multiplication, and circuits for this operation grow quadratically with the size of the input. Thus using a generic 2PC protocol for secure inference would result in an immediate quadratic blow up in both computation and communication.

Funding

- This work was supported by the NSF CISE Expeditions Award CCF-1730628, as well as gifts from the Sloan Foundation, Bakar and Hellman Fellows Fund, Alibaba, Amazon Web Services, Ant Financial, Arm, Capital One, Ericsson, Facebook, Google, Intel, Microsoft, Scotiabank, Splunk and VMware

Reference

- Apple. “iOS Security”. https://www.apple.com/business/docs/site/iOS_Security_ Guide.pdf.
- G. Ateniese, L. V. Mancini, A. Spognardi, A. Villani, D. Vitali, and G. Felici. “Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers”. In: IJSN (2015).
- M. Ball, B. Carmer, T. Malkin, M. Rosulek, and N. Schimanski. “Garbled Neural Networks are Practical”. ePrint Report 2019/338.
- B. Barrett. “The year Alexa grew up”. https://www.wired.com/story/amazon-alexa2018-machine-learning/.
- A. Barak, D. Escudero, A. Dalskov, and M. Keller. “Secure Evaluation of Quantized Neural Networks”. ePrint Report 2019/131.
- Y. Bengio, P. Y. Simard, and P. Frasconi. “Learning long-term dependencies with gradient descent is difficult”. In: IEEE Trans. Neural Networks (1994).
- J. Bergstra and Y. Bengio. “Random Search for Hyper-Parameter Optimization”. In: JMLR (2012).
- A. Brutzkus, O. Elisha, and R. Gilad-Bachrach. “Low Latency Privacy Preserving Inference”. ArXiV, cs.CR 1812.10659.
- N. Chandran, D. Gupta, A. Rastogi, R. Sharma, and S. Tripathi. “EzPC: Programmable, Efficient, and Scalable Secure Two-Party Computation for Machine Learning”. ePrint Report 2017/1109.
- E. Chou, J. Beal, D. Levy, S. Yeung, A. Haque, and L. Fei-Fei. “Faster CryptoNets: Leveraging Sparsity for Real-World Encrypted Inference”. ArXiV, cs.CR 1811.09953.
- T. Elgamal. “A public key cryptosystem and a signature scheme based on discrete logarithms”. In: IEEE Trans. on Inf. Theory (1985).
- T. Elsken, J. H. Metzen, and F. Hutter. “Neural Architecture Search: A Survey”. In: JMLR (2019).
- J. Fan and F. Vercauteren. “Somewhat Practical Fully Homomorphic Encryption”. ePrint Report 2012/144.
- L. Hanzlik, Y. Zhang, K. Grosse, A. Salem, M. Augustin, M. Backes, and M. Fritz. “MLCapsule: Guarded Offline Deployment of Machine Learning as a Service”. ArXiV, cs.CR 1808.00590.
- [Hes+17] E. Hesamifard, H. Takabi, and M. Ghasemi. “CryptoDL: Deep Neural Networks over Encrypted Data”. ArXiV, cs.CR 1711.05189.
- M. Jaderberg, V. Dalibard, S. Osindero, W. Czarnecki, J. Donahue, A. Razavi, et al. “Population Based Training of Neural Networks”. ArXiV, cs.LG 1711.09846.
- M. Jagielski, N. Carlini, D. Berthelot, A. Kurakin, and N. Papernot. “High-Fidelity Extraction of Neural Network Models”. ArXiV, cs.LG 1909.01838.
- A. Krizhevsky. “Convolutional Deep Belief Networks on CIFAR-10”. Unpublished manuscript. http://www.cs.utoronto.ca/~kriz/convcifar10-aug2010.pdf.
- Kuna. “Kuna AI”. https://getkuna.com/blogs/news/2017-05-24-introducing kuna-ai.
- W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. E. Alsaadi. “A survey of deep neural network architectures and their applications”. In: Neurocomputing (2017).
- O. Regev. “On lattices, learning with errors, random linear codes, and cryptography”. In: JACM (2009).
- B. K. Samanthula, Y. Elmehdwi, and W. Jiang. “k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data”. In: IEEE Trans. Knowl. Data Eng. (2015).
- P. Schoppmann, A. Gascon, M. Raykova, and B. Pinkas. “Make Some ROOM for the Zeros: Data Sparsity in Secure Distributed Machine Learning”. ePrint Report 2019/281.
- S. Tople, K. Grover, S. Shinde, R. Bhagwan, and R. Ramjee. “Privado: Practical and Secure DNN Inference”. ArXiV, cs.CR 1810.00602.
- [Wag+18] S. Wagh, D. Gupta, and N. Chandran. “SecureNN: Efficient and Private Neural Network Training”. ePrint Report 2018/442.
- M. Wistuba, A. Rawat, and T. Pedapati. “A Survey on Neural Architecture Search”. ArXiV, cs.LG 1905.01392.
- [Wu+16a] D. J. Wu, T. Feng, M. Naehrig, and K. E. Lauter. “Privately Evaluating Decision Trees and Random Forests”. In: PoPETs (2016).
- “Wyze: Contact and Motion Sensors for Your Home”. https://www.wyze.com/.
- X. Yao. “Evolving artificial neural networks”. In: Proceedings of the IEEE (1999).

Tags

Comments