Triton: Software-Defined Threat Model for Secure Multi-Tenant ML Inference Accelerators

PROCEEDINGS OF THE 12TH INTERNATIONAL WORKSHOP ON HARDWARE AND ARCHITECTURAL SUPPORT FOR SECURITY AND PRIVACY, HASP 2023(2023)

Cited 0|Views3
No score
Abstract
Secure machine-learning inference is essential with the advent of multi-tenancy in machine learning-as-a-service (MLaaS) accelerators. Model owners demand the confidentiality of both model weights and architecture, while end users want to protect their personal data. Moreover, ML models used in mission-critical applications like autonomous vehicles or disease classification need integrity protection. While hardware trusted execution environments (TEE) [4, 41] provide data confidentiality and integrity, they face two challenges in the adoption for ML inference. First, TEEs are susceptible to numerous side channels, arising from resource sharing in multi-tenant systems. Second, the performance overhead of these TEEs is often proportional to the secret data size, making them unattractive for data-intensive real-time inference. The diverse deployment threats further complicate these challenges. For instance, compared to time-sharing execution, multitenant accelerators must assume a larger attack surface with adversaries monitoring or tampering with on-accelerator resources. Some inference process sensitive inputs while others compute on public inputs. As a result, existing TEE designs often adopt a single, perhaps the most restrictive threat model, which overburdens many secure ML inference deployments. To address the challenges in adopting TEEs for secure ML inference, we introduce the Triton TEE framework. Triton tailors threat models to each deployment with low overhead while mitigating side-channel leakages. Triton achieves this by offering an interface to define fine-grained secrets in an ML model or input, along with the attacker observation capabilities. Triton framework generates code for a custom threat model for each application based on its security requirements. The security policy of each secret is embedded in the instruction to convey the security guarantee to the hardware. The expressive threat model and secret declaration can reduce the secure ML inference overhead from 64% to 6% across different multi-tenant deployments.
More
Translated text
Key words
Secure hardware,ML accelerator,TEE,Threat model
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined