Triton: Software-Defined Threat Model for Secure Multi-Tenant ML Inference Accelerators

Sarbartha Banerjee,Shijia Wei,Prakash Ramrakhyani,Mohit Tiwari

PROCEEDINGS OF THE 12TH INTERNATIONAL WORKSHOP ON HARDWARE AND ARCHITECTURAL SUPPORT FOR SECURITY AND PRIVACY, HASP 2023（2023）

Cited 0|Views3

No score

Abstract

Secure machine-learning inference is essential with the advent of multi-tenancy in machine learning-as-a-service (MLaaS) accelerators. Model owners demand the confidentiality of both model weights and architecture, while end users want to protect their personal data. Moreover, ML models used in mission-critical applications like autonomous vehicles or disease classification need integrity protection. While hardware trusted execution environments (TEE) [4, 41] provide data confidentiality and integrity, they face two challenges in the adoption for ML inference. First, TEEs are susceptible to numerous side channels, arising from resource sharing in multi-tenant systems. Second, the performance overhead of these TEEs is often proportional to the secret data size, making them unattractive for data-intensive real-time inference. The diverse deployment threats further complicate these challenges. For instance, compared to time-sharing execution, multitenant accelerators must assume a larger attack surface with adversaries monitoring or tampering with on-accelerator resources. Some inference process sensitive inputs while others compute on public inputs. As a result, existing TEE designs often adopt a single, perhaps the most restrictive threat model, which overburdens many secure ML inference deployments. To address the challenges in adopting TEEs for secure ML inference, we introduce the Triton TEE framework. Triton tailors threat models to each deployment with low overhead while mitigating side-channel leakages. Triton achieves this by offering an interface to define fine-grained secrets in an ML model or input, along with the attacker observation capabilities. Triton framework generates code for a custom threat model for each application based on its security requirements. The security policy of each secret is embedded in the instruction to convey the security guarantee to the hardware. The expressive threat model and secret declaration can reduce the secure ML inference overhead from 64% to 6% across different multi-tenant deployments.

Translated text

Key words

Secure hardware,ML accelerator,TEE,Threat model

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined