Iprivjoin: an ID-Private Data Join Framework for Privacy-Preserving Machine Learning.
IEEE transactions on information forensics and security(2023)
Abstract
The world has observed an increasing trend in the development of Privacy-Preserving Machine Learning (PPML) for cross-silo collaborative model training over sensitive data. As the first essential step of cross-silo PPML, it is critical that the parties can align their dataset with privacy assurance, i.e., private data join. However, the existing private data join methods typically leak the ID information in the dataset intersection, which often raises privacy concerns. In this work, we propose iPrivJoin: a novel framework of ID-private data join for PPML. Compared with naively using circuit-based Private Set Intersection (circuit-PSI) for data join, the proposed framework has two advantages: 1) data volume reduction. iPrivJoin utilizes oblivious shuffle to securely trim off the redundant data that is outside the intersection, while the entire dataset needs to be carried to further process in the circuit-PSI based approach. 2) efficiency improvement. iPrivJoin introduces a new private encoding technique to avoid the expensive circuit evaluation that is needed in circuit-PSI. As a result, compared with directly using circuit-PSI, PPML with iPrivJoin enjoys approximately 3x of speedup. Moreover, we propose a new oblivious shuffle protocol, which may be of independent interest. It achieves 1.44x of speedup to the state-of-the-art in the real-world WAN network setting.
MoreTranslated text
Key words
Private data join,private set intersection,oblivious shuffle
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined