PQC Acceleration Using GPUs: FrodoKEM, NewHope, and Kyber

IEEE Transactions on Parallel and Distributed Systems(2021)

引用 44|浏览12
暂无评分
摘要
In this article, we present the first GPU implementation for FrodoKEM-976, NewHope-1024, and Kyber-1024. These algorithms belong to three different classes of post-quantum algorithms: Learning with errors (LWE), Ring-LWE, and Module-LWE. We show the practical applicability of the algorithms in different scenarios using two different implementation approaches. Moreover, we achieve highly efficient realization of computationally expensive operations such as NTT (Number Theoretic Transform), matrix multiplication, and Keccak. Since, these are the most common operations in lattice-based cryptographic algorithms, the techniques presented in this article will likely benefit other similar algorithms. Using a NVIDIA QUADRO GV100 graphics card, we undertook a detailed experimental study. For NewHope and Kyber we were able to perform approximately 504K and 473K key exchanges per second, demonstrating a speedup of almost 53.1× and 51.05× compared to the reference C implementation. Compared to the optimized AVX2 versions we obtain speedups of 25.7× and 14.6×, respectively. Further, implementation of FrodoKEM resulted in a speedup of 50.6×, 44.2×, and 36.9× for KeyGen, Encaps and Decaps operations. Compared to its AVX2 counterpart, we achieved a speedup of about 7.3×, 4.7× and 4.9×, respectively. We also show that using multiple streams resulted in further speedup of about 28-38 percent.
更多
查看译文
关键词
Cryptography,post-quantum,key exchange,PQC,NewHope,Kyber,FrodoKEM,GPU,CUDA,accelerator,NTT,SHAKE
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要