Folded Integer Multiplication for FPGAs

International Symposium on Field Programmable Gate Arrays(2021)

引用 9|浏览10
暂无评分
摘要
ABSTRACTEncryption - especially the key exchange algorithms such as RSA - is an increasing use-model for FPGAs, driven by the adoption of the FPGA as a SmartNIC in the datacenter. While bulk encryption such as AES maps well to generic FPGA features, the very large multipliers required for RSA are a much more difficult problem. Although FPGAs contain thousands of small integer multipliers in DSP Blocks, aggregating them into very large multipliers is very challenging because of the large amount of soft logic required - especially in the form of long adders, and the high embedded multiplier count. In this paper, we describe a large multiplier architecture that operates in a multi-cycle format and which has a linear area/throughput ratio. We show results for a 2048-bit multiplier that has a latency of 118 cycles, inputs data every 9th cycle and closes timing at 377MHz in an Intel Arria 10 FPGA, and over 400MHz in a Stratix 10. The proposed multiplier uses 1/9 of the DSP resources typically used in a 2048-bit Karatsuba implementation, showing a perfectly linear throughput to DSP-count ratio. Our proposed solution outperforms recently reported results, in either arithmetic complexity - by making use of the Karatsuba techniques, or in scheduling efficiency - embedded DSP resources are fully utilized.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要