Building blocks for redundancy-free vector integer multiplication.
Conference of the Centre for Advanced Studies on Collaborative Research (CASCON)(2021)
摘要
Commercial applications of cryptography require arithmetic in prime fields with primes larger than the sizes of architected registers, and over time there will be pressure to use even larger fields to keep up with the increasing resources available for brute-force attacks and the threat that quantum computers will reach the power required for unconventional attacks. Integer multiplication is the bottleneck for most computations, and most algorithm innovations revolve around strategic composition of efficient hardware multipliers for smaller integers into algorithms for larger integer multiplication. In this paper we present an novel vector instruction which would allow hardware multipliers to be used optimally for school-book multiplication by flexibly grouping multiplications to avoid empty slots in vector instructions resulting in unused hardware capacity. We give general conditions for optimality, consider latency/throughput tradeoffs and optionally pair the new instruction, mammma, with a novel shift-and-sum instruction.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要