Instructions and logic to perform floating-point and integer operations for machine learning

user-5d4bc4a8530c70a9b361c870(2017)

引用 24|浏览3
暂无评分
摘要
One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.
更多
查看译文
关键词
Graphics processing unit,Operand,Matrix multiplication,Floating point,Multiprocessing,Thread (computing),AND gate,Parallel computing,Computer science
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要