Heterogeneous systems with reconfigurable neuromorphic computing accelerators

2016 IEEE International Symposium on Circuits and Systems (ISCAS)(2016)

引用 12|浏览71
暂无评分
摘要
Developing heterogeneous system with hardware accelerator is a promising solution to implement high performance applications where explicitly programmed, rule-based algorithms are either infeasible or inefficient. However, mapping a neural network model to a hardware representation is a complex process, where balancing computation resources and memory accesses is crucial. In this work, we present a systematic approach o optimize the heterogeneous system with a FPGA-based neuromorphic computing accelerator (NCA). For any applications, the neural network topology and computation flow of the accelerator can be configured through a NCA-aware compiler. The FPGA-based NCA contains a generic multi-layer neural network composed of a set of parallel neural processing elements. Such a scheme imitates the human cognition process and follows the hierarchy of neocortex. At architectural level, we decrease the computing resource requirement to enhance computation efficiency. The hardware implementation primarily targets at reducing data communication load: a multi-thread computation engine is utilized to mask the long memory latency. Such a combined solution can well accommodate the ever increasing complexity and scalability of machine learning applications and improve the system performance and efficiency. Through the evaluation across eight representative benchmarks, we observed on average 12.1× speedup and 45.8× energy reduction, with marginal accuracy loss comparing with CPU-only computation.
更多
查看译文
关键词
reconfigurable neuromorphic computing accelerators,hardware accelerator,high-performance applications,explicitly programmed rule-based algorithms,neural network model mapping,complex process,computation resource balancing,memory access,heterogeneous system optimization,FPGA-based neuromorphic computing accelerator,FPGA-based NCA,neural network topology,computation flow,NCA-aware compiler,generic multilayer neural network,parallel neural processing elements,human cognition process,neocortex hierarchy,architectural level,computing resource,computation efficiency enhancement,data communication load reduction,multithread computation engine,long-memory latency masking,machine learning applications,system performance improvement,system efficiency improvement
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要