P2012: Building An Ecosystem For A Scalable, Modular And High-Efficiency Embedded Computing Accelerator

DATE '12: Proceedings of the Conference on Design, Automation and Test in Europe(2012)

引用 234|浏览320
暂无评分
摘要
P2012 is an area- and power-efficient many-core computing fabric based on multiple globally asynchronous, locally synchronous (GALS) clusters supporting aggressive fine-grained power, reliability and variability management. Clusters feature up to 16 processors and one control processor with independent instruction streams sharing a multi-banked L1 data memory, a multi-channel DMA engine, and specialized hardware for synchronization and scheduling. P2012 achieves extreme area and energy efficiency by supporting domain-specific acceleration at the processor and cluster level through the addition of dedicated HW IPs. P2012 can run standard OpenCL and OpenMP parallel codes well as proprietary Native Programming Model (NPM) SW components that provide the highest level of control on application-to-resource mapping. In Q3 2011 the P2012 SW Development Kit (SDK) has been made available to a community of R&D users; it includes full OpenCL and NPM development environments. The first P2012 SoC prototype in 28nm CMOS will sample in Q4 2012, featuring four clusters and delivering 80GOPS (with single precision floating point support) in 15.2mm2 with 2W power consumption.
更多
查看译文
关键词
CMOS memory circuits,file organisation,integrated circuit reliability,parallel architectures,power aware computing,shared memory systems,synchronisation,system-on-chip,HW IPs,NPM development environments,OpenCL parallel codes,OpenMP parallel codes,P2012 SW development kit,P2012 SoC prototype,Q3 2011,Q4 2012,R&D users,SW components,aggressive fine-grained power management,application-to-resource mapping,area-efficient many-core computing fabric,control processor,domain-specific acceleration,energy efficiency,globally asynchronous locally synchronous clusters,high-efficiency embedded computing accelerator,independent instruction streams,modular embedded computing accelerator,multibanked L1 data memory sharing,multichannel DMA engine,native programming model,power consumption,power-efficient many-core computing fabric,processors scheduling,reliability management,size 28 nm,synchronization,variability management,
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要