A 0.57-GOPS/DSP Object Detection PIM Accelerator on FPGA

ASPDAC(2021)

引用 7|浏览7
暂无评分
摘要
ABSTRACTThe paper presents an object detection accelerator featuring a processing-in-memory (PIM) architecture on FPGAs. PIM architectures are well known for their energy efficiency and avoidance of the memory wall. In the accelerator, a PIM unit is developed using BRAM and LUT based counters, which also helps to improve the DSP performance density. The overall architecture consists of 64 PIM units and three memory buffers to store inter-layer results. A shrunk and quantized Tiny-YOLO network is mapped to the PIM accelerator, where DRAM access is fully eliminated during inference. The design achieves a throughput of 201.6 GOPs at 100MHz clock rate and correspondingly, a performance density of 0.57 GOPS/DSP.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要