Exploration of task-based scheduling for convolutional neural networks accelerators under memory constraints

Crefeda Faviola Rodrigues,Graham Riley,Mikel Luján

Proceedings of the 16th ACM International Conference on Computing Frontiers（2019）

引用 4|浏览42

暂无评分

摘要

Development of application specific accelerators for deep convolutional neural networks (ConvNets) have mainly focussed on accelerating the computationally intensive layers, that is the convolutional layers, to improve performance and energy efficiency. Traditional approaches in this space have relied on handcrafted dataflow implementations to leverage the fine-grained parallelism and data-locality properties within these layers. However, ConvNets layers also have an untapped potential from cross-layer data locality. In our work, we explore a novel approach in the context of deep neural networks accelerators by modelling the computation as a task-dependency directed acyclic graph and proposing a memory-aware heuristic based onHeterogeneous Earliest Finish Time (HEFT) for task-graph scheduling on shared memory systems. Our results show the benefits of task graphs in terms of better memory use (23.4 % less) over conventional layer-by-layer processing in a simulated environment with the first three layers of LeNet-5. Certain task-graphs trade-off makespan (10% increase) for memory use (20 % decrease). Finally, our exploration of graphs with different slicing configurations for the pooling layer while using memory-aware HEFT versus the original HEFT reveals that regular shaped tiles across layers offers better makespan and memory use than tiles with large dimensions along one axis.

查看译文

关键词

accelerator systems, convolutional neural networks, scheduling, task-based parallelism

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要