Occamy: Memory-efficient GPU Compiler for DNN Inference.

Jaeho Lee,Shinnung Jeong,Seungbin Song, Kunwoo Kim,Heelim Choi,Youngsok Kim,Hanjun Kim

DAC（2023）

引用 0|浏览7

暂无评分

摘要

This work proposes Occamy, a new memory-efficient DNN compiler that reduces the memory usage of a DNN model without affecting its accuracy. For each DNN operation, Occamy analyzes the dimensions of input and output tensors, and their liveness within the operation. Across all the operations, Occamy analyzes liveness of all the tensors, generates a memory pool after calculating the maximum required memory size, and schedules when and where to place each tensor in the memory pool. Compared to PyTorch, on an integrated embedded GPU for six DNNs, Occamy reduces the memory usage by 34.6% and achieves a geometric mean speedup of 1.25x.

查看译文

关键词

DNN inference,DNN model,DNN operation,integrated embedded GPU,maximum required memory size,memory pool,memory usage,memory-efficient DNN compiler,memory-efficient GPU compiler,Occamy analyzes liveness,output tensors,tensor

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要