gShare: A centralized GPU memory management framework to enable GPU memory sharing for containers

Future Generation Computer Systems(2022)

引用 4|浏览20
暂无评分
摘要
Owing to low overhead and rapid deployment, containers are increasingly becoming an attractive system software platform for deep learning and high performance computing (HPC) applications that leverage GPUs. Unfortunately, existing container software does not concern how each container allocates GPU memory. Therefore, if a certain container consumes the majority of GPU memory, other containers may not run their workloads because of insufficient memory. This paper presents gShare, a centralized GPU memory management framework to enable GPU memory sharing for containers. As with a modern operating system, gShare allocates the entire GPU memory inside the framework and manages the memory with sophisticated memory allocators. gShare is then able to enforce the GPU memory limit of each container by mediating the memory allocation calls. To achieve its objective, gShare introduces the API remoting components, the mediator, and the three-level memory allocator, which enable lightweight and efficient GPU memory management. Our prototype implementation achieves near-native performance with secure isolation and little memory waste in popular deep learning and HPC workloads.
更多
查看译文
关键词
Containers,GPU memory,Virtualization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要