A Swap Dominated Tensor Re-Generation Strategy for Training Deep Learning Models

2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022)（2022）

引用 2|浏览10

暂无评分

摘要

With the growing of the depth of neural networks and the scale of data, the difficulty of network training also increases. When the GPU memory is insufficient, it is challenging to train deeper models. Recent research uses tensor swapping and recomputation techniques in a combined manner to optimize the memory usage. However, complex dependencies of the DNN graph limit the improvement of the single GPU memory optimization. Improper swap decisions even brings negative effects because the source of the recomputation may have been swapped out. In this paper, we propose a novel swap dominated tensor re-generation strategy, called STR, which combines swap and recomputation techniques to find the optimal execution plan for the DNN training when the memory is limited. We formalize our memory optimization problem with constraints which describe the dependency of the operator calculation and the bandwidth usage of swap. A host checkpoint mechanism is designed to make full use of the swapped tensors, which reduces the cost of the recomputation. We also present an approximation method based on a recursive source tracing procedure to improve the optimization efficiency. We implement a prototype of STR as a plugin on TensorFlow. The experimental result shows that STR improves up to 21.3% throughput compared with the state-of-the-art hybrid optimization strategy.

查看译文

关键词

recomputation, swap, DNN training, GPU memory optimization

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要