Training Deep Nets with Sublinear Memory Cost
arXiv: Learning, Volume abs/1604.06174, 2016.
EI
Abstract:
We propose a systematic approach to reduce the memory consumption of deep neural network training. Specifically, we design an algorithm that costs O( √ n) memory to train a n layer network, with only the computational cost of an extra forward pass per mini-batch. As many of the state-of-the-art models hit the upper bound of the GPU memory...More
Code:
Data:
Tags
Comments