SAND: A Storage Abstraction for Video-based Deep Learning.

HotStorage(2023)

引用 0|浏览8
暂无评分
摘要
Deep learning has gained significant success in video applications such as classification, analytics, and self-supervised learning. However, when scaling out to a large volume of videos, existing approaches suffer from a fundamental limitation; they cannot efficiently utilize GPUs for training deep neural networks (DNNs). This is because video decoding in data preparation incurs a prohibitive amount of computing overhead, making GPU idle for the majority of training time. Otherwise, caching raw videos in memory or storage to bypass decoding is not scalable as they account for from tens to hundreds of terabytes. This paper proposes SAND, a system that enables deep learning frameworks to directly access training data by a storage abstraction. This abstraction effectively hides the data preprocessing delay, enabling GPUs to be fully utilized for DNN training. To accomplish this, SAND operates an in-storage cache and manages the cache by ahead-of-time scheduling to guarantee that requested training data can be always retrieved immediately from the cache. This scheduling considers the future data accesses of deep learning frameworks for cache replacement. Compared to the existing approach, our evaluation using emulated environments shows that SAND improves the GPU utilization by 6.0x and reduces the training time by 75.9% on average.
更多
查看译文
关键词
Storage abstraction, Video preprocessing, Computational storage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要