Quantitative evaluation of deep learning frameworks in heterogeneous computing environment

Zhengxian Lu, Chengkun Du, Yanfeng Jiang,Xueshuo Xie,Tao Li,Fei Yang

CCF Transactions on High Performance Computing（2024）

引用 0|浏览1

暂无评分

摘要

Deep learning frameworks are powerful tools to support model training. They dispatch operators by mapping them into a series of kernel functions and launching these kernel functions to specialized devices such as GPUs. However, there is little known about the performance of dispatching and mapping mechanisms in different frameworks, although these mechanisms directly affect training time. This paper presents a performance evaluation in various frameworks by examining their kernel function efficiency and operator dispatching mechanisms. We introduce two evaluation metrics, device computing time (DCT) and device occupancy ratio (DOR), based on the device’s active and idle states. To ensure comparable evaluation results, we propose a three-step verification method including hyper-parameter, model, and updating method equivalences. Due to inequivalent implementations in frameworks, we present an equivalence adjustment method based on the number of operators. Our evaluation results demonstrate the device utilization capability of five frameworks, namely PyTorch, TensorFlow 1, TensorFlow 2, MXNet, and PaddlePaddle, and reveal the potential for further optimizing the training performance of deep learning frameworks.

查看译文

关键词

Deep learning framework,Performance evaluation,Device computing time,Device occupancy ratio

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要