Design of a Quantization-Based DNN Delta Compression Framework for Model Snapshots and Federated Learning

Haoyu Jin,Donglei Wu,Shuyu Zhang,Xiangyu Zou,Sian Jin,Dingwen Tao,Qing Liao,Wen Xia

IEEE Transactions on Parallel and Distributed Systems（2023）

Cited 5|Views77

No score

Abstract

Deep neural networks (DNNs) have achieved remarkable success in many fields. However, large-scale DNNs also bring storage costs when storing snapshots for preventing clusters' frequent failures or incur significant communication overheads when transmitting DNNs in the Federated Learning (FL). Recently, several approaches, such as Delta-DNN and LC-Checkpoint, aim to reduce the size of DNNs' snapshot storage by compressing the difference between two neighboring versions of the DNNs (a.k.a., delta). However, we observe that existing approaches, applying traditional global lossy quantization techniques in DNN's delta compression, can not fully exploit the data similarity since the parameters' value ranges vary among layers. To fully explore the similarity of the delta model and improve the compression ratio, we propose a quantization-based local-sensitive delta compression approach, named QD-Compressor, by developing a layer-based local-sensitive quantization scheme and error feedback mechanism. Specifically, the quantizers and number of quantization bits are adaptive among layers based on the value distribution and weighted entropy of the delta's parameters. To avoid quantization error degrading the performance of the restored model, an alternative error feedback mechanism is designed to dynamically correct the quantization error during the training process. Experiments on multiple popular DNNs and datasets show that QD-Compressor obtains a higher 7x-40x compression ratio in the model snapshot compression scenario than the state-of-the-art approaches. Additionally, QD-Compressor achieves an 11x-15x compression ratio to the residual model of the Federated Learning compression scenario.

Translated text

Key words

Quantization (signal),Neural networks,Training,Computational modeling,Data models,Data compression,Federated learning,quantization,delta compression,snapshot,distribution learning

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined