M22: Rate-Distortion Inspired Gradient Compression
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)
Abstract
In federated learning (FL), the communication constraint between the remote users and the Parameter Server (PS) is a crucial bottleneck. This paper proposes M22, a rate-distortion inspired approach to model update compression for distributed training of deep neural networks (DNNs). In particular, (i) we propose a family of distortion measures referred to as "M-magnitude weighted L
2
" norm, and (ii) we assume that gradient updates follow an i.i.d. distribution with two degrees of freedom – generalized normal and Weibull distributions. To measure the gradient compression performance under a communication constraint, we define the per-bit accuracy as the optimal improvement in accuracy that a bit of communication brings to the centralized model over the training period. Using this performance measure, we systematically benchmark the choice of gradient distributions and the distortion measure. We provide substantial insights on the role of these choices and argue that significant performance improvements can be attained using such a rate-distortion inspired compressor.
MoreTranslated text
Key words
Federated learning,Gradient compression,Gradient sparsification,DNN gradient modeling
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined