Gradient Statistics Aware Power Control for Over-the-Air Federated Learning in Fading Channels

ICC Workshops, pp. 1-6, 2020.

Other Links: arxiv.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
This work studied the power control optimization problem for the over-the-air federated learning over fading channels by taking the gradient statistics into account

Abstract:

To enable communication-efficient federated learning, fast model aggregation can be designed using over-the-air computation (AirComp). In order to implement a reliable and high-performance AirComp over fading channels, power control at edge devices is crucial. Existing works focus on the traditional data aggregation which often assumes ...More

Code:

Data:

0
Introduction
  • The proliferation of mobile devices such as smartphones, tablets, and wearable devices has revolutionized people’s daily lives.
  • It is increasingly desired to let edge devices engage in the learning process by keeping the collected data locally and performing training/inference either collaboratively or individually.
  • This emerging technology is known as Edge Machine Learning [2] or Edge Intelligence [3].
Highlights
  • The proliferation of mobile devices such as smartphones, tablets, and wearable devices has revolutionized people’s daily lives
  • Optimal power control in special cases: In the special case where the gradient squared multivariate coefficient of variation approaches infinity, which could happen when the model training converges and/or the dataset is highly non-identically distributed, we show that there is an optimal threshold for the aggregation capabilities of the devices, below which the devices transmit with full power and above which the devices transmit at the power to equalize the weights of their gradients for aggregation to one
  • We provide experimental results to validate the performance of the proposed power control for air computation-based Federated learning over fading channels
  • This work studied the power control optimization problem for the over-the-air federated learning over fading channels by taking the gradient statistics into account
  • It is shown that the optimal transmit power on each device decreases with gradient squared multivariate coefficient of variation and increases with noise variance
  • We propose an adaptive power control algorithm that dynamically adjusts the transmit power in each iteration based on the estimation results
Results
  • The authors provide experimental results to validate the performance of the proposed power control for AirComp-based FL over fading channels.

    A.
Conclusion
  • This work studied the power control optimization problem for the over-the-air federated learning over fading channels by taking the gradient statistics into account.
  • Number of Devices, K control policy is derived in closed form when the first- and second-order gradient statistics are known.
  • In the special cases where β approaches infinity and zero, the optimal transmit power reduces to threshold-based power control in [17] and full power transmission, respectively.
  • The authors propose an adaptive power control algorithm that dynamically adjusts the transmit power in each iteration based on the estimation results.
  • Experimental results show that the proposed adaptive power control scheme outperforms the existing schemes
Summary
  • Introduction:

    The proliferation of mobile devices such as smartphones, tablets, and wearable devices has revolutionized people’s daily lives.
  • It is increasingly desired to let edge devices engage in the learning process by keeping the collected data locally and performing training/inference either collaboratively or individually.
  • This emerging technology is known as Edge Machine Learning [2] or Edge Intelligence [3].
  • Results:

    The authors provide experimental results to validate the performance of the proposed power control for AirComp-based FL over fading channels.

    A.
  • Conclusion:

    This work studied the power control optimization problem for the over-the-air federated learning over fading channels by taking the gradient statistics into account.
  • Number of Devices, K control policy is derived in closed form when the first- and second-order gradient statistics are known.
  • In the special cases where β approaches infinity and zero, the optimal transmit power reduces to threshold-based power control in [17] and full power transmission, respectively.
  • The authors propose an adaptive power control algorithm that dynamically adjusts the transmit power in each iteration based on the estimation results.
  • Experimental results show that the proposed adaptive power control scheme outperforms the existing schemes
Reference
  • N. Zhang and M. Tao, “Gradient statistics aware power control for over-the-air federated learning in fading channels,” in Proc. IEEE ICC Workshops, 2020, pp. 1–6.
    Google ScholarLocate open access versionFindings
  • J. Park, S. Samarakoon, M. Bennis, and M. Debbah, “Wireless network intelligence at the edge,” Proceedings of the IEEE, vol. 107, no. 11, pp. 2204–2239, 2019.
    Google ScholarLocate open access versionFindings
  • Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, and J. Zhang, “Edge intelligence: Paving the last mile of artificial intelligence with edge computing,” Proceedings of the IEEE, vol. 107, no. 8, pp. 1738–1762, 2019.
    Google ScholarLocate open access versionFindings
  • B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in Artificial Intelligence and Statistics, 2017, pp. 1273–1282.
    Google ScholarLocate open access versionFindings
  • K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman, V. Ivanov, C. Kiddon, J. Konecny, S. Mazzocchi, H. B. McMahan et al., “Towards federated learning at scale: System design,” arXiv preprint arXiv:1902.01046, 2019.
    Findings
  • Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning: Concept and applications,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 10, no. 2, p. 12, 2019.
    Google ScholarLocate open access versionFindings
  • D. Alistarh, D. Grubic, J. Li, R. Tomioka, and M. Vojnovic, “Qsgd: Communication-efficient sgd via randomized quantization and encoding,” Advances in Neural Information Processing Systems 30, vol. 3, pp. 1710–1721, 2018.
    Google ScholarLocate open access versionFindings
  • F. Seide, H. Fu, J. Droppo, G. Li, and D. Yu, “1-bit stochastic gradient descent and its application to data-parallel distributed training of speech dnns,” in Fifteenth Annual Conference of the International Speech Communication Association, 2014.
    Google ScholarLocate open access versionFindings
  • A. F. Aji and K. Heafield, “Sparse communication for distributed gradient descent,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017, pp. 440–445.
    Google ScholarLocate open access versionFindings
  • Y. Tsuzuku, H. Imachi, and T. Akiba, “Variance-based gradient compression for efficient distributed deep learning,” arXiv preprint arXiv:1802.06058, 2018.
    Findings
  • S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan, “Adaptive federated learning in resource constrained edge computing systems,” IEEE J. Sel. Areas Commun., vol. 37, no. 6, pp. 1205–1221, 2019.
    Google ScholarLocate open access versionFindings
  • B. Nazer and M. Gastpar, “Computation over multiple-access channels,” IEEE Trans. Inf. Theory, vol. 53, no. 10, pp. 3498–3516, 2007.
    Google ScholarLocate open access versionFindings
  • K. Yang, T. Jiang, Y. Shi, and Z. Ding, “Federated learning via over-the-air computation,” IEEE Trans. Wireless Commun., vol. 19, no. 3, pp. 2022–2035, March 2020.
    Google ScholarLocate open access versionFindings
  • G. Zhu, Y. Wang, and K. Huang, “Broadband analog aggregation for low-latency federated edge learning,” IEEE Trans. Wireless Commun., vol. 19, no. 1, pp. 491–506, Jan. 2020.
    Google ScholarLocate open access versionFindings
  • M. M. Amiri and D. Gunduz, “Machine learning at the wireless edge: Distributed stochastic gradient descent over-the-air,” in Proc. IEEE ISIT, July 2019, pp. 1432–1436.
    Google ScholarLocate open access versionFindings
  • ——, “Federated learning over wireless fading channels,” IEEE Trans. Wireless Commun., pp. 1–1, 2020.
    Google ScholarLocate open access versionFindings
  • X. Cao, G. Zhu, J. Xu, and K. Huang, “Optimal power control for over-the-air computation,” in Proc. IEEE GLOBECOM, Dec. 2019, pp. 1–6.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments