Towards Optimal Multi-Modal Federated Learning on Non-IID Data with Hierarchical Gradient Blending
IEEE Conference on Computer Communications (INFOCOM)(2022)
Abstract
Recent advances in federated learning (FL) made it feasible to train a machine learning model across multiple clients, even with non-IID data distributions. In contrast to these uni-modal models that have been studied extensively in the literature, there are few in-depth studies on how multi-modal models can be trained effectively with federated learning. Unfortunately, we empirically observed a counter-intuitive phenomenon that, compared with its uni-modal counterpart, multi-modal FL leads to a significant degradation in performance. Our in-depth analysis of such a phenomenon shows that modality sub-networks and local models can overfit and generalize at different rates. To alleviate these inconsistencies in collaborative learning, we propose hierarchical gradient blending (HGB), which simultaneously computes the optimal blending of modalities and the optimal weighting of local models by adaptively measuring their overfitting and generalization behaviors. When HGB is applied, we present a few important theoretical insights and convergence guarantees for convex and smooth functions, and evaluate its performance in multi-modal FL. Our experimental results on an extensive array of non-IID multi-modal data have demonstrated that HGB is not only able to outperform the best uni-modal baselines but also to achieve superior accuracy and convergence speed as compared to state-of-the-art frameworks.
MoreTranslated text
Key words
multi-modal machine learning,federated learning,gradient blending
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined