Towards Fair and Decentralized Federated Learning System for Gradient Boosting Decision Trees


引用 2|浏览7
At present, gradient boosting decision trees (GBDTs) has become a popular machine learning algorithm and has shined in many data mining competitions and real-world applications for its salient results on classification, ranking, prediction, etc. Federated learning which aims to mitigate privacy risks and costs, enables many entities to keep data locally and train a model collaboratively under an orchestration service. However, most of the existing systems often fail to make an excellent trade-off between accuracy and communication. In addition, they overlook an important aspect: fairness such as performance gains from different parties' datasets. In this paper, we propose a novel federated GBDT scheme based on the blockchain which can achieve constant communication overhead and good model performance and quantify the contribution of each party. Specifically, we replace the tree-based communication scheme with the pure gradient-based scheme and compress the intermediate gradient information to a limit to achieve good model performance and constant communication overhead in skewed datasets. On the other hand, we introduce a novel contribution allocation scheme named split Shapley value, which can quantify the contribution of each party with a limited gradient update and provide a basis for monetary reward. Finally, we combine the quantification mechanism with blockchain organically and implement a closed-loop federated GBDT system FGBDT-Chain in a permissioned blockchain environment and conduct a comprehensive experiment on public datasets. The experimental results show that FGBDT-Chain achieves a good trade-off between accuracy, communication overhead, fairness, and security under large-scale skewed datasets.
AI 理解论文
Chat Paper