AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
Note that when the number of samples nj is much larger than the dimension of the problem D, the error values in the linear regression case take on exactly the same form as those in mean estimation

Model-sharing Games: Analyzing Federated Learning Under Voluntary Participation

AAAI, pp.5303-5311, (2021)

被引用0|浏览99
EI
下载 PDF 全文
引用
微博一下

摘要

Federated learning is a setting where agents, each with access to their own data source, combine models learned from local data to create a global model. If agents are drawing their data from different distributions, though, federated learning might produce a biased global model that is not optimal for each agent. This means that agents...更多

代码

数据

简介
  • Imagine a situation as follows: a hospital is trying to evaluate the effectiveness of a certain procedure based on data it has collected from procedures done on patients in their facilities.
  • The authors show that in this game, when the number of data points n is fairly small, the only core-stable coalition structure is to have all players federating together.
  • The error values depend on the number of samples each agent has access to, with the expectation taken over the values of samples each agent draws as well as the possible different true parameters of the data each player is trying to model.
重点内容
  • Imagine a situation as follows: a hospital is trying to evaluate the effectiveness of a certain procedure based on data it has collected from procedures done on patients in their facilities
  • Note that when the number of samples nj is much larger than the dimension of the problem D, the error values in the linear regression case take on exactly the same form as those in mean estimation
  • We have shown that there always exists a stable partition of players into coalitions in the case where players come in two sizes
  • We analyzed one type of federated learning: when the global model is produced by taking the weighted average of the parameters each player calculates on their own data: θf =
  • We proposed and analyzed two other variants of federated learning that incentivize the formation of larger coalitions
结果
  • The authors will first assume that each player has 5 samples from their local data distribution: Table 1 gives the error each player can expect in this situation.
  • In the case where they each have 25 samples, the players minimize their error by being alone.
  • Note that when the number of samples nj is much larger than the dimension of the problem D, the error values in the linear regression case take on exactly the same form as those in mean estimation.
  • The expected MSE of federated estimation for a player with nj samples is: Lj +
  • In the case that all players have the same number of samples, the authors can use ni = n to simplify the error term: μe M ·n n2 + (M M 2 · n2
  • In the case that players are indifferent between any arrangement, for any partition π and any competing coalition C, all players would be indifferent between π and C, so π is core stable.
  • The first case is when ns is fairly large: it turns out that each player minimizes their error by using local estimation, which implies that πl is in the core.
  • The authors have shown that there always exists a stable partition of players into coalitions in the case where players come in two sizes.
  • The authors analyzed one type of federated learning: when the global model is produced by taking the weighted average of the parameters each player calculates on their own data: θf =
结论
  • For mean estimation with the v-weighting federated learning method, the expected MSE of a player with nj samples is:
  • This contradicts the use of vj as an optimal weighting, so it cannot be the case that any player gets lower error in a different coalition.
  • It could be interesting to compute exact or approximate error values for cases beyond mean estimation and linear regression.
总结
  • Imagine a situation as follows: a hospital is trying to evaluate the effectiveness of a certain procedure based on data it has collected from procedures done on patients in their facilities.
  • The authors show that in this game, when the number of data points n is fairly small, the only core-stable coalition structure is to have all players federating together.
  • The error values depend on the number of samples each agent has access to, with the expectation taken over the values of samples each agent draws as well as the possible different true parameters of the data each player is trying to model.
  • The authors will first assume that each player has 5 samples from their local data distribution: Table 1 gives the error each player can expect in this situation.
  • In the case where they each have 25 samples, the players minimize their error by being alone.
  • Note that when the number of samples nj is much larger than the dimension of the problem D, the error values in the linear regression case take on exactly the same form as those in mean estimation.
  • The expected MSE of federated estimation for a player with nj samples is: Lj +
  • In the case that all players have the same number of samples, the authors can use ni = n to simplify the error term: μe M ·n n2 + (M M 2 · n2
  • In the case that players are indifferent between any arrangement, for any partition π and any competing coalition C, all players would be indifferent between π and C, so π is core stable.
  • The first case is when ns is fairly large: it turns out that each player minimizes their error by using local estimation, which implies that πl is in the core.
  • The authors have shown that there always exists a stable partition of players into coalitions in the case where players come in two sizes.
  • The authors analyzed one type of federated learning: when the global model is produced by taking the weighted average of the parameters each player calculates on their own data: θf =
  • For mean estimation with the v-weighting federated learning method, the expected MSE of a player with nj samples is:
  • This contradicts the use of vj as an optimal weighting, so it cannot be the case that any player gets lower error in a different coalition.
  • It could be interesting to compute exact or approximate error values for cases beyond mean estimation and linear regression.
表格
  • Table1: The expected errors of players in each coalition when all three players have 5 samples each, with parameters μe = 10, σ2 = 1. Each row denotes a different coalition partition: for example, {a, b}{c} indicates that players a and b are federating together while c is alone. Coalitions that are identical up to renaming of players are omitted
  • Table2: The expected errors of players in each coalition when players a and b have 5 samples each and player c has 25 samples, with parameters μe = 10, σ2 = 1
  • Table3: The expected errors of players in each coalition when players a, b, c each have 25 samples, with parameters μe = 10, σ2 = 1
Download tables as Excel
相关工作
  • Incentives and federated learning: Blum et al (2017) describes an approach to handling heterogeneous data where more samples are iteratively gathered from each agent in a way so that all agents are incentivized to participate in the grand coalition during federated learning. Duan et al (2021) builds a framework to schedule data augmentation and resampling. Yu, Bagdasaryan, and Shmatikov (2020) demonstrates empirically that there can be cases where individuals get lower error with local training than federated and evaluates empirical solutions. Wang et al (2020) analyzes the question of when it makes sense to split or not to split datasets drawn from different distributions. Finally, Blum et al (2020) analyzes notions of envy and efficiency with respect to sampling allocations in federated learning.

    Transfer learning: Mansour et al (2020) and Deng, Kamani, and Mahdavi (2020) both propose theoretical methods for using transfer learning to minimize error provided to agents with heterogeneous data. Li et al (2019) and Martinez, Bertran, and Sapiro (2020) both provide methods to produce a more uniform level of error rates across agents participating in federated learning.
基金
  • This work was supported in part by a Simons Investigator Award, a Vannevar Bush Faculty Fellowship, a MURI grant, AFOSR grant FA9550-19-1-0183, grants from the ARO and the MacArthur Foundation, and NSF grant DGE-1650441
研究对象与分析
samples: 5
We will discuss these more in future sections, including how to handle cases where they may be imperfectly known, but for now we will take them to be fixed. We will first assume that each player has 5 samples from their local data distribution: Table 1 gives the error each player can expect in this situation. Note that because the players have the same number of samples, players in identical situations have identical errors

samples: 25
It is also not individually stable because of the same reason: player b could leave its coalition to join {a}, and the resulting set {a, b} leads to a reduction in both of their errors. Finally, we will assume that all three players have 25 samples: this example is shown in Table 3. As in Table 1, the players have identical preferences

samples: 5
As in Table 1, the players have identical preferences. However, in the case where they each had 5 samples, they minimized their error by being together. In the case where they each have 25 samples, the players minimize their error by being alone

samples: 25
However, in the case where they each had 5 samples, they minimized their error by being together. In the case where they each have 25 samples, the players minimize their error by being alone. In later sections we will give theoretical results that explain this example more fully, but understanding the corestable partitions here will help to build intuition for more general results

引用论文
  • Abu-Mostafa, Y.; Lin, H.; and Magdon-Ismail, M. 2012. Learning from data: a short course: AMLbook. Google Scholar.
    Google ScholarLocate open access versionFindings
  • Anderson, T. W. 196An introduction to multivariate statistical analysis. Technical report, Wiley New York.
    Google ScholarFindings
  • Blum, A.; Haghtalab, N.; Procaccia, A. D.; and Qiao, Martinez, N.; Bertran, M.; and Sapiro, G. 2020.
    Google ScholarFindings
  • M. 2017. Collaborative PAC Learning. In Guyon, Minimax Pareto Fairness: A Multi Objective Per-
    Google ScholarFindings
  • I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Ferspective. In Proceedings of the 37th Internagus, R.; Vishwanathan, S.; and Garnett, R., eds., tional Conference on Machine Learning, Proceed-Advances in Neural Information Processing Sysings of Machine Learning Research. PMLR. URL
    Google ScholarLocate open access versionFindings
  • https://proceedings.icml.cc/static/paper files/icml/2020/1084-Paper.pdf.
    Locate open access versionFindings
  • http://papers.nips.cc/paper/6833-collaborative-pac-learning.pdf. Morris, C. N.1986. Empirical Bayes:a frequency-
    Findings
  • Blum, A.; Haghtalab, N.; Shao, H.; and Phillips, R. L. 2020.
    Google ScholarFindings
  • Bogomolnaia, A.; and Jackson, M. O. 2002. The Stability of Hedonic Coalition Structures. Games and Economic Behavior 38(2): 201 – 230. ISSN 08998256. doi:https://doi.org/10.1006/game.2001.0877. URL
    Locate open access versionFindings
  • Mathematical Statistics. doi:10.1214/lnms/1215540299. URL https://doi.org/10.1214/lnms/1215540299.
    Findings
  • Paquay, P. 2018. Learning-from-data-Solutions. URL https://github.com/ppaquay/Learning-from-Data-Solutions.
    Findings
  • http://www.sciencedirect.com/science/article/pii/S0899825601908S77at2t.ler, F.; Muller, K.-R.; and Samek, W.2020. Clus-Casella, G.1992. Illustrating empirical Bayes methods. Chemometrics and intelligent laboratory systems 16(2):107–125.
    Locate open access versionFindings
  • Deng, Y.; Kamani, M. M.; and Mahdavi, M. 2020. Adaptive Personalized Federated Learning.
    Google ScholarFindings
  • tered Federated Learning: Model-Agnostic Distributed Multitask Optimization Under Privacy Constraints. IEEE Transactions on Neural Networks and Learning Systems 1–13. ISSN 2162-2388. doi:10.1109/tnnls.2020.3015958. URL http://dx.doi.org/10.1109/TNNLS.2020.3015958.
    Findings
  • Sellentin, E.; and Heavens, A. F. 20Parameter infer-
    Google ScholarLocate open access versionFindings
  • L. 2021. Self-Balancing Federated Learning With Global of the Royal Astronomical Society: Letters 456(1): L132–
    Google ScholarLocate open access versionFindings
  • L136. ISSN 1745-3925. doi:10.1093/mnrasl/slv190. URL
    Findings
  • https://doi.org/10.1093/mnrasl/slv190.
    Findings
  • Efron, B.; and Morris, C. 1977. Stein’s paradox in statistics. Scientific American 236(5): 119–127.
    Google ScholarLocate open access versionFindings
  • Guazzone, M.; Anglano, C.; and Sereno, M. 2014. A Game-Theoretic Approach to Coalition Formation in Green Cloud Federations. 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing doi:10.1109/ccgrid.2014.37. URL http://dx.doi.org/10.1109/CCGrid.2014.37.
    Locate open access versionFindings
  • Kairouz, P.; McMahan, H. B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A. N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; D’Oliveira, R. G. L.; Rouayheb, S. E.; Evans, D.; Gardner, J.; Garrett, Z.; Gascon, A.; Ghazi, B.; Gibbons, P. B.; Gruteser, M.; Harchaoui, Z.; He, C.; He, L.; Huo, Z.; Hutchinson, B.; Hsu, J.; Jaggi, M.; Javidi, T.; Joshi, G.; Khodak, M.; Konecny, J.; Korolova, A.; Koushanfar, F.; Koyejo, S.; Lepoint, T.; Liu, Y.; Mittal, P.; Mohri, M.; Nock, R.; Ozgur, A.; Pagh, R.; Raykova, M.; Qi, H.; Ramage, D.; Raskar, R.; Song, D.; Song, W.; Stich, S. U.; Sun, Z.; Suresh, A. T.; Tramer, F.; Vepakomma, P.; Wang, J.; Xiong, L.; Xu, Z.; Yang, Q.; Yu, F. X.; Yu, H.; and Zhao, S. 2019. Advances and Open Problems in Federated Learning.
    Google ScholarFindings
  • Shlezinger, N.; Rini, S.; and Eldar, Y. C. 2020. The Communication-Aware Clustered Federated Learning Problem. In 2020 IEEE International Symposium on Information Theory (ISIT), 2610–2615.
    Google ScholarLocate open access versionFindings
  • Wang, H.; Hsu, H.; Diaz, M.; and Calmon, F. P. 2020. To Split or Not to Split: The Impact of Disparate Treatment in Classification.
    Google ScholarFindings
  • Yu, T.; Bagdasaryan, E.; and Shmatikov, V. 2020. Salvaging Federated Learning by Local Adaptation.
    Google ScholarFindings
  • Li, T.; Sahu, A. K.; Talwalkar, A.; and Smith, V. 2020. Federated Learning: Challenges, Methods, and Future Directions. IEEE Signal Processing Magazine 37(3): 50–60. ISSN 1558-0792. doi:10.1109/msp.2020.2975749. URL http://dx.doi.org/10.1109/MSP.2020.2975749.
    Locate open access versionFindings
  • Li, T.; Sanjabi, M.; Beirami, A.; and Smith, V. 2019. Fair Resource Allocation in Federated Learning.
    Google ScholarFindings
  • Mansour, Y.; Mohri, M.; Ro, J.; and Suresh, A. T. 2020. Three Approaches for Personalization with Applications to Federated Learning.
    Google ScholarFindings
  • Parametric empirical Bayes (Morris 1986) (Casella
    Google ScholarFindings
  • 1992) is frequently described as an intermediate between these two viewpoints. Similar to the hierarchical Bayesian viewpoint, it assumes data is drawn Yi ∼ D(Y |θi), with parameter θi is drawn θi ∼ Θi(θ|λi). However, it differs in that it estimates λi based on the data, producing λi. This estimate of the hyperparameter is used, along with the data, to estimate θi.
    Google ScholarFindings
  • A related example is the James-Stein estimator (Efron and Morris 1977). The estimator assumes the following process: each of m players draws a single sample from a normal distribution with variance s2.
    Google ScholarFindings
作者
Kate Donahue
Kate Donahue
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科