Shapley Value for Federated Learning: A Distributed and Fair Framework

Document Type : Research Article

Authors

1 Information Systems and Security Lab. (ISSL), Sharif University of Technology, Tehran, Iran

2 Department of Electrical Engineering, Sharif University of Technology, Tehran, Iran

Abstract
In a federated learning system, the objective is to train a global model over distributed datasets without centralizing all data on a single unit. This is accomplished by training a local model on the dataset of each data owner and then aggregating these local models to preserve the datasets’ privacy. To incentivize clients to actively engage in the learning process, fairness-aware federated learning techniques can be employed. One such approach involves quantifying the contribution of locally trained models in training the global model by Shapley value (SV) using an additional dataset and rewarding them according to their contributions. However, the calculation of the Shapley value presents a significant challenge due to its high computational complexity. To tackle this issue, our research presents a contribution-based federated learning method that efficiently computes the contribution of each locally trained model by distributing the additional dataset among processing nodes in a private manner and calculating the Shapley value over them.

Keywords


[1] Konstantin Sozinov, Vladimir Vlassov, and Sarunas Girdzijauskas. Human activity recognition using federated learning, 2018.
[2] Joel Stremmel and Arjun Singh. Pretraining federated text models for next word prediction. 2020.
[3] Fan Bai, Jiaxiang Wu, Pengcheng Shen, Shaoxin Li, and Shuigeng Zhou. Federated face recognition. volume abs/2105.02501, 2021.
[4] Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. Federated learning: Challenges, methods, and future directions. CoRR, abs/1908.07873, 2019.
[5] Ashish Rauniyar, Desta Haileselassie Hagos, Debesh Jha, Jan Erik H˚akeg˚ard, Ulas Bagci, Danda B Rawat, and Vladimir Vlassov. Federated learning for medical applications: A taxonomy, current trends, challenges, and future research directions. IEEE Internet of Things Journal, 2023.
[6] Dinh C. Nguyen, Ming Ding, Pubudu N. Pathirana, Aruna Seneviratne, Jun Li, and H. Vincent Poor. Federated learning for internet of things: A comprehensive survey. IEEE Communications Surveys, 23(3):1622–1658, 2021.
[7] Yao Chen, Yijie Gui, Hong Lin, Wensheng Gan, and Yongdong Wu. Federated learning attacks and defenses: A survey. In 2022 IEEE International Conference on Big Data (Big Data), pages 4256–4265, 2022.
[8] Jinhyun So, Ramy E. Ali, Basak Guler, Jiantao Jiao, and Salman Avestimehr. Securing secure aggregation: Mitigating multi-round privacy leakage in federated learning, 2021.
[9] Kang Wei, Jun Li, Ming Ding, Chuan Ma, Howard H. Yang, Farhad Farokhi, Shi Jin, Tony Q. S. Quek, and H. Vincent Poor. Federated learning with differential privacy: Algorithms and performance analysis, 2020.
[10] Ahmed El Ouadrhiri and Ahmed Abdelhadi. Differential privacy for deep and federated learning: A survey. IEEE Access, 10:22359–22380, 2022.
[11] Tayyebeh Jahani-Nezhad, Mohammad Ali Maddah-Ali, and Giuseppe Caire. Byzantine-resistant secure aggregation for federated learning based on coded computing and vector commitment. arXiv: 2302.09913, 2023.
[12] Yuxin Shi, Han Yu, and Cyril Leung. Towards fairness-aware federated learning. IEEE Transactions on Neural Networks and Learning Systems, pages 1–17, 2023.
[13] Guan Wang, Charlie Xiaoqian Dang, and Ziye Zhou. Measure contribution of participants in federated learning, 2019.
[14] Benedek Rozemberczki, Lauren Watson, P´eter Bayer, Hao-Tsung Yang, Oliv´er Kiss, Sebastian Nilsson, and Rik Sarkar. The shapley value in machine learning. arXiv, 2022.
[15] Zelei Liu, Yuanyuan Chen, Han Yu, Yang Liu, and Lizhen Cui. Gtg-shapley: Efficient and accurate participant contribution evaluation in federated learning. arXiv, 2021.
[16] Ziwen Cheng, Yi Liu, Chao Wu, Yongqi Pan, Liushun Zhao, and Cheng Zhu. PoShapley-BCFL: A Fair and Robust Decentralized Federated Learning Based on Blockchain and the Proof of Shapley-Value, pages 531–549. 11 2023.
[17] Liguo Dong, Zhenmou Liu, Kejia Zhang, Abdulsalam Yassine, and M. Shamim Hossain. Affordable federated edge learning framework via efficient shapley value estimation. Future Generation Computer Systems, 147:339–349, 2023.
[18] Qian Yu, Songze Li, Netanel Raviv, Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi, and Salman A. Avestimehr. Lagrange coded computing: Optimal design for resiliency, security, and privacy. In Kamalika Chaudhuri and Masashi Sugiyama, editors, Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 of Proceedings of Machine Learning Research, pages 1215–1225. PMLR, 16–18 Apr 2019.
[19] Peva Blanchard, El Mahdi El Mhamdi, Rachid Guerraoui, and Julien Stainer. Byzantine-tolerant machine learning. arXiv, 2017.
[20] Vincent Labatut Khaoula Otmani, Rachid Elazouzi. Fedsv: Byzantine-robust federated learning via shapley value. In IEEE International Conference on Communications, Jun 2024.
[21] Jinhyun So, Ba¸sak G¨uler, and A. Salman Avestimehr. Codedprivateml: A fast and privacy-preserving framework for distributed machine learning. IEEE Journal on Selected Areas in Information Theory, 2(1):441–451, 2021.
[22] Hongyi Zhang, Jan Bosch, and Helena Holm-str¨om Olsson. Federated learning systems: Architecture alternatives. In 2020 27th Asia-Pacific Software Engineering Conference (APSEC), pages 385–394, 2020.
[23] Kiran S Kedlaya and Christopher Umans. Fast polynomial factorization and modular composition. SIAM Journal on Computing, 40(6):1767–1802, 2011.