Federated learning is a novel framework that enables resource-constrained edge devices to jointly learn a model, which solves the problem of data protection and data islands. However, standard federated learning is vulnerable to Byzantine attacks, which will cause the global model to be manipulated by the attacker or fail to converge. On non-iid data, the current methods are not effective in defensing against Byzantine attacks. In this paper, we propose a Byzantine-robust framework for federated learning via credibility assessment on non-iid data (BRCA). Credibility assessment is designed to detect Byzantine attacks by combing adaptive anomaly detection model and data verification. Specially, an adaptive mechanism is incorporated into the anomaly detection model for the training and prediction of the model. Simultaneously, a unified update algorithm is given to guarantee that the global model has a consistent direction. On non-iid data, our experiments demonstrate that the BRCA is more robust to Byzantine attacks compared with conventional methods.
Citation: Kun Zhai, Qiang Ren, Junli Wang, Chungang Yan. Byzantine-robust federated learning via credibility assessment on non-IID data[J]. Mathematical Biosciences and Engineering, 2022, 19(2): 1659-1676. doi: 10.3934/mbe.2022078
Federated learning is a novel framework that enables resource-constrained edge devices to jointly learn a model, which solves the problem of data protection and data islands. However, standard federated learning is vulnerable to Byzantine attacks, which will cause the global model to be manipulated by the attacker or fail to converge. On non-iid data, the current methods are not effective in defensing against Byzantine attacks. In this paper, we propose a Byzantine-robust framework for federated learning via credibility assessment on non-iid data (BRCA). Credibility assessment is designed to detect Byzantine attacks by combing adaptive anomaly detection model and data verification. Specially, an adaptive mechanism is incorporated into the anomaly detection model for the training and prediction of the model. Simultaneously, a unified update algorithm is given to guarantee that the global model has a consistent direction. On non-iid data, our experiments demonstrate that the BRCA is more robust to Byzantine attacks compared with conventional methods.
[1] | L. Zhou, K. H. Yeh, G. Hancke, Z. Liu, C. Su, Security and privacy for the industrial internet of things: An overview of approaches to safeguarding endpoints, IEEE Signal Process. Mag., 35 (2018), 76–87. doi: 10.1109/MSP.2018.2846297. doi: 10.1109/MSP.2018.2846297 |
[2] | B. McMahan, E. Moore, D. Ramage, S. Hampson, B. Aguera, Communication-efficient learning of deep networks from decentralized data, in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 54 (2017), 1273–1282. |
[3] | Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, V. Chandra, Federated learning with non-iid data, preprint, arXiv: 1806.00582. |
[4] | T. R. Gadekallu, Q. Pham, T. Huynh-The, S. Bhattacharya, P. K. R. Maddikunta, M. Liyanage, Federated learning for big data: A survey on opportunities, applications, and future directions, preprint, arXiv: 2110.04160. |
[5] | E. M. E. Mhamdi, R. Guerraoui, S. Rouault, The hidden vulnerability of distributed learning in Byzantium, in International Conference on Machine Learning, (2018), 3521–3530. |
[6] | J. So, B. Guler, A. S. Avestimehr, Byzantine-resilient secure federated learning, IEEE J. Sel. Areas Commun., 2020. doi: 10.1109/JSAC.2020.3041404. doi: 10.1109/JSAC.2020.3041404 |
[7] | X. Chen, T. Chen, H. Sun, Z. S. Wu, M. Hong, Distributed training with heterogeneous data: Bridging median-and mean-based algorithms, preprint, arXiv: 1906.01736. |
[8] | H. Yang, X. Zhang, M. Fang, J. Liu, Byzantine-resilient stochastic gradient descent for distributed learning: A lipschitz-inspired coordinate-wise median approach, in 2019 IEEE 58th Conference on Decision and Control (CDC), (2019), 5832–5837. doi: 10.1109/CDC40024.2019.9029245. |
[9] | D. Yin, Y. Chen, R. Kannan, P. Bartlett, Byzantine-robust distributed learning: Towards optimal statistical rates, in International Conference on Machine Learning, (2018), 5650–5659. |
[10] | Y. Chen, L. Su, J. Xu, Distributed statistical machine learning in adversarial settings: Byzantine gradient descent, in Proceedings of the ACM on Measurement and Analysis of Computing Systems, 1 (2017), 1–25. doi: 10.1145/3154503. |
[11] | K. Pillutla, S. M. Kakade, Z. Harchaoui, Robust aggregation for federated learning, preprint, arXiv: 1912.13445. |
[12] | P. Blanchard, E. M. E. Mhamdi, R. Guerraoui, J. Stainer, Machine learning with adversaries: Byzantine tolerant gradient descent, in Proceedings of the 31st International Conference on Neural Information Processing Systems, (2017), 118–128. |
[13] | M. Alazab, S. P. R M, P. M, P. Reddy, T. R. Gadekallu, Q. Pham, Federated learning for cybersecurity: Concepts, challenges and future directions, IEEE Trans. Ind. Inf., (2021). doi: 10.1109/TII.2021.3119038. doi: 10.1109/TII.2021.3119038 |
[14] | S. Li, Y. Cheng, W. Wang, Y. Liu, T. Chen, Learning to detect malicious clients for robust federated learning, preprint, arXiv: 2002.00211. |
[15] | S. U. Stich, Local sgd converges fast and communicates little, preprint, arXiv: 1805.09767. |
[16] | B. Woodworth, J. Wang, A. Smith, B. McMahan, N. Srebro, Graph oracle models, lower bounds, and gaps for parallel stochastic optimization, preprint, arXiv: 1805.10222. |
[17] | P. Kairouz, H B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, et al., Advances and open problems in federated learning, preprint, arXiv: 1912.04977. |
[18] | T. Li, A. K. Sahu, A. Talwalkar, V. Smith, Federated learning: Challenges, methods, and future directions, IEEE Signal Process. Mag., 37 (2020), 50–60. doi: 10.1109/MSP.2020.2975749. doi: 10.1109/MSP.2020.2975749 |
[19] | X. Li, K. Huang, W. Yang, S. Wang, Z. Zhang, On the convergence of fedavg on non-iid data, preprint, arXiv: 1907.02189. |
[20] | S. Agrawal, S. Sarkar, M. Alazab, P. K. R. Maddikunta, T. R. Gadekallu, Q. Pham, Genetic CFL: Optimization of hyper-parameters in clustered federated learning, preprint, arXiv: 2107.07233. |
[21] | Y. Chen, S. Kar, J. M. Moura, The internet of things: Secure distributed inference, IEEE Signal Process. Mag., 35 (2018), 64–75. doi: 10.1109/MSP.2018.2842097. doi: 10.1109/MSP.2018.2842097 |
[22] | C. Xie, O. Koyejo, I. Gupta, Generalized byzantine-tolerant sgd, preprint, arXiv: 1802.10116. |
[23] | C. Iwendi, Z. Jalil, A. R. Javed, T. R. G, R. Kaluri, G. Srivastava, et al., Keysplitwatermark: Zero watermarking algorithm for software protection against cyber-attacks, IEEE Access, 8 (2020), 72650–72660. doi: 10.1109/access.2020.2988160. doi: 10.1109/access.2020.2988160 |
[24] | S. Shamshirband, M. Fathi, A. T. Chronopoulos, Computational intelligence intrusion detection techniques in mobile cloud computing environments: Review, taxonomy, and open research issues, J. Inf. Sec. Appl., 55 (2020), 102582. doi: 10.1016/j.jisa.2020.102582. doi: 10.1016/j.jisa.2020.102582 |
[25] | S. Li, Y. Cheng, Y. Liu, W. Wang, Ti. Chen, Abnormal client behavior detection in federated learning, preprint, arXiv: 1910.09933. |
[26] | E. El-Mhamdi, R. Guerraoui, S. Rouault, Distributed momentum for byzantine-resilient learning, preprint, arXiv: 2003.00010. |
[27] | M. Sakurada, T. Yairi, Anomaly detection using autoencoders with nonlinear dimensionality reduction, in Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, (2014), 4–11. doi: 10.1145/2689746.2689747. |