The deep integration of edge computing and Artificial Intelligence (AI) in IoT (Internet of Things)-enabled smart cities has given rise to new edge AI paradigms that are more vulnerable to attacks such as data and model poisoning and evasion of attacks. This work proposes an online poisoning attack framework based on the edge AI environment of IoT-enabled smart cities, which takes into account the limited storage space and proposes a rehearsal-based buffer mechanism to manipulate the model by incrementally polluting the sample data stream that arrives at the appropriately sized cache. A maximum-gradient-based sample selection strategy is presented, which converts the operation of traversing historical sample gradients into an online iterative computation method to overcome the problem of periodic overwriting of the sample data cache after training. Additionally, a maximum-loss-based sample pollution strategy is proposed to solve the problem of each poisoning sample being updated only once in basic online attacks, transforming the bi-level optimization problem from offline mode to online mode. Finally, the proposed online gray-box poisoning attack algorithms are implemented and evaluated on edge devices of IoT-enabled smart cities using an online data stream simulated with offline open-grid datasets. The results show that the proposed method outperforms the existing baseline methods in both attack effectiveness and overhead.
Citation: Yanxu Zhu, Hong Wen, Jinsong Wu, Runhui Zhao. Online data poisoning attack against edge AI paradigm for IoT-enabled smart city[J]. Mathematical Biosciences and Engineering, 2023, 20(10): 17726-17746. doi: 10.3934/mbe.2023788
The deep integration of edge computing and Artificial Intelligence (AI) in IoT (Internet of Things)-enabled smart cities has given rise to new edge AI paradigms that are more vulnerable to attacks such as data and model poisoning and evasion of attacks. This work proposes an online poisoning attack framework based on the edge AI environment of IoT-enabled smart cities, which takes into account the limited storage space and proposes a rehearsal-based buffer mechanism to manipulate the model by incrementally polluting the sample data stream that arrives at the appropriately sized cache. A maximum-gradient-based sample selection strategy is presented, which converts the operation of traversing historical sample gradients into an online iterative computation method to overcome the problem of periodic overwriting of the sample data cache after training. Additionally, a maximum-loss-based sample pollution strategy is proposed to solve the problem of each poisoning sample being updated only once in basic online attacks, transforming the bi-level optimization problem from offline mode to online mode. Finally, the proposed online gray-box poisoning attack algorithms are implemented and evaluated on edge devices of IoT-enabled smart cities using an online data stream simulated with offline open-grid datasets. The results show that the proposed method outperforms the existing baseline methods in both attack effectiveness and overhead.
[1] | Edge AI and Vision Alliance, 2023 Edge AI Technology Report, 2023. Available from: https://www.edge-ai-vision.com/2023/07/2023-edge-ai-technology-report/. |
[2] | Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature, 521 (2015), 436–444. https://doi.org/10.1038/nature14539 doi: 10.1038/nature14539 |
[3] | Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, J. Zhang, Edge intelligence: Paving the last mile of artificial intelligence with edge computing, in Proceedings of IEEE, 107 (2019), 1738–1762. https://doi.org/10.1109/JPROC.2019.2918951 |
[4] | Z. Zhou, Y. Shuai, X. Chen, Edge intelligence: a new nexus of edge computing and artificial intelligence, Big Data Res., 5 (2019), 53–63. https://doi.org/10.11959/j.issn.2096-0271.2019013 doi: 10.11959/j.issn.2096-0271.2019013 |
[5] | X. Wang, Y. Han, V. C. M. Leung, D. Niyato, X. Yan, X. Chen, Convergence of edge computing and deep learning: A comprehensive survey, IEEE Commun. Surv. Tutorials, 22 (2020), 869–904. https://doi.org/10.1109/COMST.2020.2970550 doi: 10.1109/COMST.2020.2970550 |
[6] | S. Deng, H. Zhao, W. Fang, J. Yin, S. Dustdar, A.Y. Zomaya, Edge intelligence: The confluence of edge computing and artificial intelligence, IEEE Internet Things J., 7 (2020), 7457–7469. https://doi.org/10.1109/JIOT.2020.2984887 doi: 10.1109/JIOT.2020.2984887 |
[7] | Y. Li, Y. Yu, W. Susilo, Z. Hong, M. Guizani, Security and privacy for edge intelligence in 5G and beyond networks: Challenges and solutions, IEEE Wireless Commun., 28 (2021), 63–69. https://doi.org/10.1109/MWC.001.2000318 doi: 10.1109/MWC.001.2000318 |
[8] | M. S. Ansari, S. H. Alsamhi, Y. Qiao, Y. Ye, B. Lee, Security of distributed intelligence in edge computing: Threats and countermeasures, in The Cloud-to-Thing Continuum, Springer, (2020), 95–122. |
[9] | B. Biggio, B. Nelson, P. Laskov, Poisoning attacks against support vector machines, preprint, arXiv: 1206.6389. |
[10] | S. Mei, X. Zhu, Using machine teaching to identify optimal training-set attacks on machine learners, in Proceedings of the AAAI Conference on Artificial Intelligence, 29 (2015), 2871–2877. https://doi.org/10.1609/aaai.v29i1.9569 |
[11] | N. Müller, D. Kowatsch, K. Böttinger, Data poisoning attacks on regression learning and corresponding defenses, in 2020 IEEE 25th Pacific Rim International Symposium on Dependable Computing (PRDC), (2020), 80–89. https://doi.org/10.1109/PRDC50213.2020.00019 |
[12] | M. Jagielski, A. Oprea, B. Biggio, C. Liu, C. Nita-Rotaru, B. Li, Manipulating machine learning: Poisoning attacks and countermeasures for regression learning, in 2018 IEEE Symposium on Security and Privacy (SP), (2018), 19–35. https://doi.org/10.1109/SP.2018.00057 |
[13] | T. Cerquitelli, M. Meo, M. Curado, L. Skorin-Kapov, E. E. Tsiropoulou, Machine learning empowered computer networks, Comput. Networks, 230 (2023), 109807. https://doi.org/10.1016/j.comnet.2023.109807 doi: 10.1016/j.comnet.2023.109807 |
[14] | P. W. Koh, J. Steinhart, P. Liang, Stronger data poisoning attacks break data sanitization defenses, Mach. Learn., 111 (2022), 1–47. https://doi.org/10.1007/s10994-021-06119-y doi: 10.1007/s10994-021-06119-y |
[15] | C. Burkard, B. Lagesse, Analysis of causative attacks against SVMs learning from data streams, in Proceedings of the 3rd ACM on International Workshop on Security and Privacy Analytics, (2017), 31–36. https://doi.org/10.1145/3041008.3041012 |
[16] | X. Zhang, X. Zhu, L. Lessard, Online data poisoning attack, preprint, arXiv: 1903.01666. |
[17] | P. G. Margiotta, S. Goldt, G. Sanguinetti, Attacks on online learners: A teacher-student analysis, preprint, arXiv: 2305.11132. |
[18] | Z. Hammoudeh, D. Lowd, Training data influence analysis and estimation: A survey, preprint, arXiv: 2212.04612. |
[19] | M. Wojnowicz, B. Cruz, X. Zhao, B. Wallace, M. Wolff, J. Luan, et al., "Influence sketching": Finding influential samples in large-scale regressions, in 2016 IEEE International Conference on Big Data (Big Data), (2016), 3601–3612. https://doi.org/10.1109/BigData.2016.7841024 |
[20] | P. W. Koh, P. Liang, Understanding black-box predictions via influence functions, preprint, arXiv: 1703.04730. |
[21] | Y. Wang, K. Chaudhuri, Data poisoning attacks against online learning, preprint, arXiv: 1808.08994. |
[22] | M. A. Ramirez, S. Kim, H. A. Hamadi, E. Damiani, Y. J. Byon, T. Y. Kim, et al., Poisoning Attacks and Defenses on Artificial Intelligence: A Survey, preprint, arXiv: 2202.10276. |
[23] | L. Bottou, Large-scale machine learning with stochastic gradient descent, in Proceedings of COMPSTAT'2010, (2010), 177–186. https://doi.org/10.1007/978-3-7908-2604-3_16 |
[24] | Y. Zhu, H. Wen, R. Zhao, Y. Jiang, Q. Liu, P. Zhang, Research on data poisoning attack against smart grid cyber-physical system based on edge computing, Sensors, 23 (2023), 4509. https://doi.org/10.3390/s23094509 doi: 10.3390/s23094509 |