Enhancing facial recognition accuracy through multi-scale feature fusion and spatial attention mechanisms

Muhammad Ahmad Nawaz Ul Ghani; Kun She; Muhammad Usman Saeed; Naila Latif; Muhammad Ahmad Nawaz Ul Ghani; Kun She; Muhammad Usman Saeed; Naila Latif

doi:10.3934/era.2024103

Electronic Research Archive

2024, Volume 32, Issue 4: 2267-2285. doi: 10.3934/era.2024103

Previous Article Next Article

Research article Special Issues

Enhancing facial recognition accuracy through multi-scale feature fusion and spatial attention mechanisms

1.
School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China
2.
School of Computer Science and Engineering, Central South University, Changsha 410083, China
3.
School of Telecommunications Engineering, Xidian University, Xi'an 710071, China

Received: 15 February 2024 Revised: 02 March 2024 Accepted: 14 March 2024 Published: 21 March 2024

Nowadays, advancements in facial recognition technology necessitate robust solutions to address challenges in real-world scenarios, including lighting variations and facial position discrepancies. We introduce a novel deep neural network framework that significantly enhances facial recognition accuracy through multi-scale feature fusion and spatial attention mechanisms. Leveraging techniques from FaceNet and incorporating atrous spatial pyramid pooling and squeeze-excitation modules, our approach achieves superior accuracy, surpassing 99% even under challenging conditions. Through meticulous experimentation and ablation studies, we demonstrate the efficacy of each component, highlighting notable improvements in noise resilience and recall rates. Moreover, the introduction of the Feature Generative Spatial Attention Adversarial Network (FFSSA-GAN) model further advances the field, exhibiting exceptional performance across various domains and datasets. Looking forward, our research emphasizes the importance of ethical considerations and transparent methodologies in facial recognition technology, paving the way for responsible deployment and widespread adoption in the security, healthcare, and retail industries.
- facial recognition,
- feature fusion,
- spatial attention networks,
- multi-scale feature extraction,
- GAN,
- spoof detection
Citation: Muhammad Ahmad Nawaz Ul Ghani, Kun She, Muhammad Usman Saeed, Naila Latif. Enhancing facial recognition accuracy through multi-scale feature fusion and spatial attention mechanisms[J]. Electronic Research Archive, 2024, 32(4): 2267-2285. doi: 10.3934/era.2024103

Related Papers:

Abstract

Nowadays, advancements in facial recognition technology necessitate robust solutions to address challenges in real-world scenarios, including lighting variations and facial position discrepancies. We introduce a novel deep neural network framework that significantly enhances facial recognition accuracy through multi-scale feature fusion and spatial attention mechanisms. Leveraging techniques from FaceNet and incorporating atrous spatial pyramid pooling and squeeze-excitation modules, our approach achieves superior accuracy, surpassing 99% even under challenging conditions. Through meticulous experimentation and ablation studies, we demonstrate the efficacy of each component, highlighting notable improvements in noise resilience and recall rates. Moreover, the introduction of the Feature Generative Spatial Attention Adversarial Network (FFSSA-GAN) model further advances the field, exhibiting exceptional performance across various domains and datasets. Looking forward, our research emphasizes the importance of ethical considerations and transparent methodologies in facial recognition technology, paving the way for responsible deployment and widespread adoption in the security, healthcare, and retail industries.

References

[1]	S. Kumar, Rishabh, K. Bhatia, A review on face identification systems in computer vision, WoS, 2 (2023), 230–238. Available from: https://innosci.org/wos/article/view/1474.
[2]	W. Yang, S. Wang, J. Hu, G. Zheng, C. Valli, A fingerprint and finger-vein based cancelable multi-biometric system, Pattern Recognit., 78 (2018), 242–251. https://doi.org/10.1016/j.patcog.2018.01.026 doi: 10.1016/j.patcog.2018.01.026
[3]	K. Conger, R. Fausset, S. F. Kovaleski, San Francisco bans facial recognition technology, in The New York Times, 14 (2019).
[4]	L. Li, X. Mu, S. Li, H. Peng, A review of face recognition technology, IEEE Access, 8 (2020), 139110–139120. https://doi.org/10.1109/ACCESS.2020.3011028 doi: 10.1109/ACCESS.2020.3011028
[5]	N. Zeng, H. Zhang, B. Song, W. Liu, Y. Li, A. M. Dobaie, Facial expression recognition via learning deep sparse autoencoders, Neurocomputing, 273 (2018), 643–649. https://doi.org/10.1016/j.neucom.2017.08.043 doi: 10.1016/j.neucom.2017.08.043
[6]	N. Zeng, X. Li, P. Wu, H. Li, X. Luo, A novel tensor decomposition-based efficient detector for low-altitude aerial objects with knowledge distillation scheme, IEEE/CAA J. Autom. Sin., 11 (2024), 487–501. https://doi.org/10.1109/JAS.2023.124029 doi: 10.1109/JAS.2023.124029
[7]	J. M. Mase, N. Leesakul, G. P. Figueredo, M. T. Torres, Facial identity protection using deep learning technologies: an application in affective computing, AI Ethics, 3 (2023), 937–946. https://doi.org/10.1007/s43681-022-00215-y doi: 10.1007/s43681-022-00215-y
[8]	X. Jin, Y. Xie, X. S. Wei, B. R. Zhao, Z. M. Chen, X. Tan, Delving deep into spatial pooling for squeeze-and-excitation networks, Pattern Recognit., 121 (2022), 108159. https://doi.org/10.1016/j.patcog.2021.108159 doi: 10.1016/j.patcog.2021.108159
[9]	X. Lian, Y. Pang, J. Han, J. Pan, Cascaded hierarchical atrous spatial pyramid pooling module for semantic segmentation, Pattern Recognit., 110 (2021), 107622. https://doi.org/10.1016/j.patcog.2020.107622 doi: 10.1016/j.patcog.2020.107622
[10]	D. Yang, X. Wang, N. Zhu, S. Li, N. Hou, MJ-GAN: Generative adversarial network with multi-grained feature extraction and joint attention fusion for infrared and visible image fusion, Sensors, 23 (2023), 6322. https://doi.org/10.3390/s23146322 doi: 10.3390/s23146322
[11]	Z. Shao, X. Wang, B. Li, Y. Zhang, Y. Shang, J. Ouyang, Cancelable color face recognition using trinion gyrator transform and randomized nonlinear PCANet, Multimedia Tools Appl., (2024), 1–15. https://doi.org/10.1007/s11042-023-17905-2 doi: 10.1007/s11042-023-17905-2
[12]	Z. Shao, L. Li, Z. Zhang, B. Li, X. Liu, Y. Shang, et al., Cancelable face recognition using phase retrieval and complex principal component analysis network, Mach. Vision Appl., 35 (2024), 12. https://doi.org/10.1007/s00138-023-01496-x doi: 10.1007/s00138-023-01496-x
[13]	H. Tao, Q. Duan, Hierarchical attention network with progressive feature fusion for facial expression recognition, Neural Networks, 170 (2024), 337–348. https://doi.org/10.1016/j.neunet.2023.11.033 doi: 10.1016/j.neunet.2023.11.033
[14]	H. Tao, Q. Duan, A spatial-channel feature-enriched module based on multi-context statistics attention, IEEE Internet Things J., 2023. https://doi.org/10.1109/JIOT.2023.3339722 doi: 10.1109/JIOT.2023.3339722
[15]	M. Ren, Y. Wang, Y. Zhu, K. Zhang, Z. Sun, Multiscale dynamic graph representation for biometric recognition with occlusions, IEEE Trans. Pattern Anal. Mach. Intell., 45 (2023), 15120–15136. https://doi.org/10.1109/TPAMI.2023.3298836 doi: 10.1109/TPAMI.2023.3298836
[16]	S. B. Chaabane, M. Hijji, R. Harrabi, H. Seddik, Face recognition based on statistical features and SVM classifier, Multimedia Tools Appl., 81 (2022), 8767–8784. https://doi.org/10.1007/s11042-021-11816-w doi: 10.1007/s11042-021-11816-w
[17]	J. S. Talahua, J. Buele, P. Calvopiña, J. Varela-Aldas, Facial recognition system for people with and without face mask in times of the COVID-19 pandemic, Sustainability, 13 (2021), 6900. https://doi.org/10.3390/su13126900 doi: 10.3390/su13126900
[18]	J. Wu, W. Feng, G. Liang, T. Wang, G. Li, Y. Zheng, A privacy protection scheme for facial recognition and resolution based on edge computing, Secur. Commun. Netw., 2022 (2022), 4095427. https://doi.org/10.1155/2022/4095427 doi: 10.1155/2022/4095427
[19]	M. Zhang, L. Wang, Y. Zou, W. Yan, Analysis of consumers' innovation resistance behavior to facial recognition payment: an empirical investigation, WHICEB 2022 Proc., 32 (2022). Available from: https://aisel.aisnet.org/whiceb2022/32/.
[20]	E. Farooq, A. Borghesi, A federated learning approach for anomaly detection in high performance computing, in 2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI), (2023), 496–500. https://doi.org/10.1109/ICTAI59109.2023.00079
[21]	M. H. B. Alhlffee, Y. Huang, Y. A. Chen, 2D facial landmark localization method for multi-view face synthesis image using a two-pathway generative adversarial network approach, PeerJ Comput. Sci., 8 (2022), e897. https://doi.org/10.7717/peerj-cs.897 doi: 10.7717/peerj-cs.897
[22]	S. Cen, H. Luo, J. Huang, W. Shi, X. Chen, Pre-trained feature fusion and multidomain identification generative adversarial network for face frontalization, IEEE Access, 10 (2022), 77872–77882. https://doi.org/10.1109/ACCESS.2022.3193386 doi: 10.1109/ACCESS.2022.3193386
[23]	A. Ullah, H. Elahi, Z. Sun, A. Khatoon, I. Ahmad, Comparative analysis of AlexNet, ResNet18 and SqueezeNet with diverse modification and arduous implementation, Arabian J. Sci. Eng., 47 (2022), 2397–2417. https://doi.org/10.1007/s13369-021-06182-6 doi: 10.1007/s13369-021-06182-6
[24]	A. Ullah, H. Xie, M. O. Farooq, Z. Sun, Pedestrian detection in infrared images using fast RCNN, in 2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA), (2018), 1–6. https://doi.org/10.1109/IPTA.2018.8608121
[25]	O. Basystiuk, N. Melnykova, Z. Rybchak, Machine Learning Methods and Tools for Facial Recognition Based on Multimodal Approach, 2023. Available from: https://ceur-ws.org/Vol-3426/paper13.pdf.
[26]	B. Thaman, T. Cao, N. Caporusso, Face mask detection using mediapipe facemesh, in 2022 45th Jubilee International Convention on Information, Communication and Electronic Technology (MIPRO), (2022), 378–382. https://doi.org/10.23919/MIPRO55190.2022.9803531
[27]	S. Bhatlawande, S. Shilaskar, T. Gadad, S. Ghulaxe, R. Gaikwad, Smart home security monitoring system based on face recognition and android application, in 2023 International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT), (2023), 222–227. https://doi.org/10.1109/IDCIoT56793.2023.10053558
[28]	C. S. Hsu, S. F. Tu, P. C. Chiu, Design of an e-diploma system based on consortium blockchain and facial recognition, Educ. Inf. Technol., 27 (2022), 5495–5519. https://doi.org/10.1007/s10639-021-10840-5 doi: 10.1007/s10639-021-10840-5
[29]	S. Rizwan, M. Zubair, A. Ghani, S. Ahmed, B. Fayyaz, Decentralized voting system based on regions using facial recognition, J. Independent Stud. Res. Comput., 20 (2022). https://doi.org/10.31645/JISRC.22.20.1.8 doi: 10.31645/JISRC.22.20.1.8
[30]	F. Schroff, D. Kalenichenko, J. Philbin, Facenet: a unified embedding for face recognition and clustering, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 815–823. https://doi.org/10.1109/CVPR.2015.7298682
[31]	J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018), 7132–7141.
[32]	Z. Wen, W. Lin, T. Wang, G. Xu, Distract your attention: multi-head cross attention network for facial expression recognition, Biomimetics, 8 (2023), 199. https://doi.org/10.3390/biomimetics8020199 doi: 10.3390/biomimetics8020199
[33]	A. R. Revanda, C. Fatichah, N. Suciati, Utilization of generative adversarial networks in face image synthesis for augmentation of face recognition training data, in 2020 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM), (2020), 396–401. https://doi.org/10.1109/CENIM51130.2020.9297899

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)