Research article Special Issues

Enhancing facial recognition accuracy through multi-scale feature fusion and spatial attention mechanisms

  • Received: 15 February 2024 Revised: 02 March 2024 Accepted: 14 March 2024 Published: 21 March 2024
  • Nowadays, advancements in facial recognition technology necessitate robust solutions to address challenges in real-world scenarios, including lighting variations and facial position discrepancies. We introduce a novel deep neural network framework that significantly enhances facial recognition accuracy through multi-scale feature fusion and spatial attention mechanisms. Leveraging techniques from FaceNet and incorporating atrous spatial pyramid pooling and squeeze-excitation modules, our approach achieves superior accuracy, surpassing 99% even under challenging conditions. Through meticulous experimentation and ablation studies, we demonstrate the efficacy of each component, highlighting notable improvements in noise resilience and recall rates. Moreover, the introduction of the Feature Generative Spatial Attention Adversarial Network (FFSSA-GAN) model further advances the field, exhibiting exceptional performance across various domains and datasets. Looking forward, our research emphasizes the importance of ethical considerations and transparent methodologies in facial recognition technology, paving the way for responsible deployment and widespread adoption in the security, healthcare, and retail industries.

    Citation: Muhammad Ahmad Nawaz Ul Ghani, Kun She, Muhammad Usman Saeed, Naila Latif. Enhancing facial recognition accuracy through multi-scale feature fusion and spatial attention mechanisms[J]. Electronic Research Archive, 2024, 32(4): 2267-2285. doi: 10.3934/era.2024103

    Related Papers:

  • Nowadays, advancements in facial recognition technology necessitate robust solutions to address challenges in real-world scenarios, including lighting variations and facial position discrepancies. We introduce a novel deep neural network framework that significantly enhances facial recognition accuracy through multi-scale feature fusion and spatial attention mechanisms. Leveraging techniques from FaceNet and incorporating atrous spatial pyramid pooling and squeeze-excitation modules, our approach achieves superior accuracy, surpassing 99% even under challenging conditions. Through meticulous experimentation and ablation studies, we demonstrate the efficacy of each component, highlighting notable improvements in noise resilience and recall rates. Moreover, the introduction of the Feature Generative Spatial Attention Adversarial Network (FFSSA-GAN) model further advances the field, exhibiting exceptional performance across various domains and datasets. Looking forward, our research emphasizes the importance of ethical considerations and transparent methodologies in facial recognition technology, paving the way for responsible deployment and widespread adoption in the security, healthcare, and retail industries.



    加载中


    [1] S. Kumar, Rishabh, K. Bhatia, A review on face identification systems in computer vision, WoS, 2 (2023), 230–238. Available from: https://innosci.org/wos/article/view/1474.
    [2] W. Yang, S. Wang, J. Hu, G. Zheng, C. Valli, A fingerprint and finger-vein based cancelable multi-biometric system, Pattern Recognit., 78 (2018), 242–251. https://doi.org/10.1016/j.patcog.2018.01.026 doi: 10.1016/j.patcog.2018.01.026
    [3] K. Conger, R. Fausset, S. F. Kovaleski, San Francisco bans facial recognition technology, in The New York Times, 14 (2019).
    [4] L. Li, X. Mu, S. Li, H. Peng, A review of face recognition technology, IEEE Access, 8 (2020), 139110–139120. https://doi.org/10.1109/ACCESS.2020.3011028 doi: 10.1109/ACCESS.2020.3011028
    [5] N. Zeng, H. Zhang, B. Song, W. Liu, Y. Li, A. M. Dobaie, Facial expression recognition via learning deep sparse autoencoders, Neurocomputing, 273 (2018), 643–649. https://doi.org/10.1016/j.neucom.2017.08.043 doi: 10.1016/j.neucom.2017.08.043
    [6] N. Zeng, X. Li, P. Wu, H. Li, X. Luo, A novel tensor decomposition-based efficient detector for low-altitude aerial objects with knowledge distillation scheme, IEEE/CAA J. Autom. Sin., 11 (2024), 487–501. https://doi.org/10.1109/JAS.2023.124029 doi: 10.1109/JAS.2023.124029
    [7] J. M. Mase, N. Leesakul, G. P. Figueredo, M. T. Torres, Facial identity protection using deep learning technologies: an application in affective computing, AI Ethics, 3 (2023), 937–946. https://doi.org/10.1007/s43681-022-00215-y doi: 10.1007/s43681-022-00215-y
    [8] X. Jin, Y. Xie, X. S. Wei, B. R. Zhao, Z. M. Chen, X. Tan, Delving deep into spatial pooling for squeeze-and-excitation networks, Pattern Recognit., 121 (2022), 108159. https://doi.org/10.1016/j.patcog.2021.108159 doi: 10.1016/j.patcog.2021.108159
    [9] X. Lian, Y. Pang, J. Han, J. Pan, Cascaded hierarchical atrous spatial pyramid pooling module for semantic segmentation, Pattern Recognit., 110 (2021), 107622. https://doi.org/10.1016/j.patcog.2020.107622 doi: 10.1016/j.patcog.2020.107622
    [10] D. Yang, X. Wang, N. Zhu, S. Li, N. Hou, MJ-GAN: Generative adversarial network with multi-grained feature extraction and joint attention fusion for infrared and visible image fusion, Sensors, 23 (2023), 6322. https://doi.org/10.3390/s23146322 doi: 10.3390/s23146322
    [11] Z. Shao, X. Wang, B. Li, Y. Zhang, Y. Shang, J. Ouyang, Cancelable color face recognition using trinion gyrator transform and randomized nonlinear PCANet, Multimedia Tools Appl., (2024), 1–15. https://doi.org/10.1007/s11042-023-17905-2 doi: 10.1007/s11042-023-17905-2
    [12] Z. Shao, L. Li, Z. Zhang, B. Li, X. Liu, Y. Shang, et al., Cancelable face recognition using phase retrieval and complex principal component analysis network, Mach. Vision Appl., 35 (2024), 12. https://doi.org/10.1007/s00138-023-01496-x doi: 10.1007/s00138-023-01496-x
    [13] H. Tao, Q. Duan, Hierarchical attention network with progressive feature fusion for facial expression recognition, Neural Networks, 170 (2024), 337–348. https://doi.org/10.1016/j.neunet.2023.11.033 doi: 10.1016/j.neunet.2023.11.033
    [14] H. Tao, Q. Duan, A spatial-channel feature-enriched module based on multi-context statistics attention, IEEE Internet Things J., 2023. https://doi.org/10.1109/JIOT.2023.3339722 doi: 10.1109/JIOT.2023.3339722
    [15] M. Ren, Y. Wang, Y. Zhu, K. Zhang, Z. Sun, Multiscale dynamic graph representation for biometric recognition with occlusions, IEEE Trans. Pattern Anal. Mach. Intell., 45 (2023), 15120–15136. https://doi.org/10.1109/TPAMI.2023.3298836 doi: 10.1109/TPAMI.2023.3298836
    [16] S. B. Chaabane, M. Hijji, R. Harrabi, H. Seddik, Face recognition based on statistical features and SVM classifier, Multimedia Tools Appl., 81 (2022), 8767–8784. https://doi.org/10.1007/s11042-021-11816-w doi: 10.1007/s11042-021-11816-w
    [17] J. S. Talahua, J. Buele, P. Calvopiña, J. Varela-Aldas, Facial recognition system for people with and without face mask in times of the COVID-19 pandemic, Sustainability, 13 (2021), 6900. https://doi.org/10.3390/su13126900 doi: 10.3390/su13126900
    [18] J. Wu, W. Feng, G. Liang, T. Wang, G. Li, Y. Zheng, A privacy protection scheme for facial recognition and resolution based on edge computing, Secur. Commun. Netw., 2022 (2022), 4095427. https://doi.org/10.1155/2022/4095427 doi: 10.1155/2022/4095427
    [19] M. Zhang, L. Wang, Y. Zou, W. Yan, Analysis of consumers' innovation resistance behavior to facial recognition payment: an empirical investigation, WHICEB 2022 Proc., 32 (2022). Available from: https://aisel.aisnet.org/whiceb2022/32/.
    [20] E. Farooq, A. Borghesi, A federated learning approach for anomaly detection in high performance computing, in 2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI), (2023), 496–500. https://doi.org/10.1109/ICTAI59109.2023.00079
    [21] M. H. B. Alhlffee, Y. Huang, Y. A. Chen, 2D facial landmark localization method for multi-view face synthesis image using a two-pathway generative adversarial network approach, PeerJ Comput. Sci., 8 (2022), e897. https://doi.org/10.7717/peerj-cs.897 doi: 10.7717/peerj-cs.897
    [22] S. Cen, H. Luo, J. Huang, W. Shi, X. Chen, Pre-trained feature fusion and multidomain identification generative adversarial network for face frontalization, IEEE Access, 10 (2022), 77872–77882. https://doi.org/10.1109/ACCESS.2022.3193386 doi: 10.1109/ACCESS.2022.3193386
    [23] A. Ullah, H. Elahi, Z. Sun, A. Khatoon, I. Ahmad, Comparative analysis of AlexNet, ResNet18 and SqueezeNet with diverse modification and arduous implementation, Arabian J. Sci. Eng., 47 (2022), 2397–2417. https://doi.org/10.1007/s13369-021-06182-6 doi: 10.1007/s13369-021-06182-6
    [24] A. Ullah, H. Xie, M. O. Farooq, Z. Sun, Pedestrian detection in infrared images using fast RCNN, in 2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA), (2018), 1–6. https://doi.org/10.1109/IPTA.2018.8608121
    [25] O. Basystiuk, N. Melnykova, Z. Rybchak, Machine Learning Methods and Tools for Facial Recognition Based on Multimodal Approach, 2023. Available from: https://ceur-ws.org/Vol-3426/paper13.pdf.
    [26] B. Thaman, T. Cao, N. Caporusso, Face mask detection using mediapipe facemesh, in 2022 45th Jubilee International Convention on Information, Communication and Electronic Technology (MIPRO), (2022), 378–382. https://doi.org/10.23919/MIPRO55190.2022.9803531
    [27] S. Bhatlawande, S. Shilaskar, T. Gadad, S. Ghulaxe, R. Gaikwad, Smart home security monitoring system based on face recognition and android application, in 2023 International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT), (2023), 222–227. https://doi.org/10.1109/IDCIoT56793.2023.10053558
    [28] C. S. Hsu, S. F. Tu, P. C. Chiu, Design of an e-diploma system based on consortium blockchain and facial recognition, Educ. Inf. Technol., 27 (2022), 5495–5519. https://doi.org/10.1007/s10639-021-10840-5 doi: 10.1007/s10639-021-10840-5
    [29] S. Rizwan, M. Zubair, A. Ghani, S. Ahmed, B. Fayyaz, Decentralized voting system based on regions using facial recognition, J. Independent Stud. Res. Comput., 20 (2022). https://doi.org/10.31645/JISRC.22.20.1.8 doi: 10.31645/JISRC.22.20.1.8
    [30] F. Schroff, D. Kalenichenko, J. Philbin, Facenet: a unified embedding for face recognition and clustering, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 815–823. https://doi.org/10.1109/CVPR.2015.7298682
    [31] J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018), 7132–7141.
    [32] Z. Wen, W. Lin, T. Wang, G. Xu, Distract your attention: multi-head cross attention network for facial expression recognition, Biomimetics, 8 (2023), 199. https://doi.org/10.3390/biomimetics8020199 doi: 10.3390/biomimetics8020199
    [33] A. R. Revanda, C. Fatichah, N. Suciati, Utilization of generative adversarial networks in face image synthesis for augmentation of face recognition training data, in 2020 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM), (2020), 396–401. https://doi.org/10.1109/CENIM51130.2020.9297899
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1502) PDF downloads(94) Cited by(1)

Article outline

Figures and Tables

Figures(4)  /  Tables(4)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog