Research article Special Issues

Lightweight high-performance pose recognition network: HR-LiteNet

  • Received: 16 October 2023 Revised: 27 December 2023 Accepted: 11 January 2024 Published: 29 January 2024
  • To address the limited resources of mobile devices and embedded platforms, we propose a lightweight pose recognition network named HR-LiteNet. Built upon a high-resolution architecture, the network incorporates depthwise separable convolutions, Ghost modules, and the Convolutional Block Attention Module to construct L_block and L_basic modules, aiming to reduce network parameters and computational complexity while maintaining high accuracy. Experimental results demonstrate that on the MPII validation dataset, HR-LiteNet achieves an accuracy of 83.643% while reducing the parameter count by approximately 26.58 M and lowering computational complexity by 8.04 GFLOPs compared to the HRNet network. Moreover, HR-LiteNet outperforms other lightweight models in terms of parameter count and computational requirements while maintaining high accuracy. This design provides a novel solution for pose recognition in resource-constrained environments, striking a balance between accuracy and lightweight demands.

    Citation: Zhiming Cai, Liping Zhuang, Jin Chen, Jinhua Jiang. Lightweight high-performance pose recognition network: HR-LiteNet[J]. Electronic Research Archive, 2024, 32(2): 1145-1159. doi: 10.3934/era.2024055

    Related Papers:

  • To address the limited resources of mobile devices and embedded platforms, we propose a lightweight pose recognition network named HR-LiteNet. Built upon a high-resolution architecture, the network incorporates depthwise separable convolutions, Ghost modules, and the Convolutional Block Attention Module to construct L_block and L_basic modules, aiming to reduce network parameters and computational complexity while maintaining high accuracy. Experimental results demonstrate that on the MPII validation dataset, HR-LiteNet achieves an accuracy of 83.643% while reducing the parameter count by approximately 26.58 M and lowering computational complexity by 8.04 GFLOPs compared to the HRNet network. Moreover, HR-LiteNet outperforms other lightweight models in terms of parameter count and computational requirements while maintaining high accuracy. This design provides a novel solution for pose recognition in resource-constrained environments, striking a balance between accuracy and lightweight demands.



    加载中


    [1] S. Wu, Z. Wang, B. Shen, J. Wang, D. Li, Human-computer interaction based on machine vision of a smart assembly workbench, Assem. Autom., 40 (2020), 475–482. https://doi.org/10.1108/AA-10-2018-0170 doi: 10.1108/AA-10-2018-0170
    [2] B. Debnath, M. O'brien, M. Yamaguchi, A. Behera, A review of computer vision-based approaches for physical rehabilitation and assessment, Multimedia Syst., 28 (2022), 209–239. https://doi.org/10.1007/s00530-021-00815-4 doi: 10.1007/s00530-021-00815-4
    [3] N. Lyons, Deep learning-based computer vision algorithms, immersive analytics and simulation software, and virtual reality modeling tools in digital twin-driven smart manufacturing, Econ. Manage. Financ. Mark., 17 (2022), 67–81. https://doi.org/10.22381/emfm17220224 doi: 10.22381/emfm17220224
    [4] Q. Kha, Q. Ho, N. Q. K. Le, Identifying snare proteins using an alignment-free method based on multiscan convolutional neural network and PSSM profiles, J. Chem. Inf. Model., 62 (2022), 4820–4826. https://doi.org/10.1021/acs.jcim.2c01034 doi: 10.1021/acs.jcim.2c01034
    [5] Z. Zhao, J. Gui, A. Yao, N. Q. K. Le, M. C. H. Chua, Improved prediction model of protein and peptide toxicity by integrating channel attention into a convolutional neural network and gated recurrent units, ACS Omega, 7 (2022), 40569–40577. https://doi.org/10.1021/acsomega.2c05881 doi: 10.1021/acsomega.2c05881
    [6] Z. Li, F. Liu, W. Yang, S. Peng, J. Zhou, A survey of convolutional neural networks: analysis, applications, and prospects, IEEE Trans. Neural Networks Learn. Syst., 33 (2022), 6999–7019. https://doi.org/10.1109/TNNLS.2021.3084827 doi: 10.1109/TNNLS.2021.3084827
    [7] C. Zheng, S. Zhu, M. Mendieta, T. Yang, C. Chen, Z. Ding, 3D human pose estimation with spatial and temporal transformers, preprint, arXiv: 2103.10455.
    [8] C. Li, G. H. Lee, Generating multiple hypotheses for 3D human pose estimation with mixture density network, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 9879–9887. https://doi.org/10.1109/CVPR.2019.01012
    [9] A. Toshev, C. Szegedy, Deeppose: Human pose estimation via deep neural networks, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, (2014), 1653–1660. https://doi.org/10.1109/CVPR.2014.214
    [10] J. Tompson, A. Jain, Y. LeCun, C. Bregler, Joint training of a convolutional network and a graphical model for human pose estimation, preprint, arXiv: 1406.2984.
    [11] S. Wei, V. Ramakrishna, T. Kanade, Y. Sheikh, Convolutional pose machines, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 4724–4732. https://doi.org/10.1109/CVPR.2016.511
    [12] Y. Chen, Y. Tian, M. He, Monocular human pose estimation: A survey of deep learning-based methods, Comput. Vision Image Understanding, 192 (2020), 102897. https://doi.org/10.1016/j.cviu.2019.102897 doi: 10.1016/j.cviu.2019.102897
    [13] C. Zheng, W. Wu, C. Chen, T. Yang, S. Zhu, J. Shen, et al., Deep learning-based human pose estimation: A survey, ACM Comput. Surv., 56 (2023), 1–37. https://doi.org/10.1145/3603618 doi: 10.1145/3603618
    [14] G. Papandreou, T. Zhu, N. Kanazawa, A. Toshev, J. Tompson, C. Bregler, et al., Towards accurate multi-person pose estimation in the wild, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 4903–4911. https://doi.org/10.1109/CVPR.2017.395
    [15] S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intell., 39 (2017), 1137–1149. https://doi.org/10.1109/tpami.2016.2577031 doi: 10.1109/tpami.2016.2577031
    [16] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90
    [17] K. Sun, B. Xiao, D. Liu, J. Wang, Deep high-resolution representation learning for human pose estimation, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 5693–5703. https://doi.org/10.1109/CVPR.2019.00584
    [18] L. Pishchulin, E. Insafutdinov, S. Tang, B. Andres, M. Andriluka, P. Gehler, et al., Deepcut: Joint subset partition and labeling for multi person pose estimation, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 4929–4937. https://doi.org/10.1109/CVPR.2016.533
    [19] Z. Cao, T. Simon, S. Wei, Y. Sheikh, Realtime multi-person 2D pose estimation using part affinity fields, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 1302–1310. https://doi.org/10.1109/CVPR.2017.143
    [20] F. Zhang, X. Zhu, M. Ye, Fast human pose estimation, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 3512–3521. https://doi.org/10.1109/CVPR.2019.00363
    [21] D. Xu, R. Zhang, L. Guo, C. Feng, S. Gao, LDNet: Lightweight dynamic convolution network for human pose estimation, Adv. Eng. Inf., 54 (2022), 101785. https://doi.org/10.1016/j.aei.2022.101785 doi: 10.1016/j.aei.2022.101785
    [22] C. Yu, B. Xiao, C. Gao, L. Yuan, L. Zhang, N. Sang, et al., Lite-HRNet: A lightweight high-resolution network, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 10435–10445. https://doi.org/10.1109/CVPR46437.2021.01030
    [23] S. Woo, J. Park, J. Lee, I. S. Kweon, Cbam: Convolutional block attention module, in European Conference on Computer Vision, 11211 (2018), 3–19. https://doi.org/10.1007/978-3-030-01234-2_1
    [24] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, et al., MobileNets: Efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861.
    [25] K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu, C. Xu, Ghostnet: More features from cheap operations, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 1577–1586. https://doi.org/10.1109/CVPR42600.2020.00165
    [26] M. Andriluka, L. Pishchulin, P. Gehler, B. Schiele, 2D human pose estimation: New benchmark and state of the art analysis, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, (2014), 3686–3693. https://doi.org/10.1109/CVPR.2014.471
    [27] N. Ma, X. Zhang, H. Zheng, J. Sun, Shufflenet v2: Practical guidelines for efficient cnn architecture design, in European Conference on Computer Vision, 11218 (2018), 122–138. https://doi.org/10.1007/978-3-030-01264-9_8
    [28] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L. Chen, Mobilenetv2: Inverted residuals and linear bottlenecks, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 4510–4520. https://doi.org/10.1109/CVPR.2018.00474
    [29] M Tan, Q. V. Le, Efficientnet: Rethinking model scaling for convolutional neural networks, preprint, arXiv: 1905.11946.
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(842) PDF downloads(35) Cited by(0)

Article outline

Figures and Tables

Figures(10)  /  Tables(4)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog