Research article Special Issues

Multi-scale feature fusion for pavement crack detection based on Transformer


  • Received: 10 June 2023 Revised: 03 July 2023 Accepted: 04 July 2023 Published: 11 July 2023
  • Automated pavement crack image segmentation presents a significant challenge due to the difficulty in detecting slender cracks on complex pavement backgrounds, as well as the significant impact of lighting conditions. In this paper, we propose a novel approach for automated pavement crack detection using a multi-scale feature fusion network based on the Transformer architecture, leveraging an encoding-decoding structure. In the encoding phase, the Transformer is leveraged as a substitute for the convolution operation, which utilizes global modeling to enhance feature extraction capabilities and address long-distance dependence. Then, dilated convolution is employed to increase the receptive field of the feature map while maintaining resolution, thereby further improving context information acquisition. In the decoding phase, the linear layer is employed to adjust the length of feature sequence output by different encoder block, and the multi-scale feature map is obtained after dimension conversion. Detailed information of cracks can be restored by fusing multi-scale features, thereby improving the accuracy of crack detection. Our proposed method achieves an F1 score of 70.84% on the Crack500 dataset and 84.50% on the DeepCrack dataset, which are improvements of 1.42% and 2.07% over the state-of-the-art method, respectively. The experimental results show that the proposed method has higher detection accuracy, better generalization and better crack detection results can be obtained under both high and low brightness conditions.

    Citation: Yalong Yang, Zhen Niu, Liangliang Su, Wenjing Xu, Yuanhang Wang. Multi-scale feature fusion for pavement crack detection based on Transformer[J]. Mathematical Biosciences and Engineering, 2023, 20(8): 14920-14937. doi: 10.3934/mbe.2023668

    Related Papers:

  • Automated pavement crack image segmentation presents a significant challenge due to the difficulty in detecting slender cracks on complex pavement backgrounds, as well as the significant impact of lighting conditions. In this paper, we propose a novel approach for automated pavement crack detection using a multi-scale feature fusion network based on the Transformer architecture, leveraging an encoding-decoding structure. In the encoding phase, the Transformer is leveraged as a substitute for the convolution operation, which utilizes global modeling to enhance feature extraction capabilities and address long-distance dependence. Then, dilated convolution is employed to increase the receptive field of the feature map while maintaining resolution, thereby further improving context information acquisition. In the decoding phase, the linear layer is employed to adjust the length of feature sequence output by different encoder block, and the multi-scale feature map is obtained after dimension conversion. Detailed information of cracks can be restored by fusing multi-scale features, thereby improving the accuracy of crack detection. Our proposed method achieves an F1 score of 70.84% on the Crack500 dataset and 84.50% on the DeepCrack dataset, which are improvements of 1.42% and 2.07% over the state-of-the-art method, respectively. The experimental results show that the proposed method has higher detection accuracy, better generalization and better crack detection results can be obtained under both high and low brightness conditions.



    加载中


    [1] H. Li, D. Song, Y. Liu, Automatic pavement crack detection by multi-scale image fusion, IEEE Trans. Intell. Transp. Syst., 20 (2018), 2025–2036. https://doi.org/10.1109/TITS.2018.2856928 doi: 10.1109/TITS.2018.2856928
    [2] L. Hong, L. Mu, Survey on new progresses of deep learning based computer vision, J. Data Acquis. Process., 37 (2022), 247–278.
    [3] J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 39 (2015), 640–651. https://doi.org/10.1109/TPAMI.2016.2572683 doi: 10.1109/TPAMI.2016.2572683
    [4] O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, (2015), 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
    [5] A. Vaswani, N. Shazeer, N. Parmar, Attention is all you need, Adv. Neural Inf. Process. Syst., 30 (2017), 5998–6008.
    [6] S. Zheng, J. Lu, H. Zhao, Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers, in IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2021), 6877–6886. https://doi.org/10.1109/CVPR46437.2021.00681
    [7] J. W. Wang, C. Li, X. Zhang, Surface crack detection of rubber insulator based on machine vision, in 14th International Conference on Intelligent Robotics and Applications (ICIRA), (2021), 175–185. https://doi.org/10.1007/978-3-030-89098-8_17
    [8] H. F. Li, Z. L. Wu, J. J. Nie, An automatic fine crack recognition algorithm for airport pavement under significant noises, Comput. Eng. Sci., 42 (2020), 2020–2029.
    [9] C. Peng, M. Q. Yang, Q. H. Zheng, A triple-thresholds pavement crack detection method leveraging random structured forest, Constr. Build. Mater., 263 (2020), 120080. https://doi.org/10.1016/j.conbuildmat.2020.120080 doi: 10.1016/j.conbuildmat.2020.120080
    [10] N. Kheradmandi, V. Mehranfar, A critical review and comparative study on image segmentation-based techniques for pavement crack detection, Constr. Build. Mater., 321 (2022), 126162. https://doi.org/10.1016/j.conbuildmat.2021.126162 doi: 10.1016/j.conbuildmat.2021.126162
    [11] J. L. Chen, Concrete detection method based on image processing, Chin. J. Liq. Cryst. Disp., 35 (2020), 395–401. https://doi.org/10.3788/YJYXS20203504.0395 doi: 10.3788/YJYXS20203504.0395
    [12] Z. Othman, S. Zukfily, S. S. S. Ahmad, Road crack detection using modification of threshold values in Canny algorithm, in 7th Mechanical Engineering Research Day (MERD), (2020), 184–186.
    [13] J. Luo, H. Z. Lin, X. X. Wei, Adaptive canny and semantic segmentation networks based on feature fusion for road crack detection, IEEE Access, 11 (2023), 51740–51753. https://doi.org/10.1109/ACCESS.2023.3279888 doi: 10.1109/ACCESS.2023.3279888
    [14] S. Konishi, A. Yuille, J. Coughlan, Statistical edge detection: learning and evaluating edge cues, IEEE Trans. Pattern Anal. Mach. Intell., 25 (2003), 57–74. https://doi.org/10.1109/TPAMI.2003.1159946 doi: 10.1109/TPAMI.2003.1159946
    [15] Z. Liu, Y. Cao, Y. Wang, Computer vision-based concrete crack detection using U-net fully convolutional networks, Autom. Constr., 104 (2019), 129–139. https://doi.org/10.1016/j.autcon.2019.04.005 doi: 10.1016/j.autcon.2019.04.005
    [16] X. N. Cui, Q. C. Wang, J. P. Dai, Intelligent crack detection based on attention mechanism in convolution neural network, Adv. Struct. Eng., 24 (2021), 1859–1868. https://doi.org/10.1177/1369433220986638 doi: 10.1177/1369433220986638
    [17] F. Liu, J. F. Wang, Z. Y. Chen, Parallel attention based UNet for crack detection, J. Comput. Res. Dev., 58 (2021), 1718–1726.
    [18] X. W. Gao, B. R. Tong, MRA-UNet: Balancing speed and accuracy in road crack segmentation network, Signal Image Video Process., 17 (2023), 2093–2100. https://doi.org/10.1007/s11760-022-02423-9 doi: 10.1007/s11760-022-02423-9
    [19] T. Chen, Z. Cai, X. Zhao, Pavement crack detection and recognition using the architecture of segNet, J. Ind. Inf. Integr., 18 (2020), 100144. https://doi.org/10.1016/j.jii.2020.100144 doi: 10.1016/j.jii.2020.100144
    [20] V. Badrinarayanan, A. Kendall, R. Cipolla, Segnet: A deep convolutional encoder-decoder architecture for image segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 39 (2017), 2481–2495. https://doi.org/10.1109/TPAMI.2016.2644615 doi: 10.1109/TPAMI.2016.2644615
    [21] Y. Liu, J. Yao, X. Lu, DeepCrack: A deep hierarchical feature learning architecture for crack segmentation, Neurocomputing, 338 (2019), 139–153. https://doi.org/10.1016/j.neucom.2019.01.036 doi: 10.1016/j.neucom.2019.01.036
    [22] Y. Ren, J. Huang, Z. Hong, Image-based concrete crack detection in tunnels using deep fully convolutional networks, Constr. Build. Mater., 234 (2020), 117367. https://doi.org/10.1016/j.conbuildmat.2019.117367 doi: 10.1016/j.conbuildmat.2019.117367
    [23] H. Zhao, J. Shi, X. Qi, Pyramid scene parsing network, in Proceedings of the IEEE conference on computer vision and pattern recognition, (2017), 2881–2890.
    [24] A. Dosovitskiy, L. Beyer, A. Kolesnikov, An image is worth 16 x 16 words: Transformers for image recognition at scale, preprint, arXiv: 2010.11929. https://doi.org/10.48550/arXiv.2010.11929
    [25] H. Liu, X. Miao, C. Mertz, Crackformer: Transformer network for fine-grained crack detection, in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2021), 3783–3792. https://doi.org/10.1109/ICCV48922.2021.00376
    [26] E. Xie, W. Wang, Z. Yu, SegFormer: Simple and efficient design for semantic segmentation with transformers, Adv. Neural Inf. Process. Syst., 34 (2021), 12077–12090.
    [27] W. Wang, C. Su, Automatic concrete crack segmentation model based on transformer, Autom. Constr., 139 (2022), 104275. https://doi.org/10.1016/j.autcon.2022.104275 doi: 10.1016/j.autcon.2022.104275
    [28] F. Guo, Y. Qian, J. Liu, Pavement crack detection based on transformer network, Autom. Constr., 145 (2023), 104646. https://doi.org/10.1016/j.autcon.2022.104646 doi: 10.1016/j.autcon.2022.104646
    [29] Z. Liu, Y. Lin, Y. Cao, Swin transformer: Hierarchical vision transformer using shifted windows, in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2021), 10012–10022.
    [30] X. Ju, X. Zhao, S. Qian, TransMF: Transformer-based multi-scale fusion model for crack detection, Mathematics, 10 (2022), 2354. https://doi.org/10.3390/math10132354 doi: 10.3390/math10132354
    [31] S. Khan, M. Naseer, M. Hayat, Transformers in vision: A survey, ACM Comput. Surv., 54 (2022), 1–41. https://doi.org/10.1145/3505244 doi: 10.1145/3505244
    [32] R. Xiong, Y. Yang, D. He, On layer normalization in the transformer architecture, in International Conference on Machine Learning, (2020), 10524–10533.
    [33] Q. Zhong, C. Wen, Concrete pavement crack detection based on dilated convolution and multi-features fusion, Comput. Sci., 49 (2022), 192–196.
    [34] F. Yang, L. Zhang, S. Yu, Feature pyramid and hierarchical boosting network for pavement crack detection, IEEE Trans. Intell. Transp. Syst., 21 (2019), 1525–1535. https://doi.org/10.1109/TITS.2019.2910595 doi: 10.1109/TITS.2019.2910595
    [35] R. Shamir, Y. Duchin, J. Kim, Continuous dice coefficient: A method for evaluating probabilistic segmentations, preprint, arXiv: 1906.11031. https://doi.org/10.1101/306977
    [36] G. Shu, L. Xiao, W. Xue, Crack detection based on enhanced semantic information and multi-channel feature fusion, Comput. Eng. Appl., 57 (2021), 204–210.
    [37] C. Hui, R. Zhi, L. Yu, Research on tunnel crack segmentation algorithm based on improved U-Net network, Comput. Eng. Appl., 57 (2021), 215–222.
    [38] R. K. Meleppat, K. E. Ronning, S. J. Karlen, In vivo multimodal retinal imaging of disease-related pigmentary changes in retinal pigment epithelium, Sci. Rep., 11 (2021). https://doi.org/10.1038/s41598-021-95320-z doi: 10.1038/s41598-021-95320-z
    [39] R. K. Meleppat, C. R. Fortenbach, Y. F. Jian, In vivo imaging of retinal and choroidal morphology and vascular plexuses of vertebrates using swept-source optical coherence tomography, Translational Vision Sci. Technol., 11 (2022). https://doi.org/10.1167/tvst.11.8.11 doi: 10.1167/tvst.11.8.11
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1821) PDF downloads(123) Cited by(2)

Article outline

Figures and Tables

Figures(9)  /  Tables(6)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog