Research article Special Issues

Research and optimization of YOLO-based method for automatic pavement defect detection

  • Received: 06 November 2023 Revised: 15 January 2024 Accepted: 22 January 2024 Published: 27 February 2024
  • According to the latest statistics at the end of 2022, the total length of highways in China has reached 5.3548 million kilometers, with a maintenance mileage of 5.3503 million kilometers, accounting for 99.9% of the total maintenance coverage. Relying on inefficient manual pavement detection methods is difficult to meet the needs of large-scale detection. To tackle this issue, experiments were conducted to explore deep learning-based intelligent identification models, leveraging pavement distress data as the fundamental basis. The dataset encompasses pavement micro-cracks, which hold particular significance for the purpose of pavement preventive maintenance. The two-stage model Faster R-CNN achieved a mean average precision (mAP) of 0.938, which surpassed the one-stage object detection algorithms YOLOv5 (mAP: 0.91) and YOLOv7 (mAP: 0.932). To balance model weight and detection performance, this study proposes a YOLO-based optimization method on the basis of YOLOv5. This method achieves comparable detection performance (mAP: 0.93) to that of two-stage detectors, while exhibiting only a minimal increase in the number of parameters. Overall, the two-stage model demonstrated excellent detection performance when using a residual network (ResNet) as the backbone, whereas the YOLO algorithm of the one-stage detection model proved to be more suitable for practical engineering applications.

    Citation: Hui Yao, Yaning Fan, Xinyue Wei, Yanhao Liu, Dandan Cao, Zhanping You. Research and optimization of YOLO-based method for automatic pavement defect detection[J]. Electronic Research Archive, 2024, 32(3): 1708-1730. doi: 10.3934/era.2024078

    Related Papers:

  • According to the latest statistics at the end of 2022, the total length of highways in China has reached 5.3548 million kilometers, with a maintenance mileage of 5.3503 million kilometers, accounting for 99.9% of the total maintenance coverage. Relying on inefficient manual pavement detection methods is difficult to meet the needs of large-scale detection. To tackle this issue, experiments were conducted to explore deep learning-based intelligent identification models, leveraging pavement distress data as the fundamental basis. The dataset encompasses pavement micro-cracks, which hold particular significance for the purpose of pavement preventive maintenance. The two-stage model Faster R-CNN achieved a mean average precision (mAP) of 0.938, which surpassed the one-stage object detection algorithms YOLOv5 (mAP: 0.91) and YOLOv7 (mAP: 0.932). To balance model weight and detection performance, this study proposes a YOLO-based optimization method on the basis of YOLOv5. This method achieves comparable detection performance (mAP: 0.93) to that of two-stage detectors, while exhibiting only a minimal increase in the number of parameters. Overall, the two-stage model demonstrated excellent detection performance when using a residual network (ResNet) as the backbone, whereas the YOLO algorithm of the one-stage detection model proved to be more suitable for practical engineering applications.



    加载中


    [1] K. Wang, Z. Hou, W. Gong, Automation techniques for digital highway data vehicle (DHDV), in 7th International Conference on Managing Pavement Assets, Citeseer, 2008.
    [2] S. Zhu, X. Xia, Q. Zhang, K. Belloulata, An image segmentation algorithm in image processing based on threshold segmentation, in 2007 Third International IEEE Conference on Signal-Image technologies and Internet-Based System, (2007), 673–678. https://doi.org/10.1109/sitis.2007.116
    [3] S. S. Al-Amri, N. V. Kalyankar, Image segmentation by using threshold techniques, preprint, arXiv: 1005.4020. https://doi.org/10.48550/arXiv.1005.4020
    [4] N. Kanopoulos, N. Vasanthavada, R. L. Baker, Design of an image edge detection filter using the Sobel operator, IEEE J. Solid-State Circuits, 23 (1988), 358–367. https://doi.org/10.1109/4.996 doi: 10.1109/4.996
    [5] W. Dong, Z. Shisheng, Color image recognition method based on the prewitt operator, in 2008 International Conference on Computer Science and Software Engineering, 6 (2008), 170–173. https://doi.org/10.1109/CSSE.2008.567
    [6] L. Er-Sen, Z. Shu-Long, Z. Bao-shan, Z. Yong, X. Chao-gui, S. Li-hua, An adaptive edge-detection method based on the canny operator, in 2009 International Conference on Environmental Science and Information Application Technology, 1 (2009), 465–469. https://doi.org/10.1109/ESIAT.2009.49
    [7] A. Marques, P. L. Correia, Automatic road pavement crack detection using SVM, in Lisbon, Portugal: Dissertation for the Master of Science Degree in Electrical and Computer Engineering at Instituto Superior Técnico, 2012.
    [8] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, preprint, arXiv: 1409.1556. https://doi.org/10.48550/arXiv.1409.1556
    [9] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, et al., Going deeper with convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2015), 1–9. https://doi.org/10.1109/CVPR.2015.7298594
    [10] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90
    [11] R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2014), 580–587. https://doi.org/10.1109/cvpr.2014.81
    [12] A. Krizhevsky, I. Sutskever, G. E. Hinton, ImageNet classification with deep convolutional neural networks, Adv. Neural Inf. Process. Syst., (2012), 25. https://doi.org/10.1145/3065386 doi: 10.1145/3065386
    [13] K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 1904–1916. https://doi.org/10.18280/ts.370620 doi: 10.18280/ts.370620
    [14] R. Girshick, Fast R-CNN, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2015), 2380–7504. https://doi.org/10.1109/ICCV.2015.169
    [15] S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, Adv. Neural Inf. Process. Syst., (2015), 28. https://doi.org/10.1109/TPAMI.2016.2577031 doi: 10.1109/TPAMI.2016.2577031
    [16] T. Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollár, Focal loss for dense object detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2017), 2980–2988. https://doi.org/10.1109/TPAMI.2018.2858826
    [17] J. Redmon, S. K. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2016), 779–788. https://doi.org/10.48550/arXiv.1506.02640
    [18] C. Y. Wang, A. Bochkovskiy, H. Y. M. Liao, YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, preprint, arXiv: 2207.02696. https://doi.org/10.48550/arXiv.2207.02696
    [19] C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, et al., YOLOv6: A single-stage object detection framework for industrial applications, preprint, arXiv: 2209.02976. https://doi.org/10.48550/arXiv.2209.02976
    [20] A. Bochkovskiy, C. Y. Wang, H. Y. M. Liao, YOLOv4: Optimal speed and accuracy of object detection, preprint, arXiv: 2004.10934. https://doi.org/10.48550/arXiv.2004.10934
    [21] J. Redmon, A. Farhadi, YOLOv3: An incremental improvement, preprint, arXiv: 1804.02767. https://doi.org/10.48550/arXiv.1804.02767
    [22] J. Redmon, A. Farhadi, YOLO9000: Better, faster, stronger, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2017), 7263–7271. https://doi.org/10.1109/CVPR.2017.690
    [23] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, et al., Ssd: Single shot multibox detector, in Computer Vision–ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, 9905 (2016). https://doi.org/10.1007/978-3-319-46448-0_2
    [24] A. Womg, M. J. Shafiee, F. Li, B. Chwyl, Tiny SSD: A tiny single-shot detection deep convolutional neural network for real-time embedded object detection, in 2018 15th Conference on Computer and Robot Vision (CRV), (2018), 95–101. https://doi.org/10.1109/CRV.2018.00023
    [25] V. Mandal, L. Uong, Y. Adu-Gyamfi, Automated road crack detection using deep convolutional neural networks, in 2018 IEEE International Conference on Big Data (Big Data), (2018), 5212–5215. https://doi.org/10.1109/BigData.2018.8622327
    [26] S. Dong, J. Zhang, F. Wang, X. Wang, YOLO-pest: a real-time multi-class crop pest detection model, in International Conference on Computer Application and Information Security (ICCAIS 2021), 12260 (2022), 12–18. https://doi.org/10.1117/12.2637467
    [27] L. Liu, C. Ke, H. Lin, H. Xu, Research on pedestrian detection algorithm based on MobileNet-YOLO, Comput. Intell. Neurosci., 2022 (2022). https://doi.org/10.1155/2022/8924027 doi: 10.1155/2022/8924027
    [28] S. Liu, L. Qi, H. Qin, J. Shi, J. Jia, Path aggregation network for instance segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2018), 8759–8768, https://doi.org/10.1109/CVPR.2018.00913
    [29] M. Tan, Q. V. Le, EfficientNet: Rethinking model scaling for convolutional neural networks, preprint, arXiv: 1905.11946v2. https://doi.org/10.48550/arXiv.1905.11946
    [30] S. Woo, J. Park, J. Y. Lee, I. S. Kweon, Cbam: Convolutional block attention module, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 3–19. https://doi.org/10.48550/arXiv.1807.06521
    [31] Q. L. Zhang, Y. B. Yang, SA-Net: Shuffle attention for deep convolutional neural networks, in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2021), 2235–2239. https://doi.org/10.1109/ICASSP39728.2021.9414568
    [32] L. Yang, R. Y. Zhang, L. Li, X. Xie, Simam: A simple, parameter-free attention module for convolutional neural networks, in International Conference on Machine Learning, (2021), 11863–11874.
    [33] J. Yu, Y. Jiang, Z. Wang, Z. Cao, T. Huang, Unitbox: An advanced object detection network, in Proceedings of the 24th ACM International Conference on Multimedia, (2016), 516–520. https://doi.org/10.1145/2964284.2967274
    [34] H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, S. Savarese, Generalized intersection over union: A metric and a loss for bounding box regression, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), 658–666. https://doi.org/10.1109/CVPR.2019.00075
    [35] Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, D. Ren, Distance-IoU loss: Faster and better learning for bounding box regression, in Proceedings of the AAAI Conference on Artificial Intelligence, 34 (2020), 12993–13000. https://doi.org/10.48550/arXiv.1911.08287
    [36] Z. Zheng, P. Wang, D. Ren, W. Liu, R. Ye, Q. Hu, et al., Enhancing geometric factors in model learning and inference for object detection and instance segmentation, IEEE Trans. Cybern., 52 (2021), 8574–8586. https://doi.org/10.48550/arXiv.2005.03572 doi: 10.48550/arXiv.2005.03572
    [37] Z. Yang, X. Wang, J. Li, EIoU: An improved vehicle detection algorithm based on vehiclenet neural network, in Journal of Physics: Conference Series, 1924 (2021), 012001. https://doi.org/10.48550/arXiv.2005.03572
    [38] H. Zhang, H. Chang, B. Ma, N. Wang, X. Chen, Dynamic R-CNN: Towards high quality object detection via dynamic training, in Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science, (2020), 260–275. https://doi.org/10.1007/978-3-030-58555-6_16
    [39] Z. Liu, X. Gu, H. Yang, L. Wang, Y. Chen, D. Wang, Novel YOLOv3 model with structure and hyperparameter optimization for detection of pavement concealed cracks in GPR images, IEEE Trans. Intell. Transp. Syst., 23 (2022), 22258–22268. https://doi.org/10.1109/TITS.2022.3174626 doi: 10.1109/TITS.2022.3174626
    [40] D. Ma, H. Fang, N. Wang, C. Zhang, J. Dong, H. Hu, Automatic detection and counting system for pavement cracks based on PCGAN and YOLO-MF, IEEE Trans. Intell. Transp. Syst., 23 (2022), 22166–22178. https://doi.org/10.1109/TITS.2022.3161960 doi: 10.1109/TITS.2022.3161960
    [41] J. Li, C. Yuan, X. Wang, Real-time instance-level detection of asphalt pavement distress combining space-to-depth (SPD) YOLO and omni-scale network (OSNet), Autom. Constr., 155 (2023), 105062. https://doi.org/10.1016/j.autcon.2023.105062 doi: 10.1016/j.autcon.2023.105062
    [42] Q. Qiu, D. Lau, Real-time detection of cracks in tiled sidewalks using YOLO-based method applied to unmanned aerial vehicle (UAV) images, Autom. Constr., 147 (2023), 104745. https://doi.org/10.1016/j.autcon.2023.104745 doi: 10.1016/j.autcon.2023.104745
    [43] H. Yao, Y. Liu, H. Lv, J. Huyan, Z. You, Y. Hou, Encoder-decoder with pyramid region attention for pixel‐level pavement crack recognition, Comput.‐Aided Civil Infrastruct. Eng., 2023. https://doi.org/10.1111/mice.13128 doi: 10.1111/mice.13128
    [44] R. Li, Y. Wu, Improved YOLO v5 wheat ear detection algorithm based on attention mechanism, Electronics, 11 (2022), 1673. https://doi.org/10.3390/electronics11111673 doi: 10.3390/electronics11111673
    [45] J. Sun, H. Ge, Z. Zhang, AS-YOLO: an improved YOLOv4 based on attention mechanism and SqueezeNet for person detection, in 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), 5 (2021), 1451–1456. https://doi.org/10.1109/IAEAC50856.2021.9390855
    [46] J. Li, H. Wang, Y. Xu, F. Liu, Road object detection of YOLO algorithm with attention mechanism, Front. Signal Process., (2021), 9–16. https://doi.org/10.22606/fsp.2021.51002 doi: 10.22606/fsp.2021.51002
    [47] Y. Yuan, L. Huang, J. Guo, C. Zhang, X. Chen, J. Wang, Ocnet: Object context network for scene parsing, preprint, arXiv: 1809.00916. https://doi.org/https://doi.org/10.48550/arXiv.1809.00916
    [48] Q. Wang, T. Wu, H. Zheng, G. Guo, Hierarchical pyramid diverse attention networks for face recognition, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2020), 8326–8335. https://doi.org/10.1109/CVPR42600.2020.00835
    [49] O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa, et al., Attention u-net: Learning where to look for the pancreas, preprint, arXiv: 1804.03999. https://doi.org/10.48550/arXiv.1804.03999
    [50] M. H. Guo, J. X. Cai, Z. N. Liu, T. J. Mu, R. R. Martin, S. M. Hu, Pct: Point cloud transformer, Comput. Visual Media, 7 (2021), 187–199. https://doi.org/10.1007/s41095-021-0229-5 doi: 10.1007/s41095-021-0229-5
    [51] H. Yao, Y. Liu, X. Li, Z. You, Y. Feng, W. Lu, A detection method for pavement cracks combining object detection and attention mechanism, IEEE Trans. Intell. Transp. Syst., 23 (2022), 22179–22189. https://doi.org/10.1109/TITS.2022.3177210 doi: 10.1109/TITS.2022.3177210
    [52] F. J. Du, S. J. Jiao, Improvement of lightweight convolutional neural network model based on YOLO algorithm and its research in pavement defect detection, Sensors, 22 (2022), 3537. https://doi.org/10.3390/s22093537 doi: 10.3390/s22093537
    [53] D. Wang, Z. Liu, X. Gu, W. Wu, Y. Chen, L. Wang, Automatic detection of pothole distress in asphalt pavement using improved convolutional neural networks, Remote Sens., 14 (2022), 3892. https://doi.org/10.3390/rs14163892 doi: 10.3390/rs14163892
    [54] M. Nie, C. Wang, Pavement crack detection based on yolo v3, in 2019 2nd International Conference on Safety Produce Informatization (IICSPI), (2019), 327–330. https://doi.org/10.1109/IICSPI48186.2019.9095956
    [55] D. Zhou, J. Fang, X. Song, C. Guan, J. Yin, Y. Dai, et al., IoU loss for 2d/3d object detection, in 2019 International Conference on 3D Vision (3DV), (2019), 85–94. https://doi.org/10.1109/3DV.2019.00019
    [56] C. Han, T. Ma, L. Gu, J. Cao, X. Shi, W. Huang, et al., Asphalt pavement health prediction based on improved transformer network, IEEE Trans. Intell. Transp. Syst., 24 (2022), 4482–4493. https://doi.org/10.1109/TITS.2022.3229326 doi: 10.1109/TITS.2022.3229326
    [57] Z. Tong, T. Ma, W. Zhang, J. Huyan, Evidential transformer for pavement distress segmentation, Comput.‐Aided Civil Infrastruct. Eng., 2023. https://doi.org/10.1111/mice.13018 doi: 10.1111/mice.13018
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1228) PDF downloads(141) Cited by(2)

Article outline

Figures and Tables

Figures(13)  /  Tables(2)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog