Semantic segmentation of substation tools using an improved ICNet network

Guozhong Liu; Qiongping Tang; Changnian Lin; An Xu; Chonglong Lin; Hao Meng; Mengyu Ruan; Wei Jin; Guozhong Liu; Qiongping Tang; Changnian Lin; An Xu; Chonglong Lin; Hao Meng; Mengyu Ruan; Wei Jin

doi:10.3934/era.2024246

Electronic Research Archive

2024, Volume 32, Issue 9: 5321-5340. doi: 10.3934/era.2024246

Previous Article Next Article

Research article Special Issues

Semantic segmentation of substation tools using an improved ICNet network

1.
School of Instrument Science and Opto-Electronics Engineering, Beijing Information Science and Technology University, Beijing 100096, China
2.
Beijing Kedong Electric Control System Co., Ltd., Haidian District, Beijing 100192, China
3.
NARI Group Corporation (State Grid Electric Power Research Institute), Nanjing 211106, China

Academic Editor: Zhenglin Wang

Received: 19 July 2024 Revised: 03 September 2024 Accepted: 11 September 2024 Published: 18 September 2024

In the field of substation operation and maintenance, real-time detection and precise segmentation of tools play an important role in maintaining the safe operation of the power grid and guiding operators to work safely. To improve the accuracy and real-time performance of semantic segmentation of substation operation and maintenance tools, we have proposed an improved, light-weight, real-time, semantic segmentation network based on an efficient image cascade network architecture (ICNet). The network uses multiscale branches and cascaded feature fusion units to extract rich multilevel features. We designed a semantic segmentation and purification module to deal with redundant and conflicting information in multiscale feature fusion. A lightweight backbone network was used in the feature extraction stage at different resolutions, and a recursive gated convolution was used in the upsampling stage to achieve high-order spatial interactions, thereby improving segmentation accuracy. Due to the lack of a substation tool semantic segmentation data set, we constructed one. Training and testing on the data set showed that the proposed model improved the accuracy of tool detection while ensuring real-time performance. Compared with the currently popular semantic segmentation network, it had better performance in real-time and accuracy, and provided a new semantic segmentation method for embedded platforms.
- ICNet,
- lightweight,
- semantic segmentation,
- tools and instruments,
- substation operation and maintenance
Citation: Guozhong Liu, Qiongping Tang, Changnian Lin, An Xu, Chonglong Lin, Hao Meng, Mengyu Ruan, Wei Jin. Semantic segmentation of substation tools using an improved ICNet network[J]. Electronic Research Archive, 2024, 32(9): 5321-5340. doi: 10.3934/era.2024246

Related Papers:

Abstract

In the field of substation operation and maintenance, real-time detection and precise segmentation of tools play an important role in maintaining the safe operation of the power grid and guiding operators to work safely. To improve the accuracy and real-time performance of semantic segmentation of substation operation and maintenance tools, we have proposed an improved, light-weight, real-time, semantic segmentation network based on an efficient image cascade network architecture (ICNet). The network uses multiscale branches and cascaded feature fusion units to extract rich multilevel features. We designed a semantic segmentation and purification module to deal with redundant and conflicting information in multiscale feature fusion. A lightweight backbone network was used in the feature extraction stage at different resolutions, and a recursive gated convolution was used in the upsampling stage to achieve high-order spatial interactions, thereby improving segmentation accuracy. Due to the lack of a substation tool semantic segmentation data set, we constructed one. Training and testing on the data set showed that the proposed model improved the accuracy of tool detection while ensuring real-time performance. Compared with the currently popular semantic segmentation network, it had better performance in real-time and accuracy, and provided a new semantic segmentation method for embedded platforms.

References

[1]	Z. Q. Cheng, Q. Dai, S. Li, T. Mitamura, A. Hauptmann, Gsrformer: Grounded situation recognition transformer with alternate semantic attention refinement, in Proceedings of the 30th ACM International Conference on Multimedia, (2022), 3272–3281. https://doi.org/10.1145/3503161.3547943
[2]	H. Wang, Z. Q. Cheng, J. Sun, X. Yang, X. Wu, H. Y. Chen, et al., Debunking free fusion myth: Online multi-view anomaly detection with disentangled product-of-experts modeling, in Proceedings of the 31st ACM International Conference on Multimedia, (2023), 3277–3286. https://doi.org/10.1145/3581783.3612487
[3]	J. Zhang, X. Wu, Z. Q. Cheng, Q. He, W. Li, Improving anomaly segmentation with multi-granularity cross-domain alignment, in Proceedings of the 31st ACM International Conference on Multimedia, (2023), 8515–8524. https://doi.org/10.1145/3581783.3611849
[4]	S. Gupta, P. Arbeláez, R. Girshick, J. Malik, Indoor scene understanding with RGB-D images: Bottom-up segmentation, object detection and semantic segmentation, Int. J. Comput. Vision, 112 (2015), 133–149. https://doi.org/10.1007/s11263-014-0777-6 doi: 10.1007/s11263-014-0777-6
[5]	X. M. Zhang, Z. Y. Li, Y. Zheng, Multi-threshold image segmentation based on combining fisher criterion and potential function, J. Comput. Appl., 32 (2012), 2843–2847. https://doi.org/10.3724/SP.J.1087.2012.02843 doi: 10.3724/SP.J.1087.2012.02843
[6]	P. Liu, A. M. Yang, A method of region based color image segmentation, Comput. Eng. Appl., 43 (2007), 37–39. https://doi.org/10.3321/j.issn:1002-8331.2007.06.012 doi: 10.3321/j.issn:1002-8331.2007.06.012
[7]	C. Li, Z. Qu, Review of image edge detection algorithms based on deep learning, J. Comput. Appl., 40 (2020), 3280–3288. https://doi.org/10.11772/j.issn.1001-9081.2020030314 doi: 10.11772/j.issn.1001-9081.2020030314
[8]	J. Song, Y. Yu, Q. Luo, Cross-layer fusion feature based on richer convolutional features for edge detection, J. Comput. Appl., 40 (2020), 2053–2058. https://doi.org/10.11772/j.issn.1001-9081.2019112057 doi: 10.11772/j.issn.1001-9081.2019112057
[9]	S. J. Zhai, Research on Image Segmentation Based on Optimization Theory, Ph.D thesis, Hunan Normal University, 2018.
[10]	J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2015), 3431–3440. https://doi.org/10.1109/CVPR.2015.7298965
[11]	A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, Commun. ACM, 60 (2017), 84–90. https://doi.org/10.1145/3065386 doi: 10.1145/3065386
[12]	J. J. Qiao, Z. Q. Cheng, X. Wu, W. Li, J. Zhang, Real-time semantic segmentation with parallel multiple views feature augmentation, in Proceedings of the 30th ACM International Conference on Multimedia, (2022), 6300–6308. https://doi.org/10.1145/3503161.3547786
[13]	H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 2881–2890. https://doi.org/10.1109/CVPR.2017.660
[14]	L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille, Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs, IEEE Trans. Pattern Anal. Mach. Intell., 40 (2017), 834–848. https://doi.org/10.1109/TPAMI.2017.2699184 doi: 10.1109/TPAMI.2017.2699184
[15]	O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention, (2015), 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
[16]	C. Peng, T. Tian, C. Chen, X. Guo, J. Ma, Bilateral attention decoder: A lightweight decoder for real-time semantic segmentation, Neural Networks. 137 (2021), 188–199. https://doi.org/10.1016/j.neunet.2021.01.021 doi: 10.1016/j.neunet.2021.01.021
[17]	Y. Liu, Z. Zhang, S. Pei, J. H. Wu, L. H. Liang, Z. R. Ma, Faulty insulator segmentation method in infrared image based on deep learning, Electr. Meas. Instrum., 59 (2022), 63–68.
[18]	Z. Hu, S. Bao, C. Xu, H. Wang, Semantic segmentation algorithm for remote sensing buildings based on DeepLabv3+, J. Comput. Appl., 41 (2021), 71–75.
[19]	X. Tang, W. Tu, K. Li, J. Cheng, DFFNet: An iot-perceptive dual feature fusion network for general real-time semantic segmentation, Inf. Sci., 565 (2021), 326–343. https://doi.org/10.1016/j.ins.2021.02.004 doi: 10.1016/j.ins.2021.02.004
[20]	Y. Wang, H. Liu, H. Wang, Y. Qian, Lightweight building semantic segmentation method based on remote sensing images, Comput. Eng. Design, 43 (2022), 2646–2653. https://doi.org/10.16208/j.issn1000-7024.2022.09.032 doi: 10.16208/j.issn1000-7024.2022.09.032
[21]	A. Paszke, A. Chaurasia, S. Kim, E. Culurciello, Enet: A deep neural network architecture for real-time semantic segmentation, preprint, arXiv: 1606.02147. https://doi.org/10.48550/arXiv.1606.02147
[22]	E. Romera, J. M. Alvarez, L. M. Bergasa, R. Arroyo, Erfnet: Efficient residual factorized convnet for real-time semantic segmentation, IEEE Trans. Intell. Transp. Syst., 19 (2017), 263–272. https://doi.org/10.1109/TITS.2017.2750080 doi: 10.1109/TITS.2017.2750080
[23]	F. Xiong, X. Zhang, X. Han, L. Kuang, H. Liu, J. Jia, Research on improved semantic segmentation of remote sensing, Comput. Eng. Appl., 58 (2022), 185–190. https://doi.org/10.3778/j.issn.1002-8331.2011-0021 doi: 10.3778/j.issn.1002-8331.2011-0021
[24]	S. Li, T. Wu, Lightweight semantic segmentation of road scenes for autonomous driving, Comput. Eng. Appl., 59 (2023). https://doi.org/10.3778/j.issn.1002-8331.2206-0433 doi: 10.3778/j.issn.1002-8331.2206-0433
[25]	H. Zhao, X. Qi, X. Shen, J. Shi, J. Jia, Icnet for real-time semantic segmentation on high-resolution images, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 405–420. https://doi.org/10.1007/978-3-030-01219-9_25
[26]	S. Liu, H. Ye, K. Jin, H. Cheng, CT-UNet: Context-transfer-UNet for building segmentation in remote sensing images, Neural Process. Lett., 53 (2021), 4257–4277. https://doi.org/10.1007/s11063-021-10592-w doi: 10.1007/s11063-021-10592-w
[27]	A. Garcia-Garcia, S. Orts-Escolano, S. Oprea, V. Villena-Martinez, J. Garcia-Rodriguez, A review on deep learning techniques applied to semantic segmentation, preprint, arXiv: 1704.06857. https://doi.org/10.48550/arXiv.1704.06857
[28]	C. Cui, T. Gao, S. Wei, Y. Du, R. Guo, S. Dong, PP-LCNet: A lightweight CPU convolutional neural network, preprint, arXiv: 2109.15099. https://doi.org/10.48550/arXiv.2109.15099
[29]	K. Zhou, Q. Yang, Y. Wang, J. Zhang, An improved SSD algorithm based on pressure plate status recognition, Electr. Meas. Instrum, 58 (2021), 69–76. https://doi.org/10.19753/j.issn1001-1390.2021.01.010 doi: 10.19753/j.issn1001-1390.2021.01.010
[30]	Q. Yao, S. Bie, J. Yu, Q. Chen, A bearing fault diagnosis method combining improved inception V2 module and CBAM, J. Vib. Eng., 35 (2022), 949–957. https://doi.org/10.16385/j.cnki.issn.1004-4523.2022.04.019 doi: 10.16385/j.cnki.issn.1004-4523.2022.04.019
[31]	H. Wang, X. Ge, Lightweight DeepLabv3+ building extraction method from remote sensing images, Remote Sens. Natural Resour., 34 (2022), 128–135. https://doi.org/10.6046/zrzyyg.2021219 doi: 10.6046/zrzyyg.2021219
[32]	D. Liu, Z. Liang, Y. Sun, Micro-expression recognition method based on spatial attention mechanism and optical flow features, J. Comput.-Aided Design Comput. Graphics, 33 (2021), 1541–1552. https://dx.doi.org/10.3724/SP.J.1089.2021.18569 doi: 10.3724/SP.J.1089.2021.18569
[33]	Z Lyu, X Xu, F Zhang, Lightweight attention mechanism module based on squeeze and excitation, J. Comput. Appl., 42 (2022), 2353–2360. https://doi.org/10.11772/j.issn.1001-9081.2021061037 doi: 10.11772/j.issn.1001-9081.2021061037
[34]	Y Rao, W Zhao, Y Tang, J Zhou, S. N. Lim, J. Lu, Hornet: Efficient high-order spatial interactions with recursive gated convolutions, preprint, arXiv: 2207.1428v3.
[35]	Y. Liu, F. Zheng, B. Fan, TV news automatic segmentation base on text and audio-visual multi-modal features information, Comput. Eng. Appl., 43 (2007), 190–194. https://doi.org/10.3321/j.issn:1002-8331.2007.35.057 doi: 10.3321/j.issn:1002-8331.2007.35.057
[36]	P. Wang, L. Liu, H. Zhang, T. Wang, CGNet: A cascaded generative network for dense point cloud reconstruction from a single image, Knowledge-Based Syst., 223 (2021), 107057. https://doi.org/10.1016/j.knosys.2021.107057 doi: 10.1016/j.knosys.2021.107057
[37]	Q. You, W. Xu, K. Zhang, L. Zhang, X. Yi, D. Yao, C. Wang, et al., ccNET: Database of co-expression networks with functional modules for diploid and polyploid Gossypium, Nuclc Acids Res., 45 (2017), D1090–D1099. https://doi.org/10.1093/nar/gkw910 doi: 10.1093/nar/gkw910

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)