Objectives: We intend to develop a dual-modal dynamic contour-based instance segmentation method that is based on carotid artery and jugular vein ultrasound and its optical flow image, then we evaluate its performance in comparison with the classic single-modal deep learning networks. Method: We collected 2432 carotid artery and jugular vein ultrasound images and divided them into training, validation and test dataset by the ratio of 8:1:1. We then used these ultrasound images to generate optical flow images with clearly defined contours. We also proposed a dual-stream information fusion module to fuse complementary features between different levels extracted from ultrasound and optical flow images. In addition, we proposed a learnable contour initialization method that eliminated the need for manual design of the initial contour, facilitating the rapid regression of nodes on the contour to the ground truth points. Results: We verified our method by using a self-built dataset of carotid artery and jugular vein ultrasound images. The quantitative metrics demonstrated a bounding box detection mean average precision of 0.814 and a mask segmentation mean average precision of 0.842. Qualitative analysis of our results showed that our method achieved smoother segmentation boundaries for blood vessels. Conclusions: The dual-modal network we proposed effectively utilizes the complementary features of ultrasound and optical flow images. Compared to traditional single-modal instance segmentation methods, our approach more accurately segments the carotid artery and jugular vein in ultrasound images, demonstrating its potential for reliable and precise medical image analysis.
Citation: Chenkai Chang, Fei Qi, Chang Xu, Yiwei Shen, Qingwu Li. A dual-modal dynamic contour-based method for cervical vascular ultrasound image instance segmentation[J]. Mathematical Biosciences and Engineering, 2024, 21(1): 1038-1057. doi: 10.3934/mbe.2024043
Objectives: We intend to develop a dual-modal dynamic contour-based instance segmentation method that is based on carotid artery and jugular vein ultrasound and its optical flow image, then we evaluate its performance in comparison with the classic single-modal deep learning networks. Method: We collected 2432 carotid artery and jugular vein ultrasound images and divided them into training, validation and test dataset by the ratio of 8:1:1. We then used these ultrasound images to generate optical flow images with clearly defined contours. We also proposed a dual-stream information fusion module to fuse complementary features between different levels extracted from ultrasound and optical flow images. In addition, we proposed a learnable contour initialization method that eliminated the need for manual design of the initial contour, facilitating the rapid regression of nodes on the contour to the ground truth points. Results: We verified our method by using a self-built dataset of carotid artery and jugular vein ultrasound images. The quantitative metrics demonstrated a bounding box detection mean average precision of 0.814 and a mask segmentation mean average precision of 0.842. Qualitative analysis of our results showed that our method achieved smoother segmentation boundaries for blood vessels. Conclusions: The dual-modal network we proposed effectively utilizes the complementary features of ultrasound and optical flow images. Compared to traditional single-modal instance segmentation methods, our approach more accurately segments the carotid artery and jugular vein in ultrasound images, demonstrating its potential for reliable and precise medical image analysis.
[1] | S. Wang, M. E. Celebi, Y. D. Zhang, X. Yu, S. Lu, X. Yao, et al., Advances in data preprocessing for biomedical data fusion: An overview of the methods, challenges, and prospects, Inf. Fusion, 76 (2021), 376–421. https://doi.org/10.1016/j.inffus.2021.07.001 doi: 10.1016/j.inffus.2021.07.001 |
[2] | C. Fournil, N. Boulet, S. Bastide, B. Louart, A. Ambert, C. Boutin, et al., High success rates of ultrasound-guided distal internal jugular vein and axillary vein approaches for central venous catheterization: A randomized controlled open-label pilot trial, J. Clin. Ultrasound, 51 (2023), 158–166. https://doi.org/10.1002/jcu.23383 doi: 10.1002/jcu.23383 |
[3] | W. Choi, B. Park, S. Choi, D. Oh, J. Kim, C. Kim, Recent advances in contrast-enhanced photoacoustic imaging: Overcoming the physical and practical challenges, Chem. Rev., 123 (2023), 7379–7419. https://doi.org/10.1021/acs.chemrev.2c00627 doi: 10.1021/acs.chemrev.2c00627 |
[4] | L. Wang, J. Bai, J. Jin, K. Zhi, S. Nie, L. Qu, Treatment of inadvertent cervical arterial catheterization: Single-center experience, Vascular, 31 (2023), 791–798. https://doi.org/10.1177/17085381221083161 doi: 10.1177/17085381221083161 |
[5] | L. A. Groves, B. VanBerlo, N. Veinberg, A. Alboog, T. M. Peters, E. C. Chen, Automatic segmentation of the carotid artery and internal jugular vein from 2D ultrasound images for 3D vascular reconstruction, Int. J. Comput. Assisted Radiol. Surg., 15 (2020), 1835–1846. https://doi.org/10.1007/s11548-020-02248-2 doi: 10.1007/s11548-020-02248-2 |
[6] | D. Khurana, A. Koli, K. Khatter, S. Singh, Natural language processing: State of the art, current trends and challenges, Multimedia Tools Appl., 82 (2023), 3713–3744. https://doi.org/10.1007/s11042-022-13428-4 doi: 10.1007/s11042-022-13428-4 |
[7] | C. Li, X. Li, M. Chen, X. Sun, Deep learning and image recognition, in 2023 IEEE 6th International Conference on Electronic Information and Communication Technology (ICEICT), (2023), 557–562. https://doi.org/10.1109/ICEICT57916.2023.10245041 |
[8] | T. Jin, H. Xia, Lookback option pricing models based on the uncertain fractional-order differential equation with Caputo type, J. Ambient Intell. Hum. Comput., 14 (2023), 6435–6448. https://doi.org/10.1007/s12652-021-03516-y doi: 10.1007/s12652-021-03516-y |
[9] | T. Jin, X. Yang, Monotonicity theorem for the uncertain fractional differential equation and application to uncertain financial market, Math. Comput. Simul., 190 (2021), 203–221. https://doi.org/10.1016/j.matcom.2021.05.018 doi: 10.1016/j.matcom.2021.05.018 |
[10] | N. Shlezinger, J. Whang, Y. C. Eldar, A. G. Dimakis, Model-based deep learning, Proc. IEEE, 111 (2023), 465–499. https://doi.org/10.1109/JPROC.2023.3247480 doi: 10.1109/JPROC.2023.3247480 |
[11] | S. Suganyadevi, V. Seethalakshmi, K. Balasamy, A review on deep learning in medical image analysis, Int. J. Multimedia Inf. Retr., 11 (2022), 19–38. https://doi.org/10.1007/s13735-021-00218-1 doi: 10.1007/s13735-021-00218-1 |
[12] | O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in International Conference on Medical Image Computing and Computer-assisted Intervention, Springer, (2015), 234–241. https://doi.org/10.1007/978-3-319-24574-4_28 |
[13] | Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, J. Liang, UNet++: Redesigning skip connections to exploit multiscale features in image segmentation, IEEE Trans. Med. Imaging, 39 (2020), 1856–1867. https://doi.org/10.1109/TMI.2019.2959609 doi: 10.1109/TMI.2019.2959609 |
[14] | S. Yousefi, H. Sokooti, M. S. Elmahdy, I. M. Lips, M. T. M. Shalmani, R. T. Zinkstok, et al., Esophageal tumor segmentation in CT images using a dilated dense attention unet (DDAUnet), IEEE Access, 9 (2021), 99235–99248. https://doi.org/10.1109/ACCESS.2021.3096270 doi: 10.1109/ACCESS.2021.3096270 |
[15] | H. Cao, Y. Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, et al., Swin-unet: Unet-like pure transformer for medical image segmentation, in Computer Vision–ECCV 2022 Workshops, Springer, (2023), 205–218. https://doi.org/10.1007/978-3-031-25066-8_9 |
[16] | T. S. Mathai, V. Gorantla, J. Galeotti, Segmentation of vessels in ultra high frequency ultrasound sequences using contextual memory, in Medical Image Computing and Computer Assisted Intervention–MICCAI 2019, Springer, (2019), 173–181. https://doi.org/10.1007/978-3-030-32245-8_20 |
[17] | R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, W. Brendel, ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness, arXiv preprint, (2022), arXiv: 1811.12231. https://doi.org/10.48550/arXiv.1811.12231 |
[18] | K. Song, Y. Zhao, L. Huang, Y. Yan, Q. Meng, RGB-T image analysis technology and application: A survey, Eng. Appl. Artif. Intell., 120 (2023), 105919. https://doi.org/10.1016/j.engappai.2023.105919 doi: 10.1016/j.engappai.2023.105919 |
[19] | X. Zhang, A. Boularias, Optical flow boosts unsupervised localization and segmentation, arXiv preprint, (2023), arXiv: 2307.13640. https://doi.org/10.48550/arXiv.2307.13640 |
[20] | J. Hur, S. Roth, Optical flow estimation in the deep learning age, in Modelling Human Motion: From Human Perception to Robot Design, Springer, (2020), 119–140. https://doi.org/10.1007/978-3-030-46732-6_7 |
[21] | S. Shah, X. Xiang, Traditional and modern strategies for optical flow: An investigation, SN Appl. Sci., 3 (2021), 1–14. https://doi.org/10.1007/s42452-021-04227-x doi: 10.1007/s42452-021-04227-x |
[22] | J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in IEEE Conference on Computer Vision and Pattern Recognition, (2015), 3431–3440. https://doi.org/10.1109/cvpr.2015.7298965 |
[23] | A. A. Rafique, A. Jalal, K. Kim, Statistical multi-objects segmentation for indoor/outdoor scene detection and classification via depth images, in 2020 17th International Bhurban Conference on Applied Sciences and Technology (IBCAST), (2020), 271–276. https://doi.org/10.1109/IBCAST47879.2020.9044576 |
[24] | S. Civilibal, K. K. Cevik, A. Bozkurt, A deep learning approach for automatic detection, segmentation and classification of breast lesions from thermal images, Expert Syst. Appl., 212 (2023), 118774. https://doi.org/10.1016/j.eswa.2022.118774 doi: 10.1016/j.eswa.2022.118774 |
[25] | D. Yu, Q. Li, X. Wang, C. Xu, Y. Zhou, A cross-level spectral–spatial joint encode learning framework for imbalanced hyperspectral image classification, IEEE Trans. Geosci. Remote Sens., 60 (2022), 1–17. https://doi.org/10.1109/TGRS.2022.3203980 doi: 10.1109/TGRS.2022.3203980 |
[26] | A. F. Al-Battal, I. R. Lerman, T. Q. Nguyen, Object detection and tracking in ultrasound scans using an optical flow and semantic segmentation framework based on convolutional neural networks, in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2022), 1096–1100. https://doi.org/10.1109/ICASSP43922.2022.9747608 |
[27] | C. Xu, Q. Li, X. Jiang, D. Yu, Y. Zhou, Dual-space graph-based interaction network for RGB-thermal semantic segmentation in electric power scene, IEEE Trans. Circuits Syst. Video Technol., 33 (2023), 1577–1592. https://doi.org/10.1109/TCSVT.2022.3216313 doi: 10.1109/TCSVT.2022.3216313 |
[28] | F. Sun, P. Ren, B. Yin, F. Wang, H. Li, CATNet: A cascaded and aggregated transformer network for RGB-D salient object detection, IEEE Trans. Multimedia, 2023 (2023), 1–14. https://doi.org/10.1109/TMM.2023.3294003 doi: 10.1109/TMM.2023.3294003 |
[29] | J. Li, K. Liu, Y. Hu, H. Zhang, A. A. Heidari, H. Chen, et al., Eres-UNet++: Liver CT image segmentation based on high-efficiency channel attention and Res-UNet++, Comput. Biol. Med., 158 (2023), 106501. https://doi.org/10.1016/j.compbiomed.2022.106501 doi: 10.1016/j.compbiomed.2022.106501 |
[30] | R. Raza, U. I. Bajwa, Y. Mehmood, M. W. Anwar, M. H. Jamal, dResU-Net: 3D deep residual U-Net based brain tumor segmentation from multimodal MRI, Biomed. Signal Process. Control, 79 (2023), 103861. https://doi.org/10.1016/j.bspc.2022.103861 doi: 10.1016/j.bspc.2022.103861 |
[31] | S. Hou, T. Zhou, Y. Liu, P. Dang, H. Lu, H. Shi, Teeth U-Net: A segmentation model of dental panoramic X-ray images for context semantics and contrast enhancement, Comput. Biol. Med., 152 (2023), 106296. https://doi.org/10.1016/j.compbiomed.2022.106296 doi: 10.1016/j.compbiomed.2022.106296 |
[32] | H. Yang, D. Yang, CSwin-PNet: A CNN-Swin Transformer combined pyramid network for breast lesion segmentation in ultrasound images, Expert Syst. Appl., 213 (2023), 119024. https://doi.org/10.1016/j.eswa.2022.119024 doi: 10.1016/j.eswa.2022.119024 |
[33] | L. Willems, J. Vermeulen, A. Wiegerinck, S. Fekkes, M. Reijnen, M. Warle, et al., Construct validity and reproducibility of handheld ultrasound devices in carotid artery diameter measurement, Ultrasound Med. Biol., 49 (2023), 866–874. https://doi.org/10.1016/j.ultrasmedbio.2022.11.013 doi: 10.1016/j.ultrasmedbio.2022.11.013 |
[34] | B. C. Russell, A. Torralba, K. P. Murphy, W. T. Freeman, LabelMe: A database and web-based tool for image annotation, Int. J. Comput. Vision, 77 (2008), 157–173. https://doi.org/10.1007/s11263-007-0090-8 doi: 10.1007/s11263-007-0090-8 |
[35] | S. M. Pizer, E. P. Amburn, J. D. Austin, R. Cromartie, A. Geselowitz, T. Greer, et al., Adaptive histogram equalization and its variations, Comput. Vision Graphics Image Proc., 39 (1987), 355–368. https://doi.org/10.1016/S0734-189X(87)80186-X doi: 10.1016/S0734-189X(87)80186-X |
[36] | H. Xu, J. Zhang, J. Cai, H. Rezatofighi, D. Tao, GMFlow: Learning optical flow via global matching, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2022), 8122–8130. https://doi.org/10.1109/CVPR52688.2022.00795 |
[37] | Z. Teed, J. Deng, RAFT: Recurrent all-pairs field transforms for optical flow, in Computer Vision–ECCV 2020, Springer, (2020), 402–419. https://doi.org/10.1007/978-3-030-58536-5_24 |
[38] | Z. Huang, X. Shi, C. Zhang, Q. Wang, K. C. Cheung, H. Qin, et al., FlowFormer: A transformer architecture for optical flow, in Computer Vision–ECCV 2022, Springer, (2022), 668–685. https://doi.org/10.1007/978-3-031-19790-1_40 |
[39] | K. He, X. Zhang, S. Ren, J. Sun, Identity mappings in deep residual networks, in Computer Vision–ECCV 2016, Springer, (2016), 630–645. https://doi.org/10.1007/978-3-319-46493-0_38 |
[40] | Z. Tian, C. Shen, H. Chen, T. He, FCOS: Fully convolutional one-stage object detection, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 9627–9636. https://doi.org/10.1109/ICCV.2019.00972 |
[41] | T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Belongie, Feature pyramid networks for object detection, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 936–944. https://doi.org/10.1109/CVPR.2017.106 |
[42] | J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection, in 2016 IEEE Conference on Computer Vision and Pattern Recognitio (CVPR), (2016), 779–788. https://doi.org/10.1109/CVPR.2016.91 |
[43] | R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in 2014 IIEEE Conference on Computer Vision and Pattern Recognitio (CVPR), (2014), 580–587. https://doi.org/10.1109/CVPR.2014.81 |
[44] | N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, End-to-end object detection with transformers, in Computer Vision–ECCV 2020, Springer, (2020), 213–229. https://doi.org/10.1007/978-3-030-58452-8_13 |
[45] | B. Cheng, I. Misra, A. G. Schwing, A. Kirillov, R. Girdhar, Masked-attention mask transformer for universal image segmentation, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2022), 1290–1299. https://doi.org/10.1109/CVPR52688.2022.00135 |
[46] | T. Zhang, S. Wei, S. Ji, E2EC: An end-to-end contour-based method for high-quality high-speed instance segmentation, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2022), 4433–4442. https://doi.org/10.1109/CVPR52688.2022.00440 |
[47] | K. Chen, J. Wang, J. Pang, Y. Cao, Y. Xiong, X. Li, et al., MMDetection: Open MMLab detection toolbox and benchmark, arXiv preprint, (2019), arXiv: 1906.07155. https://doi.org/10.48550/arXiv.1906.07155 |
[48] | K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask R-CNN, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 2980–2988. https://doi.org/10.1109/ICCV.2017.322 |
[49] | D. Bolya, C. Zhou, F. Xiao, Y. J. Lee, YOLACT: Real-Time instance segmentation, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 9156–9165. https://doi.org/10.1109/ICCV.2019.00925 |
[50] | Z. Tian, C. Shen, H. Chen, Conditional convolutions for instance segmentation, in Computer Vision–ECCV 2020, Springer, (2020), 282–298. https://doi.org/10.1007/978-3-030-58452-8_17 |
[51] | X. Wang, T. Kong, C. Shen, Y. Jiang, L. Li, SOLO: A simple framework for instance segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 8587–8601. https://doi.org/10.1109/TPAMI.2021.3111116 doi: 10.1109/TPAMI.2021.3111116 |
[52] | A. Kirillov, Y. Wu, K. He, R. Girshick, PointRend: Image segmentation as rendering, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 9796–9805. https://doi.org/10.1109/CVPR42600.2020.00982 |
[53] | X. Wang, T. Kong, C. Shen, Y. Jiang, L. Li, SOLOv2: Dynamic and fast instance segmentation, arXiv preprint, (2020), arXiv: 2003.10152. https://doi.org/10.48550/arXiv.2003.10152 |
[54] | Y. Fang, S. Yang, X. Wang, Y. Li, C. Fang, Y. Shan, et al., Instances as queries, in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 6890–6899. https://doi.org/10.1109/ICCV48922.2021.00683 |
[55] | Z. Tian, C. Shen, X. Wang, H. Chen, BoxInst: High-performance instance segmentation with box annotations, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 5439–5448. https://doi.org/10.1109/CVPR46437.2021.00540 |
[56] | T. Cheng, X. Wang, S. Chen, W. Zhang, Q. Zhang, C. Huang, et al., Sparse instance activation for real-time instance segmentation, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2022), 4423–4432. https://doi.org/10.1109/CVPR52688.2022.00439 |
[57] | C. Lyu, W. Zhang, H. Huang, Y. Zhou, Y. Wang, Y. Liu, et al., Rtmdet: An empirical study of designing real-time object detectors, arXiv preprint, (2022), arXiv: 2212.07784. https://doi.org/10.48550/arXiv.2212.07784 |