Personalized heart models are widely used to study the mechanisms of cardiac arrhythmias and have been used to guide clinical ablation of different types of arrhythmias in recent years. MRI images are now mostly used for model building. In cardiac modeling studies, the degree of segmentation of the heart image determines the success of subsequent 3D reconstructions. Therefore, a fully automated segmentation is needed. In this paper, we combine U-Net and Transformer as an alternative approach to perform powerful and fully automated segmentation of medical images. On the one hand, we use convolutional neural networks for feature extraction and spatial encoding of inputs to fully exploit the advantages of convolution in detail grasping; on the other hand, we use Transformer to add remote dependencies to high-level features and model features at different scales to fully exploit the advantages of Transformer. The results show that, the average dice coefficients for ACDC and Synapse datasets are 91.72 and 85.46%, respectively, and compared with Swin-Unet, the segmentation accuracy are improved by 1.72% for ACDC dataset and 6.33% for Synapse dataset.
Citation: Zhenyin Fu, Jin Zhang, Ruyi Luo, Yutong Sun, Dongdong Deng, Ling Xia. TF-Unet:An automatic cardiac MRI image segmentation method[J]. Mathematical Biosciences and Engineering, 2022, 19(5): 5207-5222. doi: 10.3934/mbe.2022244
Personalized heart models are widely used to study the mechanisms of cardiac arrhythmias and have been used to guide clinical ablation of different types of arrhythmias in recent years. MRI images are now mostly used for model building. In cardiac modeling studies, the degree of segmentation of the heart image determines the success of subsequent 3D reconstructions. Therefore, a fully automated segmentation is needed. In this paper, we combine U-Net and Transformer as an alternative approach to perform powerful and fully automated segmentation of medical images. On the one hand, we use convolutional neural networks for feature extraction and spatial encoding of inputs to fully exploit the advantages of convolution in detail grasping; on the other hand, we use Transformer to add remote dependencies to high-level features and model features at different scales to fully exploit the advantages of Transformer. The results show that, the average dice coefficients for ACDC and Synapse datasets are 91.72 and 85.46%, respectively, and compared with Swin-Unet, the segmentation accuracy are improved by 1.72% for ACDC dataset and 6.33% for Synapse dataset.
[1] | E. Behradfar, A. Nygren, E. J. Vigmond, The role of Purkinje-myocardial coupling during ventricular arrhythmia: a modeling study, PloS one, 9 (2014), e88000. https://doi.org/10.1371/journal.pone.0088000 doi: 10.1371/journal.pone.0088000 |
[2] | D. Deng, H. J. Arevalo, A. Prakosa, D. J. Callans, N. A. Trayanova, A feasibility study of arrhythmia risk prediction in patients with myocardial infarction and preserved ejection fraction, Europace, 18 (2016), iv60–iv66. https://doi.org/10.1093/europace/euw351 doi: 10.1093/europace/euw351 |
[3] | A. Lopez-Perez, R. Sebastian, M. Izquierdo, R. Ruiz, M. Bishop, J. M. Ferrero, Personalized cardiac computational models: From clinical data to simulation of infarct-related ventricular tachycardia, Front. physiol., 10 (2019), 580. https://doi.org/10.3389/fphys.2019.00580 doi: 10.3389/fphys.2019.00580 |
[4] | D. Deng, H. Arevalo, F. Pashakhanloo, A. Prakosa, H. Ashikaga, E. McVeigh, et al., Accuracy of prediction of infarct-related arrhythmic circuits from image-based models reconstructed from low and high resolution MRI, Front. physiol., 6 (2015), 282. https://doi.org/10.3389/fphys.2015.00282 doi: 10.3389/fphys.2015.00282 |
[5] | A. Prakosa, H. J. Arevalo, D. Deng, P. M. Boyle, P. P. Nikolov, H. Ashikaga, et al., Personalized virtual-heart technology for guiding the ablation of infarct-related ventricular tachycardia, Nat. Biomed. Eng., 2 (2018), 732–740. https://doi.org/10.1038/s41551-018-0282-2 doi: 10.1038/s41551-018-0282-2 |
[6] | R. Pohle, K. D. Toennies, Segmentation of medical images using adaptive region growing, Proc. SPIE, 4322 (2002), 1337–1346. https://doi.org/10.1117/12.431013 doi: 10.1117/12.431013 |
[7] | C. Lee, S. Huh, T. A. Ketter, M. Unser, Unsupervised connectivity-based thresholding segmentation of midsagittal brain MR images, Comput. Boil. Med., 28 (1998), 309–338. https://doi.org/10.1016/s0010-4825(98)00013-4 doi: 10.1016/s0010-4825(98)00013-4 |
[8] | H. Y. Lee, N. C. Codella, M. D. Cham, J. W. Weinsaft, Y. Wang, Automatic left ventricle segmentation using iterative thresholding and an active contour model with adaptation on short-axis cardiac MRI, IEEE Trans. Biomed. Eng., 57(2010), 905–913. https://doi.org/10.1109/TBME.2009.2014545 doi: 10.1109/TBME.2009.2014545 |
[9] | S. Antunes, C. Colantoni, A. Palmisano, A. Esposito, S. Cerutti, G. Rizzo, Automatic right ventricle segmentation in ct images using a novel multi-scale edge detector approach, Comput. Cardiol., (2013), 815–818. |
[10] | P. Peng, K. Lekadir, A. Gooya, L. Shao, S. E. Petersen, A. F. Frangi, A review of heart chamber segmentation for structural and functional analysis using cardiac magnetic resonance imaging, MAGMA, 29 (2016), 155–195. https://doi.org/10.1007/s10334-015-0521-4 doi: 10.1007/s10334-015-0521-4 |
[11] | R. Hegadi, A. Kop, M. Hangarge, A survey on deformable model and its applications to medical imaging, Int. J. Comput. Appl., (2010), 64–75. |
[12] | V. Tavakoli, A. A. Amini, A survey of shaped-based registration and segmentation techniques for cardiac images, Comput. Vision Image Understanding, 117 (2013), 966–989. https://doi.org/10.1016/j.cviu.2012.11.017 doi: 10.1016/j.cviu.2012.11.017 |
[13] | D. Lesage, E. D. Angelini, I. Bloch, G. Funka-Lea, A review of 3D vessel lumen segmentation techniques: models, features and extraction schemes, Med. Image Anal., 13 (2009), 819–845. https://doi.org/10.1016/j.media.2009.07.011 doi: 10.1016/j.media.2009.07.011 |
[14] | X. Liu, L. Yang, J. Chen, S. Yu, K. Li, Region-to-boundary deep learning model with multi-scale feature fusion for medical image segmentation, Biomed. Signal Proc. Control, 71 (2022), 103165. https://doi.org/10.1016/j.bspc.2021.103165 doi: 10.1016/j.bspc.2021.103165 |
[15] | B. Pu, K. Li, S. Li, N. Zhu, Automatic fetal ultrasound standard plane recognition based on deep learning and IIoT, IEEE Trans. Ind. Inf., 17 (2021), 7771–7780. https://doi.org/10.1109/TII.2021.3069470 doi: 10.1109/TII.2021.3069470 |
[16] | J. Chen, K. Li, Z. Zhang, K. Li, P. S. Yu, A survey on applications of artificial intelligence in fighting against COVID-19, ACM Comput. Surv., 54 (2021), 1–32. https://doi.org/10.1145/3465398 doi: 10.1145/3465398 |
[17] | D. Ciresan, A. Giusti, L. Gambardella, J. Schmidhuber, Deep neural networks segment neuronal membranes in electron microscopy images, Adv. Neural Inf. Proc. Syst., 2 (2012), 2843–2851. https://dl.acm.org/doi/10.5555/2999325.2999452 |
[18] | J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 3431–3440. https://doi.org/10.1109/CVPR.2015.7298965 |
[19] | O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in International Conference on Medical image computing and computer-assisted intervention (eds. N. Navab, et al.), Springer, Cham, 9351 (2015), 234–241. https://doi.org/10.1007/978-3-319-24574-4_28 |
[20] | F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, K. H. Maier-Hein, nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation, Nat. methods, 18 (2021), 203–211. https://doi.org/10.1038/s41592-020-01008-z doi: 10.1038/s41592-020-01008-z |
[21] | H. Huang, L. Lin, R. Tong, H. Hu, Q. Zhang, Y. Iwamoto, et al., Unet 3+: A full-scale connected unet for medical image segmentation, in 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2020), 1055–1059. https://doi.org/10.1109/ICASSP40776.2020.9053405 |
[22] | Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, J. Liang, Unet++: A nested u-net architecture for medical image segmentation, Deep Learn. Med. Image Anal. Multimodal Learn. Clin. Decis. Support, (2018), 3–11. https://doi.org/10.1007/978-3-030-00889-5_1 doi: 10.1007/978-3-030-00889-5_1 |
[23] | H. Cao, Y. Wang, J. Chen, et al., Swin-Unet: Unet-like pure transformer for medical image segmentation, preprint, arXiv: 2105.05537. |
[24] | Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, et al., Swin transformer: Hierarchical vision transformer using shifted windows, preprint, arXiv: 2103.14030. |
[25] | A. Dosovitskiy, L. Beyer, A. Kolesnikov, et al., An image is worth 16x16 words: Transformers for image recognition at scale, preprint, arXiv: 2010.11929. |
[26] | W. Wang, E. Xie, X. Li, D. Weissenborn, X. Zhai, T. Unterthiner, et al., Pyramid vision transformer: A versatile backbone for dense prediction without convolutions, preprint, arXiv: 2102.12122. |
[27] | J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, et al., TransUNet: Transformers make strong encoders for medical image segmentation, preprint, arXiv: 2102.04306. |
[28] | K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90 |
[29] | H. Hu, J. Gu, Z. Zhang, J. Dai; Y. Wei, Relation networks for object detection, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 3588–3597. https://doi.org/10.1109/CVPR.2018.00378 |
[30] | H. Hu, Z. Zhang, Z. Xie, S. Lin, Local relation networks for image recognition, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 3464–3473. [https://doi.org/10.1109/ICCV.2019.00356 |
[31] | L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille, Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs, IEEE Trans. Pattern Anal. Mach. Intel., 40 (2017), 834–848. https://doi.org/10.1109/TPAMI.2017.2699184 doi: 10.1109/TPAMI.2017.2699184 |
[32] | P. A. Yushkevich, J. Piven, H. C. Hazlett, R. G. Smith, S. Ho, J. C. Gee, et al., User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability, Neuroimage, 31 (2006), 1116–1128. https://doi.org/10.1016/j.neuroimage.2006.01.015 doi: 10.1016/j.neuroimage.2006.01.015 |
[33] | J. Schlemper, O. Oktay, M. Schaap, M. Heinrich, B. Kainz, B. Glocker, et al., Attention gated networks: Learning to leverage salient regions in medical images, Med. Image Anal., 53 (2019), 197–207. https://doi.org/10.1016/j.media.2019.01.012 doi: 10.1016/j.media.2019.01.012 |
[34] | J. Xiao, L. Yu, L. Xing, A. Yuille, DualNorm-UNet: Incorporating global and local statistics for robust medical image segmentation, preprint, arXiv: 2103.15858. |
[35] | M. Treml, J. Arjona-Medina, T. Entertainer, R. Durgesh, F. Friedmann, P. Schuberth, et al., Speeding up semantic segmentation for autonomous driving, 2016. |