Research article Special Issues

DST-Net: Dual self-integrated transformer network for semi-supervised segmentation of optic disc and optic cup in fundus image

  • Published: 16 April 2025
  • Accurate and efficient optic disc and cup segmentation from fundus images is significant for glaucoma screening. However, current neural network-based optic disc (OD) and optic cup (OC) segmentation tend to prioritize the image's local edge features, thus limiting their capacity to model long-term relationships, with errors in delineating the boundaries. To address this issue, we proposed a semi-supervised dual self-integrated transformer network (DST-Net) for joint segmentation of the OD and OC. First, we introduce a dual-view co-training mechanism to construct the encoder and decoder of the self-integrated network from the mutually enhanced feature learning modules of Vision Transformer (ViT) and convolutional neural networks (CNN), which are co-trained with dual views to learn the global and local features of the image adaptively. Moreover, we employ a dual self-integrated teacher-student framework, effectively utilizing large amounts of unlabeled fundus images through semi-supervised learning, thereby refining OD and OC segmentation results. Finally, we use a boundary difference over union loss (BDoU-loss) to optimize boundary prediction further. We implemented the comparative experiments on the publicly available dataset RIGA+. The OD and OC Dice values of the proposed DST-Net reached 95.12 ± 0.14 and 85.69 ± 0.27, respectively, outperforming other state-of-the-art (SOTA) methods. In addition, DST-Net shows strong generalization on the DRISHTI-GS1 and RIM-ONE-v3 datasets, proving its promising prospect in OD and OC segmentation.

    Citation: Yanxia Sun, Tianze Xu, Jing Wang, Jinke Wang. DST-Net: Dual self-integrated transformer network for semi-supervised segmentation of optic disc and optic cup in fundus image[J]. Electronic Research Archive, 2025, 33(4): 2216-2245. doi: 10.3934/era.2025097

    Related Papers:

  • Accurate and efficient optic disc and cup segmentation from fundus images is significant for glaucoma screening. However, current neural network-based optic disc (OD) and optic cup (OC) segmentation tend to prioritize the image's local edge features, thus limiting their capacity to model long-term relationships, with errors in delineating the boundaries. To address this issue, we proposed a semi-supervised dual self-integrated transformer network (DST-Net) for joint segmentation of the OD and OC. First, we introduce a dual-view co-training mechanism to construct the encoder and decoder of the self-integrated network from the mutually enhanced feature learning modules of Vision Transformer (ViT) and convolutional neural networks (CNN), which are co-trained with dual views to learn the global and local features of the image adaptively. Moreover, we employ a dual self-integrated teacher-student framework, effectively utilizing large amounts of unlabeled fundus images through semi-supervised learning, thereby refining OD and OC segmentation results. Finally, we use a boundary difference over union loss (BDoU-loss) to optimize boundary prediction further. We implemented the comparative experiments on the publicly available dataset RIGA+. The OD and OC Dice values of the proposed DST-Net reached 95.12 ± 0.14 and 85.69 ± 0.27, respectively, outperforming other state-of-the-art (SOTA) methods. In addition, DST-Net shows strong generalization on the DRISHTI-GS1 and RIM-ONE-v3 datasets, proving its promising prospect in OD and OC segmentation.



    加载中


    [1] Y. C. Tham, X. Li, T. Y. Wong, H. A. Quigley, T. Aung, C. Y. Cheng, Global prevalence of glaucoma and projections of glaucoma burden through 2040: A systematic review and meta-analysis, Ophthalmology, 121 (2014), 2081–2090. https://doi.org/10.1016/j.ophtha.2014.05.013 doi: 10.1016/j.ophtha.2014.05.013
    [2] A. C. Thompson, A. A. Jammal, F. A. Medeiros, A review of deep learning for screening, diagnosis, and detection of glaucoma progression, Transl. Vision Sci. Technol., 9 (2020), 42. https://doi.org/10.1167/tvst.9.2.42 doi: 10.1167/tvst.9.2.42
    [3] R. C. Zhao, X. L. Chen, X. Y. Liu, Z. L. Chen, F. Guo, S. Li, Direct cup-to-disc ratio estimation for glaucoma screening via semi-supervised learning, IEEE J. Biomed. Health Inf., 24 (2019), 1104–1113. https://doi.org/10.1109/JBHI.2019.2934477 doi: 10.1109/JBHI.2019.2934477
    [4] C. Jia, F. Shi, M. Zhao, Y. Zhang, X. Cheng, M. Z. Wang, et al., Semantic segmentation with light field imaging and convolutional neural networks, IEEE Trans. Instrum. Meas., 70 (2021), 5017214. https://doi.org/10.1109/TIM.2021.3115204 doi: 10.1109/TIM.2021.3115204
    [5] T. Hassan, B. Hassan, M. U. Akram, S. Hashimi, A. H. Taguri, N. Werghi, Incremental cross-domain adaptation for robust retinopathy screening via Bayesian deep learning, IEEE Trans. Instrum. Meas., 70 (2021), 2516414. https://doi.org/10.1109/TIM.2021.3122172 doi: 10.1109/TIM.2021.3122172
    [6] Y. F. Guo, Y. J. Peng, B. Zhang, CAFR-CNN: Coarse-to-fine adaptive faster R-CNN for cross-domain joint optic disc and cup segmentation, Appl. Intell., 51 (2021), 5701–5725. https://doi.org/10.1007/s10489-020-02145-w doi: 10.1007/s10489-020-02145-w
    [7] L. Luo, D. Y. Xue, F. Pan, X. L. Feng, Joint optic disc and optic cup segmentation based on boundary prior and adversarial learning, Int. J. Comput. Assisted Radiol. Surg., 16 (2021), 905–914. https://doi.org/10.1007/s11548-021-02373-6 doi: 10.1007/s11548-021-02373-6
    [8] P. S. Yin, Y. W. Xu, J. H. Zhu, J. Liu, C. A. Yi, H. C. Huang, et al., Deep level set learning for optic disc and cup segmentation, Neurocomputing, 464 (2021), 330–341. https://doi.org/10.1016/j.neucom.2021.08.102 doi: 10.1016/j.neucom.2021.08.102
    [9] J. N. Chen, Y. Y. Lu, Q. H. Yu, X. D. Luo, E. Adeli, Y. Wang, et al., Transunet: Transformers make strong encoders for medical image segmentation, preprint, arXiv: 2102.04306.
    [10] A.Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al., Attention is all you need, in Advances in Neural Information Processing Systems 30 (NIPS 2017), Curran Associates, Inc., 30 (2017), 1–11.
    [11] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. H. Zhai, T. Unterthiner, et al., An image is worth 16×16 words: Transformers for image recognition at scale, preprint, arXiv: 2010.11929.
    [12] J. Wu, W. Ji, H. Z. Fu, M. Xu, Y. M. Jin, Y. W. Xu, Medsegdiff-v2: Diffusion-based medical image segmentation with transformer, in Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Press, 38 (2024), 6030–6038. https://doi.org/10.1609/aaai.v38i6.28418
    [13] Y. Chen, D. Su, J. Luo, Laplacian-guided hierarchical transformer: A network for medical image segmentation, Comput. Methods Programs Biomed., 260 (2025), 108526. https://doi.org/10.1016/j.cmpb.2024.108526 doi: 10.1016/j.cmpb.2024.108526
    [14] E. Goceri, Medical image data augmentation: Techniques, comparisons and interpretations, Artif. Intell. Rev., 56 (2023), 12561–12605. https://doi.org/10.1007/s10462-023-10453-z doi: 10.1007/s10462-023-10453-z
    [15] Y. Wang, J. Cheng, Y. Chen, S. Shao, L. Y. Zhu, Z. Z. Wu, et al., Fvp: Fourier visual prompting for source-free unsupervised domain adaptation of medical image segmentation, IEEE Trans. Med. Imaging, 42 (2023), 3738–3751. https://doi.org/10.1109/TMI.2023.3306105 doi: 10.1109/TMI.2023.3306105
    [16] P. L. Shi, J. N. Qiu, S. M. D. Abaxi, H. Wei, F. P. W. Lo, W. Yuan, Generalist vision foundation models for medical imaging: A case study of segment anything model on zero-shot medical segmentation, Diagnostics, 13 (2023), 1947. https://doi.org/10.3390/diagnostics13111947 doi: 10.3390/diagnostics13111947
    [17] J. Zilly, J. M. Buhmann, D. Mahapatra, Glaucoma detection using entropy sampling and ensemble learning for automatic optic cup and disc segmentation, Comput. Med. Imaging Graphics, 55 (2017), 28–41. https://doi.org/10.1016/j.compmedimag.2016.07.012 doi: 10.1016/j.compmedimag.2016.07.012
    [18] L. Wang, J. Gu, Y. Z. Chen, Y. B. Liang, W. J. Zhang, J. T. Pu, et al., Automated segmentation of the optic disc from fundus images using an asymmetric deep learning network, Pattern Recognit. 112 (2021), 107810. https://doi.org/10.1016/j.patcog.2020.107810 doi: 10.1016/j.patcog.2020.107810
    [19] A. Tulsani, P. Kumar, S. Pathan, Automated segmentation of optic disc and optic cup for glaucoma assessment using improved UNET++ architecture, Biocybern. Biomed. Eng., 41 (2021), 819–832. https://doi.org/10.1016/j.bbe.2021.05.011 doi: 10.1016/j.bbe.2021.05.011
    [20] S. Pachade, P. Porwal, M. Kokare, L. Giancardo, F. Meriaudeau, NENet: Nested EfficientNet and adversarial learning for joint optic disc and cup segmentation, Med. Image Anal., 74 (2021) 102253. https://doi.org/10.1016/j.media.2021.102253 doi: 10.1016/j.media.2021.102253
    [21] X. X. Guo, J. H. Li, Q. F. Lin, Z. H. Tu, X. Y. Hu, S. T. Che, Joint optic disc and cup segmentation using feature fusion and attention, Comput. Biol. Med., 150 (2022), 106094 https://doi.org/10.1016/j.compbiomed.2022.106094 doi: 10.1016/j.compbiomed.2022.106094
    [22] H. Z. Fu, J. Cheng, Y. W. Xu, D. W. K. Wong, J. Liu, X. C. Cao, Joint optic disc and cup segmentation based on multilabel deep network and polar transformation, IEEE Trans. Med. Imaging, 37 (2018), 1597–1605. https://doi.org/10.1109/TMI.2018.2791488 doi: 10.1109/TMI.2018.2791488
    [23] Z. Q. Zhu, Z. M. Zhang, G. Q. Qi, Y. Y. Li, Y. Z. Li, L. Mu, A dual-branch network for ultrasound image segmentation, Biomed. Signal Process. Control, 103 (2025), 107368 https://doi.org/10.1016/j.bspc.2024.107368 doi: 10.1016/j.bspc.2024.107368
    [24] Z. Q. Zhu, X. Y. He, G. Q. Qi, Y. Y. Li, B. S. Cong, Y. Liu, Brain tumor segmentation based on the fusion of deep semantics and edge information in multimodal MRI, Inf. Fusion, 91 (2023) 376–387. https://doi.org/10.1016/j.inffus.2022.10.022 doi: 10.1016/j.inffus.2022.10.022
    [25] Y. H. Fu, J. Chen, J. Li, D. Y. Pan, X. Z. Yue, Y. M. Zhu, Optic disc segmentation by U-net and probability bubble in abnormal fundus images, Pattern Recognit., 117 (2021), 107971. https://doi.org/10.1016/j.patcog.2021.107971 doi: 10.1016/j.patcog.2021.107971
    [26] H. Cao, Y. Y. Wang, J. Chen, D. S. Jiang, X. P. Zhang, Q. Tian, et al., Swin-unet: Unet-like pure transformer for medical image segmentation, in European Conference on Computer Vision–ECCV 2022 Workshops, Springer, (2022), 205–218. https://doi.org/10.1007/978-3-031-25066-8_9
    [27] S. H. Li, X. C. Sui, X. D. Luo, X. X. Xu, Y. Liu, R. Goh, Medical image segmentation using squeeze-and-expansion transformers, in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, (2021), 807–815. https://doi.org/10.24963/ijcai.2021/112
    [28] Z. Q. Zhu, Z. Y. Wang, G. Q. Qi, N. Mazur, P. Yang, Y. Liu, Brain tumor segmentation in MRI with multi-modality spatial information enhancement and boundary shape correction, Pattern Recognit., 153 (2024), 110553. https://doi.org/10.1016/j.patcog.2024.110553 doi: 10.1016/j.patcog.2024.110553
    [29] Z. Q. Zhu, K. Yu, G. Q. Qi, B. S. Cong, Y. Y. Li, Z. X. Li, et al., Lightweight medical image segmentation network with multi-scale feature-guided fusion, Comput. Biol. Med., 182 (2024), 109204. https://doi.org/10.1016/j.compbiomed.2024.109204 doi: 10.1016/j.compbiomed.2024.109204
    [30] Z. Q. Zhu, M. W. Sun, G. Q. Qi, Y. Y. Li, X. B. Gao, Y. Liu, Sparse dynamic volume TransUNet with multi-level edge fusion for brain tumor segmentation, Comput. Biol. Med., 172 (2024), 108284. https://doi.org/10.1016/j.compbiomed.2024.108284 doi: 10.1016/j.compbiomed.2024.108284
    [31] Y. H. Fu, J. F. Liu, J. Shi, TSCA-Net: Transformer based spatial-channel attention segmentation network for medical images, Comput. Biol. Med, . 170 (2024), 107938. https://doi.org/10.1016/j.compbiomed.2024.107938 doi: 10.1016/j.compbiomed.2024.107938
    [32] Y. G. Yi, Y. Jiang, B. Zhou, N. Y. Zhang, J. Y. Dai, X. Huang, et al., C2FTFNet: Coarse-to-fine transformer network for joint optic disc and cup segmentation, Comput. Biol. Med., 164 (2023), 107215. https://doi.org/10.1016/j.compbiomed.2023.107215 doi: 10.1016/j.compbiomed.2023.107215
    [33] R. Hussain, H. Basak, Ut-net: Combining u-net and transformer for joint optic disc and cup segmentation and glaucoma detection, preprint, arXiv: 2303.04939.
    [34] J. D. Wu, H. H. Fang, F. X. Shang, D. L. Yang, Z. W. Wang, J. Gao, et al., SeATrans: Learning segmentation-assisted diagnosis model via transformer, in Medical Image Computing and Computer Assisted Intervention–MICCAI 2022, Springer, 13432 (2022), 677–687. https://doi.org/10.1007/978-3-031-16434-7_65
    [35] Z. Liu, H. Hu, Y. T. Lin, Z. L. Yao, Z. D. Xie, Y. X. Wei, et al., Swin transformer v2: Scaling up capacity and resolution, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, (2022), 12009–12019. https://doi.org/10.48550/arXiv.2111.09883
    [36] C. L. Yang, Y. L. Wang, J. M. Zhang, H. Zhang, Z. J. Wei, Z. Lin, et al., Lite vision transformer with enhanced self-attention, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, (2022), 11998–12008.
    [37] B. Han, Q. M. Yao, X. R. Yu, G. Niu, M. Xu, W. H. Hu, et al., Co-teaching: Robust training of deep neural networks with extremely noisy labels, in Advances in Neural Information Processing Systems 31 (NeurIPS 2018), Curran Associates, Inc., 31 (2018), 1–11.
    [38] A. Tarvainen, H. Valpola, Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, in Advances in Neural Information Processing Systems 30 (NIPS 2017), Curran Associates, Inc., 30 (2017), 1–10.
    [39] S. Laine, T. Aila, Temporal ensembling for semi-supervised learning, preprint, arXiv: 1610.02242.
    [40] J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, F. F. Li, Imagenet: A large-scale hierarchical image database, in 2009 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2009), 248–255. https://doi.org/10.1109/CVPR.2009.5206848
    [41] F. Sun, Z. M. Luo, S. Z. Li, Boundary difference over union loss for medical image segmentation, in Medical Image Computing and Computer Assisted Intervention–MICCAI 2023, Springer, 14223 (2023), 292–301. https://doi.org/10.1007/978-3-031-43901-8_28
    [42] A. Almazroa, S. Alodhayb, E. Osman, E. Ramadan, M. Hummadi, M. Dlaim, et al., Retinal fundus images for glaucoma analysis: The RIGA dataset, in Medical Imaging 2018: Imaging Informatics for Healthcare, Research, and Applications, SPIE, (2018), 55–62. https://doi.org/10.1117/12.2293584
    [43] M. Bateson, H. Kervadec, J. Dolz, H. Lombaert, I. B. Ayed, Source-relaxed domain adaptation for image segmentation, in Medical Image Computing and Computer Assisted Intervention–MICCAI 2020, Springer, 12261 (2020), 490–499. https://doi.org/10.1007/978-3-030-59710-8_48
    [44] M. Bateson, H. Kervadec, J. Dolz, H. Lombaert, I. B. Ayed, Source-free domain adaptation for image segmentation, Med. Image Anal., 82 (2022), 102617. https://doi.org/10.1016/j.media.2022.102617 doi: 10.1016/j.media.2022.102617
    [45] C. Yang, X. Guo, Z. Chen, Y. Yuan, Source free domain adaptation for medical image segmentation with fourier style mining, Med. Image Anal., 79 (2022), 102457. https://doi.org/10.1016/j.media.2022.102457 doi: 10.1016/j.media.2022.102457
    [46] S. J. Wang, L. Q. Yu, X. Yang, C. W. Fu, P. A. Heng, Patch-based output space adversarial learning for joint optic disc and cup segmentation, IEEE Trans. Med. Imaging, 38 (2019), 2485–2495. https://doi.org/10.1109/TMI.2019.2899910 doi: 10.1109/TMI.2019.2899910
    [47] S. Wang, L. Yu, K. Li, X. Yang, C. W. Fu, P. A. Heng, Boundary and entropy-driven adversarial learning for fundus image segmentation, in Medical Image Computing and Computer Assisted Intervention–MICCAI 2019, Springer, 11764 (2019), 102–110. https://doi.org/10.1007/978-3-030-32239-7_12
    [48] S. Hu, Z. Liao, Y. Xia, ProSFDA: Prompt learning based source-free domain adaptation for medical image segmentation, preprint, arXiv: 2211.11514.
    [49] F. Li, A. Jiang, M. Li, C. Xiao, W. Ji, HPFG: Semi-supervised medical image segmentation framework based on hybrid pseudo-label and feature-guiding, Med. Biol. Eng. Comput., 62 (2024), 405–421. https://doi.org/10.1007/s11517-023-02946-4 doi: 10.1007/s11517-023-02946-4
    [50] Y. L. He, J. Kong, D. Liu, J. Li, C. Zheng, Self-ensembling with mask-boundary domain adaptation for optic disc and cup segmentation, Eng. Appl. Artif. Intell., 129 (2024), 107635. https://doi.org/10.1016/j.engappai.2023.107635 doi: 10.1016/j.engappai.2023.107635
    [51] S. Mallick, J. Paul, J. Sil, Response fusion attention U-ConvNext for accurate segmentation of optic disc and optic cup, Neurocomputing, 559 (2023), 126798. https://doi.org/10.1016/j.neucom.2023.126798 doi: 10.1016/j.neucom.2023.126798
    [52] K. Wu, J. Zhang, H. Peng, M. C. Liu, B. Xiao, J. L. Fu, et al., Tinyvit: Fast pretraining distillation for small vision transformers, in European Conference on Computer Vision–ECCV 2022, Springer, 13681 (2022), 68–85. https://doi.org/10.1007/978-3-031-19803-8_5
    [53] S. N. Wadekar, A. Chaurasia, Mobilevitv3: Mobile-friendly vision transformer with simple and effective fusion of local, global and input features, preprint, arXiv: 2209.15159.
  • Reader Comments
  • © 2025 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1262) PDF downloads(41) Cited by(0)

Article outline

Figures and Tables

Figures(13)  /  Tables(7)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog