Glaucoma, a leading cause of irreversible blindness, requires early detection to prevent progressive vision loss. Color fundus photography is a non-invasive and widely accessible modality for glaucoma screening; however, traditional manual interpretation is limited by subjectivity, time inefficiency, and inter-observer variability. This study proposes an optic disc (OD)/optic cup (OC)semantic feature pyramid network, a joint OD and OC segmentation model for glaucoma screening. The model extends the Semantic FPN architecture through three key enhancements: (1) a MaxViT backbone that incorporates multi-axis attention to reinforce local-global feature interaction and preserve boundary information during downsampling; (2) inception depthwise convolution modules embedded within MBConv blocks, which enables multi-scale convolution to expand receptive fields without compromising fine-grained details; (3) an optimized semantic FPN structure to improve the cross-scale feature alignment and multi-scale fusion. The proposed OD/OC-Semantic FPN was evaluated on five publicly available fundus image datasets (Drishti-GS, ORIGA, RIM-ONE DL, RIM-ONE-R3, and REFUGE), and its performance was compared against several state-of-the-art segmentation models (U-Net, DeepLabV3+, PSPNet, APCNet, semantic FPN-PoolFormer, and attention U-Net). The results show that the OD/OC-semantic FPN surpasses existing models across several metrics: dice coefficient, Mean Intersection over Union (mIoU), mean pixal accuracy (MPA), and classification accuracy, thus demonstrating superior structural precision for fundus analysis. Collectively, these results indicate that the OD/OC-Semantic FPN is a robust and generalizable tool for intelligent early detection of glaucoma.
Citation: Xuan Liu, Qian Ma, Jiajia Wang, Xiaohu Liu, Qiuyang Zhang, Jin Yao, Biao Yan, Zhenhua Wang. OD/OC-semantic FPN: an enhanced optic cup and disc segmentation model in color fundus images using improved MaxViT and semantic feature pyramid network[J]. Electronic Research Archive, 2025, 33(9): 5496-5517. doi: 10.3934/era.2025246
Glaucoma, a leading cause of irreversible blindness, requires early detection to prevent progressive vision loss. Color fundus photography is a non-invasive and widely accessible modality for glaucoma screening; however, traditional manual interpretation is limited by subjectivity, time inefficiency, and inter-observer variability. This study proposes an optic disc (OD)/optic cup (OC)semantic feature pyramid network, a joint OD and OC segmentation model for glaucoma screening. The model extends the Semantic FPN architecture through three key enhancements: (1) a MaxViT backbone that incorporates multi-axis attention to reinforce local-global feature interaction and preserve boundary information during downsampling; (2) inception depthwise convolution modules embedded within MBConv blocks, which enables multi-scale convolution to expand receptive fields without compromising fine-grained details; (3) an optimized semantic FPN structure to improve the cross-scale feature alignment and multi-scale fusion. The proposed OD/OC-Semantic FPN was evaluated on five publicly available fundus image datasets (Drishti-GS, ORIGA, RIM-ONE DL, RIM-ONE-R3, and REFUGE), and its performance was compared against several state-of-the-art segmentation models (U-Net, DeepLabV3+, PSPNet, APCNet, semantic FPN-PoolFormer, and attention U-Net). The results show that the OD/OC-semantic FPN surpasses existing models across several metrics: dice coefficient, Mean Intersection over Union (mIoU), mean pixal accuracy (MPA), and classification accuracy, thus demonstrating superior structural precision for fundus analysis. Collectively, these results indicate that the OD/OC-Semantic FPN is a robust and generalizable tool for intelligent early detection of glaucoma.
| [1] |
R. Shinde, Glaucoma detection in retinal fundus images using U-Net and supervised machine learning algorithms, Intell. Med., 5 (2021), 100038. https://doi.org/10.1016/j.ibmed.2021.100038 doi: 10.1016/j.ibmed.2021.100038
|
| [2] |
L. Zhang, C. P. Lim, Intelligent optic disc segmentation using improved particle swarm optimization and evolving ensemble models, Appl. Soft Comput., 92 (2020), 106328. https://doi.org/10.1016/j.asoc.2020.106328 doi: 10.1016/j.asoc.2020.106328
|
| [3] |
D. Meedeniya, T. Shyamalee, G. Lim, P. Yogarajah, Glaucoma identification with retinal fundus images using deep learning: systematic review, Inf. Med. Unlocked, 56 (2025), 101644. https://doi.org/10.1016/j.imu.2025.101644 doi: 10.1016/j.imu.2025.101644
|
| [4] |
A. Haider, M. Arsalan, M. B. Lee, M. Owais, T. Mahmood, H. Sultan, et al., Artificial intelligence-based computer-aided diagnosis of glaucoma using retinal fundus images, Exp. Syst. Appl., 207 (2022), 117968. https://doi.org/10.1016/j.eswa.2022.117968 doi: 10.1016/j.eswa.2022.117968
|
| [5] | A. Al-Mahrooqi, D. Medvedev, R. Muhtaseb, M. Yaqub, GARDNet: Robust multi-view network for glaucoma classification in color fundus images, in Ophthalmic Medical Image Analysis: 9th International Workshop, Springer, 13576 (2022), 152-161.https://doi.org/10.1007/978-3-031-16525-2_16 |
| [6] |
X. Bian, X. Luo, C. Wang, W. Liu, X. Lin, Optic disc and optic cup segmentation based on anatomy guided cascade network, Comput. Methods Programs Biomed., 197 (2020), 105717. https://doi.org/10.1016/j.cmpb.2020.105717 doi: 10.1016/j.cmpb.2020.105717
|
| [7] | T. Shyamalee, D. Meedeniya, Attention U-net for glaucoma identification using fundus image segmentation, in 2022 International Conference on Decision Aid Sciences and Applications (DASA), IEEE, (2022), 6-10.https://doi.org/10.1109/DASA54658.2022.9765303 |
| [8] |
S. Bengani, J. A. A. Jothi, S. Vadivel, Automatic segmentation of optic disc in retinal fundus images using semi-supervised deep learning, Multimed. Tools Appl., 80 (2021), 3443-3468. https://doi.org/10.1007/s11042-020-09778-6 doi: 10.1007/s11042-020-09778-6
|
| [9] |
Y. Jiang, L. Duan, J. Cheng, Z. Gu, H. Xia, H. Fu, et al., JointRCNN: A region-based convolutional neural network for optic disc and cup segmentation, IEEE Trans. Biomed. Eng., 67 (2019), 335-343. https://doi.org/10.1109/TBME.2019.2913211 doi: 10.1109/TBME.2019.2913211
|
| [10] |
X. Yuan, L. Zhou, S. Yu, M. Li, X. Wang, X. Zheng, A multi-scale convolutional neural network with context for joint segmentation of optic disc and cup, Artif. Intell. Med., 113 (2021), 102035. https://doi.org/10.1016/j.artmed.2021.102035 doi: 10.1016/j.artmed.2021.102035
|
| [11] |
R. Bhattacharya, R. Hussain, A. Chatterjee, D. Paul, S. Chatterjee, D. Dey, PY-Net: Rethinking segmentation frameworks with dense pyramidal operations for optic disc and cup segmentation from retinal fundus images, Biomed. Signal Process. Control, 85 (2023), 104895. https://doi.org/10.1016/j.bspc.2023.104895 doi: 10.1016/j.bspc.2023.104895
|
| [12] |
Y. Yi, Y. Jiang, B. Zhou, N. Zhang, J. Dai, X. Huang, et al., C2FTFNet: Coarse-to-fine transformer network for joint optic disc and cup segmentation, Comput. Biol. Med., 164 (2023), 107215. https://doi.org/10.1016/j.compbiomed.2023.107215 doi: 10.1016/j.compbiomed.2023.107215
|
| [13] |
Y. Chen, Z. Liu, Y. Meng, J. Li, Lightweight optic disc and optic cup segmentation based on MobileNetv3 convolutional neural network, Biomimetics, 9 (2024), 637. https://doi.org/10.3390/biomimetics9100637 doi: 10.3390/biomimetics9100637
|
| [14] | T. Shyamalee, D. Meedeniya, CNN based fundus images classification for glaucoma identification, in 2022 2nd International Conference on Advanced Research in Computing (ICARC), IEEE, (2022), 200-205.https://doi.org/10.1109/ICARC54489.2022.9754171 |
| [15] | M. Wassel, A. M. Hamdi, N. Adly, M. Torki, Vision transformers based classification for glaucomatous eye condition, in 2022 26th International Conference on Pattern Recognition (ICPR), IEEE, (2022), 5082-5088.https://doi.org/10.1109/ICPR56361.2022.9956086 |
| [16] | A. Singh, S. Sengupta, V. Lakshminarayanan, Glaucoma diagnosis using transfer learning methods, in Applications of Machine Learning, SPIE, (2019), 27.https://doi.org/10.1117/12.2529429 |
| [17] |
M. Nawaz, T. Nazir, A. Javed, U. Tariq, H. S. Yong, M. A. Khan, et al., An efficient deep learning approach to automatic glaucoma detection using optic disc and optic cup localization, Sensors, 22 (2022), 434. https://doi.org/10.3390/s22020434 doi: 10.3390/s22020434
|
| [18] |
T. Shyamalee, D. Meedeniya, Glaucoma detection with retinal fundus images using segmentation and classification, Mach. Intell. Res., 19 (2022), 563-580. https://doi.org/10.1007/s11633-022-1354-z doi: 10.1007/s11633-022-1354-z
|
| [19] |
Á. S. Hervella, J. Rouco, J. Novo, M. Ortega, End-to-end multi-task learning for simultaneous optic disc and cup segmentation and glaucoma classification in eye fundus images, Appl. Soft Comput., 116 (2022), 108347. https://doi.org/10.1016/j.asoc.2021.108347 doi: 10.1016/j.asoc.2021.108347
|
| [20] | H. Wang, H. Sun, Y. Fang, S. Li, M. Feng, R. Wang, A workflow for computer-aided diagnosis of glaucoma, in 2022 IEEE International Symposium on Biomedical Imaging Challenges (ISBIC), IEEE, (2022), 1-4.https://doi.org/10.1109/ISBIC56247.2022.9854585 |
| [21] |
Y. Xu, C. Zhang, H. Li, Transformer-based large vision model for universal structural damage segmentation, Autom. Constr., 176 (2025), 106256. https://doi.org/10.1016/j.autcon.2025.106256 doi: 10.1016/j.autcon.2025.106256
|
| [22] | Z. Tu, H. Talebi, H. Zhang, F. Yang, P. Milanfar, A. Bovik, et al., Maxvit: Multi-axis vision transformer, in Computer vision-ECCV 2022: 17th European Conference, Springer, 13684 (2022), 459-479. https://doi.org/10.1007/978-3-031-20053-3_27 |
| [23] | W. Yu, P. Zhou, S. Yan, X. Wang, InceptionNeXt: When inception meets ConvNeXt, in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2024), 5672-5683.https://doi.org/10.1109/cvpr52733.2024.00542 |
| [24] | A. Kirillov, R. Girshick, K. He, P. Dollar, Panoptic feature pyramid networks, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2019), 6392-6401.https://doi.org/10.1109/CVPR.2019.00656 |
| [25] | J. Sivaswamy, S. R. Krishnadas, G. D. Joshi, M. Jain, A. U. S. Tabish, Drishti-gs: Retinal image dataset for optic nerve head (onh) segmentation, in 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI), IEEE, (2014), 53-56.https://doi.org/10.1109/ISBI.2014.6867807 |
| [26] | Z. Zhang, F. S. Yin, J. Liu, W. K. Wong, N. M. Tan, B. H. Lee, et al., Origa-light: An online retinal fundus image database for glaucoma analysis and research, in 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, IEEE, (2010), 3065-3068.https://doi.org/10.1109/IEMBS.2010.5626137 |
| [27] |
F. J. F. Batista, T. Diaz-Aleman, J. Sigut, S. Alayón, R. Arnay, D. Angel-Pereira, Rim-one dl: A unified retinal image database for assessing glaucoma using deep learning, Image Anal. Stereol., 39 (2020), 161-167. https://doi.org/10.5566/ias.2346 doi: 10.5566/ias.2346
|
| [28] | F. Fumero, S. Alayón, J. L. Sanchez, J. Sigut, M. Gonzalez-Hernandez, RIM-ONE: An open retinal image database for optic nerve evaluation, in 2011 24th International Symposium on Computer-based Medical Systems (CBMS), IEEE, (2011), 1-6.https://doi.org/10.1109/CBMS.2011.5999143 |
| [29] |
J. I. Orlando, H. Fu, J. B. Breda, K. Van Keer, D. R. Bathula, A. Diaz-Pinto, et al., REFUGE challenge: a unified framework for evaluating automated methods for glaucoma assessment from fundus photographs, Med. Image Anal., 59 (2020), 101570. https://doi.org/10.1016/j.media.2019.101570 doi: 10.1016/j.media.2019.101570
|
| [30] | O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional networks for biomedical image segmentation, in Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Lecture Notes in Computer Science, Springer, 9351 (2015), 234-241.https://doi.org/10.1007/978-3-319-24574-4_28 |
| [31] | L. C. Chen, Y. Zhu, G. Papandreou, F. Schroff, H. Adam, Encoder-decoder with atrous separable convolution for semantic image segmentation, in Computer Vision—ECCV 2018, Springer, 11211 (2018), 833-851.https://doi.org/10.1007/978-3-030-01234-2_49 |
| [32] | H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2017), 6230-6239.https://doi.org/10.1109/CVPR.2017.660 |
| [33] | J. He, Z. Deng, L. Zhou, Y. Wang, Y. Qiao, Adaptive pyramid context network for semantic segmentation, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2019), 7511-7520. https://doi.org/10.1109/CVPR.2019.00770 |
| [34] | W. Yu, M. Luo, P. Zhou, C. Si, Y. Zhou, X. Wang, et al., MetaFormer is actually what you need for vision, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2022), 10809-10819.https://doi.org/10.1109/cvpr52688.2022.01055 |
| [35] |
T. Shyamalee, D. Meedeniya, G. Lim, M. Karunarathne, Automated tool support for glaucoma identification with explainability using fundus images, IEEE Access, 12 (2024), 17290-17307. https://doi.org/10.1109/ACCESS.2024.3359698 doi: 10.1109/ACCESS.2024.3359698
|
| [36] | Y. Fan, H. Li, Y. Bao, Y. Xu, Cycle-consistency-constrained few-shot learning framework for universal multi-type structural damage segmentation, Struct. Health Monit., 2024 (2024), 14759217241293467.https://doi.org/10.1177/14759217241293467 |
| [37] |
Y. Xu, Y. Fan, Y. Bao, H. Li, Task-aware meta-learning paradigm for universal structural damage segmentation using limited images, Eng. Struct., 284 (2023), 115917. https://doi.org/10.1016/j.engstruct.2023.115917 doi: 10.1016/j.engstruct.2023.115917
|
| [38] |
Y. Xu, Y. Fan, Y. Bao, H. Li, Few-shot learning for structural health diagnosis of civil infrastructure. Adv. Eng. Inf., 62 (2024), 102650. https://doi.org/10.1016/j.aei.2024.102650 doi: 10.1016/j.aei.2024.102650
|