Microscopic examination of visible components based on micrographs is the gold standard for testing in biomedical research and clinical diagnosis. The application of object detection technology in bioimages not only improves the efficiency of the analyst but also provides decision support to ensure the objectivity and consistency of diagnosis. However, the lack of large annotated datasets is a significant impediment in rapidly deploying object detection models for microscopic formed elements detection. Standard augmentation methods used in object detection are not appropriate because they are prone to destroy the original micro-morphological information to produce counterintuitive micrographs, which is not conducive to build the trust of analysts in the intelligent system. Here, we propose a feature activation map-guided boosting mechanism dedicated to microscopic object detection to improve data efficiency. Our results show that the boosting mechanism provides solid gains in the object detection model deployed for microscopic formed elements detection. After image augmentation, the mean Average Precision (mAP) of baseline and strong baseline of the Chinese herbal medicine micrograph dataset are increased by 16.3% and 5.8% respectively. Similarly, on the urine sediment dataset, the boosting mechanism resulted in an improvement of 8.0% and 2.6% in mAP of the baseline and strong baseline maps respectively. Moreover, the method shows strong generalizability and can be easily integrated into any main-stream object detection model. The performance enhancement is interpretable, making it more suitable for microscopic biomedical applications.
Citation: Haixu Yang, Yunqi Zhu, Jiahui Yu, Luhong Jin, Zengxi Guo, Cheng Zheng, Junfen Fu, Yingke Xu. Boosting microscopic object detection via feature activation map guided poisson blending[J]. Mathematical Biosciences and Engineering, 2023, 20(10): 18301-18317. doi: 10.3934/mbe.2023813
Microscopic examination of visible components based on micrographs is the gold standard for testing in biomedical research and clinical diagnosis. The application of object detection technology in bioimages not only improves the efficiency of the analyst but also provides decision support to ensure the objectivity and consistency of diagnosis. However, the lack of large annotated datasets is a significant impediment in rapidly deploying object detection models for microscopic formed elements detection. Standard augmentation methods used in object detection are not appropriate because they are prone to destroy the original micro-morphological information to produce counterintuitive micrographs, which is not conducive to build the trust of analysts in the intelligent system. Here, we propose a feature activation map-guided boosting mechanism dedicated to microscopic object detection to improve data efficiency. Our results show that the boosting mechanism provides solid gains in the object detection model deployed for microscopic formed elements detection. After image augmentation, the mean Average Precision (mAP) of baseline and strong baseline of the Chinese herbal medicine micrograph dataset are increased by 16.3% and 5.8% respectively. Similarly, on the urine sediment dataset, the boosting mechanism resulted in an improvement of 8.0% and 2.6% in mAP of the baseline and strong baseline maps respectively. Moreover, the method shows strong generalizability and can be easily integrated into any main-stream object detection model. The performance enhancement is interpretable, making it more suitable for microscopic biomedical applications.
[1] | J. Hipp, T. Flotte, J. Monaco, J. Cheng, A. Madabhushi, Y. Yagi, et al., Computer aided diagnostic tools aim to empower rather than replace pathologists: Lessons learned from computational chess, J. Pathol. Inform., 2 (2011), 25. https://doi.org/10.4103/2153-3539.82050 doi: 10.4103/2153-3539.82050 |
[2] | Z. Q. Zhao, P. Zheng, S. T. Xu, X. Wu, Object detection with deep learning: A review, IEEE Transact. Neural Networks Learn. Syst., 30 (2019), 3212–3232. https://doi.org/10.1109/icABCD49160.2020.9183866 doi: 10.1109/icABCD49160.2020.9183866 |
[3] | Z. Liu, L. Jin, J. Chen, Q. Fang, S. Ablameyko, Z. Yin, et al., A survey on applications of deep learning in microscopy image analysis, Comput. Biol. Med., 134 (2021), 104523. https://doi.org/10.1109/TNNLS.2017.2766168 doi: 10.1109/TNNLS.2017.2766168 |
[4] | C. Matek, S. Schwarz, K. Spiekermann, C. Marr, Human-level recognition of blast cells in acute myeloid leukaemia with convolutional neural networks, Nat. Machine Intell., 1 (2019), 538–544. https://doi.org/10.1038/s42256-019-0101-9 doi: 10.1038/s42256-019-0101-9 |
[5] | B. Midtvedt, J. Pineda, F. Skärberg, E. Olsén, H. Bachimanchi, E. Wesén, et al., Single-shot self-supervised object detection in microscopy, Nat. Commun., 13 (2022), 7492. https://doi.org/10.1038/s41467-022-35004-y doi: 10.1038/s41467-022-35004-y |
[6] | C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, et al., YOLOv6: A single-stage object detection framework for industrial applications, arXiv: 2209.02976, 2022. https://doi.org/10.48550/arXiv.2209.02976 |
[7] | C.-Y. Wang, A. Bochkovskiy, H.-Y. M. Liao, YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, in CVF Conference on Computer Vision and Pattern Recognition, 2023, 7464–7475. https://doi.org/10.1109/CVPR52729.2023.00721 |
[8] | Z. Liu, H. Zhang, L. Jin, J. Chen, A. Nedzved, S. Ablameyko, et al., U-Net-based deep learning for tracking and quantitative analysis of intracellular vesicles in time-lapse microscopy images, J. Innov. Opt. Health Sci., 15 (2022), 2250031. https://doi.org/10.1142/S1793545822500316 doi: 10.1142/S1793545822500316 |
[9] | C. Sun, A. Shrivastava, S. Singh, A. Gupta, Revisiting unreasonable effectiveness of data in deep learning era, in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 843–852. https://doi.org/10.1109/ICCV.2017.97 |
[10] | V. Cheplygina, M. de Bruijne, J. P. W. Pluim, Not-so-supervised: A survey of semi-supervised, multi-instance, and transfer learning in medical image analysis, Med. Image Anal., 54 (2019), 280–296. https://doi.org/10.1016/j.media.2019.03.009 doi: 10.1016/j.media.2019.03.009 |
[11] | A. Bilodeau, C. V. L. Delmas, M. Parent, P. De Koninck, A. Durand, F. Lavoie-Cardinal, Microscopy analysis neural network to solve detection, enumeration and segmentation from image-level annotations, Nat. Mach. Intell., 4 (2022), 455–466. https://doi.org/10.1038/s42256-022-00472-w doi: 10.1038/s42256-022-00472-w |
[12] | A. Halevy, P. Norvig, F. Pereira, The unreasonable effectiveness of data, IEEE Intell. Syst., 24 (2009), 8–12. https://doi.org/10.1109/MIS.2009.36 doi: 10.1109/MIS.2009.36 |
[13] | H. Zhang, M. Cisse, Y. N. Dauphin, D. Lopez-Paz, mixup: Beyond empirical risk minimization, in International Conference on Learning Representations (ICLR), 2018. |
[14] | S. Yun, D. Han, S. Chun, S. J. Oh, Y. Yoo, J. Choe, CutMix: Regularization strategy to train strong classifiers with localizable features, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 6022–6031. https://doi.org/10.1109/ICCV.2019.00612 |
[15] | T. Devries, G. W. Taylor, Improved regularization of convolutional neural networks with cutout, arXiv: 1708.04552, 2017. https://doi.org/10.48550/arXiv.1708.04552 |
[16] | Z. Zhong, L. Zheng, G. Kang, S. Li, Y. Yang, Random erasing data augmentation, in Proceedings of the AAAI Conference on Artificial Intelligence, 2020. https://doi.org/10.1609/aaai.v34i07.7000 |
[17] | S. Liu, L. Qi, H. Qin, J. Shi, J. Jia, Path aggregation network for instance segmentation, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 8759–8768. https://doi.org/10.1109/CVPR.2018.00913 |
[18] | R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580–587. https://doi.org//10.1109/CVPR.2014.81 |
[19] | K. Grauman, T. Darrell, The pyramid match kernel: Discriminative classification with sets of image features, in Tenth IEEE International Conference on Computer Vision (ICCV'05), 1 (2005), pp.1458–1465. https://doi.org/10.1109/ICCV.2005.239 |
[20] | T. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Belongie, Feature pyramid networks for object detection, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 936–944. https://doi.org/10.1109/CVPR.2017.106 |
[21] | R. Girshick, Fast R-CNN, in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448. https://doi.org/10.1109/ICCV.2015.169 |
[22] | J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788. https://doi.org/10.1109/CVPR.2016.91 |
[23] | M. Tan, R. Pang, Q. V. Le, EfficientDet: Scalable and efficient object detection, arXiv: 1911.09070, 2019. https://doi.org/10.1109/CVPR42600.2020.01079 |
[24] | A. Krizhevsky, I. Sutskever, G. E. Hinton, ImageNet classification with deep convolutional neural networks, Commun. ACM, 60 (2012), 84–90. https://doi.org/10.1145/3065386 doi: 10.1145/3065386 |
[25] | G. Jocher, A. Stoken, A. Chaurasia, J. Borovec, Y. Kwon, K. Michael, et al., ultralytics/yolov5: v6. 0—YOLOv5n 'Nano'models, Roboflow integration, TensorFlow export, OpenCV DNN support, Zenodo Tech. Rep., (2021). |
[26] | W. Ouyang, C. F. Winsnes, M. Hjelmare, A. J. Cesnik, L. Åkesson, H. Xu, et al., Analysis of the Human Protein Atlas Image Classification competition, Nat. Methods, 16 (2019), 1254–1261. https://doi.org/10.1038/s41592-019-0658-6 doi: 10.1038/s41592-019-0658-6 |
[27] | R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-CAM: Visual explanations from deep networks via gradient-based localization, in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618–626. https://doi.org/10.1109/ICCV.2017.74 |
[28] | N. Dvornik, J. Mairal, C. Schmid, Modeling visual context is key to augmenting object detection datasets, in European Conference on Computer Vision (ECCV) 2018, Springer International Publishing, Cham, 2018, pp. 375–391. https://doi.org/10.1007/978-3-030-01258-8_23 |
[29] | P. Pérez, M. Gangnet, A. Blake, Poisson image editing, ACM Trans. Graph., 22 (2003), 313–318. https://doi.org/10.1145/1201775.882269 doi: 10.1145/1201775.882269 |
[30] | C.C. Pharmacopoeia, Pharmacopoeia of the People's Republic of China, 2010. |
[31] | J. Redmon, A. J. A. P. A. Farhadi, Yolov3: An incremental improvement, arXiv: 1804.02767. 2018. https://doi.org/10.48550/arXiv.1804.02767 |
[32] | S. Qiao, L. C. Chen, A. Yuille, DetectoRS: Detecting objects with recursive feature pyramid and switchable atrous convolution, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 10208–10219. https://doi.org/10.1109/CVPR46437.2021.01008 |
[33] | X. Zhu, W. Su, L. Lu, B. Li, X. Wang, J. Dai, Deformable DETR: Deformable transformers for end-to-end object detection, in International Conference on Learning Representations, 2021. |
[34] | S. Zhang, C. Chi, Y. Yao, Z. Lei, S. Z. Li, Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. https://doi.org/10.1109/cvpr42600.2020.00978 |
[35] | H. Zhang, F. Li, S. Liu, L. Zhang, H. Su, J. Zhu, et al., DINO: DETR with improved denoising anchor boxes for end-to-end object detection, arXiv: 2203.03605, 2022. https://doi.org/10.48550/arXiv.2203.03605 |
[36] | Z. Chen, C. Yang, J. Chang, F. Zhao, Z. J. Zha, F. Wu, DDOD: Dive deeper into the disentanglement of object detector, IEEE Transact. Mult., (2023), 1–15. https://doi.org/10.1109/TMM.2023.3264008 doi: 10.1109/TMM.2023.3264008 |
[37] | B. Zhu, J. Wang, Z. Jiang, F. Zong, S. Liu, Z. Li, et al., AutoAssign: Differentiable label assignment for dense object detection, arXiv: 2007.03496, 2020. https://doi.org/10.48550/arXiv.2007.03496 |
[38] | X. Zhu, H. Hu, S. Lin, J. Dai, Deformable ConvNets V2: More deformable, better results, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 9300–9308. https://doi.org/10.1109/CVPR.2019.00953 |
[39] | K. Chen, J. Wang, J. Pang, Y. Cao, Y. Xiong, X. Li, et al., MMDetection: Open MMLab detection toolbox and benchmark, arXiv: 1906.07155, 2019. https://doi.org/10.48550/arXiv.1906.07155 |