Liver rupture repair surgery serves as one tool to treat liver rupture, especially beneficial for cases of mild liver rupture hemorrhage. Liver rupture can catalyze critical conditions such as hemorrhage and shock. Surgical workflow recognition in liver rupture repair surgery videos presents a significant task aimed at reducing surgical mistakes and enhancing the quality of surgeries conducted by surgeons. A liver rupture repair simulation surgical dataset is proposed in this paper which consists of 45 videos collaboratively completed by nine surgeons. Furthermore, an end-to-end SA-RLNet, a self attention-based recurrent convolutional neural network, is introduced in this paper. The self-attention mechanism is used to automatically identify the importance of input features in various instances and associate the relationships between input features. The accuracy of the surgical phase classification of the SA-RLNet approach is 90.6%. The present study demonstrates that the SA-RLNet approach shows strong generalization capabilities on the dataset. SA-RLNet has proved to be advantageous in capturing subtle variations between surgical phases. The application of surgical workflow recognition has promising feasibility in liver rupture repair surgery.
Citation: Yutao Men, Zixian Zhao, Wei Chen, Hang Wu, Guang Zhang, Feng Luo, Ming Yu. Research on workflow recognition for liver rupture repair surgery[J]. Mathematical Biosciences and Engineering, 2024, 21(2): 1844-1856. doi: 10.3934/mbe.2024080
Liver rupture repair surgery serves as one tool to treat liver rupture, especially beneficial for cases of mild liver rupture hemorrhage. Liver rupture can catalyze critical conditions such as hemorrhage and shock. Surgical workflow recognition in liver rupture repair surgery videos presents a significant task aimed at reducing surgical mistakes and enhancing the quality of surgeries conducted by surgeons. A liver rupture repair simulation surgical dataset is proposed in this paper which consists of 45 videos collaboratively completed by nine surgeons. Furthermore, an end-to-end SA-RLNet, a self attention-based recurrent convolutional neural network, is introduced in this paper. The self-attention mechanism is used to automatically identify the importance of input features in various instances and associate the relationships between input features. The accuracy of the surgical phase classification of the SA-RLNet approach is 90.6%. The present study demonstrates that the SA-RLNet approach shows strong generalization capabilities on the dataset. SA-RLNet has proved to be advantageous in capturing subtle variations between surgical phases. The application of surgical workflow recognition has promising feasibility in liver rupture repair surgery.
[1] | L. S. Feldman, A. D. Pryor, A. Gardner, B. Dunkin, L. Schultz, M. Awad, et al., Sages video-based assessment (vba) program: A vision for life-long learning for surgeons, Surg. Endoscopy, 34 (2020), 3285–3288. https://doi.org/10.1007/s00464-020-07628-y doi: 10.1007/s00464-020-07628-y |
[2] | B. Zhang, J. Abbing, A. Ghanem, D. Fer, J. Barker, R. Abukhalil, et al., Towards accurate surgical workflow recognition with convolutional networks and transformers, Comput. Methods Biomech. Biomed. Eng.: Imaging Visualization, 10 (2022), 349–356. https://doi.org/10.1080/21681163.2021.2002191 doi: 10.1080/21681163.2021.2002191 |
[3] | O. Dergachyova, D. Bouget, A. Huaulmé, X. Morandi, P. Jannin, Automatic data-driven real-time segmentation and recognition of surgical workflow, Int. J. Comput. Assisted Radiol. Surg., 11 (2016), 1081–1089. https://doi.org/10.1007/s11548-016-1371-x doi: 10.1007/s11548-016-1371-x |
[4] | L. Maier-Hein, S. Vedula, S. Speidel, N. Navab, R. Kikinis, A. Park, et al., Surgical data science for next-generation interventions, Nat. Biomed. Eng., 1 (2017), 691–696. https://doi.org/10.1038/s41551-017-0132-7 doi: 10.1038/s41551-017-0132-7 |
[5] | N. Bricon-Souf, C. R. Newman, Context awareness in health care: A review, Int. J. Med. Inf., 76 (2007), 2–12. https://doi.org/10.1016/j.ijmedinf.2006.01.003 doi: 10.1016/j.ijmedinf.2006.01.003 |
[6] | N. Padoy, Machine and deep learning for workflow recognition during surgery, Minimally Invasive Ther. Allied Technol., 28 (2019), 82–90. https://doi.org/10.1080/13645706.2019.1584116 doi: 10.1080/13645706.2019.1584116 |
[7] | A. Huaulmé, P. Jannin, F. Reche, J. Faucheron, A. Moreau-Gaudry, S. Voros, Offline identification of surgical deviations in laparoscopic rectopexy, Artif. Intell. Med., 104 (2020). https://doi.org/10.1016/j.artmed.2020.101837 doi: 10.1016/j.artmed.2020.101837 |
[8] | B. Zhang, A. Ghanem, A. Simes, H. Choi, A. Yoo, Surgical workflow recognition with 3DCNN for Sleeve Gastrectomy, Int. J. Comput. Assisted Radiol. Surg., 16 (2021), 2029–2036. https://doi.org/10.1007/s11548-021-02473-3 doi: 10.1007/s11548-021-02473-3 |
[9] | C. Garrow, K. Kowalewski, L. Li, M. Wagner, M. Schmidt, S. Engelhardt, et al., Machine learning for surgical phase recognition: A systematic review, Ann. Surg., 273 (2021), 684–693. https://doi.org/10.1097/SLA.0000000000004425 doi: 10.1097/SLA.0000000000004425 |
[10] | A. Twinanda, S. Shehata, D. Mutter, J. Marescaux, M. de Mathelin, N. Padoy, EndoNet: A deep architecture for recognition tasks on laparoscopic videos, IEEE Trans. Med. Imaging, 36 (2017), 86–97. https://doi.org/10.1109/TMI.2016.2593957 doi: 10.1109/TMI.2016.2593957 |
[11] | Y. Jin, Q. Dou, H. Chen, L. Yu, J. Qin, C. Fu, et al., SV-RCNet: Workflow recognition from surgical videos using recurrent convolutional network, IEEE Trans. Med. Imaging, 37 (2018), 1114–1126. https://doi.org/10.1109/TMI.2017.2787657 doi: 10.1109/TMI.2017.2787657 |
[12] | Y. Jin, H. Li, Q. Dou, H. Chen, J. Qin, C. Fu, et al., Multi-task recurrent convolutional network with correlation loss for surgical video analysis, Med. Image Anal., 59 (2020). https://doi.org/10.1016/j.media.2019.101572 doi: 10.1016/j.media.2019.101572 |
[13] | N. Jalal, T. Alshirbaji, K. Möller, Predicting surgical phases using CNN-NARX neural network, Curr. Dir. Biomed. Eng., 5 (2019), 405–407. https://doi.org/10.1515/cdbme-2019-0102 doi: 10.1515/cdbme-2019-0102 |
[14] | K. He, C. Gan, Z. Li, I. Rekik, Z. Yin, W. Ji, et al., Transformers in medical image analysis, Intell. Med., 3 (2023), 59–78. https://doi.org/10.1016/j.imed.2022.07.002 doi: 10.1016/j.imed.2022.07.002 |
[15] | T. Czempiel, M. Paschali, D. Ostler, S. Tae Kim, B. Busam, N. Navab, Opera: Attention-regularized transformers for surgical phase recognition, in Medical Image Computing and Computer-Assisted Intervention, Springer, (2021), 604–614. https://doi.org/10.1007/978-3-030-87202-1_58 |
[16] | X. Gao, Y. Jin, Y. Long, Q. Dou, P. Heng, Trans-SVNet: Accurate phase recognition from surgical videos via hybrid embedding aggregation transformer, arXiv preprint, (2021), arXiv: 2103.09712. https://doi.org/10.48550/arXiv.2103.09712 |
[17] | S. Panigrahi, R. Bhuyan, K. Kumar, J. Nayak, T. Swarnkar, Multistage classification of oral histopathological images using improved residual network, Math. Biosci. Eng., 19 (2022), 1909–1925. https://doi.org/10.3934/mbe.2022090 doi: 10.3934/mbe.2022090 |
[18] | A. Hassan, J. Wu, M. Muhammad, U. Muhammad, Brain tumor classification in MRI image using convolutional neural network, Math. Biosci. Eng., 17 (2020), 6203–6216. https://doi.org/10.3934/mbe.2020328 doi: 10.3934/mbe.2020328 |
[19] | D. Birkhoff, A. van Dalen, M. Schijven, A review on the current applicationsof artificial intelligence in the operating room, Surg. Innovation, 28 (2021), 611–619. https://doi.org/10.1177/1553350621996961 doi: 10.1177/1553350621996961 |
[20] | X. Pan, X. Gao, H. Wang, W, Zhang, Y. Mu, X. He, Temporal-based swin transformer network for workflow recognition of surgical video, Int. J. Comput. Assisted Radiol. Surg., 18 (2023), 139–147. https://doi.org/10.1007/s11548-022-02785-y doi: 10.1007/s11548-022-02785-y |
[21] | M. Tan, Q. Le, EfficientNetV2: Smaller models and faster training, arXiv preprint, (2021), arXiv: 2104.00298. https://doi.org/10.48550/arXiv.2104.00298 |
[22] | N. Ma, X. Zhang, H. Zheng, J. Sun, ShuffleNet V2: Practical guidelines for efficient CNN architecture design, arXiv preprint, (2018), arXiv: 1807.11164. https://doi.org/10.48550/arXiv.1807.11164 |
[23] | A. Howard, M. Sandler, G. Chu, L. Chen, B. Chen, M. Tan, et al., Searching for MobileNetV3, arXiv preprint, (2019), arXiv: 1905.02244. https://doi.org/10.48550/arXiv.1905.02244 |
[24] | J. Donahue, L. Hendricks, M. Rohrbach, S. Venugopalan, S. Guadarrama, K. Saenko, et al., Long-term recurrent convolutional networks for visual recognition and description, IEEE Trans. Pattern Anal. Mach. Intell., 39 (2017), 677–691. https://doi.org/10.1109/TPAMI.2016.2599174 doi: 10.1109/TPAMI.2016.2599174 |