Abstract
Purpose
Surgeons’ skill in the operating room is a major determinant of patient outcomes. Assessment of surgeons’ skill is necessary to improve patient outcomes and quality of care through surgical training and coaching. Methods for video-based assessment of surgical skill can provide objective and efficient tools for surgeons. Our work introduces a new method based on attention mechanisms and provides a comprehensive comparative analysis of state-of-the-art methods for video-based assessment of surgical skill in the operating room.
Methods
Using a dataset of 99 videos of capsulorhexis, a critical step in cataract surgery, we evaluated image feature-based methods and two deep learning methods to assess skill using RGB videos. In the first method, we predict instrument tips as keypoints and predict surgical skill using temporal convolutional neural networks. In the second method, we propose a frame-wise encoder (2D convolutional neural network) followed by a temporal model (recurrent neural network), both of which are augmented by visual attention mechanisms. We computed the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and predictive values through fivefold cross-validation.
Results
To classify a binary skill label (expert vs. novice), the range of AUC estimates was 0.49 (95% confidence interval; CI = 0.37 to 0.60) to 0.76 (95% CI = 0.66 to 0.85) for image feature-based methods. The sensitivity and specificity were consistently high for none of the methods. For the deep learning methods, the AUC was 0.79 (95% CI = 0.70 to 0.88) using keypoints alone, 0.78 (95% CI = 0.69 to 0.88) and 0.75 (95% CI = 0.65 to 0.85) with and without attention mechanisms, respectively.
Conclusion
Deep learning methods are necessary for video-based assessment of surgical skill in the operating room. Attention mechanisms improved discrimination ability of the network. Our findings should be evaluated for external validity in other datasets.
Similar content being viewed by others
References
Aghdasi N, Bly R, White LW, Hannaford B, Moe K, Lendvay TS (2015) Crowd-sourced assessment of surgical skills in cricothyrotomy procedure. J Surg Res 196(2):302–306
Agresti A (2003) Categorical data analysis, vol 482. Wiley, Hoboken
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations
Bettadapur V, Schindler G, Plötz T, Essa I (2013) Augmenting bag-of-words: data-driven discovery of temporal and structural information for activity recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2619–2626
Birkmeyer JD, Finks JF, O’Reilly A, Oerline M, Carlin AM, Nunn AR, Dimick J, Banerjee M, Birkmeyer NJO (2013) Surgical skill and complication rates after bariatric surgery. N Engl J Med 369(15):1434–1442
Carroll N, Richardson I (2016). Software-as-a-medical device: demystifying connected health regulations. J Syst Inf Technol
Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. In: European conference on computer vision, pp 428–441. Springer
Deal SB, Stefanidis D, Telem D, Fanelli RD, McDonald M, Michael Ujiki L, Brunt M, Alseidi AA (2017) Evaluation of crowd-sourced assessment of the critical view of safety in laparoscopic cholecystectomy. Surg Endosc 31(12):5094–5100
Golnik C, Beaver H, Gauba V, Lee AG, Mayorga E, Palis G, Saleh GM (2013) Development of a new valid, reliable, and internationally applicable assessment tool of residents’ competence in ophthalmic surgery (an American ophthalmological society thesis). Transactions of the American Ophthalmological Society. 111:24
Hand DJ, Till RJ (2001) A simple generalisation of the area under the roc curve for multiple class classification problems. Mach Learn 45(2):171–186
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Kim TS, O’Brien M, Zafar S, Hager GD, Sikder S, Swaroop Vedula S (2019) Objective assessment of intraoperative technical skill in capsulorhexis using videos of cataract surgery. Int J Comput Assist Radiol Surg 14(6):1097–1105
Kingma D, Ba J (2014) Adam: a method for stochastic optimization. International Conference on Learning Representations, 12
Laptev I (2005) On space-time interest points. Int J Comput Vision 64(2–3):107–123
Lendvay TS, White L, Kowalewski T (2015) Crowdsourcing to assess surgical skill. JAMA Surg 150(11):1086–1087
Lin T, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp 2999–3007
Maier-Hein L, Vedula SS, Speidel S, Navab N, Kikinis R, Park A, Eisenmann M, Feussner H, Forestier G, Giannarou S, Hashizume M, Katic D, Kenngott H, Kranzfelder M, Malpani A, März K, Neumuth T, Padoy N, Pugh C, Schoch N, Stoyanov D, Taylor R, Wagner M, Hager GD, Jannin P (2017) Surgical data science for next-generation interventions. Nat Biomed Eng 1(9):691–696
Malpani SA, Vedula S, Chen CCG, Hager GD (2015) A study of crowdsourced segment-level surgical skill assessment using pairwise rankings. Int J Comput Assist Radiol Surg 10(9):1435–1447
Marcano-Cedeno A, Quintanilla-DomÃnguez J, Cortina-Januchs MG, Andina D(2010) Feature selection using sequential forward selection and classification applying artificial metaplasticity neural network. In: IECON 2010-36th annual conference on IEEE industrial electronics society, pp 2845–2850. IEEE,
Pandey VA, Wolfe JHN, Black SA, Cairols M, Liapis CD, Bergqvist D (2008) Self-assessment of technical skill in surgery: the need for expert feedback. Ann R Coll Surg England 90(4):286–290
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Benoit Steiner L, Fang JB, Chintala S (2019) Pytorch: an imperative style high-performance deep learning library. In: Wallach H, Larochelle H, Beygelzimer A, d‘Alché-Buc F, Fox E, Garnett R et al (eds) Advances in neural information processing systems. Curran Associates Inc., RedHook
Powers MK, Boonjindasup A, Pinsky M, Dorsey P, Maddox M, Su LM, Gettman M, Sundaram CP, Castle EP, Lee JY, Lee BR (2016) Crowdsourcing assessment of surgeon dissection of renal artery and vein during robotic partial nephrectomy: a novel approach for quantitative assessment of surgical performance. J Endourol 30(4):447–452
Pugh C, Hashimoto DA, Korndorffer Jr JR (2020) The what? how? and who? of video based assessment. Am J Surg
Robertson S (2004) Understanding inverse document frequency: on theoretical arguments for idf. J Doc
Sharma Y (2014) Surgical skill assessment using motion texture analysis. PhD thesis, Georgia Institute of Technology
Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5693–5703
Valanci-Aroesty S, Alhassan N, Feldman LS, Landry T, Mastropietro V, Fiore Jr J, Lee L, Fried GM, Mueller CL (2020) Implementation and effectiveness of coaching for surgeons in practice–a mixed studies systematic review. J Surg Educ
Vedula SS, Ishii M, Hager GD (2017) Objective assessment of surgical technical skill and competency in the operating room. Ann Rev Biomed Eng 19:301–325
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057
Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B (2016) Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). pp 207–212
Zia A, Essa I (2018) Automated surgical skill assessment in rmis training. Int J Comput Assist Radiol Surg 13(5):731–739
Zia A, Sharma Y, Bettadapura V, Sarin EL, Clements MA, Essa I (2015) Automated assessment of surgical skills using frequency analysis. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp 430–438. Springer
Zia A, Sharma Y, Bettadapura V, Sarin EL, Essa I (2018) Video and accelerometer-based motion analysis for automated surgical skills assessment. Int J Comput Assist Radiol Surg 13(3):443–455
Zia A, Sharma Y, Bettadapura V, Sarin EL, Ploetz T, Clements MA, Essa I (2016) Automated video-based assessment of surgical skills for training and evaluation in medical schools. Int J Comput Assist Radiol Surg 11(9):1623–1636
Acknowledgements
Dr. Austin Reiter mentored this work in its early stages.
Funding
Drs. Vedula, Sikder, and Hager are supported by a grant from the National Institutes of Health, USA; NIH 1R01EY033065. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Hira, S., Singh, D., Kim, T.S. et al. Video-based assessment of intraoperative surgical skill. Int J CARS 17, 1801–1811 (2022). https://doi.org/10.1007/s11548-022-02681-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11548-022-02681-5