Skip to main content

Self-supervised Sim-to-Real Kinematics Reconstruction for Video-Based Assessment of Intraoperative Suturing Skills

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 (MICCAI 2023)

Abstract

Suturing technical skill scores are strong predictors of patient functional recovery following robot-assisted radical prostatectomy (RARP), but manual assessment of these skills is a time and resource-intensive process. By automating suturing skill scoring through computer vision methods, we can significantly reduce the burden on healthcare professionals and enhance the quality and quantity of educational feedbacks. Although automated skill assessment on simulated virtual reality (VR) environments have been promising, applying vision methods to live (‘real’) surgical videos has been challenging due to: 1) the lack of kinematic data from the da Vinci® surgical system, a key source of information for determining the movement and trajectory of robotic manipulators and suturing needles, and 2) the lack of training data due to the labor-intensive task of segmenting and scoring individual stitches from live videos. To address these challenges, we developed a self-supervised pre-training paradigm whereby sim-to-real generalizable representations are learned without requiring any live kinematic annotations. Our model is based on a masked autoencoder (MAE), termed as LiveMAE. We augment live stitches with VR images during pre-training and require LiveMAE to reconstruct images from both domains while also predicting the corresponding kinematics. This process learns a visual-to-kinematic mapping that seeks to locate the positions and orientations of surgical manipulators and needles, deriving “kinematics” from live videos without requiring supervision. With an additional skill-specific finetuning step, LiveMAE surpasses supervised learning approaches across 6 technical skill assessments, ranging from 0.56–0.84 AUC (0.70–0.91 AUPRC), with particular improvements of 35.78% in AUC for wrist rotation skills and 8.7% for needle driving skills. Mean-squared error for test VR kinematics was as low as 0.045 for each element of the instrument poses. Our contributions provide the foundation to deliver personalized feedback to surgeons training in VR and performing live prostatectomy procedures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Birkmeyer, J.D., et al.: Surgical skill and complication rates after bariatric surgery. N. Engl. J. Med. 369(15), 1434–1442 (2013). https://doi.org/10.1056/NEJMsa130062

    Article  Google Scholar 

  2. Hung, A.J., et al.: A deep-learning model using automated performance metrics and clinical features to predict urinary continence recovery after robot-assisted radical prostatectomy. BJU Int. 124(3), 487–495 (2019). https://doi.org/10.1111/bju.14735

    Article  Google Scholar 

  3. Trinh, L., et al.: Survival analysis using surgeon skill metrics and patient factors to predict urinary continence recovery after robot-assisted radical prostatectomy. Eur. Urol. Focus. S2405–4569(21), 00107–00113 (2021). https://doi.org/10.1016/j.euf.2021.04.001

    Article  Google Scholar 

  4. Chen, J., et al.: Objective assessment of robotic surgical technical skill: a systematic review. J. Urol. 201(3), 461–469 (2019). https://doi.org/10.1016/j.juro.2018.06.078

    Article  Google Scholar 

  5. Lendvay, T.S., White, L., Kowalewski, T.: Crowdsourcing to assess surgical skill. JAMA Surg. 150(11), 1086–1087 (2015). https://doi.org/10.1001/jamasurg.2015.2405

    Article  Google Scholar 

  6. Hung, A.J., et al.: Road to automating robotic suturing skills assessment: battling mislabeling of the ground truth. Surgery S0039–6060(21), 00784–00794 (2021). https://doi.org/10.1016/j.surg.2021.08.014

    Article  Google Scholar 

  7. Hung, A.J., Bao, R., Sunmola, I.O., Huang, D.A., Nguyen, J.H., Anandkumar, A.: Capturing fine-grained details for video-based automation of suturing skills assessment. Int. J. Comput. Assist. Radiol. Surg. 18(3), 545–552 (2023). Epub 2022 Oct 25. PMID: 36282465; PMCID: PMC9975072. https://doi.org/10.1007/s11548-022-02778-x

  8. Sanford, D.I., et al.: Technical skill impacts the success of sequential robotic suturing substeps. J. Endourol. 36(2), 273–278 (2022 ). PMID: 34779231; PMCID: PMC8861914. https://doi.org/10.1089/end.2021.0417

  9. Graves, A.: Long short-term memory. In: Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, vol. 385, pp. 37–45. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24797-2_4

  10. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)

  11. He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)

    Google Scholar 

  12. Dosovitskiy A, et al.:. An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  13. Balvardi, S., et al.: The association between video-based assessment of intraoperative technical performance and patient outcomes: a systematic review. Surg. Endosc. 36(11), 7938–7948 (2022). Epub 2022 May 12. PMID: 35556166. https://doi.org/10.1007/s00464-022-09296-6

  14. Fecso, A.B., Szasz, P., Kerezov, G., Grantcharov, T.P.: The effect of technical performance on patient outcomes in surgery: a systematic review. Ann Surg. 265(3), 492–501 (2017). PMID: 27537534. https://doi.org/10.1097/SLA.0000000000001959

Download references

Acknowledgements

This study is supported in part by the National Cancer Institute under Award Number 1RO1CA251579-01A1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrew Hung .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 5572 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Trinh, L. et al. (2023). Self-supervised Sim-to-Real Kinematics Reconstruction for Video-Based Assessment of Intraoperative Suturing Skills. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_68

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43996-4_68

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43995-7

  • Online ISBN: 978-3-031-43996-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics