Self-supervised Sim-to-Real Kinematics Reconstruction for Video-Based Assessment of Intraoperative Suturing Skills

Trinh, Loc; Chu, Tim; Cui, Zijun; Malpani, Anand; Yang, Cherine; Dalieh, Istabraq; Hui, Alvin; Gomez, Oscar; Liu, Yan; Hung, Andrew

doi:10.1007/978-3-031-43996-4_68

Loc Trinh¹⁴,
Tim Chu¹⁴,
Zijun Cui¹⁴,
Anand Malpani¹⁵,
Cherine Yang¹⁶,
Istabraq Dalieh¹⁷,
Alvin Hui¹⁸,
Oscar Gomez¹⁴,
Yan Liu¹⁴ &
…
Andrew Hung¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14228))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

2852 Accesses

Abstract

Suturing technical skill scores are strong predictors of patient functional recovery following robot-assisted radical prostatectomy (RARP), but manual assessment of these skills is a time and resource-intensive process. By automating suturing skill scoring through computer vision methods, we can significantly reduce the burden on healthcare professionals and enhance the quality and quantity of educational feedbacks. Although automated skill assessment on simulated virtual reality (VR) environments have been promising, applying vision methods to live (‘real’) surgical videos has been challenging due to: 1) the lack of kinematic data from the da Vinci^® surgical system, a key source of information for determining the movement and trajectory of robotic manipulators and suturing needles, and 2) the lack of training data due to the labor-intensive task of segmenting and scoring individual stitches from live videos. To address these challenges, we developed a self-supervised pre-training paradigm whereby sim-to-real generalizable representations are learned without requiring any live kinematic annotations. Our model is based on a masked autoencoder (MAE), termed as LiveMAE. We augment live stitches with VR images during pre-training and require LiveMAE to reconstruct images from both domains while also predicting the corresponding kinematics. This process learns a visual-to-kinematic mapping that seeks to locate the positions and orientations of surgical manipulators and needles, deriving “kinematics” from live videos without requiring supervision. With an additional skill-specific finetuning step, LiveMAE surpasses supervised learning approaches across 6 technical skill assessments, ranging from 0.56–0.84 AUC (0.70–0.91 AUPRC), with particular improvements of 35.78% in AUC for wrist rotation skills and 8.7% for needle driving skills. Mean-squared error for test VR kinematics was as low as 0.045 for each element of the instrument poses. Our contributions provide the foundation to deliver personalized feedback to surgeons training in VR and performing live prostatectomy procedures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Birkmeyer, J.D., et al.: Surgical skill and complication rates after bariatric surgery. N. Engl. J. Med. 369(15), 1434–1442 (2013). https://doi.org/10.1056/NEJMsa130062
Article Google Scholar
Hung, A.J., et al.: A deep-learning model using automated performance metrics and clinical features to predict urinary continence recovery after robot-assisted radical prostatectomy. BJU Int. 124(3), 487–495 (2019). https://doi.org/10.1111/bju.14735
Article Google Scholar
Trinh, L., et al.: Survival analysis using surgeon skill metrics and patient factors to predict urinary continence recovery after robot-assisted radical prostatectomy. Eur. Urol. Focus. S2405–4569(21), 00107–00113 (2021). https://doi.org/10.1016/j.euf.2021.04.001
Article Google Scholar
Chen, J., et al.: Objective assessment of robotic surgical technical skill: a systematic review. J. Urol. 201(3), 461–469 (2019). https://doi.org/10.1016/j.juro.2018.06.078
Article Google Scholar
Lendvay, T.S., White, L., Kowalewski, T.: Crowdsourcing to assess surgical skill. JAMA Surg. 150(11), 1086–1087 (2015). https://doi.org/10.1001/jamasurg.2015.2405
Article Google Scholar
Hung, A.J., et al.: Road to automating robotic suturing skills assessment: battling mislabeling of the ground truth. Surgery S0039–6060(21), 00784–00794 (2021). https://doi.org/10.1016/j.surg.2021.08.014
Article Google Scholar
Hung, A.J., Bao, R., Sunmola, I.O., Huang, D.A., Nguyen, J.H., Anandkumar, A.: Capturing fine-grained details for video-based automation of suturing skills assessment. Int. J. Comput. Assist. Radiol. Surg. 18(3), 545–552 (2023). Epub 2022 Oct 25. PMID: 36282465; PMCID: PMC9975072. https://doi.org/10.1007/s11548-022-02778-x
Sanford, D.I., et al.: Technical skill impacts the success of sequential robotic suturing substeps. J. Endourol. 36(2), 273–278 (2022 ). PMID: 34779231; PMCID: PMC8861914. https://doi.org/10.1089/end.2021.0417
Graves, A.: Long short-term memory. In: Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, vol. 385, pp. 37–45. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24797-2_4
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)
Google Scholar
Dosovitskiy A, et al.:. An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Balvardi, S., et al.: The association between video-based assessment of intraoperative technical performance and patient outcomes: a systematic review. Surg. Endosc. 36(11), 7938–7948 (2022). Epub 2022 May 12. PMID: 35556166. https://doi.org/10.1007/s00464-022-09296-6
Fecso, A.B., Szasz, P., Kerezov, G., Grantcharov, T.P.: The effect of technical performance on patient outcomes in surgery: a systematic review. Ann Surg. 265(3), 492–501 (2017). PMID: 27537534. https://doi.org/10.1097/SLA.0000000000001959

Download references

Acknowledgements

This study is supported in part by the National Cancer Institute under Award Number 1RO1CA251579-01A1.

Author information

Authors and Affiliations

University of Southern California, Los Angeles, USA
Loc Trinh, Tim Chu, Zijun Cui, Oscar Gomez & Yan Liu
Mimic Technologies Inc., Seattle, USA
Anand Malpani
Department of Urology, Cedars-Sinai Medical Center, Los Angeles, USA
Cherine Yang & Andrew Hung
Boston University Henry M. Goldman School of Dental Medicine, Boston, USA
Istabraq Dalieh
Western University of Health Sciences, Pomona, USA
Alvin Hui

Authors

Loc Trinh
View author publications
You can also search for this author in PubMed Google Scholar
Tim Chu
View author publications
You can also search for this author in PubMed Google Scholar
Zijun Cui
View author publications
You can also search for this author in PubMed Google Scholar
Anand Malpani
View author publications
You can also search for this author in PubMed Google Scholar
Cherine Yang
View author publications
You can also search for this author in PubMed Google Scholar
Istabraq Dalieh
View author publications
You can also search for this author in PubMed Google Scholar
Alvin Hui
View author publications
You can also search for this author in PubMed Google Scholar
Oscar Gomez
View author publications
You can also search for this author in PubMed Google Scholar
Yan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Hung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Hung .

Editor information

Editors and Affiliations

Icahn School of Medicine, Mount Sinai, NYC, NY, USA, Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
Emory University, Atlanta, GA, USA
Anant Madabhushi
Queen’s University, Kingston, ON, Canada
Parvin Mousavi
The University of British Columbia, Vancouver, BC, Canada
Septimiu Salcudean
Yale University, New Haven, CT, USA
James Duncan
IBM Research, San Jose, CA, USA
Tanveer Syeda-Mahmood
Johns Hopkins University, Baltimore, MD, USA
Russell Taylor

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 5572 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Trinh, L. et al. (2023). Self-supervised Sim-to-Real Kinematics Reconstruction for Video-Based Assessment of Intraoperative Suturing Skills. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_68

Download citation

DOI: https://doi.org/10.1007/978-3-031-43996-4_68
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43995-7
Online ISBN: 978-3-031-43996-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Self-supervised Sim-to-Real Kinematics Reconstruction for Video-Based Assessment of Intraoperative Suturing Skills