Elsevier

Journal of Surgical Education

Volume 77, Issue 6, November–December 2020, Pages e214-e219
Journal of Surgical Education

2020 APDS SPRING MEETING
Crowd-Sourced and Attending Assessment of General Surgery Resident Operative Performance Using Global Ratings Scales

https://doi.org/10.1016/j.jsurg.2020.07.011Get rights and content

Objective

We sought to assess the extent to which both crowd and intraoperative attending ratings using objective structured assessment of technical skill (OSATS) or global objective assessment of laparoscopic skills (GOALS) would correlate with the system for improving procedural learning (SIMPL) Zwisch and Performance scales.

Design

Comparison of directly observed versus crowd sourced review of operative video.

Setting

Operative video captured at 2 institutions.

Participants

Six (6) core general surgery procedures, 3 open and 3 laparoscopic, were selected from the American Board of Surgery's Resident Assessments list. Thirty-two cases performed by General Surgery residents across all training levels at 2 institutions were filmed. Videos were condensed using a standardized protocol to include the critical portion of the procedure.  Condensed videos were then submitted to crowd-sourced assessment of technical skills (C-SATS), an online crowd source-driven assessment service, for assessment using the appropriate resident assessment form (GOALS or OSATS) as well as with the SIMPL Zwisch and Performance scales. Crowd workers watched an educational tutorial on how to use the Zwisch and SIMPL Performance rating scales prior to participating. Attendings scored residents using the same tools immediately after the shared operative experience. Statistical analysis was performed using Pearson's correlation coefficient.

Results

Crowd raters evaluated 32 procedures using GOALS/OSATS, Zwisch and Performance (35-50 ratings per video). Attendings also evaluated all 32 procedures using GOALS/OSATS and 26 of the procedures using SIMPL Zwisch and Performance. Pearson correlation coefficients with 95% confidence intervals for crowd ratings were: GOALS and Zwisch −0.40 [−0.73 to 0.10], OSATS and Zwisch 0.11 [−0.41 to 0.57], GOALS and Performance −0.06 [−0.44 to 0.35], and OSATS and Performance 0.22 [−0.46 to 0.20]. Pearson correlation coefficients for attendings were: GOALS and Zwisch (0.77), OSATS and Zwisch (0.65), GOALS and Performance (0.93), and OSATS and Performance (0.59).

Conclusions

Overall, correlations between crowd-sourced ratings using GOALS/OSATS and SIMPL global operative performance ratings tools were weak, yet for attendings, they were strong. Direct attending assessment may be required for evaluation of global performance while crowd sourcing may be more suitable for technical assessment.  Further studies are needed to see if more extensive crowd training would result in improved ability for global performance evaluation.

Introduction

While it is widely accepted that rapid, accurate assessment of intraoperative performance is essential to guiding resident feedback and self-improvement of technical skills, the provision of frequent, timely, and objective evaluation remains a challenge.1 Crowd sourcing may be one way to operationalize frequent objective feedback and has been shown to provide accurate assessment when raters use instruments focused on technical skills.2, 3, 4, 5 Meanwhile, there is emerging interest in using global assessments of performance.6

To address this issue, the system for improving and measuring procedural learning (SIMPL) application was developed. It is a smart-phone based mobile application for the evaluation of operative performance and autonomy. The application captures 3 metrics: an autonomy metric, or the Zwisch Scale, which has previously been validated, a difficulty scale, and a performance metric.7 While the SIMPL performance scale is designed to be intuitive, the performance metric has not yet been validated against existing, longer-form tools.

The correlation between crowd ratings from more detailed multi-item instruments versus global ratings scales is unknown. We sought to assess the extent to which both crowd and intraoperative attending ratings using objective structured assessment of technical skill (OSATS) or global objective assessment of laparoscopic skills (GOALS) would correlate with the SIMPL Zwisch and Performance scales.

Section snippets

Audio & Video Capture

Six core general surgery procedures, 3 open and 3 laparoscopic, were selected from the American Board of Surgery's Resident Assessments list including laparoscopic cholecystectomy, laparoscopic colectomy, laparoscopic inguinal hernia repair, open inguinal hernia repair, open ventral hernia repair, and thyroidectomy (Table 1).8 Intraoperative audio and video of 32 general surgery procedures were recorded. For laparoscopic procedures, video was captured using a single mounted GoPro Hero4 camera

Results

Crowd raters evaluated 32 procedures using GOALS/OSATS, Zwisch and Performance (35-50 ratings per video). Attendings also evaluated all 32 procedures using GOALS/OSATS and 26 of the procedures using SIMPL Zwisch and Performance. Six SIMPL requests were not complete by the attending surgeon within the requisite 72 hours after observed performance. Pearson correlation coefficients with 95% confidence intervals for crowd ratings were: GOALS and Zwisch −0.40 [−0.73 to 0.10], OSATS and Zwisch 0.11

Discussion

This study demonstrates that attending rater evaluation of global performance using the SIMPL performance scale correlates well with both GOALS and OSATS technical evaluation assessments. This provides additional evidence that when taken in aggregate, the technical performance of a resident tends to positively correlate with the observer's impression of their overall performance or readiness for graduated autonomy. We also sought to determine if crowd workers could be utilized to provide a

Conclusions

Overall, correlations between crowd-sourced ratings using GOALS or OSATS and SIMPL global operative performance ratings tools were weak, yet for attending raters, they were strong.  Further studies are needed to see if more extensive crowd training would result in improved ability for global performance evaluation.

Conflict of Interest

The authors declare no conflicts of interest with respect to the authorship and/or publication of this article.

Funding Source

The project was supported by a grant from the Association of Surgical Education (ASE) and Association of Program Directors in Surgery (APDS). No grant number was provided.

Cited by (0)

View full text