Minimally invasive surgery (MIS) has significant advantages for patients compared to open surgery [1], including shorter postoperative patient recovery, decreased perioperative complications, smaller incisions, and less blood loss [2, 3]. However, the surgical skills required to perform image-guided surgery differ from those of traditional open surgery, requiring different training methods [4,5,6,7,8]. This might represent a challenge for novice surgeons whose learning curve is influenced by factors such as a limited 2-dimensional view, challenging hand–eye coordination, difficult instrument handling, and inhibited haptic response [9].

Many MIS training modalities have proven to positively affect the development of MIS skills in a patient-safe environment [6, 7, 9,10,11,12,13,14,15,16]. Simulation is a complementary training modality that helps acquire surgical abilities, accelerating the learning curve in a controlled, safe, and standardized environment [10, 17,18,19]. The main goal of simulation training is to acquire and improve surgical skills and then transfer these skills into the operating room (OR) [6, 20]. One of the MIS simulation training aspects is to provide structured feedback to the trainees after an MIS training session [21]. Feedback is an educational technique and a social interaction between tutor and trainee in a respectful and trusting relationship [22]. In MIS training, individual feedback has proven to positively impact the performance and confidence of the trainees [21, 23, 24]. Video debriefing of the performed surgical procedure as a form of feedback improves surgical training and the certitude of the trainees [23, 25,26,27]. Video debriefing has been used in a randomized-controlled setting in laparoscopic cholecystectomy (LC) on Virtual Reality (VR) trainer in laparoscopic novices and showed significant improvement of laparoscopic skills [27]. However, the video debriefing in this study was performed only after VR LCs, and it did not include the annotation of the critical view of safety (CVS) or the identification of the safe (Go) and dangerous (No-Go) zones of dissection during LC. Timing out to assess the CVS during LC has been shown to increase the rate of intraoperative CVS achievement, which may lead to a reduction of complications [28]. Identifying Go and No-Go zones of dissection during LC using artificial intelligence (AI) has been used as a form of intraoperative guidance and can potentially reduce intraoperative mistakes [29]. These modalities thus bear the potential to be used in MIS training as part of structured feedback and video debriefing after LC to improve the performance in LC.

This study aimed to assess the effects of structured feedback with video debriefing on training success in trainees undergoing a predefined multi-modal MIS training. The MIS training curriculum was completed in teams of two trainees and included repetitive ex-vivo porcine LCs.

Materials and methods

Study design

According to the CONSORT guidelines for randomized-controlled trials, this study was designed as a prospective, single-center, two-arm, randomized-controlled study [30]. The primary aim of the study was to investigate whether there is a benefit of structured feedback and video debriefing with CVS annotation after LCs in teams of two trainees compared to training without structured feedback and video debriefing.

Study setting and participants

The study was conducted between September 2021 and December 2022 at the MIS training center at the Department of General, Visceral, and Transplantation Surgery at the University Hospital Heidelberg, Germany. The MIS training center offers voluntary MIS training courses to medical students at Heidelberg University during their clinical years. The local ethics committee gave its approval (S-436/2018).

Inclusion and exclusion criteria

Inclusion criteria were medical students enrolled at Heidelberg University Medical School during their clinical years. The exclusion criteria were participation in previous MIS training courses or experience in MIS.

Study flow

Both groups underwent a structured MIS basic training consisting of 2 hours of e-learning about the basic MIS skills and LC (http://www.webop.de, http://www.websurg.com) under the supervision of a trained tutor [31, 32] (Fig. 1). Afterward, the trainees performed 6 hours of practical basic MIS training on a Szabo–Berci–Sackier Box Trainer and a standard laparoscopy tower (KARL STORZ GmbH & Co. KG, Tuttlingen, Germany). The exercises included: laparoscopic camera guidance, clamping six rubber bands in a device with six screws, pulling a rubber band into a device made of eyelets and hooks, cutting a predefined circle on a paper sheet, needle-guidance, and continuous and interrupted suturing on a 3D printed wound model. The basic MIS training encompassed a 2 hours basic module and LC on a VR trainer (Simbionix LAP Mentor) (Fig. 2, on the right).

Fig. 1
figure 1

Flowchart of the study. LC laparoscopic cholecystectomy; OSATS Objective Structured Assessment of Technical Skills; GOALS Global Operative Assessment of Laparoscopic Skills; VAS visual assessment score for LC difficulty; CALC complication assessment score of LC

Fig. 2
figure 2

Training setting for laparoscopic cholecystectomy on an ex-vivo porcine liver (on the left) and on the Virtual Reality trainer (on the right)

After the MIS basic training, the two groups performed 4 LCs in teams of two (Fig. 2, on the left). A trained tutor provided both groups with assistance and verbal guidance during each LC on demand. The tutor was previously trained by an experienced board-certified surgeon in performing and teaching LC on ex-vivo porcine livers.

The students in the control group performed the four LCs without structured feedback, video debriefing, and CVS annotation, whereas the intervention group students were provided with structured feedback and video debriefing with CVS annotation as a form of intervention. All ex-vivo porcine LCs were captured on video and evaluated using the standardized, validated assessment systems (OSATS and GOALS scores) on-site for structured feedback and by a blinded rater for outcome assessment [33, 34].

Intervention

After each LC, the feedback group received structured feedback and video debriefing with CVS annotation from a trained tutor. The post-LC structured feedback and video debriefing consisted of a joined video assessment of the performed LC reflecting on important steps, complications, and potential improvement of the performance of the procedure (exposure, instrument holding, dissection, clip positioning, cutting). Video debriefing consisted of annotating four predefined video frames on an open-source platform using the annotation tool LabelMe [35]. The four predefined video frames were: starting point of the LC after instrument insertion, first cauterization, identification of the cystic duct and artery, and 10s before clipping. The main goal of CVS annotations was to provide visual guidelines to perform a safe LC. The tutor and the trainee annotated the Go- and No-Go zones in the first two video frames. The Go- and No-Go zones were defined as safe (Go) and dangerous (No-Go) zones of dissection (Fig. 3a) [29].

Fig. 3
figure 3

a Annotation of Go- (marked in green) and No-Go zones (marked in red) using open-access software LabelMe; b Annotations of the important anatomical structures and achievement of the CVS: Gallbladder (yellow), cystic artery (red), cystic duct (green), instrument (blue)

Video frames 3 and 4 were annotated by the tutor and the trainee regarding the identification of the CVS (Fig. 3b) [36].

Primary outcomes

The study's primary outcome was comparing the LC technical skills of two groups using global and task-specific OSATS and GOALS assessment scores after each LC performed [34, 37].

OSATS is a standardized, validated assessment tool implemented in many academic centers to measure operative performance [34]. The global OSATS score evaluates tissue respect, efficiency, usage, and knowledge of the instruments, camera assistance, and workflow (35 in total attainable points). The task-specific OSATS score assesses the following LC aspects: (a) retraction of the gallbladder, (b) the Calot’s triangle preparation, (c) the preparation of the cystic duct, (d) the preparation of the cystic artery, (e) the preparation of the gallbladder, (f) the clipping and cutting, (g) the knowledge of the procedure, and (h) the end product’s quality (70 attainable points). GOALS can be used as a general performance assessment tool for any MIS procedure and procedure-specific assessment, such as video assessment of LC in the learning stage observed in medical students [37].

Secondary outcomes

The total time spent performing the LC, the complication assessment, CVS achievement, and difficulty of the LC were recorded as well. Complications such as perforations, injuries on the cystic artery and/or cystic duct, and misplacement of the clips were assessed using a 3-point Likert scale [38]. The difficulty of each LC was evaluated by the trained tutor of the MIS section supervising the trainees using the visual analog scale (VAS) [37].

Various data regarding the subjective effects of the predefined training were recorded for each trainee using predefined questionaries. The questions were related to the subjective assessment of personal improvement in performing the LC and acquiring MIS skills. The feedback group received additional questions associated with the feedback and video debriefing.

Sample size

The sample size calculation was based on a previous study in the same setting [39]. With a two-sided α = 0.05, the sample size gives 80% power to detect a standardized effect of d = 0.64 with a power of 80%. This effect represents approximately 2.5 points for the general skills scale and about 3.5 points for the specific skills scale. As the general skills scale parameters range from 1 to 5 and those for the specific skills from 2 to 10, the effect would reflect an improvement of precisely one scale unit, which is reasonably small. The sample size determination for the total scores of general and specific scales can only be estimated as the correlation between the two scales is unknown. Assuming a positive correlation of ρ = 0.5, the standard deviation for the total scores of the scales would be 7.86 for both groups. With the sample size of 40 participants per group and α = 0.05 two-tailed, a difference of 5 points would be detected (for example, 3 points for the general skills area and 2 points for the specific skills area) with a power of 80%.

Statistical analysis

Statistical analysis and descriptive statistics were performed with the SPSS software (version 25.0, IBM SPSS Inc., Chicago, Illinois, USA), and data were given as absolute frequency and mean ± standard deviation. Differences between the LC were assessed using the t-Test for independent samples in parametric data and the Mann–Whitney U test for independent samples in the case of non-parametric data. For binary endpoints, group differences were calculated using the Chi-square test. A p-value of p < 0.05 was considered statistically significant.

Results

Eighty medical students were included in the study. The medical students were randomized into the feedback group (n = 40) and the control group (n = 40). All 80 trainees completed the study at the Department of General, Visceral and Transplantation Surgery, Heidelberg University Hospital, Germany.

Primary outcomes

The total global and task-specific GOALS performance scores were significantly higher in the feedback compared to the control group (17.5 ± 4.4 vs. 16.0 ± 3.8, p < 0.001; 6.6 ± 2.3 vs. 5.9 ± 2.1, p = 0.001, respectively). When comparing total global (26.5 ± 7.3 vs. 23.4 ± 5.1, p = 0.003) and task-specific (47.6 ± 12.9 vs. 36.0 ± 12.8, p < 0.001) OSATS performance scores for all 4 LCs, the feedback group had performed better than the control group (Table 1).

Table 1 Outcome parameters for all laparoscopic cholecystectomies

Other than task-specific OSATS, all performance scores of the first LC were comparable between the two groups (Table 2). The performance improvement according to the global and task-specific GOALS and OSATS scores was observed after the second LC (Table 2).

Table 2 Comparison of performance scores for individual laparoscopic cholecystectomies

Secondary outcomes

The feedback group achieved CVS more often than the control group (75.2% vs. 24.8%, p < 0.001) (Table 1) (Fig. 4). Despite having to perform more difficult LCs compared to the control group (33.7 ± 9.2 vs. 29.7 ± 8.4, p < 0.001), the feedback group had fewer complications than the control group (2.9 ± 2.1 vs. 3.7 ± 2.3, p = 0.001) (Table 1) (Fig. 4).

The total time of the performed LCs was significantly shorter in the control than in the feedback group (68.2 ± 32.7 vs. 84.1 ± 35.9 min, p < 0.001) (Table 1).

Fig. 4
figure 4

Comparison of complication rate and CVS achievement of the performed LCs between the feedback and the control group

When comparing global and task-specific GOALS and OSATS performance scores for each LC individually, the feedback group showed continuous performance improvement with comparable baseline scores after the first LC (Fig. 5).

Fig. 5
figure 5

Comparison of the global and task-specific GOALS and OSATS performance scores for each LC between the feedback and the control group

After completing the training, the feedback group was asked to fill out a predefined questionnaire regarding the subjective effects of the training. The annotations of the Go- and No-Go zones were perceived to have helped improve the technical skills of LC in 75% of the students. Fifty-two percent of the trainees found the annotation of the CVS very helpful for their overall operative performance. The structured feedback and video debriefing with CVS annotation made 65% of the trainees feel more confident performing an LC than before (Fig. 6).

Fig. 6
figure 6

Questionnaire about the subjective effects of the annotations of Go- and No-Go zones and CVS, and structured feedback and video debriefing after each LC on improving performance, confidence, and technical skills during LC in the feedback group

Discussion

This randomized-controlled single-center study showed that structured feedback and video debriefing with CVS annotations in teams of two after each LC (n = 4) contributed to the performance improvement according to global and task-specific GOALS and OSATS performance scores. The initial scores after the first LC, except for task-specific OSATS scores, were comparable between the two groups. There is no objective reason why the task-specific OSATS score would differ between the groups since the groups consisted of trainees of comparable previous training but with no previous MIS experience. Nonetheless, the tendency for performance improvement over time was significantly more visible in the feedback than in the control group, which shows the positive continuity of learning effects of structured feedback and video debriefing with CVS annotations on LC performance.

To ensure safe and successful outcomes in MIS, surgeons should participate in structured, comprehensive MIS training programs to develop skills in precise instrument handling, hand–eye coordination, and effective teamwork [40,41,42,43,44]. Individual feedback and video debriefing have positively affected MIS training [21,22,23,24,25,26, 45]. These days, several online platforms allow surgeons to upload their surgical procedures and analyze or receive feedback or video debriefing from their peers to improve their performance (https://www.csats.com/, https://medtube.net/) [46]. However, there has not been an active incorporation of the CVS annotation as an important safety step in LC as well as the segmentation of the safe (Go) and dangerous (No-Go) zones of dissection into the video debriefing and structured feedback [28, 47,48,49,50]. These intervention aspects make this study unique in its design. Furthermore, the intervention was performed on ex-vivo porcine LC, the next closest LC training form to the real LC, enabling a realistic and safe space for practicing and improving surgical skills. The study emphasized that the time invested in the annotation/segmentation aspect of the intervention is worthwhile to improve surgical skills and reduce complications in a safe training setting.

A study by O'Connell et al. evaluated targeted video feedback on LC performed by individual trainees [23]. The video feedback positively impacted the performance with an improved demonstration of CVS and the dissection of Calot’s triangle. Similarly, the feedback group in the present study had significantly higher global and task-specific GOALS and OSATS scores in total and after each individual postinterventional LC compared to the control group. Implementing structured feedback and video debriefing containing CVS annotation in the surgical training of young surgical residents could shorten the LC learning curve and improve performance in LC, a standard training procedure in most hospitals [51].

The reason for choosing the training in teams of two was a study by Kowalewski et al., which evaluated the performance improvement of trainees in teams of two compared to individual trainees undergoing laparoscopic training courses [39]. It showed reduced operation time in the group where trainees were in teams of two and presented the training model as a promising alternative when training time and resources are limited. However, although both groups were trained in teams of two, performance improvement, according to the objective assessment through global and task-specific GOALS and OSATS scores, was observed more in the feedback than in the control group.

Video debriefing after surgery also reduces technical errors in MIS [45]. Hamad et al. reported a significantly reduced technical error rate in laparoscopic jejunojejunal anastomosis in surgical residents who received postoperative video debriefing compared to those who did not receive the postoperative video debriefing [45]. Similarly, the present study's feedback group experienced performance improvement and had fewer complications after structured feedback and video debriefing with CVS annotation. A reduction in complication rates was reached despite the greater difficulty of LCs in the feedback group. This indicates that structured feedback and video debriefing with CVS annotation can be a valuable training modality for LC training.

Achieving CVS is essential to perform a safe LC [49, 50, 52, 53]. Inadequate completion of CVS during LC has been identified as one of the contributory factors of common biliary duct injury [53]. An intraoperative 5-s-long time-out to verify CVS has been shown to improve the CVS achievement rate and, therefore, can assist in avoiding typical intraoperative pitfalls [28]. The feedback group in the presented study performed better and achieved CVS more often than the control group, which is essential in performing a safe LC in a real OR setting [49, 52, 54]. However, the feedback group had an initial higher CVS achievement than the control group even before the intervention, which could potentially influence the CVS achievement discrepancy. Nevertheless, the improvement tendency of CVS achievement in the feedback group was observed over time compared to the flattened CVS achievement curve in the control group. The low rate of CVS achievement in the control group could be attributed to a lack of active and structured feedback on individual steps of LC and emphasizing errors from the previous LCs. Due to the non-existent postprocedural error aspect and reflection, the control group failed to improve the initial low CVS achievement rate. This further emphasizes that novice surgeons need postprocedural feedback reflecting on the potential improvement of the technical steps at the beginning of their training.

In this study, we also observed that the achievement rate of the CVS remained low in both groups. A possible reason for this could be the fact that the trainees had no previous surgical experience. The trainees were at the beginning of their MIS learning curve, and further improvements are expected as their experience increases over time [55]. The second reason could be the thinner porcine anatomical CVS structures, which makes CVS recognition and achievement more difficult.

It was interesting to observe that the students from the control group were faster in total operative time than those from the feedback group. The prolonged operative time in the feedback group may result from careful attention to the important anatomical structures and safety of the procedure since the feedback group had significantly fewer complications and achieved CVS more often than the control group. Although prolonged operative time can be associated with an increased risk of postoperative complications such as wound infections and pneumonia, it should not be the only factor for surgical quality assessment and clinical outcome prediction. Operative time has been criticized as a performance measure because it does not necessarily reflect the quality of the performance [56,57,58,59,60].

Madani et al. described the potential of AI-based intraoperative guidance through semantic segmentation with the identification of safe (Go) and dangerous (No-Go) zones of dissection during LC [29] in order to achieve safer LC performance and reduce intraoperative complications [61]. In the present study, a similar concept of annotating the Go and No-Go zones of dissection has been used as part of the structured feedback and video debriefing with a CVS annotation after each LC, which led to a significant performance improvement and reduced intraoperative complication rate. As a byproduct, these annotations can be used for Surgical Science Data (SDS) research and development projects, and can thus create synergies between training, research, and development in surgery [62].

Most trainees reported that structured feedback and video debriefing of Go- and No-Go-Zones and CVS annotations helped them improve their technical skills and confidence in performing LC. This aspect of the training is valuable since the sense of confidence and performance are mutually corresponding and can reduce subjective workload in MIS [63, 64].

MIS training offers many modalities that are often combined, such as blended learning [7], box-trainer, VR trainer [6, 9], augmented reality [14], telementoring [65], and telestration [17, 19]. Combining these advantageous modalities with other beneficial factors, such as training in teams of two and postprocedural feedback and video debriefing with CVS annotation, could improve MIS training [39, 66, 67].

There are some limitations to be addressed regarding the present study. The participants of the study were medical students without previous MIS experience. This limits the transfer of the study results into a clinical setting because surgical residents would have different surgical predispositions. Despite randomizing trainees without previous surgical experience, the feedback group had increased initial CVS achievement and task-specific OSATS score after the first LC, even before structured feedback and video debriefing with CVS annotation took place. However, the progressive continuity of the observed learning curve regarding performance and CVS achievement demonstrated positive effects of the intervention over time and repetitions. Other scores, such as global and task-specific GOALS and global OSATS, were comparable after the first LC. The preclinical setting of the study could be seen as a potential limitation, and the concept of it needs to be tested in a clinical setting to verify the results. Lastly, the intervention contained three potentially independent factors (structured feedback, video debriefing and annotation), which could have been applied individually. This makes it impossible to discern the impact of each intervention individually.

The study assessed the effects of structured feedback and video debriefing with CVS annotation on ex-vivo porcine LC performance compared to verbal instructions on demand only. That is why there is no comparison between the effects of structured feedback and video debriefing and CVS annotation with other training modalities, such as image-guided surgery or telestration with augmented reality. This should be a subject of future randomized-controlled studies to optimize MIS training.

The present randomized-controlled study showed a positive impact of structured feedback and video debriefing with CVS annotation after ex-vivo porcine LC in novice trainees. Achievement of CVS was improved, and complication rates were reduced. The structured feedback and video debriefing also contributed to increased self-confidence and subjective safety of the procedure. This is one of the reasons why postoperative structured feedback and video debriefing with CVS annotation could thus be integrated into preclinical and clinical training. In the near future, automated intraoperative feedback with artificial intelligence models, such as assessment of CVS achievement [36] and intraoperative image guidance like Go- and No-Go-Zones definition [29], will become available in daily routine. A reflection on the performed surgery and aspiration for continuous improvement through structured feedback and video debriefing with an experienced and skilled mentor should always form the basis for continuous improvement of surgical quality and skills.