Introduction

The education methods for training in gynaecological surgery are being challenged by different forces and influences, such as the boundaries of the traditional apprentice–tutor model, the ethical objective to limit patient morbidity and error rate during surgery and the continuous pressure on cost effectiveness. Against such challenges, it is mandatory that objective measurable levels of practical skills should be established and validated prior to gynaecological surgery.

Some aspects of this complex problem of training, education and certification in laparoscopic surgery were recently addressed in a very striking way by the Dutch Ministry of Health (http://www.igz.nl/actueel/nieuwsberichten/mic) because the health inspection found an unacceptable amount of serious complications in common laparoscopic procedures. The inspection assessed the manner in which patient safety is assured and the quality of the procedures in terms of practitioner skills and training. The report concluded that training in laparoscopic techniques was inadequately structured and that it is a matter of concern that the standards which a future laparoscopist must meet in order to operate, either independently or under supervision, have not been adequately established.

It seems obvious, but not yet implemented, that a future laparoscopist should possess objective measurable theoretical and practical skills to be able to enter in a one-to-one teaching process in the operating room (OR). Different models have been proposed for this aim. In vitro models (e.g. trainer box, virtual reality) allow the monitoring of skills acquisition in a relaxed and controlled environment [13]. Trainer boxes are relatively cheap and accessible [4] and are as effective as virtual reality models [5], but unfortunately scientific validation of most of them remains insufficiently studied and underreported.

A new in vitro model, called the Laparoscopic Skills Testing and Training (LASTT) model, aimed to train and measure essential laparoscopic psychomotor skills (LPS; i.e. camera navigation, hands–eyes coordination, bimanual coordination), has recently been developed by the European Academy of Gynaecological Surgery (EAGS) [6]. It has been suggested that this model can be a cost-effective tool for continuous training and evaluation of LPS in all surgical disciplines that perform laparoscopic procedures because it is tutor independent, relatively cheap and suitable for any trainer box. Its feasibility and construct validity (i.e. its capacity to distinguish between experienced and inexperienced surgeons) have been well demonstrated [6].

This study was designed to evaluate the face validity (the realism of the method) and to confirm the construct validity of the LASTT model in a large population of residents and specialists in OB&GYN attending test sessions organised by the EAGS. The participants were classified according to their level of exposure to laparoscopic procedures, aiming to evaluate the correlation between the clinical experience and the proficiency in the essential LPS.

Methods

Participants and venue

The study enrolled residents and gynaecologists with different levels of experience in laparoscopic surgery (n = 199) who voluntarily attended workshops organised by the EAGS during the 20th European Congress on Obstetrics and Gynaecology (EBCOG) in Lisbon, Portugal, in 2008 (n = 56), the 24th Annual Meeting of the European Society of Human Reproduction and Embryology (ESHRE) in Barcelona, Spain, in 2008 (n = 58), the Workshop on Laparoscopic Hysterectomy in Tübingen, Germany, in 2008 (n = 24), the Laparoscopic Suturing Course in Leuven, Belgium, in 2009 (n = 7), the 19th European Meeting of the European Network for Trainees in Obstetrics and Gynaecology (ENTOG) in Budapest, Hungry, in 2009 (n = 12) and the 30th Congress of the Spanish Society of Gynaecology and Obstetrics (SEGO) in Barcelona, Spain, in 2009 (n = 42).

Instruments and materials

The LASTT model, suitable for performing three standardised exercises, as described elsewhere [6], was used. The insert, with the relevant materials for the different exercises, was inserted into the Szabo trainer box (Karl Storz, Tuttlingen, Germany). The exercises were performed with standard instruments (10 mm 0º/30º optic, 5-mm Kelly forceps and Matkowitz forceps) and an all-in-one (monitor, light source and video camera) laparoscopic tower (Telepack, Karl Storz, Tuttlingen, Germany).

Laparoscopic exercises (E)

E1—camera navigation

This exercise was used to evaluate the participant’s ability to navigate a laparoscopic camera with a 30º optic. This was done by measuring their ability to identify 14 different targets placed at different sites in the LASTT model [6]. Each target included a large symbol identifiable from a panoramic viewpoint and a small symbol only identifiable from a close-up viewpoint (Fig. 1a). The exercise started by identifying the large symbol on the first target (i.e. 1) and then the small symbol situated next to it, which had to be shown on the centre of the screen. This small symbol indicated the next large symbol to be identified. Following this order, the participant continued until the identification of the small symbol on the last target (i.e. end).

Fig. 1
figure 1

The LASTT model. a Setup for E1 (camera navigation). b Setup for E2 (hands–eyes coordination). c Setup for E3 (bimanual coordination)

E2—hands–eyes coordination

This exercise was used to evaluate the participant’s ability to navigate a laparoscopic camera with a 0º optic with the non-dominant hand (NDH) and to handle a laparoscopic forceps with the dominant hand (DH). This was done by measuring their ability to grasp and transport six pre-defined objects to six pre-defined targets in the LASTT model, which was fitted with coloured objects (5 × 4-mm open cylinders) and coloured targets (10 ×-1 mm nails) [6]. The matched targets and objects were identifiable by colour (Fig. 1b). The exercise started by identifying a target and an object of the same colour. Then, the object was grasped, transported and introduced onto the relevant nail. Only when the participant succeeded in introducing the open cylinder onto the matched nail was he/she allowed to continue with the next object of a different colour.

E3—bimanual coordination

This exercise was used to evaluate the participant’s ability to handle laparoscopic forceps simultaneously with the DH and the NDH. This was done by measuring the participant’s ability to grasp six pre-defined objects with the DH and re-grasp and transport them with the NDH to six pre-defined targets on the LASTT model, which was fitted with coloured objects (10 × 5-mm push pins with a tail of 10 mm) and coloured targets (20-mm holes) [6]. The matched targets and objects were identifiable by colour (Fig. 1c). The exercise started by identifying a target and an object of the same colour. Then, the push pin was grasped by the head with the Matkowitz forceps (DH) and exposed to the Kelly forceps (NDH), with which it was re-grasped by the tail, transported and introduced into its target. Only when the participant succeeded with the introduction of an object into its target was he/she allowed to continue with the others.

Experimental design, scoring and statistics

Standardised sessions with simultaneous working stations were organised during the above-mentioned meetings. Each station had a tutor and two participants at a time. At the beginning of the session, participants completed a survey about demographic (Table 1) and previous exposure to gynaecological laparoscopy using the classification of the European Society for Gynaecological Endoscopy. This classification establishes four levels of procedures: first level (basic), second level (intermediate), third level (advanced) and fourth level (special procedures) [6]. For each level, the numbers of procedures performed were recorded and then scored in the following categories: no procedures (score 0), 1–30 procedures (score 1), 31–50 procedures (score 2) and more than 50 procedures (score 3). The scores obtained in each level were added, giving a final score ranging from 0 to 12. This score, resulting from many possible combinations, represents the amount of laparoscopic procedures to which an individual was exposed to.

Table 1 Participants’ demographics

At the time of data analysis, participants were classified in three groups (G). G1 comprises those with no or very little exposure to laparoscopy (final score 0 or 1). G2 comprises those with limited exposure to laparoscopy (final score 2 or 3). G3 comprises those with important exposure to laparoscopy (final score equal to or more than 4).

The exercises were performed in chronological order: E1, E2 and finally E3. Full explanation and demonstration were given at the beginning of each exercise. For each exercise, participants performed three repetitions in alternate order with his/her partner. This number was determined to avoid the major learning effect observed after three repetitions and to be consistent with previous studies [6]. For each repetition, the time was limited to 120 s for E1 and to 180 s for E2 and E3. This limit was based on previous observations that a large amount of people could finalise the task within this period and on the obvious time restrictions encountered during large medical meeting.

In each repetition, the numbers of objectives actually achieved were recorded (i.e. targets identified for E1 and objects transported for E2 and E3). When any objectives were accomplished, a value of 0.5 was assigned. The measurement of the exercises was based on the time to correct performed exercise (TCPE), which reflects errors and economy of movement in the result and as such engages an accuracy assurance. Since some participants may not finalise the task in the assigned time, the final score was obtained by dividing the actual time used by the number of objectives effectively accomplished. The average values of the triplicate observations were used for statistical analysis. The scores were not normally distributed, and therefore median values (inter-quartile range) are presented. The correlation between the self-reported level of exposure to laparoscopy and the LPS was evaluated with the Spearman test. Inter-group differences were evaluated by the Kruskal Wallis test (with Dunn’s correction for multiple-comparison tests).

To assess the face validity of the LASTT model, participants were asked to respond to a questionnaire with 11 questions (Q; Table 2) using a 10-cm visual analogue scale (VAS; 0: not realistic/good/useful; 10: very realistic/good/useful). Q1–Q8 examined the usefulness of the different exercises and of the overall model in terms of their testing and training capacities for LPS. These questions were answered at the end of each exercise (Q1–Q6) and at the end of the sessions (Q7–Q8). Q9–Q11 examined the usefulness of the model in terms of its realism and relevance to laparoscopic surgery and were answered at the end of the session. The scores were not normally distributed, and therefore median values (inter-quartile range) are presented. Inter-group differences were evaluated by the Kruskal Wallis test (with Dunn’s correction for multiple-comparison tests).

Table 2 Questionnaire to test the face validity of the model and the exercise

All statistical comparisons were performed with the GraphPad Prism Software and two- tailed P values <0.05 were considered significant.

Results

Seven participants did not perform all the assigned tests and were therefore excluded from the study. The remaining 192 participants were classified according to their level of exposure to laparoscopic surgery: G1 (n = 85), G2 (n = 44) and G3 (n = 63). Their demographics are reported in Table 1.

Face validity

The response rate to the questionnaire was of 97.7% (97.3% in G1, 97.5% in G2 and 98.3% in G3). The total population (Table 2), as well as the three groups separately (Fig. 2), gave a favourable response to this model.

Fig. 2
figure 2

Median (inter-quartile range) scores of a questionnaire assessing the face validity of the LASTT model using a 10-cm VAS (0: not realistic/good/useful; 10: very realistic/good/useful). Participants were grouped according to their level of exposure to laparoscopic surgery in three groups: G1 (no or little exposure, green bars), G2 (intermediate exposure, yellow bars) and G3 (important exposure, orange bars)

The first six questions (Q1–Q6), in which the individual exercises were evaluated in terms of their testing and training capacities, were well validated by all participants, all receiving a score above 8, without inter-group differences. This evaluation of the individual exercises was reflected in the overall assessment of the model. Indeed, all participants considered that the model was good for testing purposes (Q7), with a score of 8.8 (8.0–9.4), and for training purposes (Q8), with a score of 9.0 (8.0–9.8), without inter-group differences.

In the last three questions, the relevance of the model for actual laparoscopic surgery (Q9) and its realism to simulate the female pelvis (Q10) and to simulate the movements required to perform laparoscopic surgery in the pelvis (Q11) were evaluated. Q9 and Q11 were positively judged, with a score of 8.4 (7.2–9.3) and 8.4 (7.0–9.5), respectively. Q10, however, received a score of 7.3 (5.3–9.0) only, being the questions which received the lowest validation. In the three questions, no inter-group differences were detected.

Construct validity

In all exercises, participants previously exposed to a larger amount of laparoscopic procedures achieved better results than participants exposed to fewer procedures.

For camera navigation (E1, Fig. 3), a negative correlation (r = −0.39; P < 0.0001) between the scores and the level of exposure to laparoscopy was found. The scores were 14 (9–21) for G1, 11 (8–18) for G2 and 8 (7–10) for G3. G3 scored better than G1 (P < 0.001) and G2 (P < 0.01). The differences between G2 and G1 were not significant.

Fig. 3
figure 3

Exercise 1 (camera navigation). The ability to identify 14 different targets placed at different sites in the LASTT model was evaluated. The left graph shows the scores of the participants in function to their exposure to laparoscopic surgery (ranged from 0 to 12). The right graph shows the median (inter-quartile range) scores of the three groups (G1: no or little exposure to laparoscopy, G2: intermediate exposure to laparoscopy, G3: important exposure to laparoscopy). ***P < 0.001 (G1 vs. G3); °°P < 0.01 (G2 vs. G3)

For hands–eyes coordination (E2, Fig. 4), a negative correlation (r = −0.42; P < 0.0001) between the scores and the level of exposure to laparoscopy was found. The scores were 45 (31–91) for G1, 33 (21–48) for G2 and 29 (20–40) for G3. G3 scored better than G1 (P < 0.001) but not G2 (P = NS). G2 scored better than G1 (P < 0.01).

Fig. 4
figure 4

Exercise 2 (hands–eyes coordination). The ability to grasp and transport six pre-defined objects to six pre-defined targets with the DH in the LASTT model was evaluated. The left graph shows the scores of the participants in function to their exposure to laparoscopic surgery (ranged from 0 to 12). The right graph shows the median (inter-quartile range) scores of the three groups (G1: no or little exposure to laparoscopy, G2: intermediate exposure to laparoscopy, G3: important exposure to laparoscopy). ## P < 0.01 (G1 vs. G2); ***P < 0.001 (G1 vs. G3)

For bimanual coordination (E3, Fig. 5), a negative correlation (r = −0.44; P < 0.0001) between the scores and the level of exposure to laparoscopy was found. The scores were 40 (30–64) for G1, 31 (26–41) for G2 and 25 (16–34) for G3. G3 scored better than G1 (P < 0.001) and G2 (P < 0.01). G2 scored better than G1 (P < 0.05).

Fig. 5
figure 5

Exercise 3 (bimanual coordination). The ability to grasp six pre-defined objects with the DH and re-grasp and transport them with the NDH to pre-defined targets in the LASTT model was evaluated. The left graph shows the scores of the participants in function to their exposure to laparoscopic surgery (ranged from 0 to 12). The right graph shows the median (inter-quartile range) scores of the three groups (G1: no or little exposure to laparoscopy, G2: intermediate exposure to laparoscopy, G3: important exposure to laparoscopy). ***P < 0.001 (G1 vs. G3); °°P < 0.01 (G2 vs. G3)

Discussion

In the classic apprenticeship system of “see one, do one, teach one”, feedback is directly provided during surgery in the OR, and surgery is learnt by the student through simple observation and later imitation of the actions of a skilled mentor. However, there are critical factors for the current use of this model, such as the necessity of a high volume of surgical procedures, the availability of a sufficient number of skilled mentors, the limitation of resident working hours and some ethical and financial constraints.

Training in laparoscopic surgery increases existing problems because besides the typical surgical skills the trainee has to acquire the LPS (i.e. hand–eye coordination, camera navigation, remote handling of instruments without tactile feedback and fine motor skills to deal with the fulcrum effect and the lever forces of the long instrument). The acquisition of these skills through the classic apprenticeship system seems not only impossible but also ethically unacceptable, as it might increase the complication rate of laparoscopic procedures. To shorten learning curves and to reduce accidents and complications, specific LPS and some typical surgical skills, such as suturing or knot tying, must be learnt outside the OR.

Although several training devices and methods for laparoscopic skills acquisition have been reported [714], most studies focused on models that recreate operative conditions and very few on the specific LPS. Virtual reality models have been proposed in this regard but, as they are still very expensive, a simple and broad implementation (not only at specialised centres) is not feasible today. The EAGS has developed the LASTT model, which focuses on the acquisition and measurement of three essential LPS (i.e. camera navigation, hand–eye coordination and bimanual coordination) [6]. The experience gathered so far demonstrates that this model can be used as an insert in a conventional trainer box and is feasible for large-scale adaptation. Furthermore, its capacity to distinguish between novices and experts (i.e. construct validity) has been recently demonstrated [6].

The present study extends these observations by demonstrating the face validity of the LASTT model, which indicates the appropriateness of the method on “its face value” (the resemblance of a test task to the actual clinical task). Since the LASTT model does not simulate nor represent any specific laparoscopic procedure (there is no specific clinical task involved), the assessment of its face validity should be very cautious. The model aims to train and measure LPS only, which is a concept not easily understood by physicians without experience in laparoscopic surgery who want to rapidly acquire proficiency to perform specific laparoscopic procedures and not only to navigate a camera or to handle an instrument. The total population, as well as the three study groups separately, gave a favourable opinion about the face validity of the model for testing and training LPS and also about its face validity for actual laparoscopic surgery. The realism of the model to simulate the female pelvis was, however, the least validated, which is not surprising taking into account that the LASTT model does not pretend to resemble the anatomy of a female pelvis but only the spatial distribution and orientation of its different planes and angles.

To assess the face validity of the study, it is important to be aware that the opinions of both types of participants might be influenced by several factors [15]. To what extent the less-experienced laparoscopists are just polite or feel obliged to fill in a questionnaire in exchange for a chance to “play” with the model is difficult to measure. Another effect that must be considered is the “Pygmalion effect” named after Pygmalion, a king figure from ancient Greek mythology, who carved a sculpture out of stone so skillfully that he fell in love with it. In this setting, it might be that the answers of the experienced, but especially of the less-experienced laparoscopists, were influenced by the mere enthusiasm of the LASTT developers, who took part in the workshops and gave the demonstration. The less-experienced laparoscopists may be particularly susceptible to give favourable responses because they have been around for less time in the working field of surgery and because they are relatively unprotected against the tempting display of the model and instruments by industry. On the other hand, the more-experienced laparoscopists, trained by the classical apprentice–tutor model, may be more conservative in their opinion of surgical training novelties. Nevertheless, even if these phenomena are of influence, they were not reflected in the outcomes of the study as manifested in the strong uniformity of opinion amongst groups.

In addition to demonstrating the face validity of the LASTT model, this study confirms its construct validity in another population and demonstrates the construct validity of an adapted scoring system. In the previous studies, each exercise was evaluated with different scoring systems in which the time and/or the number of objectives to accomplish were limited [6]. This resulted in data with very large variability and in a poorly reproducible scoring system. This scoring was modified to be more easily and universally applied. In this study, the maximum time allowed for each repetition was limited for practical organisational reasons, and quantifiable objectives were defined for each task. The main parameter measured was TCPE (i.e. the time required to finalise successfully the task). It was likely, however, that some participants would not be able to finalise the exercise within the time frame. To be able to include the data of those participants, the TCPE score was adapted, and the final score was obtained by dividing the time used by the number of objectives effectively accomplished.

It must be admitted that other factors that could influence the performance of the tasks, such as errors and economy of the movements, were not directly measured. It was assumed that with some limitations they will affect anyway in the TCPE score and thus be reflected in the final score. In spite of these limitations, this scoring system is similar for the three tasks and allows for the comparison of scores obtained at different locations, in which the time dedicated to the task could be different. Furthermore, it has the advantage of being objective, tutor independent and useful for self-assessment.

Besides the face and construct validity of the LASTT model, this study also demonstrates a strong correlation between the self-reported clinical exposure to laparoscopic surgery and the three LPS evaluated. Indeed, the very poor LPS (with large inter-individual variability) observed in most participants with little exposure to laparoscopy become progressively better (with small inter-individual variability) in function to the amount of laparoscopic procedures performed. Although the study did not pretend to detect the proficiency level of the LPS (participants performed three repetitions only), the data indicate that if novices want to achieve the experts’ LPS levels by clinical exposure in the OR only an enormous amount of laparoscopic procedures would be necessary. This would be not only unethical but also virtually impossible in the current residency programmes and confirms the necessity of validated pre-graduate (outside the OR) training programmes.

The data obtained in this and in the previous study [6] suggest that the LASTT model can be a useful tool for training and assessing the LPS of the residents in their own hospitals/universities. It provides tutors an objective method for evaluation of basic laparoscopic skills, which together with the theoretical knowledge of anatomy, laparoscopic principles and instrumentation and OR functioning, are the essential pre-requisites before an in vivo OR training programme with the apprentice tutor model can be started.

Furthermore, the LASTT model can be used as research tool for evaluating the different parameters of the learning curve (e.g. length, shape, slope, plateau, etc.) in order to optimise the acquisition and retention of the laparoscopic skills in a specific group and/or an individual trainee and more importantly for the establishment of performance standards [1618].