The development and psychometric validation of a low-cost anthropomorphic 3D-printout simulator for training basic skills applicable to office-based hysteroscopy

Hysteroscopy training requires the development of specific psychomotor skills. Few validated low-cost models exist in hysteroscopy. The main objective of this study is to determine face, content, and construct validity of a simulator designed for training basic hysteroscopy skills applied to office-based hysteroscopy. Twenty-five hysteroscopy experts and 30 gynecology residents participated in this prospective observational study. The simulator consisted of three color-textured, silicone-coated anthropomorphic 3-dimensional (3-D) printout uterine models inside a box. Each uterine model in the simulator was designed to develop one of the following basic hysteroscopic skills: hysteroscopic navigation, direct biopsy, and foreign body removal. Participants performed five video-recorded simulation attempts on each model. Procedure-specific checklists were used to rate performance. Median scores (25th–75th percentiles; p-value) 4 (3–4; p < 0.001) and surgical experience associated to the simulated procedures 4 (3–4; p < 0.001) indicated positive perceptions as to the realism of the internal cavity of the simulator. Median scores of 4 (3–4; p < 0.001) were assigned to the realism and utility of the tasks performed in the simulator for enhancing novice training in hysteroscopy. Expert performance scores were significantly higher and task completion times were significantly lower than those of novices in the navigation exercise (F(1,53) = 56.66; p < 0.001), the directed biopsy exercise (F(1,53) = 22.45; p < 0.001), and the foreign body removal exercise (F(1,53) = 58.51; p < 0.001). Novices’ performance improved on all three exercises: navigation exercise (F(1,53) = 182.44; p < 0.001), directed endometrial biopsy (F(1,53) = 110.53; p < 0.001), and foreign body removal (F(1,53) = 58.62; p < 0.001). Experts’ task completion times were significantly lower when compared to that of novices in the five attempts (p < 0,001) of the exercises: navigation (F(1,48) = 25.46; p < 0.001), directed biopsy (F(1,46) = 31.20; p < 0.001), and foreign body removal (F(1,50) = 69.8; p < 0.001). Novices’ task completion times diminished significantly throughout the sequence of exercises. The low-cost simulator designed for the acquisition of basic skills in hysteroscopy demonstrated face, content, and construct validity.


Background
Hysteroscopy is a minimally invasive procedure considered the gold standard for the evaluation and treatment of intrauterine diseases [1]. Currently available, smalldiameter scopes have permitted the performance of inoffice hysteroscopic procedures, allowing non-traumatic insertion into the cervix, a vaginoscopy technique [2,3]. Therefore, hysteroscopic proficiency requires specific training for the development of new psychomotor skills. The importance of a simulator as a training tool has been extensively emphasized [4,5].
Simulators allow training in a safe environment, realtime expert feedback and repeated practice of each step of the procedure [6][7][8]. Recommendations have been made for institutions to provide dry labs for training endoscopic surgical skills [9,10].
The successful integration of a new simulation system into a training curriculum demands rigorous evaluation of its validity [11]. The first step is to demonstrate that new simulator resembles real-life situations, that is, face validity. The extent to which new simulator allows acquisition of abilities required for proficiently performing in real patients, content validity, must also be assured. Thereon, validity procedures aim to demonstrate that new simulator allows detection of distinct levels of operator proficiency, for example, performance of novices and experts [12]. It is also desirable to demonstrate that simulator allows detection of changes in performance, learning curve, during repeated attempts [13].
We hypothesized that a novel, inorganic, anthropomorphic simulator incorporates 3-D printing would demonstrate high levels of face, content, and construct validity. The present study aimed to test these hypotheses by applying specific procedures for face, content, and construct validation.

Materials and methods
Data were collected at two general hospitals. The IRB (institutional review board) approved the study at the participant institutions. Written informed consent was obtained from participants before inclusion in the study. From October 2018 to April 2019, a convenience sample consisting of 30 gynecology residents with no previous experience in hysteroscopy and 25 experts participated in the study. Experts were gynecologic surgeons who had received formal training in hysteroscopic procedures and had been practicing diagnostic and operative hysteroscopy for more than 3 years at the time they were enrolled in the study.

The simulator
The anthropomorphic anatomical models used in the simulator were manufactured in acrylonitrile butadiene styrene (ABS) on a fusion deposition modeling (FDM) 3-D printer (3-D Printer Prusa I3 Rework, ITCTERM, Brazil). The 3-D model was scanned from a 9 × 7 × 6 cm pyriform mold simulating a human uterus. The models contained two hemicavities internally lined with textured, rose-colored acetic silicone to mimic the appearance of the human endometrium. A 2-cm diameter opening was created to allow the atraumatic passage of the hysteroscope.
Three uterine models were embedded in latex foam inside an opaque 30-cm-wide, 28-cm-long, 18-cm-high cardboard box. Three 3-cm circular apertures were made on the longest box wall. The three apertures were framed with ABS ring-shaped connectors to which red-colored male condoms were attached and connected to the opening of the uterine models inside the box simulating vaginas ( Fig. 1). Each uterine model in the simulator was designed to develop one of the basic hysteroscopic skills.
The first model was designed to develop intrauterine navigation skills. It was equipped with nine distinct adhesive images positioned on the anterior, posterior, left lateral, and right lateral walls; on the uterine fundus; and on the anterior and posterior tubal ostia, represented by two red markings inside the model. The second uterine model was designed to develop direct biopsy skills. Five electronic touch-sensors (Micro Switch, KW-1 3pins, Dongnan, China) covered by a non-slip adhesive tape were connected to a microcontroller board, Arduino Uno (Smarts Projects, Ivrea, Italy) which was connected to a microcomputer via a USB (universal serial bus) port. The sensors were positioned through openings in the model's wall in the uterine fundus and on the anterior, posterior, left, and right walls of the inner surface of the model. A visual sign on the computer screen indicated which sensor had been pressed at each attempt. The third uterine model was designed for developing foreign body removal skills. The model was equipped with five 8-mm-diameter apertures on the model wall through which 8-mm-diameter, 4-cm-long miniballoons were inserted 5 mm into the model cavity.

Face and content validity
Face and content validities were based on expert perceptions of the realism of tasks performed on the simulator compared to procedures performed on real patients. Face validity was evaluated by assessing the degree of expert concordance with the following statements: (1) the appearance of the simulator's internal cavity resembles that of the human uterine body cavity, and (2) the procedures performed on the simulator resemble the officebased hysteroscopy procedures performed on human patients. Responses were measured on four-point unidirectional, forced-response scales, where the extreme points 1 and 4 indicated full disagreement and full agreement, respectively. Also, the realism of the global experience in performing the tasks on the simulator was assessed on a 10-point rating scale (1 = absolutely nonrealistic; 10 = absolutely realistic).
Content validity was based on experts' agreement with statements addressing the similarity between the tasks performed on the simulator, the hysteroscopic maneuvers performed in real patients, for example hand-eye coordination, accurate visualization through a 30°angle scope and the use of grasping forceps. Also, the utility of learning experiences in the simulator for training novices was used to assess the content validity of the simulator. Responses were measured on four-point unidirectional scales where the extreme points 1 and 4 indicated full disagreement and full agreement, respectively.

Construct validity
The ability to discriminate between expert and resident performance determined the construct validity of the simulator. Participants' performance was measured on procedure-specific checklists used for rating performance on the video-recorded simulation sessions.
For developing the procedure-specific checklists ( Table  2 in Appendix), three authors (A.R.P., L.B.G., and L.K.V.) independently searched the literature to create technical performance checklists addressing the main steps of the simulation tasks: hysteroscopic navigation, direct biopsy, and foreign body removal [14,15]. Items in the checklists addressed ergonomics, image visualization, safe navigation, and handling the grasping forceps [12]. Items of the checklists represented the elements or steps identified in the three tasks that could be measured and potentially differentiate among levels of technical competence. Threepoint scales were added to the scoring rubrics, where zero indicated an unskilled performance, as expected from a novice; one indicated a somewhat skilled performance, as expected from an in-training novice; and two indicated a skilled performance, as expected from an experienced surgeon [16].

Simulation sessions
Simulation sessions took place in private rooms equipped with a video endoscopy equipment consisting of a highdefinition monitor (Karl Storz, Germany), a cold light source XENON 300 (Karl Storz, Germany), and a video-camera IMAGE 1 HUB™ (Karl Storz, Germany). Also available in the simulation room were HOPKINS® Forward-Oblique 30°, 2.9mm Telescope (Karl Storz, Germany), BETTOCCHI® inner and outer sheaths (Karl Storz, Germany), and semirigid 5-Fr grasping forceps (Karl Storz, Germany), which were used to complete the tasks in the simulator. The simulator box was positioned on a 90-cm-high table for the training sessions. The study participants were offered the choice of performing the tasks in the standing or the sitting position. Most trainees preferred the sitting position with the video equipment positioned anteriorly and to their left, allowing conducted all simulation sessions, which involved only one participant at a time. Formative feedback on participant performance was allowed [13].
Simulation sessions consisted of a preparatory and a hands-on phase. The preparatory phase aimed to provide the participant with relevant theoretical and practical guidance about the tasks to be simulated. The preparatory phase started soon after the arrival of the participant at the simulation room. The instructor started by providing a structured 20-min presentation addressing the definition, the indications, the equipment, the technique, and the complications of office-based hysteroscopy. Next, participants were informed about the technical aspects of the tasks to be simulated, their metrics, and goals. Participants were further instructed about how to handle the scope and the grasping forceps. The preparatory phase of the experiment finished after the instructor presented the simulator, demonstrated the tasks, and allowed the participants to manipulate the simulator, the video, and the hysteroscopy equipment for approximately 5 min before starting the hands-on phase of the simulation session.
Participants performed three exercises (tasks) during the hands-on phase of the simulation session. The navigation exercise (task 1) consisted of identifying, visualizing, and centering the nine targets and simulated tubal ostia inside the first model followed by obtaining a panoramic view of the interior uterine model, before withdrawing the hysteroscope. The aim of the direct biopsy simulation exercise (task 2) was to press the sensors positioned inside model 2 with the tip of a grasping forceps. The foreign body removal simulation exercises (task 3) consisted of using grasping forceps to clamp and pull the mini-balloons inserted on the wall of the third model at least 3 cm into the simulator cavity. Participants repeated each exercise five times. Task completion times were counted from the moment of insertion to the moment of withdrawal of the hysteroscope from the simulator. Participant performance was digitally recorded (Canon EOS 60D 18 MP, Canon U.S.A., Inc., Huntington, NY, USA) for further analyses. Three authors (A.R.P., L.B.G., and L.K.V.) rated the videos independently [17].
At the end of the simulation session, participants answered the demographic data questionnaire. Experts also completed the eight-item face and content validity questionnaire.

Statistical analysis
Statistical analysis was performed utilizing SPSS v 26.0 (SPSS Inc., Chicago, IL, USA). Basic descriptive statistics were calculated for demographic data. All continuous variables were presented as mean (standard deviation) or median (range), and nominal variables were presented as frequency (percentage). Kolmogorov-Smirnov test was used to assess normality. Generalized linear mixedmodel (GLM mixed-model) ANOVA was used to compare scores and task completion times within and between groups. Median scores of the items used to assess the face and content validities were tested against the mathematical center of the scales by using one-sample Wilcoxon signed-rank tests.
Psychometric analyses of the technical performance checklists included exploratory factor analyses by the principal-component extraction method and varimax rotation with Kaiser normalization to determine the factorial structure of the instruments and the estimation of Cronbach's alpha coefficients to assess the internal consistency of the technical performance checklists.
Inter-rater agreement was evaluated by estimating intraclass correlation coefficients and their 95% confidence intervals among the scores attributed by raters 1, 2, and 3 to the items in the checklists and among the average scores, under the following assumptions: the same raters assessed all videotaped procedures, the selected raters were representatives of a larger sample of raters (experts in ambulatorial hysteroscopy), the reliability of the mean value of multiple raters was the measure of interest, and we searched for consistency of ratings. Based on the assumptions mentioned above, two-way random-effects models were used [18].
The estimated sample size was 44 participants for five attempts with a power of 80% at an α level of 0.05. Calculations were based on the effect size (ε 2 ) equal to 0.057 observed in a previous study [12]. Fifty-five subjects were enrolled in the study to account for eventual losses.

Results
Twenty-five experts and thirty residents completed the study. Demographic data are shown in Table 1. A total of 825 videos records of participants' performance were obtained and randomized (www.ramdom.org).

Face and content validity
Median scores (25th-75th percentiles) assigned by experts to the statements that assessed face validity were significantly higher than the mathematical center of the scale (p < .001), indicating positive perceptions about the realism of the internal cavity of the simulator 4 (3-4; p < .001) and of the surgical experience associated the simulated procedures 4 (3-4; p < .001). Also, the median score of the global experience on the simulator was 9 (25th-75th percentiles = 9-10), differing significantly from the mathematical center of the scale (p < .001). Median scores of 4 (3-4; p < .001) were assigned to the realism and utility of the tasks performed in the simulator as applicable to the training of novices in hysteroscopy.

Psychometric analyses of the technical performance procedure-specific checklists
A total of 825 video recordings of participants' performance were obtained, from which 27 (3.2%) were excluded due to the low quality of images. Three authors rated 798 videos. Ratings were used for psychometric analyses of the technical performance checklists.
The factorial structure of the checklist used to assess participants' performance in task 1 comprised two factors, which explained 55.89% of score variance: factor 1 (items related to image visualization ability) (Eigenvalue = 7.42; explained variance = 49.48%) and factor 2 (ergonomics-related items) (Eigenvalue = 1.56; explained variance = 10.41%). The factorial structure of the scale used to assess participants' performance in task 2 comprised a single factor that explained 51.62% of score variance (Eigenvalue = 3.097). The factorial structure of the scale used to assess participants' performance in task 3 comprised a single factor that explained 52.92% of score variance (Eigenvalue = 3.176).

Construct validity
Expert technical performance scores were constant during the five attempts and significantly higher than novices for navigation tasks (F (1,53)  Residents' performance at the navigation tasks improved from the third to the fifth attempt (F (1,53) = 182.44; p < .001; Fig. 2a). Significant improvement was observed at the fourth attempt at the direct endometrial biopsy (F (1,53) = 110.53; p < .001; Fig. 3a). Residents' performance improved significantly from the second through the fifth attempts at the foreign body removal exercise (F (1,53) = 58.62; p < .001; Fig. 4a).

Discussion
The main results of this study were that the new simulator exhibited high face, content, and construct validity. Also, highly reliable instruments were used to produce the measures of technical performance.
Face validity of anthropomorphic simulators requires close resemblance between what the operator sees in the simulator and in real patients (realism). For this reason, the face validity of anthropomorphic simulators must rely on expert perception about the realism of the internal appearance, of the simulator [19,20]. To meet the requirement, the simulator developed for this present study was lined with colored texturized silicone to give it a realistic aspect, mimicking the endometrial cavity. The silicone lining also aimed to protect the hysteroscope lens during the training sessions. The tubal ostia were marked in red, standing out from the rest of the cavity to direct navigation and differentiating this simulator from other low-fidelity alternatives, such as the simulator developed by the European Academy of Gynaecological Surgery [21] and the hysteroscopic component of the Essentials in Minimally Invasive Gynecology Hysteroscopy Simulation System (EMIG) [22]. The present simulator shares high face validity as hysteroscopy component EMIG system recently developed during this research study [22]. Some fruits and vegetables can also be used as inexpensive simulators that resemble the human uterus and allow training surgical procedures [5], such as the butternut pumpkin model, which was deemed an excellent choice and used by the Royal Australian and New Zealand College of Obstetrics and Gynecology [23]. However, such models are perishable and may not be used in hospital settings.
High content validity indicates that the simulator allows the acquisition of the basic skills necessary to safely and proficiently complete the addressed procedures. Content validity grants the simulator the property of serving as a useful training tool. Office-based hysteroscopic experience is necessary to assess content validity; only experts' ratings were used in this study, as preconized elsewhere [20,24]. Because the simulator includes three procedure-specific uterine models, users can acquire skills in the most frequent and challenging maneuvers performed during office-based hysteroscopy. Training navigation includes the manipulation of the hysteroscope within the uterine cavity to obtain a clear view through the 30°-angle optics; novices can also train the reaching of small targets inside the uterine cavity as required for the successful performance of hysteroscopic endometrial biopsy. The simulator also permits the acquisition of skills at manipulating grasping forceps, which are crucial for foreign body removal. Furthermore, different from virtual simulators, our simulator provides haptic feedback that increases the realism of the task [25]. To the authors' knowledge, no other study addressed the content validity of simulators designed for the acquisition of skills for office-based hysteroscopy [5,26].
High construct validity of a simulator indicates the ability to differentiate the performance of experts and novices practicing on it. Besides, the demonstration that training in the simulator improves skills over time is also an indicator of construct validity, which is a necessary feature of simulators designed to follow the learning curve of novices during their path to proficiency [27]. By strategically positioning targets on every uterine wall, the simulator posed distinct levels of difficulty to assess the targets, thus allowing for differentiation between expert and novice performance. Furthermore, the success at progressively more difficult targets progressively increased the summative score, allowing for the detection of technical performance improvement among novices. Similarly, progressively decreased time to touch the targets might also be evidence of performance improvement, as demonstrated in a previous study [28]. Although significant improvement in technical ability and decreases in task completion time during five attempts, residents did not reach experts' performance level, indicating that learning curves of the simulated tasks are longer and deserve further studies. However, this finding corroborates the need for hysteroscopic simulation training before a performance in real patients [5,29,30].
The strengths of the current simulator are the high face, content and construct validity, and its affordable production cost (approximately US$150.00), easy reproducibility, and portability. These characteristics make the simulator an attractive tool for training and diffusing office-based hysteroscopy technique in developing countries where scarce resources challenge the acquisition of expensive simulators [31].
Low cost and easy reproducibility were the main concerns in the project of the simulator. Incorporating 3-D technology into the manufacturing of the uterine models fulfilled both requirements. However, equipping the model with resources for training vaginoscopy, the insertion of the hysteroscope through the uterine cervix and managing irrigation and complications demonstrated in virtual simulators would have substantially increased the cost of the simulator. These limitations have also been found in other low-fidelity simulation studies such as HYSTT and hysteroscopic component of EMIG [21,22,32]. Still, low-fidelity simulators demonstrate training capacity for basic skills [32].

Conclusion
Before its implementation as a training tool, a newly developed simulator requires independent validation of it face, content, and construct validity. This study described a new simulator designed for the acquisition of skills relevant to the safe performance of office-based hysteroscopic procedures. Training on the simulator produced realistic experiences for the user, an indication of high face validity; experts also highly rated the usefulness of the exercises performed on the simulator for the training of novices, an indication of content validity. Performance on the simulator also differentiates levels of surgical experience, an indication of construct validity. Did not have a logical sequence but did not touch the walls, performed panoramic viewing Executed a good sequence, did not touch the walls, and performed panoramic viewing

Handling of instruments
Difficulty to depress the sensors and pull the mini-balloons, big distance between the grasping forceps and hysteroscope Depressed the sensors and pulled mini-balloons but maintained big distance between the forceps and hysteroscope Depressed the sensors, pulled miniballoons, and maintained good distance between forceps and hysteroscope