Development and Validation of a Performance Assessment Scale for Chest Tube Insertion in Traumatic Pneumothorax

Aiham Ghazali1,2*, Alexandre Léger3, Franck Petitpas2,4, Youcef Guéchi5, Amélie Boureau-Voultoury2,6 and Denis Oriot2,6 1Department of Emergency, Pitié-Salpétrière University Hospital, Paris, France 2Faculty of Medicine, Simulation Laboratory, Poitiers, France 3Department of Pediatric, Basse Terre Medical Center, Guadeloupe, France 4Surgical Intensive Care Unit, University Hospital, Poitiers, France 5Department of Emergency, University Hospital, Poitiers, France 6Department of Pediatric Emergency, University Hospital, Poitiers, France


Background
Surgical insertion of a chest tube is a mandatory procedure in traumatic pneumothorax cases [1,2]. However, this procedure remains stressful for the operator, especially when that person is inexperienced [3,4]. Moreover, poorly performed thoracic drainage can lead to severe, potentially lethal complications [5][6][7]. Different chest tube insertion techniques have been described. In a trauma setting, the recommended surgical approach includes dissection of muscular layers until penetration of the pleural membranes, followed by insertion of a gloved finger probing into the chest cavity to confirm pleural placement and strip any adhesions prior to insertion of chest tube. This technique is likely to diminish the number of complications [8], and is routinely taught in Advanced Trauma Life Support courses (ATLS) [8]. Simulation-based chest tube insertion training helps participants to build confidence in their ability to perform a high-stakes procedure in a realistic and secure environment [9][10][11][12]. An objective essential items checklist (11 items) has already been used (but not described) for chest tube insertion in a model of trauma (TraumaMan * , Simulab ® ) [13]. Nevertheless, to our knowledge there exists no published validated performance assessment scale pertaining to this procedure. traumatic pneumothorax, at once applying simulation-based methods as a proxy for the applied environment, and covering all the steps of the procedure performed in adults and children. Psychometric validation requires a consistent assessment platform through which potentially confounding variables specific to individual cases can be controlled. Simulation-based practice has shown its transferability to applied practice [9][10][11][12] and therefore provides an opportunity to collect validity evidence. Our performance assessment scale has been considered as a clinical tool, which could be used to assist feedback, teaching, and assessment during simulation sessions and for research purposes.
The process followed in creation of the instrument and for psychometric testing represented an application of the five-step framework of Downing [14]: content, response process, internal structure, relationship to other variables, and consequences.

Creation of the instrument
Content: A trauma surgeon (MS), an emergency physician (AG), and a pediatric intensivist (DO), all of them ATLS-certified experts with 10 to 20 years of experience in trauma management, listed assessment items for surgical chest tube insertion in traumatic pneumothorax in children or adults based on guidelines [8,15] and literature [1,[16][17][18]. Items were chosen to reflect the current safest approach of chest tube insertion, i.e., through use of a tube without chuck or handle, inserted in the 4 th or 5 th intercostal space on the medio-axillary line [1,8,16,19]. Items that could not be assessed by observation (implying a sensitive perception of pressure) were not included in the scale: rupture of resistance while penetrating the pleural membranes, and connection of chest tube to the bi-conic tip without pulling. Asepsis was recorded during the procedure in two ways: 1) By items required for asepsis collected directly on the scale: utilization of sterile gloves, sterile drape, and antiseptic solution; 2) By recording secondary non-respect of asepsis during the procedure, even though this assessment was made independently from the scale, and used during the debriefing phase.
Scale's sections were determined by the experts according to the relative clinical importance they conferred to each specific step of the procedure. Because emphasis was given to the technical approach, half of the items dealt with incision, dissection, and introduction of the chest tube. Each item was ranked either "1" (correctly performed) or "0" (incorrectly performed or not performed). This phase led to the pre-scale including 8 practical steps, with total score ranging from 0 to 20.

Response process:
The pre-scale was tested and modified during several simulation-based trainings performed by nine senior physicians experienced in surgical chest tube insertion in adults and children (7 emergency physicians, 1 surgical intensivist, and 1 pediatric intensivist), in view of reducing the causes for error associated with the assessment process itself. The head research investigator chose a relevant scenarioleft traumatic pneumothorax in adolescent/young adult-employed on a previously published model [20]. By means of a step-by-step approach, some of the items were deleted, combined or specified for the sake of greater objectivity, but description of the procedure was maintained. This phase enabled us to simplify the items taken from 4 steps: incision and dissection, verification of location, insertion of chest tube, and connection of chest tube. A pediatric-specific item-tunnelization of chest tube through the chest wall-was added so that the scale could be applied, whatever the circumstances. Items assessing preparation and connection to the water seal device were not included, because the commercial models of tubing used were so variable as to defy comparison. Fortunately, this factor did not confound performance assessment, since performance could be recorded as high with different commercial tubings. Landmarks for puncture site were assessed after insertion to avoid interfering with the procedure. The final performance assessment scale included 20 items from 8 sections (Appendix 1), of which the most important was "incision and dissection" (6 items, 30% of total score). Success of chest tube insertion was assessed by observation of the intra-thoracic steps of the procedure, which was made possible through use of a webcam connected to the model [20] and recorded separately from the ranking of the items of the scale.

Psychometric testing
Participants and simulation setting: One hundred and four emergency physicians and emergency or pediatric residents who had registered for a medical university course on emergency procedures were invited to participate to the study. Because the technical approach to the surgical procedure is rarely taught in France (and consequently rarely practiced), all participants (including senior physicians) received a one-hour academic lesson, prior to the simulation session, on surgical chest tube insertion for traumatic pneumothorax. The model employed was a surgical chest tube insertion simulator we had previously developed and tested; it was constructed with a lamb halfchest tightly fixed on a box with a webcam behind enabling analysis of the intrathoracic steps of the procedure. Realistic representation of the pleural membranes was assured by a double plastic film between the chest and the box cover [20]. All simulated cases were identical: an emergency chest tube insertion in a 17-year-old patient presenting with a left traumatic pneumothorax.
We compared the performance score (process) to the functional aspect (success). Both were recorded on the same assessment form for reasons of convenience. An initial sample of 39 participants (sample A: 16 senior emergency physicians and 23 residents) was included in 2013-2014. Besides measuring process and success, the goal was to calculate the threshold value of a cut-off score above which 100% of chest tubes would be functional. This value would be of great interest in view of future pass/fail threshold determination. Secondly, to determine whether the effect of training was paralleled by an increase in performance score, another sample of 65 participants (sample B: 18 senior emergency physicians and 47 residents) was included in 2015, of whom 35 were randomly assigned for deliberate practice and feedback prior to assessment, and 31 without simulation practice prior to assessment on the simulator. All of the sessions were videotaped to allow the participants to visualize their performances and to identify their errors during debriefing.
Observers: Each chest tube insertion was assessed by two independent observers among 6 physicians (who were not teamed consistently as set pairs). All of them had previously repeatedly practiced chest tube insertion in traumatic cases; moreover, 4 of them were ATLS-certified, and 4 were course instructors. All of the observers were trained (for 1 hour) by two of us (AG and DO) to rank each item on the basis of what was observed de visu or on a laptop screen (webcam). Observers did not communicate scoring to each other, and were not allowed to discuss ratings. They were neither instructors nor research investigators. After each assessment, the research director verified that all the items on the assessment scale had been filled out, but he did not modify any of the rankings given by the observers.

Data analysis
Analysis was carried out on Statview version 4.5 software (SAS Institute Inc., Cary, NC). Descriptive analysis included percentage, mean, standard deviation (SD) of every variable. Comparative analysis used paired Student t-test. Internal consistency of the scale was analyzed by the Cronbach alpha coefficient established on 39 scenarios. Interobserver reproducibility was analyzed by intraclass correlation coefficient (ICC), comparison of means, number of incorrect items between two independent observers, and linear regression analysis. F-test was used to compare variance of scores obtained by observer 1, observer 2, and mean of the scores of the two observers. Comparison of scores between successful and non-successful attempts was performed, as well as determination of the threshold value of the score with the best positive and negative predictive values pertaining to its functional aspect (receiving observer curve). Correlation between training and performance or success rate used Spearman test. A p value of <0.05 was considered as significant.

General findings
Three experts selected the 20 items of the scale, and 6 observers assessed a total of 104 simulation sessions performed by 104 individuals with a double assessment (208 assessments files). Mean score was 13.51 ± 3.36 over 20 for the whole population (n=104), 12.78 ± 2.70 for sample A (n=39), and 13.95 ± 3.76 for sample B (n=65).

Validity analysis
Mean and standard deviation are given for each section on Table 1. Internal consistency of the scale was analyzed on sample A by the global Cronbach alpha coefficient of the scale, which was 0.747. As could be expected, success yielded a higher performance score than failure, thereby allowing for almost absolute discrimination between the two populations ( Figure 1). This finding reflected a very strong correlation between 'success' and 'process' of insertion, as shown, for a threshold determined as a score ≥14/20 (Table 2 and Figure 2). In fact, a score ≥14/20 was predictive of 100% success rate during chest tube insertion (Figures 1 and 2). The cut-off score (nearest point on the Receiver Operating Characteristic curve) with the best positive predictive value of success (95.2%) and the best negative predictive value (88.9%) was 13/20 ( Table 2).
The two randomized populations of sample B (n=65) did not differ in terms of status or experience (Table 3). Both performance scores and success rates were found to correlate with level of training (Spearman Rho=0.76 and 0.66 respectively, p<0.0001 for both), but not with status or experience (Table 4).

Reliability analysis
Interobserver reproducibility was tested on sample A. There was 100% concordance among observers as regards their assessments of success rate. There was no difference between the observers' mean score (12.91 ± 2.82 vs. 12.98 ± 2.74, p=0.91), and mean number of incorrect items (7.26 ± 2.74 vs. 7.18 ± 2.71, p=0.90). The mean number of discordant items was 0.43 ± 0.59. There was a very strong correlation (R=0.963) between the scores of the two observers (Y=1.0038x; R 2 =0.9253, p<0.0001) (Figure 3). Global ICC was 0.966, which represented particularly high interobserver reproducibility. Details of the ICC for each step are reported on Table 1. The mean of the scores of the two observers was not different from the means of each observer's scores; this was also the case for comparison of the respective standard deviations: 12.78 ± 2.70 (mean) vs. 12.74 ± 2.74 (observer 1), p=0.9426 or vs. 12.82 ± 2.71 (observer 2), p=0.9424.

Main results
We designed an 8-step performance assessment scale for surgical chest tube insertion in traumatic pneumothorax, consisting in 20 items with a total score over 20 points. This scale showed good internal consistency and excellent interobserver reproducibility. Furthermore, performance score was highly correlated to the success rate of the procedure, conferring clinical pertinence. To our knowledge, no other performance assessment scale for chest tube insertion has been published to date.

Limitations
This scale presents some limitations due to the fact that it was designed for teaching the surgical approach of chest tube insertion to beginners. First the observer should be fully informed of the current recommendations on surgical chest tube insertion. The second limitation deals with the other components of patient care when a chest tube is inserted. Since we chose a specific task-trainer for surgical chest tube insertion and not a high-fidelity mannequin in which a chest tube could also be inserted but with less realism [19], we did not include in the scale the assessment of informed consent, the general analgesia of the simulated patient, or the chest x-ray order. In other words, this scale fits well simulation-based trainings in the surgical procedure itself, but would not be appropriate for an immersive multidisciplinary team simulation scenario of a trauma patient for example.

Development of the instrument and its psychometric properties
This scale had good internal consistency allowing for global assessment of the key recommended steps in surgical chest tube insertion [8]. Furthermore, it attributed more points to the essential steps of the procedure; this aspect of the scoring system was related to the risk of severe complications, in the event of failure or an unperformed step [5][6][7][21][22][23][24][25]. The excellent interobserver reproducibility reflected Score Participants Figure 1: Study of the performance assessment scale for chest tube insertion in traumatic pneumothorax. Comparison between process performance and success rate (n=39). Number of chest tube insertions according to the performance score, in the success group and in the failure group.        the objectivity of the scale, and its potential for further use by a single observer. A score ≥14/20 was associated with 100% success rate, but between 10 and 13/20 it was not always associated with success, although the procedure may have appeared somewhat acceptable. In fact, in some cases, even though the overall procedure appears to have been correctly applied and the chest tube appropriately located, a chest tube can have an aberrant intra or extra-pleural trajectory. On another point, necessary technical steps during insertion may have been omitted or inadequately performed, thereby provoking chest tube insertion failure by dysfunction [26]. Most complications are due to poor technical procedure and have not been diagnosed at the moment of insertion: vascular lesions, exclusion of chest tube, and intra-parietal trajectory of chest tube [19,[23][24][25][26][27][28].

-Specificity
Importantly, training improved performance scores and success rates. The participants randomized in the group for deliberate practice on the simulator performed better than those who had not undergone simulation training, and this difference was not due to status or previous experience. In fact, performance was not related to status or previous experience. This result that could seem paradoxical at first glance was almost expected, since the surgical approach is rarely practiced and taught in France. Therefore already having an experience in the insertion of a chest tube in traumatic pneumothorax does not mean performing the insertion according to the surgical approach. This finding emphasizes the teaching need among emergency physicians' communities. The observed gain in performance after training may be partially attributed to the individualized debriefing following each try [29,30], but, as previously reported [9,12], simulation-based training can improve performance of chest tube insertion. Importantly, the performance assessment scale reflected this gain in performance.

Use of the Instrument
Consequently, this performance assessment scale seems applicable to training programs for participants at different levels: residents, young physicians, and even experienced physicians, in either initiation or continuous medical education. It can be used in models for adults and children alike. In small children, it is often worthwhile to carry out tunnelization of the chest tube inside the chest wall to solidify its securing [15]; this step is included in the scale. Furthermore, with a small modification of the "orientation of chest tube" item, the scale could also be used for the hemothorax or hemo-pneumothorax [18]. Finally, it is not limited, as in our model, to utilization as a task trainer. In fact, in our Anatomy Biomechanics Simulation Laboratory we routinely use the scale for assessment of chest tube insertion on cadaver models. It could also be used in a clinical setting, either de visu, or during video replay, in view of enhancing objective assessment of procedures in an emergency room [7,30].

Conclusion
We have presented the design and testing of a coherent and reproducible scale for chest tube insertion in traumatic pneumothorax, allowing for objective assessment of the procedure during simulationbased training. We suggest using this assessment tool during simulation sessions, with the aim of obtaining a score ≥14/20-a performance level guaranteeing success prior to clinical practice. Future studies should focus on the possible gain in performance and timing to be achieved with simulation-based education for chest tube insertion.