An Evaluation of Implicit Bias Training in Graduate Medical Education

Objective: Mounting evidence reveals that health care disparities stem from a combined eﬀect of structural bias within our health system and the unconscious bias of well-intentioned health care professionals. The authors designed and evaluated a novel educational intervention to introduce the concept of unconscious bias to front-line providers. Methods: The authors designed and implemented an educational curricula for providers from three diﬀerent programs at a single large, urban, tertiary-care academic institution. The intervention consisted of participants taking the implicit attitudes test (IAT), which was followed by a facilitated discussion. The discussion was audio recorded, transcribed, and coded for emerging themes. An online survey assessed participant awareness of these topics before and after the intervention and was analyzed using paired t-tests. Results: The authors analyzed the results by focus group. There were 19 participants in Focus Group 1 (FG1), 6 in Focus Group 2 (FG2), and 42 in Focus Group 3 (FG3). The majority of participants were white, between the ages of 26 to 35 and female. When analyzed in aggregate, authors found a statistically signiﬁcant improvement in self-reported domains on whether the intervention changed participant understanding of healthcare disparities and implicit bias. While the authors’ qualitative results indicated varying acceptance of the implicit


Introduction
Mounting evidence reveals that health care disparities stem from a combined effect of structural bias within our health system and the unconscious bias of well-intentioned health care professionals (Smedley et al., 2003) (Hall et al., 2015). A study by Green et al. found that among internal medicine and emergency medicine residents, significant pro-White bias exists despite no explicitly reported preference for Whites over Blacks (Green et al., 2007). Another study by Sabin et al. examined data from Harvard's Project Implicit website and found that the 2,535 website participants reporting an MD degree demonstrated significant pro-White bias (Sabin, Rivara, and Greenwald, 2008).
Social scientists assert that such unconscious attitudes and stereotypes acquired through socialization can be unlearned, or inhibited, by countervailing influences (Dovidio et al., 2008). Yet, there are limited studies to date that evaluate the use of countervailing influences in medical education. One study demonstrates that dedicated implicit bias curricula for undergraduate, medical, and physician assistant (PA) students raised learners' awareness of their implicit biases (Archambault et al., 2008). However, to date there lacks formal training in implicit bias in graduate medical education (GME) and training for residents, fellows, and advanced practice providers (APPs). In addition, we are unaware of prior assessments of residents' or APPs' perceptions on whether implicit bias affects their own patient care.
To address this critical gap, we designed and evaluated an educational intervention with the three objectives. First, to empower residents and APPs, to develop and implement implicit bias curricula for their peers. Second, to utilize the implicit attitudes test (IAT) to introduce the concept of unconscious bias and encourage self-reflection. Lastly, to engage residents and APPs in a facilitated discussion of implicit bias, how these biases affect medical care, and what can be done to combat bias.

Intervention Development and Design
The study team consisted of two Emergency Medicine residents, one Internal Medicine resident, one Radiation Oncology resident, two Advanced Practice Providers (APPs), and three senior advisors, all of whom were members of the University of Pennsylvania Graduate Medical Education Housestaff and APP (HAP) Quality Council, an organizing body of residents and APPs focused on patient safety and quality. The team engaged key stakeholders in an iterative process to develop an educational intervention for residents and APPs that focused on understanding health care disparities and the role of implicit bias. The study team engaged residents and APPs across all specialties to determine areas of interest and gaps in knowledge. Ideas were generated, presented to residents/APPs and revised based on continuous feedback from all stakeholders at the HAP Quality Council monthly meeting.
The study population selected included three health professional training programs based at a large academic and urban tertiary care hospital serving a diverse patient population. The participating programs were in-part selfselection as interested members of the HAP quality council sought approval from their respective program leadership. The three programs included: Internal Medicine Residency Program, Radiation Oncology and APPs. There were 19 participants in the Internal Medicine Residency session, 6 in the Radiation Oncology session, and 42 in the APP session. Facilitators were selected faculty mentors who had expertise in implicit bias training and healthcare disparities. The facilitator for each session was distinct from the study team and considered a participant.
The session structure consisted of a one-hour facilitated discussion. The session started with participants completing Khatri U, Zeidan A, LaRiviere M, Sanchez S, Weaver L, Lynn J, Shofer F, Todd B, Aysola J MedEdPublish https://doi.org/10.15694/mep.2019.000109.1 Page | 3 the IAT on Race on their cellular phone or laptop computer. Instructions were provided to participants at the start of the session, detailed the link to access the IAT. After completion of the IAT, groups participated in a 45-minute facilitated discussion on implicit bias and its potential effects on clinical care. The study team adapted a previously operationalized facilitator guide (Baylor College of Medicine Facilitator Guide) and met with facilitators ahead of time to review the guide and study protocol and revise as needed. The facilitator guide (See Supplementary File 1) included ground rules for participation, questions for discussion, and outlined the specific learning objectives below: Discuss what implicit bias is and identify how it may affect clinical care, specifically in their specialty 1.
Understand the difference between health care disparities and health disparities and how implicit bias relates 2.
to one or both of these terms Discuss potential ways of mitigating provider implicit bias in the clinical setting 3.
At the start of the session, participants were required to review a statement of research and thereby sign-in to opt into taking the IAT test, participating in the discussion, and completing the post-session survey. The statement of research described that discussions would be audio-recorded, observed, and transcribed by The University of Pennsylvania Mixed Methods Research Laboratory (MMRL). Two trained experts from the MMRL observed each session to evaluate for findings that would be difficult to interpret from recorded comments. Participants were made aware of this prior to the sessions. Given the potential for sensitive data collection and the need to ensure confidentiality during each session, written documentation was not obtained. In addition to trainee participants, all facilitators provided consent to participate in this study, where facilitated discussions were audiotaped, transcribed, and analyzed for emerging themes

Data Sources Quantitative Data Collection
After the session, we provided participants a link to a voluntary anonymous post-session survey via the University of Pennsylvania secure REDCap™ database. The study team developed the survey to assess the participants' interest on the topic, how the session affected their knowledge/perceptions of implicit bias, and feedback on session logistics and implementation. The survey consisted of five Likert scale questions (1=never, 2=rarely, 3=sometimes, 4=often, 5=always or strongly disagree=1 to 5=strongly agree) that assessed respondent perceptions before and after the session. It also included eight evaluation questions, including two open ended questions, to assess participant interest and feedback of the session. In addition, the survey included demographic questions. When possible the study team adapted questions previously operationalized surveys or prior work in this area.

Qualitative Data Collection
The discussion sessions were observed and audio recorded by The Penn Mixed Methods Research Lab (MMRL) trained experts observed and took notes on the facilitated discussion sessions. In addition, they facilitated audio recorded the sessions and subsequent transcription of those recorded sessions.

Statistical Analysis Quantitative Data Analysis
We first analyzed frequencies and percentages for categorical variables and means with standard deviations for continuous variables. Second, to determine differences in respondent perceptions of before and after the session, we conducted a 2-factor analysis of variance by focus group and time (pre/post). We performed post-hoc Tukey Kramer tests on all pairwise comparisons. All analyses were performed using SAS statistical software (Version 9.4, SAS Institute, Cary NC).

Qualitative Data Analysis
The Penn Mixed Methods Research Lab (MMRL) observed and audio recorded the three facilitated discussions. Recordings were transcribed, coded, and analyzed using a content analysis framework; deriving codes from a thorough reading and understanding of the data and employing a summative content analysis approach to key themes identified. All transcripts were coded by two coders with a final inter-rater reliability of 0.9962. Inter-rater reliability was derived from the two coders double coding one transcript, which constituted 25% of the data source. Due to inter-group differences noted in this analysis, each group's data was analyzed independently. The top themes from each group were summarized, along with limitations of the analysis and potential directions for future discussions on implicit bias.

Demographics of Survey Participants
There were 19 participants in Focus Group 1 (FG1), 6 in Focus Group 2 (FG2), and 42 in Focus Group 3 (FG3). The majority of participants regardless of group were between the ages of 26 to 35 and female. Of the FG2 and FG3 participants, 50.0% and 81.0% self-identified as non-Hispanic White respectively in contrast to FG1 participants, where only 21.0% self-identified as non-Hispanic White. (Table 1 Prior knowledge of health care disparities and the IAT Participants who completed the survey were asked a series of questions to assess their prior knowledge of health care disparities and the amount of healthcare disparities training they had received. In this section we describe the results of the surveys that were administered to the three groups. In all three groups, the majority of participants characterized their prior knowledge of healthcare disparities as "somewhat knowledgeable" (FG1 57.9%, FG2 83%, FG3 78.6%. (Table 2).
Prior to this session, 68.4% of FG1 and 71.4% of FG3 respondents reported receiving formal education on healthcare disparities while 66.6% of FG2 respondents reported no prior formal education on healthcare disparities. Of the respondents that reported receiving formal training, 82% responded with examples in a free text question. Formal education ranged from training during college to medical/NP/PA/Graduate school to during graduate medical training. Prior training often occurred as part of a lecture, an elective selected based on personal interest or as a special session during a seminar or conference focused on diversity.
Prior to the intervention, most respondents had not taken an IAT (FG1 68.4%, FG2 66.6%, FG3 85.7%). Of those who had taken an IAT, most had taken the "race" IAT. After taking the IAT test, nearly all participants expressed being slightly or very interested in their results. (Table 2)

Knowledge and Attitudes Before and After the Educational Intervention
There was heterogeneity among the groups with respect to themes that arose during discussions. FG1 and 3 shared skepticism of the validity of the testing format and results. Specifically, FG3 participants expressed confusion as well as angst about the outcome of the test. They suggested the test could be manipulated to obtain the desired result but also were not surprised by their test results. One participant suggested that as a result of their skepticism, the test result wouldn't change their behavior.
All focus groups suggested that implicit bias exists and may affect patient care. FG1 participants saw a complicated relationship between bias, pattern recognition and treatment offered to patients. Pattern recognition can be important and is often reinforced. However, some patterns may strengthen biases with negative outcomes if not challenged. FG2 challenged the idea that implicit biases must always have an impact on treatment or care, but did agree that, left unchecked could result in worse care. FG2 suggested that implicit bias could apply to other factors in addition to race, such as income, class and age. FG3 participants commented on the negative potential impacts of implicit bias on care to patients, especially if they felt dissimilar to patients or couldn't identify with the patient and/or situation.
Unique themes arose during FG1 and FG2 group discussions. FG1 participants struggled with their own internal biases, feeling frustrated or embarrassed when realizing there is dissidence between an individual's bias and reality. FG2 suggested that implicit bias exists on a personal and systemic level and there are possibly ways to mitigate bias at both levels.

Implementation Suggestions
From open-ended survey questions, observational and qualitative data we solicited valuable feedback from participants on implementation strategies. Participants were asked to provide suggestions on the logistics and content of the sessions as well as to provide recommendations for subsequent sessions. We have detailed a summary of those findings below and in Table 3 with representative quotes.

Logistics
Participants from FG1 and FG3 recommended smaller group sizes to facilitate an environment for more candid discussion on a sensitive topic. Other recommendations centered on choosing an appropriate size room and seating configuration to facilitate interactive discussions. Participants felt that AV equipment during the sessions made the discussion unnatural and advised we remove it in the future. Additionally, participants recommended more detailed, step-by-step instructions for taking the IAT.
Facilitators recommended that residents have protected time away from patient care duties during the session to allow them to fully engage. Most participants thought the session length was just right (64%). A small portion of participants thought the length was too long (14%), although notably, 19% of participants did not answer this question

Content
Overall, the majority of participants rated the session as good to excellent (79%) and participants largely felt that the content covered was appropriate. One suggestion provided was to expand the content material to include strategies to mitigate bias. Table 4 outlines key feedback we received on implementation. Participants in the focus groups recommended including clinical or case based scenarios to provide realistic examples of implicit bias. Additional recommendations included reviewing journal articles on disparities, role playing during sessions, and implementing hospital wide group discussions around implicit bias. Facilitators recommended a priming activity prior to the session to improve participant engagement and contributions.

Discussion
Our mixed methods evaluation revealed that the novel educational intervention was successful in exposing front-line clinicians in training to the existence of their personal implicit bias and encouraging them to explore how it affects the delivery of patient care. The novelty of this intervention stems from the fact that it was developed and implemented for clinicians in training by clinicians in training, with feedback from their peers on how to best incorporate this sensitive topic into clinician education.
A key finding of this intervention was a participant-reported increase in self-awareness of biases and therefore, served to meet a key recommendation that the intervention raise awareness of healthcare disparities among trainees. While our qualitative results indicated varying acceptance of the implicit attitude test, most participants acknowledged that implicit bias exists. Although studies have demonstrated pro-White biases among healthcare providers, recent work question whether implicit bias affects clinical care (Dehon et al., 2017). Interestingly, in our study, participants did describe clinical situations in which they felt their implicit bias affected the care they provided, both positively and negatively. Further, the majority of participants expressed interest in receiving additional training on implicit bias suggesting the educational intervention successfully engaged frontline clinicians on the topic of implicit bias and health care disparities.
Over the last several decades, there has been growing evidence to support that implicit bias in clinician decisionmaking may perpetuate health care disparities. Developing interventions to reduce the impact of provider biases is therefore paramount (Chapman, Katz, and Carnes, 2013). While other studies have measured rates of implicit bias among residents and have used the IAT as an educational intervention among medical students (van Ryn et al., 2015), to our knowledge, no study has measured the effectiveness of the IAT as an educational intervention to raise awareness of implicit bias among medical residents and APPs.
Prior studies have demonstrated the value of utilizing the IAT as a method of introducing medical students to the concept of unconscious bias. While beginning such work early in medical training has great potential value, one could argue that encouraging "graduate-level" clinicians, such as residents and APPs -those who may be relying on their biases while they diagnose and treat patients -is of even greater value. In fact, in a study by Haider et al. that looked at unconscious race and social bias with vignette-based clinical assessments by medical students, authors found that while the majority of first-year medical students consistently demonstrated an implicit preference for White patients, when asked to make decisions in vignette-based clinical assessments, there was no association with the students' implicit preferences and their clinical assessments (Haider et al., 2011). This is contrary to the many studies that have revealed the strong association between provider bias and poorer patient outcomes. Many factors, including the relative lack of fatigue, time pressure and cognitive load burden, may be protective in shielding medical students from being influenced their biases when compared with resident and attending physicians. The relaxing of work-hour restrictions and a return to overnight in-house call, and their accompanying fatigue, may further uncover underlying biases. As Haider et al. concluded, while initiating implicit bias training during medical school has great value, continuing such training at the physician-level/APP trainee-level could be an important intervention point to reduce disparities in health care. Our study sought to target the clinicians who are likely to be most susceptible to acting on their biases.
As Teal et al. describe, the process of becoming aware of one's implicit or unconscious bias is fluid, and not every learner has the same baseline insight (Teal et al., 2010). Thus, it is reasonable to conclude that interventions such as ours should be continued throughout the training of a resident or APP, rather than delivered as a one-time intervention. Additionally, as a one-time intervention, we cannot comment on the long-term effects these sessions have on patient care and provider interest regarding issues related to implicit bias. This study also does not describe Khatri U, Zeidan A, LaRiviere M, Sanchez S, Weaver L, Lynn J, Shofer F, Todd B, Aysola J nor track the department-specific initiatives or further sessions this intervention may have sparked.

Limitations
Our study should be interpreted with the recognition of a number of limitations. First, each session had a different facilitator who guided the conversations to varying degrees. To reduce this variation, the evaluation team session provided facilitators with verbal instruction on the goals of the session and a written facilitator guide. Additionally, there was inherent heterogeneity in the structure of the sessions due to differences in group size, room size, and participant seating, all of which may have contributed to varying degrees of participant comfort and participation.
Participants were explicitly instructed not to reveal their individual IAT results. This raises the possibility that participants had information on their biases that may have led them to either over-or under-participate in the discussion based on their acceptance of their results. Similarly, because participation in the discussion was optional, the themes that emerged may not be representative of the sentiments held by the participants. An additional option to express reactions in written form may have better captured the sentiment of the entire group.
Our findings drew exclusively from an educational intervention at a single academic institution and may not be generalizable. We acknowledge that there may be an institutional culture that uniquely contributed to our results. However, we posit that our implementation lessons and thematic observations provide broadly applicable knowledge. Additionally, our participants were from non-surgical specialties and predominantly self-identified as non-Hispanic white and female, which may limit our ability to extrapolate results to a more diverse group. Future studies should explore similar interventions with more demographically diverse groups and across other specialties, notably surgical specialties.

Educational Implications: Incorporation of implicit bias training in graduate education
The ACGME, through its Clinical Learning Environment Review (CLER) assessments, aims to measure how institutions engage residents on the reduction of health care disparities. It establishes the expectation that trainees and faculty receive education on identifying and reducing health care disparities and recommends that trainees be engaged in quality improvement activities that address health care disparities for vulnerable populations (ACGME, 2017). However, despite establishing this expectation, specific educational models through which to achieve these goals are limited. While APP programs are not required to include such training, as frontline clinicians, this group would also benefit. This study provides a model through which other residency and APP programs can pilot and develop an intervention to create awareness of implicit bias on the individual level as a starting point to address health care disparities within the institution. Both the qualitative and quantitative results demonstrate that this intervention successfully met the goals of the study to empower the development of an end-user driven curriculum, to utilize the IAT tool to introduce individuals to their personal biases, and to engage residents and APPs on a discussion regarding the impact of implicit bias.

Research Implications: Strategies to Address and Mitigate Implicit Bias
The recognition that unconscious bias affects opportunities and outcomes for individuals is not limited to the field of medicine. The last two decades have resulted in implicit bias research in the areas of criminal justice, education, employment, and even professional sports (Sen, 2014) (Clark and Zygmunt, 2014) (Reeves, 2014) (Ingraham, 2014).
In fact, in line with many police departments across the country, the U.S. Justice Department has gone as far as mandating implicit bias training for its 33,000 federal agents and prosecutors. Despite these advances, strategies to address and mitigate implicit bias are limited. can use their explicit processes to change and control their implicit responses. While the specifics of those described strategies are beyond the scope of this paper, it is important to note that existing research suggests that future interventions on raising awareness of implicit bias should include instruction on strategies to reduce the effect of bias on patient care. In one study, authors developed a 12-week longitudinal habit-breaking intervention that showed dramatic reductions in implicit race bias among participants as well as increases in concern about discrimination and personal awareness of bias over the study duration (Devine et al., 2012). Chapman et al. describe in their perspective piece, "As with any behavioral change, individuals need to become aware of their habitual engagement in an undesirable behavior and be provided with strategies to increase self-efficacy to engage in a new desirable behavior." While our intervention has proven successful at increasing awareness, it was not designed to provide clinicians with instruction on how to combat the effects of their bias. Further research assessing implementation of and long-term effects of current strategies to mitigate bias is necessary. Defining metrics of assessment and implementation is challenging but will assist in successful incorporation of implicit bias training into residency education.

Conclusion
Our educational intervention was successful at engaging front-line clinicians on the role of implicit bias on the development of health care disparities. However, prior to scaling our intervention to other institutions, many improvements should be considered. First, logistical improvements to the session structure were described by participants and facilitators. These include appropriate room size, smaller group sizes, and limited technical equipment. Second, choice of facilitator is important and challenging. Participants recommended facilitators who provided sufficient instruction without dominating the discussion. Finally, modifications to the curriculum should include instruction on how to combat the effect biases play in clinical practice.

Take Home Messages
Front-line providers can be effectively engaged in training on the effect of implicit bias on health care disparities Utilization of the validated IAT (Implicit Association Test) is a useful tool in introducing the concept of implicit bias to resident physicians and advanced practice providers Future interventions should involve teaching providers how to mitigate the effect of biases in order to provide just and equitable care measurement and analysis; spearheaded the acquisition of data from the Radiation Oncology group and helped in the analysis and interpretation of the collected data; revised the drafted article critically for important intellectual content; provided final approval of the version to be published. Dr. LaRiviere is Radiation Oncology Resident at the Hospital of the University of Pennsylvania.

Notes On Contributors
Sarimer Sanchez: Assisted with implementation of the intervention in the Internal Medicine focus group; helped to identify and coordinate with the session's facilitator; assisted in the acquisition of data from the APP group; reviewed the manuscript and provided feedback during the revision phase; provided final approval of the version to be published. Dr. Sanchez is an Infectious Disease Fellow at MGH/Harvard Lauren Weaver: Assisted with implementation of the session to the APP group; coordinated with the session facilitator; participated in the review of the manuscript and provided valuable feedback on both the qualitative and quantitative analysis of the data; provided final approval of the version to be published. Lauren is critical care advanced practice provider at the Hospital of the University of Pennsylvania.
Jenny Lynn: Assisted with study structure, recruited APP team members, evaluated, and ultimately helped to review data that was collected as well as editing drafts of various dissemination information; reviewed and edited the manuscript; provided final approval of the version to be published. Jenny is a surgical advanced practice provider at the Hospital of the University of Pennsylvania.
Frances Shofer: assisted with survey development as well as the collection and analysis of the data; provided all of the statistical analysis and interpretation of the quantitative results; created the figures that depict our results and reviewed the tables; reviewed the manuscript; provided final approval of the version to be published. Dr. Shofer is the Director of Epidemiology and Biostatistics and an Adjunct Professor of Emergency Medicine at the Hospital of the University of Pennsylvania Barbara Todd: assisted with implementation of the program in the APP group; provided financial and programmatic support to the project at the inception phase; reviewed the facilitator guide as well as the survey; assisted with data interpretation and analysis; reviewed and edited the manuscript; provided final approval of the version to be published. Barbara Todd is the Director of the CMS Graduate Nurse Education (GNE) Demonstration Project, Senior Fellow at the Center for Health Outcomes & Policy and Adjunct Assistant Professor of Nursing, University of Pennsylvania School of Nursing.