Fairly Assessing U.S.- and Foreign-Trained Physician Performance Using 360-Degree Surveys

With a growing number of foreign-trained physicians joining the United States workforce, there is a need to fairly assess their job performance. The purpose of this study was to explore the fairness of a 360-degree competency assessment on U.S.- and foreign-trained of


Abstract Background
With a growing number of foreign-trained physicians joining the United States workforce, there is a need to fairly assess their job performance. The purpose of this study was to explore the fairness of a 360degree competency assessment on U.S.-and foreign-trained physicians.

Methods
We conducted a non-experimental retrospective analysis on physicians working in the United States (n = 258) who participated in a physician assessment and education program between 2007 and 2017.

Results
There were no signi cant differences in performance outcomes of teamwork, motivating or discouraging behaviors, technical practice, and patient interactions based on demographic differences.

Conclusions
The PULSE 360 is a powerful tool that can be used to evaluate physician performance without bias in demographic differences including: gender, country of physician medical training, physician native language, or age.

Background
In 2017, foreign-trained physicians made up over 25% of practicing physicians in the United States. 1 Given their substantial contributions to the U.S. healthcare system, 2,3 fairly assessing their performance is critical. Policy suggestions have been developed recently to promote the bene ts of employing international health care workers, such as treating them transparently and fairly. 4 However, comparing the performance of U.S. trained physicians (USTP) to foreign trained physicians (FTPs) is not wellunderstood. 5 Exploring soft skills like professionalism, interpersonal and communication skills, teamwork, and patient interactions is critical, and physician competency assessments should be fair regardless of demographic differences like age, sex, and nationality. The purpose of this paper is to explore the fairness of a 360-degree competency assessment program when evaluating U.S.-and foreigntrained physicians working in the United States.

Assessing Physician Competence
Fair assessment of physician competencies is a key step in improving job performance. Reasons for evaluating physician performance range from appraisal to recerti cation, identifying high-risk physicians, and remediating those with a previous history of poor performance. 6 Maintaining quality patient care is important because 6-12% of physicians are referred to remediation for poor clinical skills. 7 The Institute of Medicine estimates that physician dyscompetency is one contributor to preventable medical errors at an estimated cost of $17 billion. 7,8 To determine which physicians should be referred for dyscompetency, one model for performance remediation starts with an assessment of the physician's competence. 9 A common framework for maintaining physician competency is the American Board of Medical Specialties, which developed the Maintenance of Certi cation (ABMS MOC). This four-part framework includes: maintaining licensure, lifelong learning, cognitive expertise, and quality improvement. 10

Multisource Feedback
Multisource, or 360-degree, feedback is the use of physicians' team members (other physicians, nurses, and staff) to evaluate job performance. 11,12 The scope and depth of multisource feedback is valuable given the argument that patient evaluations of physician performance are subjective at best. 13 For example, patient evaluations have been found to be in uenced by the race and gender of the physician such that only physicians who were white and male bene ted from a customer satisfaction judgment, even after controlling for objective measures of performance. 14 Beyond clinical skills, physician performance is based on a combination of individual differences including specialty area, gender, and age. 15,16 Evidence suggests that biases against international medical graduates (IMGs) may lead to more complaints against physicians and disciplinary outcomes, 17 but ndings on biased physician performance evaluations are mixed. 18 Given the inconclusive evidence, having two examiners appears to mitigate potential sex or ethnic biases against physicians who are being evaluated based on their clinical performance. 19 Some research has explored the use of multirater assessments on international medical graduates and found them reliable, 20 but little research has examined bias in physician assessment as a function of training country (i.e., USTPs versus FTPs). Of the research on assessing physician performance, one experiment found that after holding education, experience, and personality consistent, foreign-born physicians were rated more poorly than those who had born in the prospective patients' home country.
However, physicians who had been trained in an industrialized and high-income country bene ted on their evaluations. 21 There are no signi cant differences in mortality rates for international versus national practitioners, but differences may exist in regard to the soft skills of communication, teamwork, and ethical issues. [22][23][24] Part of this bias may be a function of the examiners themselves. 25 In one study, IMGs have lower mortality rates than USMGs. 26 Further, there is evidence that in Canada, international medical graduates are disciplined for misconduct more frequently than North American medical graduates. 27 In Australia, IMGs receive more complaints and disciplinary adverse ndings. 17 Thus, there is a critical need for fair tools to evaluate USTPs and FTPs on their job performance. Hypothesis 1: There will be no signi cant differences in PULSE 360 physician performance based on: a) gender; b) country where training occurred; c) rst language spoken; or d) age.

Methods Design
A non-experimental retrospective analysis of data was conducted for two hundred and fty-eight physicians (n=258) who participated in a physician education program between 2007 and 2017.

Statistical Analyses
Independent samples t-tests and an analysis of variance (ANOVA) were conducted in order to evaluate potential biases in PULSE 360 scale scores due to demographic differences including: gender, country in which training occurred, cultural background ( rst language spoken), etc.; see Tables 1-5.

PULSE 360 Survey
The PULSE 360 Survey is an assessment of leadership, teamwork, communication, professionalism, and other physician behaviors based on multisource feedback from other physicians, advanced practice providers, clinical staff, and administrative staff members who interact with a physician. Variations of the survey are based on n=96 behavioral items, 5 performance domains, including a total composite performance score known as the Teamwork Index (TI) Score with internal consistency reliability estimates ranging between α = .77 to .85 across dimensions. TI scores typically range from 0 to 100 with a national mean score of 68.9 for physicians. Prior research has demonstrated both the internal and external validity of PULSE 360 scores in relation to important physician outcomes such as malpractice risk and patient satisfaction. 11,[28][29][30][31][32]

Results
In support of hypothesis 1a there were non-signi cant differences in the mean PULSE 360 scores for male vs. female physicians (see Tables 2-4 for independent samples t-test results). In support of hypothesis 1b there were non-signi cant differences in the mean PULSE 360 scores for US-trained vs. foreign-trained physicians. In support of hypothesis 1c there were non-signi cant differences in the mean PULSE 360 scores for native English speakers vs. non-native English speakers. In support of hypothesis 1d there were non-signi cant differences in the mean PULSE 360 scores between age ranges (see Table 5 for ANOVA results). Additionally, all post hoc comparisons amongst age ranges yielded non-signi cant differences in mean scores on all PULSE 360 Scale scores.

Discussion
Given the growing demand for FTPs in the United States, there is a need to fairly select, train, and support this diverse group of international physicians. 33 Physicians in this study were evaluated on their performance using the PULSE 360 Survey and were compared across gender, country of training, native language, age, and board certi cation status. There were no signi cant differences in their reported performance on professionalism, teamwork, motivating behaviors, discouraging behaviors, technical practice style, or patient interactions. These ndings suggest that there are valid and reliable tools established to fairly evaluate the performance of both U.S.-trained physicians and foreign-trained physicians. This is valuable given that previous research has found that some measures discriminate against some protected classes. 17 The physicians in our current sample were recruited to the physician assessment and education program for a variety of reasons that may not be representative of practicing physicians in the United States.
However, the use of 360-degree data allows us to move beyond self-reported performance to a more comprehensive view of physicians' performance on the job. The important takeaway is that within our sample, there were no signi cant variations in scoring patterns attributable to protected class membership.

Conclusions
The use of 360-degree feedback can provide a fair and unbiased assessment of others' perceptions of physician behavior and performance within the healthcare team. Ethics approval and consent to participate: Because this study used de-identi ed archival data, it was deemed unnecessary to ask participants for informed consent. This study was approved by the UNK IRB #041320-2.

Consent for publication:
Not applicable.
Availability of data and materials: The dataset used and analyzed during the current study are available from the corresponding author on reasonable request.  Tables   Table 1. Distribution of Physician Participants by Specialty/Sub-Specialty.