Patient participation in clinical trials conducted by principal investigators who speak one or more language(s) beyond english: Exploring ethnicity as proxy for language

Background To explore the association between ethnicity, as a proxy for language, and participation in clinical trials (CT) conducted by Principal Investigators (PI) who speak one or more language in addition to English. Methods This retrospective, descriptive study utilized CT participant demographic data extracted from the largest Midwestern non-profit healthcare system between January 1, 2019 and 12/31/2021. The CT participant sample (N = 4308) was divided for comparison: CT Participants of Hispanic or Latino Origin (N = 254; 5.90 %) and CT Participants of Non-Hispanic or Latino Origin (N = 4054; 94.10 %). Logistic regressions were performed to generate the crude and adjusted odds of patients of Hispanic or Latino origin participating in CTs conducted by PIs who speak another language in addition to English. Results Crude analysis revealed that patients of Hispanic or Latino ethnicity had 2.04 (1.58, 2.64) times greater odds of participating in CTs conducted by PIs who speak another language than English (<0.0001), which increased to 2.67 (1.97, 3.62) times greater odds after adjusting for sex, race, age and insurance (p < 0.0001). Conclusions Overall findings indicate that patients of Hispanic or Latino ethnicity, who are more likely to speak Spanish, have greater odds of participating in CTs conducted by PIs who speak another language beyond English. This may imply that cultural sensitivity at the top of a CT study team, as likely to be demonstrated by PIs who speak another language beyond English, may be an important contributor to reducing ethnicity- and language-based barriers to diversity in CTs and a relationship worth exploring further.


Introduction
There is a need for adequate representation of minority populations in clinical trials (CTs) to ensure generalization of results, learn about potential differences among groups, and improve health outcomes for all individuals [1,2].However, there are persistent structural impediments to enrolling representative samples of racially and ethnically minoritized individuals into CTs [3][4][5].Poor recruitment of minoritized populations in research is complex but can be attributed to system factors such as insufficient recruitment strategies and planning [5,6], lack of cultural competency among researchers [5,7], and poor engagement and linkages with communities and individuals [8,9], as well as patient factors like historically poor experience of participating in research and communication issues [7].Communication issues, specifically low literacy and language differences, have all been cited as common barriers to enrollment of minority populations into CTs [7,[10][11][12].
Language barriers present a unique challenge to diverse clinical practices [7,13,14].Studies suggest that the language spoken by patients and research professionals may be more important of a factor in enrollment than race [15][16][17], not only that a patient's primary language influenced enrollment success, but research professionals had more success enrolling subjects who shared their primary language ("language concordance") [15] Further, a majority indicated that the language barrier and time spent arranging for interpreters had prevented them from offering a study to potential candidates [15,16].While it might be intuitive that a CT investigator being the same race and/or ethnicity as a potential study subject ("racial concordance") might appear to be motivating to minorities' participation studies show that race and ethnicity of potential candidates did not always indicate they were more successful in enrolling subjects of their own race for various reasons [15,[18][19][20][21].While it is highly unlikely to have research professionals speak every patient language without interpreter services, multicultural research staff may improve representation in CT enrollment by potentially having more positive attitudes related to cultural differences among potential participants.Given that language barriers tend to be one of the less surmountable challenges to enrollment [15], this paper will explore the hypothesis that CT principal investigators (PIs) who speak one or more additional languages beyond English (proxy for multiculturalism), may have greater participation of eligible patients who speak a language other than English even though there may not be PI-patient language concordance [22].
There is abundant research documenting the under-representation of patients identifying as Hispanic or Latino ethnicity, specifically [23][24][25].
What seems less clear is potential differences in characteristics of patients of Hispanic or Latino origin participating in CTs as compared to patients not of Hispanic or Latino origin and, further, as compared to patients of Hispanic or Latino origin being treated by CT PIs overall, representing potential eligible (but non-participating) participants.We suspect language is a significant contributing factor to enrollment of patients of Hispanic or Latino origin, given the additional time and effort required for a language interpreter.
The objective of this paper to compare the proportion of participants of Hispanic or Latino origin in CTs conducted by PIs who speak two or more languages (proxy for multiculturalism), to proportions of participants of Hispanic or Latino origin in CTs conducted by PIs who speak English only.Capturing language preferences in a diverse healthcare system is not without challenges, primarily being that a substantial number of patients will have their documented preferred language as English, even if it is not, making it a barrier for researchers to quantify patient-provider language concordance.Given this constraint, ethnicity is used as a proxy for Spanish speaking, given that 75 % of people identified as Hispanic speak Spanish and this is increased to 93 % if foreign born [26].In the Chicago metropolitan area, where these data derive from, nearly 2 million people speak Spanish and is greater than the national average for urban areas.Additionally, 25 % of the Chicagoland area speaks a language other than English but unfortunately in our clinical trials data management system 'language' is not accurate enough to use in this study [27].

Materials and methods
This retrospective, descriptive study focused on differences by ethnicity utilizes patient demographic data from two groups extracted from the Clinical Trials Management System (CTMS) and electronic medical record (EMR) within the largest Midwestern non-profit healthcare system.The dataset includes data from a sample of 4308 unique healthcare system patients who provided informed consent to participate in at least one active CT within the system between January 1, 2019 and December 31, 2021 ("CT Participants").The original dataset included 4321 unique CT participants but was limited to the 4308 patients who had ethnicity and PI data in the timeframe, reflecting 0.30 % missing participant data which the research team deemed small enough to proceed with a valid analysis.

Data
Patient characteristics were collected from CTMS and EMR on the CT Participants and the CT Patients of Hispanic or Latino Origin, including: Ethnicity (categorical), Hispanic or Latino, Non-Hispanic or Latino; Sex (categorical), Male or Female; Race (categorical), White, Black, Asian, American Indian or Alaskan Native, Native Hawaiian and Other Pacific Islander; Insurance type (categorical), Private, Medicare, Medicaid, Self-pay, Worker's Comp; and date of birth to calculate age (continuous) as of January 1, 2019 for both groups and categorize into groups.Patient characteristics were captured from CTMS at clinical trial enrollment in the timeframe (January 1, 2019-December 31, 2021).Gender and languages spoken were collected on all PIs of CTs active from January 1, 2019 and December 31, 2021 from the publicly-available healthcare website.

Statistical methods
Data management and analysis of the sample were conducted with SAS statistical software (Version 9.4; SAS Institute, Cary, NC).All analyses of the sample data were performed by research personnel employed by the health system.Proportions with binomial exact 95 % confidence intervals were calculated across all demographic variable levels across both participant groups and the prospective participants of Hispanic or Latino origin.Pearson chi-square tests were performed to evaluate differences by ethnicity, overall and stratified by patient characteristics, in participation in studies conducted by PIs who spoke at least one other language than English.Fisher's Exact Test p-values were interpreted for variables with at least one cell with a value less than 5 (as indicated by ^below).Crude logistic regressions were conducted to determine associations between patient characteristic levels and the outcome of participation in CTs conducted by PIs who speak at least one other language beyond English.Finally, a logistic regression was conducted to determine the adjusted association between ethnicity and participation in CTs conducted by PIs who speak at least one other language beyond English, adjusting for all patient characteristics, specifically sex, race, age and insurance.

Results
CT Participants (N = 4308) were divided into two groups: CT Participants of Hispanic Origin (N = 254; 5.90 %) and CT Participants of Non-Hispanic Origin (N = 4054; 94.10 %).Characteristics overall and by ethnicity, specifically CT Participants of Hispanic or Latino Origin and Non-Hispanic or Latino Origin, are described in detail in Table 1.Generally, fewer females of Hispanic or Latino origin participated relative to females of Non-Hispanic or Latino origin while, conversely, more males of Hispanic or Latino origin participated relative to males of Non-Hispanic or Latino origin.Generally, more patients of Hispanic or Latino origin who were White, American Indian/Alaskan Native and Native Hawaiian/Other Pacific Islander participated relative to patients of Non-Hispanic or Latino origin in these racial groups.Fewer patients of Hispanic or Latino origin who were Black and Asian participated relative to patients of Non-Hispanic or Latino origin in these racial groups.Generally, more younger patients who were of Hispanic or Latino origin participated relative to those who are of Non-Hispanic or Latino origin.Further, while proportions of patients of Hispanic or Latino origin participating in CTs increased with age, the gap in participation relative to patients of Non-Hispanic or Latino origin shifts, with more younger patients of Hispanic or Latino origin and more older patients of Non-Hispanic or Latino origin participating.Generally, more patients who were of Hispanic or Latino origin with Medicaid insurance participated in CTs relative to those who are of Non-Hispanic or Latino origin, while less patients of Hispanic or Latino origin with Medicare insurance participated relative to those of Non-Hispanic or Latino origin.Patients with Private, Self-Pay and Worker's Compensation insurances participated fairly.
Equally regardless of ethnicity.See Fig. 1 for a visual display of these characteristics.
In the study timeframe, there were 130 total healthcare system providers serving as PIs of active CTs, 86 (78.90 %) of which were male, 23 (21.10 %) female and 21 (16.15 %) missing gender.Among all PIs, 88 (67.69 %) spoke only English and 42 spoke at least one additional language, specifically24 (18.46 %) spoke two languages (one in addition to English), 17 (13.08%) spoke three languages (two in addition to English), and 1 (0.77 %) spoke four languages (three in addition to English).Overall, the 42 PIs spoke 23 different languages beyond English See Table 2 for available provider details and Table 3 for a breakdown of all languages spoken by the 130 PIs included in this study.
Pearson chi-square tests were conducted to explore associations between ethnicity and odds of participating in CTs conducted by PIs who speak another language than English, overall and stratified by each available patient characteristic.Relative to those who were of Non-Hispanic or Latino ethnicity, patients of Hispanic or Latino ethnicity overall and by most characteristics (female and male sex, Black and White race, all age categories, and all insurance types) were associated with greater odds of being participants in CTs conducted by PIs who spoke at least one other language than English by statistically and/or clinically significant odds ranging between 1.20 (0.59, 2.42) and 5.24 (2.62, 10.47).
Relative to those who were not of Hispanic or Latino ethnicity, patients of Hispanic or Latino ethnicity who were American Indian or Alaskan Native (AI/AN) race or Native Hawaiian or Other Pacific Islander (NH/PI) were associated with lesser odds of being participants in CTs conducted by PIs who spoke another language than English, with clinically significant odds of 0.60 (0.05, 6.79; p = 1.0000^) and 0.83 (0.05, 13.63; p = 1.0000^), respectively.Odds could not be generated for patients of Asian race due to a cell count of zero.See Fig. 2 for forest plots visualizing all relative odds.
Next, logistic regression analyses were then performed to look at the crude and adjusted associations between ethnicity and participation in a CT run by a PI who spoke more than just English.Crude analysis revealed that patients of Hispanic or Latino ethnicity had 2.04 (1.58, 2.64) times greater odds of participating in CTs conducted by PIs who speak another language than English (<0.0001).After adjusting for sex, race, age and insurance, adjusted analysis revealed that patients of Hispanic or Latino ethnicity had 2.67 (1.97, 3.62) greater odds of participating in CTs conducted by PIs who speak another language than English (p < 0.0001).This increase from 2.04 in the crude model to 2.67 greater odds in the adjusted model indicates that controlling for other patient characteristics strengthens the true association between ethnicity and participation in CTs conducted by PIs who speak more than English.See Table 4     parameter estimates and p-values.

Discussion
This study was performed using two EMR data sources within a large Midwestern U.S. health system, one including all patients who participated in an active CT between January 1, 2019 and December 31, 2021 and another including only patients of Hispanic or Latino ethnicity treated by the PIs of all active CTs between January 1, 2019 and December 31, 2021.In this paper, ethnicity was a key variable of interest to represent diversity on a broader scale, but also to serve as a proxy for primary language.Primary language or perceived primary language may be an important factor associated with poor participation in CTs overall but, given true primary language may be inaccurately documented in medical records [28,29], using ethnicity as a proxy is a first, albeit imperfect, step.
Further, the outcome of participation in CTs conducted by PIs who speak one or more languages beyond English in this study is intended as a proxy for multiculturalism to include cultural sensitivity and respect for others' culture.This research team pursued a hypothesis that PIs and/or research teams led by PIs who were more diverse and more aware of and sensitive to different cultures, as defined by speaking one or more languages beyond English, is associated with recruitment of and/or participation among patients who were Hispanic or Latino.
One objective of this paper is to explore associations between ethnicity and an outcome of participation in CTs conducted by PIs who speak language(s) beyond English, overall and stratified by patient characteristics, for a close look at potential differences by ethnicity within each patient characteristic.Patients of Hispanic or Latino ethnicity had two times greater odds of participating in CTs conducted by PIs who speak another language than English, which introduces a novel area for future research.This study also found more Hispanic or Latino patient participation, clinically if not just statistically, in CTs conducted by PIs who spoke more than just English across most characteristicsfemale, male, black, white, all age categories, and all insurance categories.It is important to point out that only 6.9 % of PIs spoke Spanish, so there is likely very little language concordance among Hispanic or Latino participants with their PI.These initial findings show the association between ethnicity and participation in CTs conducted by more culturally sensitive PIs was worth exploring.
Another objective of this paper is to determine the association between ethnicity and participation in CTs conducted by PIs who speak language(s) beyond English through a logistic model adjusting for all patient characteristics.After adjusting for all patient characteristics, patients of Hispanic or Latino origin had closer to 3 times greater odds of participating in CTs conducted by PIs who spoke another language than English.After controlling for other demographics that may contribute to greater participation in CTs by PIs who speak more than just English, patients identifying as Hispanic or Latino show much greater odds of being participants.
It should be noted that this study, with the currently available and  approved data, does not and cannot assess all invitations or eligible to participate in CTs.To this end, it is possible that this relationship could be explained by patients of Hispanic or Latino ethnicity more often seeing providers who speak more than just English for medical care and, thus, were more susceptible to invitation and participation to participate in CTs conducted by such PIs; however, only 9 (6.92 %) of all 130 PIs represented in this study spoke Spanish as an additional language and any other reasoning for this potential bias is unknown and unsuspected.
Overall findings indicate that patients of Hispanic or Latino ethnicity, who are more likely to speak Spanish, have greater odds of participating in CTs conducted by PIs who speak another language beyond English.This may imply that cultural sensitivity at the top of a CT study team, as likely to be demonstrated by PIs who speak another language beyond English, may be an important contributor to more diverse participation in CTs.While the findings of this study cannot conclude this with certainty, our results suggest this is a possibility worth exploring further.These findings do suggest that concordance related to some element of diversity or trained cultural sensitivity may be a valuable factor in reducing language-based barriers to diversity in CT participation.Even so, further evidence of this relationship will not solve the logistical barriers, like need for interpreters, that may contribute to lack of diverse participation in CTs, like the use of interpreters.

Future research
While this study chose to focus on patients of Hispanic or Latino ethnicity who are most likely to speak the Spanish language, future research with appropriate patient samples should broaden this to other patient languages such as any of the Asian origin languages.Further, it should be noted that numbers of patients across Non-White racial categories are curiously low.It is generally accepted that documentation in healthcare of sensitive data like race is poor and often based on external appearance assumptions, but it would be of a great value to explore the race-stratified association between ethnicity and participation in CTs conducted by PIs who speak language(s) beyond English with larger numbers of patients in different racial groups.

Fig. 1 .
Fig. 1.Patient Characteristics of Hispanic or Latino vs. Non-Hispanic or Latino Participants.

Fig. 2 .
Fig. 2. Overall and stratified findings Reveal patients of hispanic or latino ethnicity have greater odds of participating in clinical trials conducted by principal investigators who speak another language than English.

Table 1
for a complete report of crude and adjusted Demographic Distributions of CT participant sample, overall and by ethnicity, representing a timeframe of January 1, 2019 to December 31, 2021.

Table 2
Demographics of clinical trial principal investigators serving CT participants and CT patients of hispanic or latino origin from January 1, 2019 to December 31, 2021.

Table 3
Languages spoken by PIs.

Table 4
Crude and adjusted parameter estimates of patient characteristics associated with participation in clinical trials conducted by principal investigators who speak a language beyond English.