Characterising biological mechanisms underlying ethnicity-associated outcomes in COVID-19 through biomarker trajectories: a multicentre registry analysis

Background Differences in routinely collected biomarkers between ethnic groups could reflect dysregulated host responses to disease and to treatments, and be associated with excess morbidity and mortality in COVID-19. Methods A multicentre registry analysis from patients aged ≥16 yr with SARS-CoV-2 infection and emergency admission to Barts Health NHS Trust hospitals during January 1, 2020 to May 13, 2020 (wave 1) and September 1, 2020 to February 17, 2021 (wave 2) was subjected to unsupervised longitudinal clustering techniques to identify distinct phenotypic patient clusters based on trajectories of routine blood results over the first 15 days of hospital admission. Distribution of trajectory clusters across ethnic categories was determined, and associations between ethnicity, trajectory clusters, and 30-day survival were assessed using multivariable Cox proportional hazards modelling. Secondary outcomes were ICU admission, survival to hospital discharge, and long-term survival to 640 days. Results We included 3237 patients with hospital length of stay ≥7 days. In patients who died, there was greater representation of Black and Asian ethnicity in trajectory clusters for C-reactive protein and urea-to-creatinine ratio associated with increased risk of death. Inclusion of trajectory clusters in survival analyses attenuated or abrogated the higher risk of death in Asian and Black patients. Inclusion of C-reactive protein went from hazard ratio (HR) 1.36 [0.95–1.94] to HR 0.97 [0.59–1.59] (wave 1), and from HR 1.42 [1.15–1.75]) to HR 1.04 [0.78–1.39] (wave 2) in Asian patients. Trajectory clusters associated with reduced 30-day survival were similarly associated with worse secondary outcomes. Conclusions Clinical biochemical monitoring of COVID-19 and progression and treatment response in SARS-CoV-2 infection should be interpreted in the context of ethnic background.

Based on ethnic differences in biomarker expression and disease outcomes, the authors hypothesised that different outcomes from COVID-19 despite similar profiles of baseline risk factors and comorbidities are associated with blood biomarkers.
In this registry analysis of the first two waves of COVID-19 in the UK, differences in ethnicityassociated outcomes were compared with trajectories of routine biomarkers in the context of underlying comorbidities and acute response to  Of 3237 patients 16 yr old with hospital length of stay 7 days analysed, there was greater representation of Black and Asian ethnicity in trajectory clusters for C-reactive protein and urea-tocreatinine ratio associated with patients who died, and inclusion of these factors in survival analyses attenuated or abrogated the higher risk of death. Routinely collected biomarker trajectories during hospital admission are associated with adverse outcomes after COVID-19, which could reflect dysregulated host responses to disease and treatments, and could mediate ethnic imbalances in COVID-19 outcomes.
In the UK and across the Global North, people with Black and South Asian ethnic background were disproportionately affected by COVID-19 with increased hospitalisation, ICU admission, organ failure, and premature mortality. 1e4 Multiple pathways leading to differential health outcomes have been proposed, but they remain poorly understood. 5 We previously showed that initial values of routinely collected biomarkers including C-reactive protein (CRP), Ddimer, and ferritin on hospital admission with COVID-19 were increased in Black and South Asian patients, potentially reflecting increased systemic inflammation, coagulopathy, and immune dysregulation. 6,7 In the UK National Health Service (NHS) categorisation used to define our study cohort, Black ethnicity predominantly describes individuals from Caribbean and African backgrounds and Asian ethnicity is predominantly South Asian (including Indian, Bangladeshi, Pakistani), whereas people of a Chinese background are placed in the 'Other Ethnic Groups' category. This differs to the USA where definitions are based on continent of origin. 8,9 Similarly, other investigators have shown high CRP and D-dimer, as important inflammatory markers, and thrombocytopenia to strongly correlate with COVID-19 disease severity and prognosis. 10e13 These findings suggest that potential biological differences in host response to COVID-19 occur between ethnic groups, identifiable in routinely collected biochemical data. Such differences are likely to reflect a summation of population imbalances in baseline health, comorbidity, and deprivation. 14,15 Current studies often use regression-based methodologies adjusting for static measures of baseline health status and single timepoint clinical features such as disease severity scores on hospital admission. 5 These can fail to capture and account for variations in disease response and development of critical illness in patients within similar profiles of baseline risk factors. Use of longitudinal data, such as blood results, throughout hospitalisation could help to characterise these differences. 16 As studies begin to map long-term COVID-19 outcomes, characterising differences in disease response will help define underlying biological mechanisms and determine populationspecific interventions. Based on prior evidence for ethnic differences in biomarkers, we hypothesise that different phenotypes within similar profiles of baseline risk factors and comorbid disease exist in patients admitted to hospital with COVID-19. In this study, we aimed to identify phenotypes with different underlying biological features driving ethnicityassociated outcomes using trajectories of routine biomarkers, and assess these within the context of underlying comorbidities and acute response to COVID-19.

Methods
We included all adults (age 16 yr) with confirmed SARS-CoV-2 infection admitted as emergencies to the four acute care hospitals within Barts Health NHS Trust. We considered admissions between January 1, 2020 and May 13, 2020 as the first study period (wave 1), and those between September 1, 2020 and February 17, 2021 as the second (wave 2). For patients with more than one hospital admission, we included the first as the index admission. Full methods of cohort collation, data collection, and baseline patient characteristics have been described. 6,7 For this analysis, we excluded patients with unknown or undisclosed ethnicity status. In order to allow sufficient data to explore temporal trends in blood results and outcomes, we also excluded patients with hospital length of stay <7 days.

Data sources
Clinical and patient characteristic data including ethnicity, blood results, and coding data from the clinical encounter defined as date of hospital admission (start) to date of discharge or death (end), whichever occurred sooner, were collated from the Barts Health Cerner Millennium Electronic Medical Record data warehouse by members of the direct clinical care team. Mortality data updated to the data warehouse routinely against the NHS spine which captures death registration in primary care and NHS institutions was available to November 24, 2022, enabling a minimum follow-up of 640 days and a maximum of 1058 days.

Definition of key variables
Ethnicity information captured in electronic health records is a combination of self-reporting on hospital admission and clinician assigned based on primary care records or clinical judgement in the case of emergency hospital admissions. Ethnicity was defined using NHS ethnic category codes and based on five high-level groups: White, Asian or Asian British, Black or Black British, Mixed, and Other. In the NHS categorisation, the Asian group is predominantly South Asian (Indian, Bangladeshi, Pakistani). Because of small numbers, the Mixed and Other categories were merged in multivariable modelling to preserve statistical power. We examined results from all haematological and clinical biochemistry tests measured as part of routine daily panels. We included the last result if multiple were taken on the same day. The urea-tocreatinine ratio (UCR) was calculated from blood samples with the same date and time and expressed as urea (mM):creatinine (mM).
Multivariable logistic regression was used to assess ICU admission using the same covariates. Results are presented as n (%) and adjusted hazard ratios (HR) or odds ratios (OR) with 95% confidence intervals. All analyses were performed using R software version 4.02 (R Foundation for Statical Analysis, Vienna, Austria) and the kml package for clustering. 20

Ethics approval and regulations
This is a secondary analysis of the EthICAL study, which was approved by NHS England Health Research Authority and Yorkshire & The Humber Bradford Leeds Research Ethics Committee as anonymised analysis of routinely collected patient data without need for direct consent (Ethics reference 20/ YH/0159). All methods were performed in accordance with the relevant guidelines and regulations.

Results
A total of 3237 patients with hospital length of stay 7 days were included: 917 in wave 1 and 2320 in wave 2 ( Table 1).
Overall mortality at day 30 was 28.2% (n¼259) in wave 1 and 22.8% (n¼528) in wave 2. There were differences in measures across routine blood tests between ethnic groups at hospital admission and on days 7 and 15 in both waves (Supplementary Table S1). In wave 2, there was a generalised attenuation in biochemical abnormalities. Numbers of patients excluded at each stage, median follow-up time, and numbers of measures contributing to each examined biomarker are detailed in the supplementary information (Supplementary Table S2). There were no significant differences between ethnic groups.

Phenotypic patient clusters
We considered trajectories of markers of inflammation (white cell count [WCC], CRP), coagulation (platelet count), muscle wasting (UCR) and haemoglobin, red cell distribution width (RCDW), sodium, and albumin, and identified trajectory-based

Cluster membership and baseline characteristics
We confirmed differences in age, baseline comorbidity, and frailty across ethnic groups ( Supplementary Fig. S3 S4eS11). Patients in the highest risk platelet cluster were older, and more comorbid and frailer compared with those in the medium and lower risk clusters (Supplementary Table S6). Conversely, patients in the highest risk WCC, CRP, and UCR clusters were younger, less comorbid, and less frail compared with the respective medium and lower risk clusters for each marker (Supplementary Tables S7, S8, and S10).

Trajectory clusters and 30-day survival
We classified trajectory clusters according to high, medium, and lower risk defined by survival from day 7 to day 30. Across both waves, reduced 30-day survival was associated with lower platelet count, elevated WCC, elevated CRP, elevated UCR, lower haemoglobin, increased RCDW, higher sodium, and lower albumin. In the high-risk clusters, peak levels of CRP occurred during the first week of hospital admission in wave 1 compared with a secondary increase during the second week in wave 2 (Figs 1 and 2

Ethnicity distribution across trajectory clusters
In patients who died by day 30, proportions of high-risk trajectory clusters varied between ethnic groups (Figs 3 and 4).

Adjusted 30-day survival analyses
After adjustment for predefined baseline risk factors, higher risk clusters were consistently associated with increased risk of death ( Supplementary Figs S5 and S6) Fig. S4). In this analysis, no association with increased risk of death was observed for Black or Mixed and Other ethnicity patients consistent with multivariable analyses in the larger datasets. 6,7 Importantly, deaths in Black ethnicity patients occurred early during hospitalisation with median days to death being 5 days, meaning a larger proportion of Black patients who died will have been excluded from this analysis. 6

Secondary outcomes
We assessed association between clusters and predefined secondary outcomes adjusted for baseline risk factors above (Supplementary Figs S7 and S8: ICU admission; Supplementary Figs S9 and S10: survival to hospital discharge; and Supplementary Figs S11 and 12: 640-day survival). Consistent directions of effect were seen for higher risk clusters associated with 30-day survival. Survival curves to 640 days are shown in Supplementary Figures S13 and S14. Higher WCC and CRP clusters had more influence in early mortality whereas lower platelet count, higher UCR, lower albumin, lower haemoglobin, and higher RCDW clusters appeared to have more sustained impact, later impact, or both on survival.

Discussion
In this multicentre study, we found that phenotypes based on routine biomarker trajectories during hospital admission are associated with adverse outcomes after COVID-19, and that ethnicity affects the interpretation of these associations. We showed that longitudinal trajectories revealed much greater differences than single timepoint values. In particular, there was strong evidence for reduced survival associated with a lower platelet count, elevated WCC, higher CRP, and elevated UCR. Increased representation of high-risk trajectory clusters associated with inflammation and catabolism in Black and Asian patients who died and attenuation of ethnicityassociated risk of death when accounting for biomarker trajectory clusters suggest ethnic differences in phenotypes of COVID-19. In contrast, White patients who died were older, frailer, and less often admitted to ICU.

Comparison with other studies
Several observational studies have described associations between different biomarkers with severe outcomes in COVID-19, including a meta-analysis of 32 studies reporting similar magnitudes of effect ranging from pooled OR 2.36 (thrombocytopenia) to 4.27 (elevated CRP). 21 A small number of studies assessing longitudinal changes in biomarkers have been limited by small sample sizes, 22e24 lack of generalisability, 25 use of selected laboratory, immune and proteomic panels, 22,25,26 restricted time points such as differences between admission to discharge, 23,27 cut-offs or aggregate measures such as means, 24,28 and retrospective assessment based on survival or disease severity. 24,26,27 The majority of these studies were also conducted using primarily first wave data. 23,24,27 From these analyses, markers of inflammation, particularly CRP, have been most frequently associated with increased disease severity and death. Only one UK-based study assessed and did not find evidence for ethnicityrelated variation in measured biomarkers. 25

Phenotypes defined by biomarker trajectories
Multiple mechanisms behind COVID-19-associated thrombocytopenia have been proposed. 29 In our analysis, although large proportions of patients have a degree of thrombocytopenia, a failure to recover was most associated with poor outcomes, whereas a protective repose to COVID-19 appears to involve a degree of thrombocytosis. This might reflect bone marrow and megakaryocyte suppression from an ongoing inflammatory response, inability to reduce viral load, and liver and kidney failure. 30 Increased release of inflammatory cells and acute phase proteins resulting in elevated WCC and raised CRP are associated with increased disease severity. 31,32 Compared with prior studies reporting high static measures, we found changes throughout hospitalisation. Trajectories in CRP differed between waves. Overall levels of CRP were lower in the second wave and patients with a higher level early during hospitalisation had increased hospital length of stay, whereas patients with a secondary increase had increased risk of ICU admission and death. This may reflect routine early use of corticosteroids during the second wave reducing initial inflammation, but perhaps predisposing to later nosocomial infection. 33 These differences could also relate to treatment changes both in and before hospitalisation or a change in disease profile over time. 34,35 UCR is a well-established measure of catabolism correlated with both muscle loss and development of persistent critical illness. 36e38 In COVID-19, elevated levels associated with increased mortality and requirement for prolonged organ support are likely to be driven by multiple pathways including multisystem inflammation, organ dysfunction, 39 and prolonged critical illness. 40 Ethnicity-associated biological mechanisms Studies examining drivers of ethnic inequalities in COVID-19 remain sparse. In particular, little is known about biological mechanisms underlying greater disease severity, differential rates of organ failure, and variations in treatment response. Proposed hypotheses include impaired glucocorticoid sensitivity attributable to factors such as chronic social stress leading to an increased inflammatory response. 41 This might help explain potential differential responses to dexamethasone with trends in greater improvement in outcomes in non-White patients. 33 Similarly, acute inflammation arising from COVID-19 might augment existing chronic inflammation secondary to increased medical comorbidity in Black and Asian patients. 14 There is considerable evidence that incidence of clinically important disease is higher in many minority ethnic groups. 42 These findings could be explained by attenuation in the higher risk of death associated with Black and Asian ethnicity in our study when adjusting for biomarker trajectory clusters.
Overall, these data suggest that ethnic imbalances in severity of COVID-19 disease could be driven by differences in baseline health status, reflecting longstanding health inequalities, and leading to adverse responses to disease in terms of inflammation and catabolism. This highlights the importance and urgency in addressing disparities in general health and wellbeing and across acute and routine care throughout the life course. Patients of different backgrounds can respond differently to disease, and better integration of these findings in routine clinical data could identify differing healthcare needs. COVID-19 has allowed examination of underlying biochemical features in a less heterogenous disease presentation, however future work should assess whether these differences are also seen in non-COVID disease states. This is a research priority but the greater heterogeneity will require significantly larger data sources.

Strengths and limitations
Strengths of our study include the granularity of our dataset, a >50% ethnically diverse patient cohort, the longitudinal analysis, and relatively large sample size in both pandemic waves that had significant impact on hospitalisation in the UK. Limitations are as follows. To ensure inclusion of sufficient numbers of data points and adequate representation of blood tests across the total duration of hospital admission and ethnic groups, we were not able to fully explore correlations between different biomarker trajectory clusters and examine certain results such as D-dimer. We were unable to compare the effect of differing viral strains between waves and examine and adjust for potential differences in treatments and pathways and vaccination status. Based on clinical experience, all patient groups across hospital sites received management according to centralised treatment protocols. As a result, any observed treatment differences are more likely to reflect changes between waves rather than differences across ethnic groups. There could be systematic biases in data collection with more unwell patients receiving more frequent laboratory testing. However, data collected alongside routine clinical management allows for more directly translatable findings. In addition, current ethnic categorisations used in healthcare do not reflect the vast heterogeneity within each aggregated ethnic category. Furthermore, misclassification of patients into both unknown and other ethnic groups might occur more frequently in emergency admissions and therefore lead to over-representation of these ethnic categories and exclusion in our analysis compared with census data. Given the above limitations, our data remain exploratory in nature, and findings need to be interpreted with caution. Importantly, we make no assumptions regarding the underlying mechanisms behind phenotypes that represent complex interactions between biological, social, economic, and behavioural factors.

Conclusions
Phenotypes based on routinely collected biomarker trajectories during hospital admission are associated with adverse outcomes after COVID-19. These potentially reflect dysregulated host responses to disease and to treatments, and could be a mechanism mediating ethnic imbalances in outcomes for COVID-19. Increased representation of high-risk phenotypes associated with inflammation and catabolism in Black and Asian patients could be driven by baseline differences in health status, reflecting longstanding health inequalities.