Explaining the deprivation gap in COVID-19 mortality rates: A decomposition analysis of geographical inequalities in England

One of the most consistent and worrying features of the COVID-19 pandemic globally has been the disproportionate burden of the epidemic in the most deprived areas. Most of the literature so far though has focused on estimating the extent of these inequalities. There has been much less attention paid to exploring the main pathways underpinning them. In this study, we employ the syndemic pandemic theoretical framework and apply novel decomposition methods to investigate the proportion of the COVID-19 mortality gap by area-level deprivation in England during the first wave of the pandemic (January to July 2020) was accounted for by pre-existing inequalities in the compositional and contextual characteristics of place. We use a decomposition approach to explicitly quantify the independent contribution of four inequalities pathways (vulnerability, susceptibility, exposure and transmission) in explaining the more severe COVID-19 outcomes in the most deprived local authorities compared to the rest. We find that inequalities in transmission (73%) and in vulnerability (49%) factors explained the highest proportion of mortality by deprivation. Our results suggest that public health agencies need to develop short- and long-term strategies to alleviate these underlying inequalities in order to alleviate the more severe impacts on the most vulnerable communities.


Introduction
There is an accumulating body of international evidence which shows that COVID-19 incidence and mortality rates are higher in more deprived areas and populations (Desmet and Wacziarg, 2020;Jung et al., 2021;Kim and Bostwick, 2020). For example, in the USA, COVID-19 mortality rates are more than twice as high in low-income counties than in the wealthiest counties (Chen and Krieger, 2021). Similarly, in England, the cumulative death rate in the most deprived 20% of local authorities was 54% higher than the rate in the 20% least deprived areas at the start of the pandemic, with inequalities even persisting during-and after-the first national lockdown (Welsh et al., 2021a). A similar picture has been reported for COVID-19 cases in England (Morrissey et al., 2021;Welsh et al., 2021b).
The observed gradients in COVID-19 mortality by area deprivation reflect an independent impact on excess mortality from the COVID-19 pandemic (Brandily et al., 2021;Decoster et al., 2021). Higher risks at the individual level interact with local area characteristics that increase and compound these risks (Brandily et al., 2021;Decoster et al., 2021). Because of these higher risk factors in more deprived areas, it has been argued that the COVID-19 pandemic is being experienced as a 'syndemic pandemic' (Bambra et al., 2020(Bambra et al., , 2021Islam et al., 2021). A syndemic describes 'a set of closely intertwined and mutual enhancing health problems that significantly affect the overall health status of a population within the context of a perpetuating configuration of noxious social conditions' (Singer, 2000, 13).
In the case of COVID-19, the concurrent inequalities ('the syndemic') that affect more deprived areas include: greater prevalence and coexistence of non-communicable diseases such as hypertension, diabetes, asthma, COPD, hypertension, asthma, obesity [Aghili et al., 2021;Bermano et al., 2021;Gao et al., 2021;Menon, 2021;Shah et al., 2021]; greater exposure to poor working conditions (as those found in low-paid, low-skilled jobs); overcrowded living spaces and poor quality, insecure housing; greater barriers to accessing healthcare -even in societies with universal health coverage; and greater likelihood of suffering from chronic stress arising from everyday material deprivation and its psychological effects (Bambra et al., 2020;Gibson et al., 2011;Guo et al., 2019;McNamara et al., 2017;Segerstrom and Miller, 2004). These pathways also reflect the wider literature on geographical inequalities in healthwhich has focused on the interacting influences of population (composition) and local area (context) factors (Bambra et al., 2019;Cummins et al., 2007). Bambra et al. (2020) provide a conceptual framework to help understand the four different pathways whereby these factors have shaped inequalities in the COVID-19 pandemic: unequal vulnerability, unequal susceptibility, unequal exposure and unequal transmission. Unequal vulnerability refers to the increased risk of mortality and severity of disease from the higher burden of non-communicable diseases in socio-economically disadvantaged areas. Unequal susceptibility refers to the increased risk of more severe disease for people from disadvantaged backgrounds, even in the absence of a pre-existing health problem, because their immune system is weakened from chronic exposure to stressors typical of living under conditions of material deprivation and low social status (Marmot, 2004;Whitehead et al., 2016). Unequal exposure is differences in the ability to shield from infection. Typically, more deprived groups work in so called key sectors, required to go to the workplace often in conditions with frequent and close interpersonal contact. People in insecure job contracts may also not be able to stop working even if sick, increasing exposure to the virus in their social networks. Finally, unequal transmission, relates to the increased risk of contagion for people living in more deprived areas. This arises from living conditions that limit the possibility to self-isolate if infected, and which increase the chance of more severe disease in a household, for example because of overcrowding, population density or intergenerational households (Mikolai et al., 2020).
In this study we use data on area-level inequalities by deprivation in COVID-19 mortality from the first wave of the pandemic in England and employed decomposition techniques to estimate the relative importance of these theorized pathways in explaining observed differences at the local authority level. This is a novel contribution to the literature as whilst there are many studies, from many global regions (Calderón--Larrañaga et al., 2020;Chen and Krieger, 2021;Menaet al., 2021) estimating the extent of area-level inequalities in COVID-19 mortality by deprivation -we instead seek to identify the key factors explaining the observed differences. There have been some studies to date which have examined the influence of one or two different factors. For example, Harlem (2020) and Maroko et al. (2020) investigated the area-level socioeconomic and demographic characteristics of high-versus low-COVID-19 incidence urban areas. Almagro et al. (2021) explored the effect of commuting and overcrowding on area-level racial disparities in COVID-19 hospitalizations and Francetic and Munford (2021) the association between commuting and COVID-19 mortality. Decoster et al. (2021) identified the role of municipal-level household composition and age profile on the income gradient in COVID-19 deaths. More wide ranging examinations have been provided by Karaye and Horney (2020) who employed an area-level Social Vulnerability Index (comprising socioeconomic position, household composition, disability, minority status, housing and transportation), to explore how the different factors predicted COVID-19 cases in US counties. Similarly, Brandily et al. (2021) used French data to estimate the effect of labor-market exposures and housing conditions on the income gradient in COVID-19 mortality. However, these studies did not investigate other important factors, such as the share of comorbidities in an area, or environmental factors such as levels of air pollution.
So, the existing literature has not employed a comprehensive theoretical framework as we propose. Further, they have not used decomposition methods to explore the relative contribution of different factors. Our study therefore makes both an original empirical -and a novel methodological -contribution to the COVID-19 and health inequalities literature.

Data and variables
All data for this study are publicly available. All variables were measured at the English 2020 Local Authority (LA) geography level (Office for National Statistics). After excluding the City of London, the Isle of Scilly and Cornwall due to well-known mortality data quality issues and low population counts, 311 LAs were included in the final dataset. Data on local area deprivation came from the 2019 English Index of Multiple Deprivation (IMD) (McLennan et al., 2019).
A binary variable was derived identifying the LAs in the most deprived quintile of IMD (the first quintile, i.e. the 20% most deprived LAs compared to the rest of the country). We chose to focus on the bottom 20% because in previous work (Welsh et al., 2021a), we found that the 20% most deprived local authorities had the highest COVID-19 age-standardised mortality rates during the first wave and that these bottom two deciles had significantly higher death rates than the other eight deciles. The relationship between IMD and COVID-19 mortality rates was also not linear in the first wave (e.g. some of the more affluent IMD deciles [e.g. 9 and 10] had higher mortality rates than some of those lower down the IMD ranking [e.g. 8]).
Weekly counts of COVID-19 deaths (based on any mention of Coronavirus in the death certificate) for the study sample of LAs (n = 311) were obtained from the UK Office for National Statistics (ONS) by date of registration (Timson, 2021) for the period 1 st January 2020 to 4 th July 2020, the date at which many service and hospitality establishments were allowed to re-open to the public (Department for Business, 2020; Prime Minister's Office, 2020). Age-standardised weekly COVID-19 death counts were approximated using monthly age-standardised rates available from the ONS (Office for National Statistics, 2021b; Welsh et al., 2021a). Table 1 lists the data sources for the multiple explanatory variables included in the PCA analysis. We also summarise them and the reasons for their inclusion here: • For the increased vulnerability pathway: the prevalence of diabetes, hypertension, coronary heart disease and obesity were chosen because these conditions were key clinical risk factors for adverse outcomes from COVID-19 infection (Shah et al., 2021). For example, studies have found that obesity was associated with increased mortality at the individual (Aghili et al., 2021;Bermano et al., 2021;Gao et al., 2021) and area levels (Menon, 2021). Likewise, cohort studies have noted a positive association between COPD and COVID-19 mortality (Higham et al., 2020). Shielding was advised by the National Health Service for people considered to be 'clinically extremely vulnerable to COVID-19'. This included people with COPD, congenital or acquired conditions affecting immunity, and cancers (NHS -National Health Service, 2020). • For the increased susceptibility pathway: Poverty (and income inequality) rates were included as low-income households were less likely to be able to work from home and were less able to self-isolate (due to financial constraints such as no/low sick pay) (Fletcher et al., 2022). Homelessness (and rough sleeping) rates for each LA were included as people experiencing homelessness are at higher risk for chronic health conditions (clinical risk factors for COVID-19 deaths) and had a higher mortality rate during the pandemic (Cawley et al., 2022). Low quality housing (and no central heating) measures were included as research has noted an association between poor housing conditions, such as damp and mould, with respiratory infections and severity of disease (Ingham et al., 2019). • For the increased exposure pathway: the percentage of workers in an LA employed within industries which research has shown have an elevated risk of COVID-19 and were more likely to be designated as key workersrequiring continued travel to -and attendnce atthe work place (Almagro et al., 2021;Asher et al., 2021;Contreras et al., 2021;De Negri et al., 2021;Decoster et al., 2021; Office for National Statistics, 2021a). These included meat processing, prepared meals, food and pharma retail, passenger transport, justice and public order, and health.  • For the increased transmission pathway: Overcrowding (and high occupancy) rates were included as overcrowded living spaces limit the possibility to self-isolate if suffering from COVID-19 infection. There is evidence of higher COVID-19 rates in larger households and multigenerational households in wave 1 (Almagro et al., 2021;Karaye and Horney, 2020; UK Scientific Advisory Group for Emergencies, 2020; 2021). Percentage of households with dependent children was included as they may have experienced higher transmission (Larosa et al., 2020). Air quality was included because links have been found between air quality and COVID-19 (Cole et al., 2021;Persico and Johnson, 2021). Crime rates were included as they may have impacted on transmission as people living in high crime areas are less likely leave home (e.g. to engage in outside physical activity, Rees-Punia et al., 2018). Rurality (percentage of population living in a rural area) was included as research shows that mortality from-and transmission of-previous respiratory pandemics (e.g. 1918 Spanish flu or 2009 H1N1) was lower in more rural areas (Bambra et al., 2022;Rutter et al., 2012). Number of care home beds in an LA was included as the proportion of the population in an area living in nursing homes was associated with differences in COVID-19 death rates (Bach-Mortensen and Degli Esposti, 2021; Desmet and Wacziarg, 2020).

Statistical analyses
To ensure that we estimated a parsimonious model, Principal Component Analyses (PCA) is employed. We estimated four separate PCAs, one for each of the group of factors in the theoretical framework. PCAs were extracted using orthogonal varimax rotation. The choice of number of factors to retain was based on the figure of eigenvalues (scree plot), excluding factors with an eigenvalue below 1, as well as based on the interpretability of the factors.
We estimated the contribution of the different group of factors (vulnerability, susceptibility, exposure and transmission) to differences in mortality by area-level deprivation through a regression  decomposition approach . The method estimates the share of a total effect (the reduced model: an unadjusted relationship between an explanatory variable and an outcome) that is accounted for by a set of confounders (the indirect effect of the explanatory variable on the outcome after adjusting for confounders in an adjusted or 'full' model) (Kohler et al., 2011) and allows for a truncated distribution. We use linear regression to estimate the unadjusted and adjusted models. In the adjusted models we use the principal components of the PCA as the relevant confounders. Standard errors were clustered by 2011 ONS area (Office for National Statistics, 2015), where 2011 codes were assigned to 2020 LA codes using LA population weights. We estimated six separate models: the unadjusted or reduced model, one model adjusting for each of the 'vulnerability', 'susceptibility', 'exposure' and 'transmission' PCAs, and one last model including all four pathways simultaneously. Model fit using the Bayesian Information Criteria (BIC) is presented. Confidence intervals for the decomposition estimates were calculated using 1000 bootstrap samples with replacement and strata for 2011 ONS area.

Results
In total, 63 LAs out of 311 in our sample were classed as the most deprived 20% LAs in the country. These LAs were more densely populated than the rest of the country (Table 2). These areas also had on average a slightly lower proportion of residents aged between 65 and 84 years (15.6%) compared to the rest of the country (17.6%), but similar proportions of people over the age of 85 years (2.3% and 2.8% in deprived and no-deprived areas, respectively). Deprived areas also had on average 11% of residents from BAME groups. In terms vulnerability, susceptibility, exposure and transmission factors, deprived areas had higher prevalence of: comorbidities (especially COPD), households living in poverty, crime, overcrowding, and people employed in health, transport and food and pharmaceutical retail. The full list of variables can be found in Table 3.
The average cumulative weekly age-standardised COVID-19 mortality rate in the most deprived areas was 1135.9 (SD 2704.5) per 100,000 population in the 20% most deprived areas and 853.7 (SD 1742.3) in the rest of England (Welsh et al., 2021a).

Principal components analysis
The different factors and the associated rotated factor loadings resulting from the PCAs are summarized in Table 4, and the corresponding scree plots can be found in Supplementary Fig. 1. For the vulnerability group of variables, two factors were extracted explaining 76% of the variance of the variables. The first factor was loaded mainly by the hypertension, CVD and shielding variables and was interpreted as "General poor health and disability, with a focus on cardiovascular conditions". The second factor for this group loaded mainly on COPD. The susceptibility variables loaded on two factors explaining 60% of the variance in the original variables, the first factor interpreted as "low household income and income inequality" and the second as "Poor quality home". Three factors were extracted for the exposure group of variables, explaining 66% of the variance in the original data. These were interpreted based on the loadings as "Close contact food processing", "Key service sector workers" and "Key retail sector workers". Finally, variables related to transmission loaded in three factors explaining 77% of the total variance in this group of variables. The first factor was loaded by overcrowding, high occupancy, and urban location and was interpreted as "Overcrowded urban environments", the second factor was loaded mainly by the density of care homes in an area, and the third factor was loaded mainly by households with dependent children and was used as a proxy for multigenerational households.

Decomposition analyses
Results from the decomposition analyses are based on the set of linear regressions of COVID mortality on the 'vulnerability', 'susceptibility', 'exposure' and 'transmission' PCAs (Table 5). We found that COVID-19 cumulative weekly deaths were significantly higher in deprived areas in the first wave (282.2 per 100,000 population [95% CI: 106.8, 457.6]). The coefficients for the cumulative death rate per 100,000 population on the area deprivation variable after adjustment for the pathways was reduced to 35.1 (95% CI: − 112.3,182.4). The association between IMD and COVID-19 mortality is also attenuated by vulnerability, susceptibility and transmission individually. The BIC for each model -including PCAs -is lower than the area deprivation alone, with transmission and all factors having similar BIC and R-squared values. Table 6 shows the results of the decomposition analyses of the difference in the reduced and adjusted coefficients. Each factor on its own explained between 30.8% (exposure) to 84.2% (transmission) of the relationship between deprivation and COVID-19 mortality. When all pathways were considered together 87.6% of the effect was explained by the factors together; transmission (73.3%) remained the most important, followed by vulnerability (48.6%). Neither exposure, nor susceptibility factors are significantly related to mortality after adjustment for vulnerability or transmission (indicated by small proportions with confidence intervals that include 0). The decomposition results are also CV: coefficient of variation, estimated as the mean divided by the standard deviation.

Discussion
This study explored area-level inequalities behind the large disparities in COVID-19 mortality between more and less deprived local authority areas in England during the first wave of the pandemic. Specifically, it explored the contribution of four theorized pathways: unequal vulnerability, unequal susceptibility, unequal exposure and unequal transmission (Bambra et al., 2020(Bambra et al., , 2021. We found that from January 2020 to July 2020, transmission factors explained the largest share of the deprivation gap in mortality, followed by vulnerability factors. Transmission factors were also found to be important in previous research conducted in France where Brandily et al. (2021) examined the gradient in COVID-19 mortality by area-level income in urban municipalities. They found that the share of over-crowded housing in an area (transmission) was an important explanation of the mortality gap (Brandily et al., 2021). Similarly, a US study of COVID-19 incidence found that COVID-19 spread faster in more urban areas with higher population density (Clouston et al., 2021). Like Brandily et al. (2021) and Clouston et al. (2021), our study also found that when accounting for all factors simultaneously in our final model, exposure factors were the least important in explaining the deprivation gap in mortality. Whilst exposure is the first avenue through which a virus will spread into a community, differences in mortality levels in an area will be influenced by what happens after that exposure (Murti et al., 2021). So, inequalities in other risks such as overcrowding, population density (transmission) and comorbidities (vulnerability) become more important once the infectious disease has been seeded into a community.
We found higher vulnerability to adverse COVID-19 outcomes in more deprived areas arising from a higher prevalence of NCDs. This suggests that the virus was more lethal in more deprived areas, characterized by populations with higher rates of key comorbidities (Almagro et al., 2021;Brandily et al., 2021;Chang et al., 2021). However, whilst the proportion of comorbidities in an area has been previously associated to greater overall vulnerability to worse local COVID-19 outcomes in England (Daras et al., 2021), associations with inequalities by area-level deprivation have been less clear (Davenport et al., 2021). In our study, we included key clinical risk factors (such as COPD -which exhibits a clear deprivation gradient in England [Collins et al., 2018]) and so our different choice measurement of NCDs compared to these earlier studies may be responsible for our different findings. Whilst our research has examined the place-based factors that matter for the increased mortality from COVID-19 in deprived areas, our findings might also have relevance and insights for other respiratory illnesses including influenza. As with COVID-19, inequalities were documented in the 2009 H1N1 influenza pandemic and there are well documented inequalities by area-level deprivation every year in seasonal flu deaths. For example, the mortality rate from H1N1 in the most deprived areas of England was three-times higher than in the least deprived areas (Rutter et al., 2012). Similarly, in Canada hospitalization rates for H1N1 were associated with neighbourhood deprivation (Lowcock et al., 2012). Various studies into cyclical winter influenza have also found associations between mortality, morbidity and symptom severity and socio-economic status amongst both adults and children (Crighton et al., 2007;Tam et al., 2014). The pathways we have detected for COVID-19 might also be important for inequalities in flu.
By examining the influence of place characteristics on inequalities in COVID-19 mortality, our study also makes a contribution to our understanding of the geographical nature of health inequalities. This longstanding sub-section of the medical geographies literature (Elliott, 2018), examines the well documented health inequalities that exist within and between placesat different scales: from the life expectancy gap of 9 years between men living in the most and least deprived neighbourhoods of England, to the excess mortality in the West of Scotland, or the US heath disadvantage (Bambra et al., 2019). There has long been a tension in the geographical analysis of these health inequalities about the relative importance of compositional and contextual factorsand their interrelationship (Bambra, 2016;Cummins et al., 2007). Compositional factors explain the relationship between place and the health of a community by focusing on the demographic (age, sex, ethnicity), health-related practices (smoking, alcohol, physical activity, diet, drugs, gambling), and socio-economic characteristics (income, education, occupation) of the people living within the area (neighbourhood, town, city, region). In contrast, contextual analyses explore how the characteristics of a local placeeconomic (e.g. area-level: poverty rates, unemployment rates, wages, types and quality of work, and job availability), social ('opportunity structures' [e.g. services such as childcare, food availability, healthcare, housing or schools] and 'collective social functioning' [e.g. social cohesion, community control, social capital, stigma]) and physical (e.g. air pollution, access to green space, built environment) -also matter for the health of the community living in that place (Macintyre et al., 2002;Pouliou and Elliott, 2010;Bambra et al., 2014). Our study engages with this debate by suggesting that on the one hand, compositional factorsspecifically underlying rates of NCDs (vulnerability) -matter for unequal COVID-19 outcomes but on the other hand, the contextual characteristics of places also matterparticularly housing conditions and population density rates (transmission). Our results also suggest that these compositional and contextual factors act relationally both within and between pathways.
An important strength of this study is the use of a theory-guided framework to investigate the observed health inequalities in COVID-19 in England. In contrast to previous studies that have explored only a couple of potential mediators, we concurrently test for a wide range of socio-spatial inequalities that could explain the area-level deprivation gap in COVID-19. Our results show that indeed area-level inequalities were explained by this comprehensive set of factors.  95% confidence intervals in parentheses, *p < 0.05, **p < 0.01, ***p < 0.001.

Table 6
Percentage (and 95% confidence interval) of the total effect of deprivation on COVID-19 mortality rates accounted for by the different 'vulnerability', 'susceptibility', 'exposure' and 'transmission' factors between January 2020 and July 2020.  Our study is also strengthened from using age-standardised mortality rates to account for the impact on COVID-19 mortality arising from different age profiles across local authorities, although this was done as an approximation using pro-rata monthly age-standardised rates to weekly ones based on weekly death counts from a lack of official weekly age-standardised mortality rates at the local authority level (Welsh et al., 2021a).
A point of note in our analyses is that we did not adjust for the ethnic composition of neighbourhoods. Mainly in the United States, but also in other parts of the world, some of the areas worst affected by the pandemic within a country have been those with high concentrations of non-white ethnic groups (Karaye and Horney, 2020;Ruck et al., 2021). In England, many of the social inequalities associated with greater deprivation are closely interlinked with ethnicity (e.g. living in multi-generational homes and more crowded housing, holding key sector and low-paying occupations, greater prevalence of comorbidities associated with greater COVID-19 death risk, etc.) (Marmot and Allen, 2020;Nazareth et al., 2020; UK Scientific Advisory Group for Emergencies). Including a variable with such strong correlations to the exposures of interest would have very likely subsumed their predictive effects and left us with an uninformative model about the reasons why local area inequalities account for mortality disparities (Office for National Statistics, 2020). However, the important association between ethnicity and adverse COVID-19 outcomes should also be explored further (Katikireddi et al., 2021).
Our study is also subject to some important limitations. Firstly, using COVID-19-specific mortality, as opposed to a measure of excess mortality, could have biasedunderestimated -our estimates of the effect of area deprivation on deaths if true COVID-19 deaths were underreported -which may well have been the case at the start of the pandemic. Similarly, our estimates from the decomposition analyses could also be conservative if underreporting of true COVID-19 deaths was associated with deprivation. Given findings from other studies (see e.g. Brandily et al., 2021), however, we would expect our main conclusions to remain the same even when using excess mortality. Secondly, we used mortality data from local authorities in England. This was partly because local authority COVID-19 mortality data was available publicly at the time we conducted our analysis. But also because most of our predictor, explanatory variables were only available at the local authority levelnot at a smaller neighbourhood level (e.g. lower super output areas [LSOAs]). Analysis of smaller-level geographies would allow a more precise estimation of the extent of area-level inequalities in COVID-19 mortality -but data for our predictors is not available at this scale. Thirdly, we chose to focus on the bottom 20% compared to the other 80% because previous work (Welsh et al., 2021a) found that the 20% most deprived local authorities had the highest COVID-19 age-standardised mortality rates and that the relationship between IMD and COVID-19 mortality rates was not linear. However, we acknowledge that the other 80% of local authorities are likely to be heterogeneous themselves. Further work could explore differences in the pathways between these other four quintiles. Fourthly, some of the variables used in each of the four pathways are collinear with deprivation and some are in fact taken directly from the calculation of the underlying IMD score domains, however the pathways fit better than deprivation alone. Finally, whilst we have identified relationships at the area level we cannot, of course, assume that our findings hold true at the individual level. To do so would be to risk the ecological fallacy since relationships identified for areas cannot be assumed to apply to individuals (Fieldhouse and Tye, 1996). Further cohort based work would need to be conducted to explore the factors that influence individual-level socio-economic inequalities in COVID-19 (Office for National Statistics, 2021a).

Conclusion
This study provides evidence that the higher rates of COVID-19 mortality in more deprived areas arose from pre-existing inequalities in vulnerabilities (i.e. a greater burden of NCDs) and inequalities during the pandemic in transmission factorsspecifically housing conditions and population density. This is consistent with the theorization of the pandemic as a syndemic for more deprived communities (Bambra et al., 2020). This suggests that public health agencies need to understand how these factors are unequally distributed across and clustered within communities (Daras et al., 2021). Efforts to reduce the impact of COVID-19 -and any future pandemics (including influenza) -on more deprived communities must focus on addressing these priority underpinning pathways by intervening to reduce health inequalities through action on the social determinants of health (Bambra et al., 2010).