Quantifying the impact of air pollution on Covid-19 hospitalisation and death rates in Scotland

Better understanding the risk factors that exacerbate Covid-19 symptoms and lead to worse health outcomes is vitally important in the public health fight against the virus. One such risk factor that is currently under investigation is air pollution concentrations, with some studies finding statistically significant effects while other studies have found no consistent associations. The aim of this paper is to add to this global evidence base on the potential association between air pollution concentrations and Covid-19 hospitalisations and deaths, by presenting the first study on this topic at the small-area scale in Scotland, United Kingdom. Our study is one of the most comprehensive to date in terms of its temporal coverage, as it includes all hospitalisations and deaths in Scotland between 1st March 2020 and 31st July 2021. We quantify the effects of air pollution on Covid-19 outcomes using a small-area spatial ecological study design, with inference using Bayesian hierarchical models that allow for the residual spatial correlation present in the data. A key advantage of our study is its extensive sensitivity analyses, which examines the robustness of the results to our modelling assumptions. We find clear evidence that PM2.5 concentrations are associated with hospital admissions, with a 1 μgm−3 increase in concentrations being associated with between a 7.4% and a 9.3% increase in hospitalisations. In addition, we find some evidence that PM2.5 concentrations are associated with deaths, with a 1 μgm−3 increase in concentrations being associated with between a 2.9% and a 10.3% increase in deaths.


Introduction
Covid-19 has caused worldwide health and economic devastation, and was declared a global pandemic by the World Health Organisation (WHO) on 11 ℎ March 2020. The disease originated in Wuhan in the People's Republic of China in December 2019, and subsequently spread across the world in the following months. As of 1 March 2022 there have been over 437 million cases worldwide, and over 5.9 million of those individuals have sadly died from the virus (Johns Hopkins Coronavirus Resource Centre, https://coronavirus.jhu.edu/map.html). There is thus a rapidly growing research literature focusing on the Covid-19 pandemic, including: (i) modelling the spread of the pandemic and its impacts on healthcare (Remuzzi andRemuzzi, 2020 andLee et al., 2021); (ii) identifying the factors that make people at a higher risk of displaying severe symptoms (Conticini et al., 2020 andWu et al., 2020); (iii) identifying the wider health impacts of the pandemic (Douglas et al., 2020); and (iv) developing real-time surveillance systems to track disease incidence (Dong et al., 2020). effects of PM 2.5 on Covid-19 incidence in the USA but not in Canada, England or Italy. This paper adds to this limited global evidence base, by presenting a new study quantifying the effects of air pollution on Covid-19 outcomes in Scotland, United Kingdom, which as of 1 March 2022 has seen over 1,390,000 cases and over 10,670 deaths (Johns Hopkins Coronavirus Resource Centre) due to . The impact of air pollution on health more generally is a continuing priority for the Scottish government, because it recently published its second Cleaner Air for Scotland (CAFS) strategy in July 2021 (Scottish Government, 2021). Thus understanding the role that air pollution may play on adverse health outcomes due to Covid-19 is an important topic for the Scottish Government, and as no study has yet examined this at the small-area scale, this is the evidence gap that this paper fills.
Our study is based on small-area count data summarising Covid-19 hospitalisations and deaths between March 2020 and July 2021, which thus makes it one of the most up-to-date and comprehensive studies in terms of its temporal duration (e.g. Wu et al., 2020 used data up to 18 ℎ June 2020 while Konstantinoudis et al., 2021used data up to 30 ℎ June 2020. These small-area count data are modelled using Poisson log-linear models, where the spatial variation in Covid-19 risk is explained by air pollution concentrations, a range of other important confounding factors, and random effects allowing for residual spatial autocorrelation. When presenting the study results we conduct a wide-ranging sensitivity analysis to our modelling assumptions, thus quantifying the robustness of our results to model misspecification. The study region and data are presented in Section 2, while the statistical models we use are presented in Section 3. The results of the study are presented in Section 4, while Section 5 concludes the paper highlighting our key findings.

Data
The study region is Scotland, which for the purposes of this study has been partitioned into = 1279 small areal units called Intermediate Zones (IZ) that have an average population of around 4000 people. IZs are a Scottish Government developed geography for the distribution of small-area statistics, and further details can be found at https://www.spatialdata.gov.scot/geonetwork/srv/api/records/389 787c0-697d-4824-9ca9-9ce8cb79d6f5. The time period for our study is from 1 March 2020 until 31 July 2021, a duration of 17 months. We use a spatial rather than a spatio-temporal ecological design for our study, where the data are aggregated over time to yield one summary measure for each variable in each IZ representing the entire study duration. We adopt this design because having a, say monthly, spatiotemporal design would result in around 1 hospitalisation and less than 1 death on average in each spatio-temporal unit (see Table 1), which would thus make regression effects hard to estimate due to the very small numbers of disease events in each spatio-temporal unit.

Covid-19 hospitalisation and death data
There are two outcome variables in this study, namely the numbers of Covid-19 hospitalisations and deaths that occurred from the population living in each IZ between 1 March 2020 and 31 July 2021. The hospitalisation data were obtained directly from Public Health Scotland, while the death data were obtained from the National Records for Scotland (https://www.nrscotland.gov.uk/covid19stats).
The numbers of Covid-19 hospitalisations and deaths in an IZ will depend on the size and age-sex demographics of its population, which we adjust for using indirect standardisation. Specifically, we compute the expected numbers of Covid-19 hospitalisation and deaths in each IZ, based on the assumption that Scotland-wide age-sex specific hospitalisation/death rates apply in each IZ. National age-sex specific Covid-19 hospitalisation/death rates for the study period by five year age group were obtained from the sources listed above, while age-sex specific population totals for 2019 (the closest year of data available) were obtained from the Scottish Government. The expected number of hospitalisations/deaths in each IZ is computed by multiplying the number of people in each age-sex group by the national Covid-19 hospitalisation or death rate for that group, before summing over groups to obtain the final expected count separately for hospitalisations and deaths.
An exploratory measure of Covid-19 risk in each IZ is the standardised morbidity/mortality (death) ratio (SMR), which is the observed number of hospitalisations/deaths divided by the expected numbers of hospitalisations/deaths. An SMR of 1 corresponds to an average risk area relative to the national hospitalisation/death rates, values less than one correspond to low risk areas, while values greater than one correspond to high risk areas. Specifically, a value of 1.2 corresponds to a 20% increased risk compared to the national average. Maps of the spatial distributions of the SMR are displayed in panels A (hospitalisation) and B (death) of Fig. 1, while a numerical summary of their quantiles across the IZs is given in Table 1. Both these displays highlight the large amount of noise in the SMR, with values ranging between (0, 3.66) for hospitalisations and (0, 5.81) for deaths. The increased D. Lee et al. amount of noise in the death SMRs is due to them being based on smaller numbers of events compared to the hospitalisation SMRs, with median sizes of 21 hospitalisations and 6 deaths in an IZ over the study period. The SMRs also show little in the way of a clear spatial trend across Scotland, which is also likely to be caused by the large amount of noise.

Air pollution data
The network of air pollution monitors in Scotland is spatially sparse (see http://www.scottishairquality.co.uk), and thus does not provide a measurement for each of the = 1279 IZs in this study. Therefore, in common with Haining et al. (2010) and Lee et al. (2019) we utilise modelled concentrations instead, specifically annual averages for 2019 from the Pollution Climate Mapping (PCM) model (https:// uk-air.defra.gov.uk/data/pcm-data) developed for the Department for the Environment, Food and Rural Affairs (DEFRA). We use concentrations for 2019 rather than for 2020, because the latter are artificially lower than normal concentration levels. This is because lockdowns and other mobility restrictions were introduced in Scotland during 2020, which reduced transport usage and hence pollution concentrations for a sizeable proportion of that year. A full lockdown was implemented between 23 March-29 ℎ May, while a range of other local restrictions occurred across Scotland between June and December. However, to assess the effect of using 2019 rather than 2020 concentrations we also fit models based on the 2020 concentrations as a sensitivity analysis.
The PCM model produces estimated concentrations on a 1 km 2 grid, which is spatially misaligned with the irregularly shaped IZs. The problem of spatial misalignment between multiple data sets is known as the change of support problem (Gelfand et al., 2001), and is a common challenge in spatial epidemiology. For example, Berrocal et al. (2010) and Forlani et al. (2020) proposed approaches for dealing with spatial misalignment between multiple air pollution data sets, specifically between point-level measurements and grid-based modelled estimates. In contrast, a range of approaches have been proposed for spatially re-aligning air pollution measurements to an irregular areal unit geography, which is the spatial footprint that population-level disease data are typically available at. When the pollution data used in a study are point-level measurements then geostatistical Kriging is often used (e.g. Huang et al., 2021), while if modelled data with complete spatial coverage are used then simple averaging (e.g. Haining et al., 2010 andLee et al., 2019) or area-based interpolation (Berg et al., 2021) are most commonly adopted.
In this study as the pollution estimates from the PCM model have complete spatial coverage of the study region, we use the simple averaging approach in common with the authors outlined above. Specifically, each 1 km 2 gridded concentration has an associated centroid (central point), and we estimate the pollution concentration in an IZ as the mean of the grid square concentrations whose centroids lie within the IZ. Any IZ that does not contain a grid square centroid is assigned the pollution concentration from the nearest grid square.
In this study we consider concentrations of nitrogen dioxide (NO 2 ) and both coarse (PM 10 ) and fine (PM 2.5 ) particulate matter, all of which are measured in μg m −3 . We choose these pollutants because they are the ones responsible for 35 out of the 36 air quality management areas in Scotland, for details see http://www.scottishairquality.scot/ laqm/aqma. The spatial distributions of these pollutants are presented in Fig. 1 and Table 1, with the former displaying NO 2 (panel C) and PM 2.5 (panel D) concentrations. The figure and table show that high NO 2 concentrations are almost exclusively observed in the main urban centres of Aberdeen, Dundee, Edinburgh and Glasgow, with the remaining rural areas exhibiting very low concentrations. In contrast, PM 2.5 concentrations are more evenly distributed spatially, albeit with a clear east-west trend, with higher concentrations in the east being caused by transboundary pollution coming from continental Europe.

Covariate data
We have also collected data on a number of confounding factors that will likely influence Covid-19 hospitalisation and death rates, which are summarised in Table 1. The main confounding factor when modelling Covid-19 death rates is the numbers of Covid-19 cases, and data are available from Public Health Scotland (https://www.opendata. nhs.scot/dataset/covid-19-in-scotland) on the rate of cases per 100,000 people over the 17 month duration of the study. Additionally, we have data on Covid-19 vaccination rates, which will likely affect hospitalisation and death rates during the latter part of the study period. The first vaccinations were given in Scotland in December 2020, so while they could not affect the Covid-19 outcomes that occurred between March 2020 and December 2020, they may have had an affect after that time. However, it is not clear what an appropriate cutoff date should be for measuring vaccination rates in this study, because if one chooses an early date, such as 1st January 2021, then most areas will have a very small proportion of their population vaccinated by then and hence the variable will likely have little effect. In contrast, if one chooses a cutoff date of 31st July 2021 (the end of the study), then vaccination rates on this date are unlikely to have affected the hospitalisations/deaths that occurred during the study. Therefore we compute the percentage of the adult population (defined as people aged 20 or over because population counts are only available for 5-year age bands) who were either single vaccinated or double vaccinated by the end of February 2021 and April 2021, and determine which is the best predictor via a model building exercise.
Another important confounding factor in spatial epidemiological studies is socio-economic deprivation, with poorer populations tending to exhibit a greater burden of adverse health compared to more affluent areas. In Scotland in 2020 socio-economic deprivation is measured by the Scottish index of multiple deprivation (SIMD, https://simd.scot/), which is a composite index comprising indicators in the domains of geographical access to services, crime, education, employment, health, housing and income. However, as our outcome variables are health related we do not consider indicators from the health domain. Full details of the individual indicators in the SIMD are given in Section 1 of the supplementary material accompanying this paper. The index and its indicators are constructed at the Data Zone (DZ) scale, which is a smaller small-area geography that nests exactly within the IZs used here. Thus each individual indicator is aggregated to the IZ scale by population weighted averaging. However, there are 9 incidences of missing values across these SIMD indicators, which are estimated by a spatially weighted average of the values in neighbouring areas (those that they share a common border with).
We also consider a number of other covariates for possible inclusion in the model, because existing studies have shown that they may affect rates of Covid-19 ill health or death. The first is population density computed as the number of people per hectare in each IZ, and areas with higher population densities are likely to have higher Covid-19 hospitalisation/death rates due to the likely increased mixing of the population as they live closer together. Secondly, care homes for elderly people are known to have been substantially impacted by Covid-19 ill health throughout the pandemic. Therefore we obtained data for each IZ on the number of registered places in care homes for older adults in November 2020 (the middle of the study period) from the Care Inspectorate (https://www.careinspectorate.com/). Additionally, we have data on ethnicity from the 2011 Scottish census, with more recent data not available because the census is only performed every ten years and the 2021 census was pushed back to 2022 due to Covid-19. For each IZ we have data on the percentage of people who are from the following ethnic groups taken from the census: (i) black; (ii) Chinese descent; (iii) Indian/Pakistani/Bangladeshi descent; and (iv) white.

Exploratory analysis
We first built simple Poisson log-linear regression models separately for the hospitalisation and death count data sets without allowing for spatial autocorrelation, as this enables us to determine whether the covariates capture all of the spatial autocorrelation present in the data. We began by including the expected number of hospitalisations/deaths as an offset term in each model, as well as adding in the covariates relating to Covid-19 case rates, population density and the number of carehome places. All of these covariates were significantly related to both the hospitalisation and death counts, and were hence retained in the model.
Next we considered how to account for socio-economic deprivation in the model, and after ignoring the health domain the SIMD comprises 20 indicators of socio-economic deprivation from six domains, which have pairwise correlations ranging between −0.83 and 0.98. Therefore we included variables relating to each of the six remaining domains in separate models, and represent socio-economic deprivation by the domain that minimises the AIC. For the Crime, Employment and Income domains that are measured by only one indicator, we use that indicator as the domain specific measure of socio-economic deprivation. For the remaining domains we apply principal component analysis (PCA) separately to the indicators in that domain, and for simplicity use the first principal component (PC) as the domain specific measure of socioeconomic deprivation. These first principal components account for 75% (Access), 99% (Education) and 95% (Housing) of the variation in the domain specific indicators, suggesting that large proportions of the variation in each domain can be captured by just 1 PC. The employment domain produced the best fitting model for both outcomes as measured by AIC, and is the one that is retained in both cases.
Then we added in the four vaccination rate variables in separate models due to their high collinearity, and overall the percentage of the adult population who were double vaccinated by the end of April 2021 best fitted the data and was hence used in the final models for both disease outcomes. Finally we added in the ethnicity covariates, specifically the percentages of each IZ's population who were: (i) black; (ii) from Indian/Pakistani / Bangladeshi descent; and (iii) from Chinese descent. Note, the percentage of each IZ's population who were white was highly correlated with these three variables and was hence not included. A single air pollutant was also added to the model at this stage, and multiple pollutants are not considered in a single model in the main set of results due to their high pairwise correlations (between 0.66 and 0.91).
The residuals from these final covariate only models were then assessed for the presence of overdispersion and spatial autocorrelation. The residuals exhibited substantial overdispersion for both outcomes, with overdispersion parameters of̂= 2.13 (hospitalisations) and̂= 1.96 (deaths) respectively, where Var[ ] = E[ ]. They also exhibited significant spatial autocorrelation, which was assessed via a Moran's I test (Moran, 1950) based on a neighbourhood matrix (see next section for details) defined by the commonly used border sharing rule. The residuals had observed Moran's I statistics of 0.26 for hospitalisations and 0.10 for deaths, with corresponding p-values of 0.0001 against a null hypothesis of independence (based on a permutation test with 10,000 random permutations) in both cases, suggesting that the data contain residual spatial autocorrelation after covariate adjustment.

Statistical models
Let ( , ) respectively denote the observed and expected numbers of hospitalisations or deaths from the population living in the th IZ, where = 1, … , (= 1279). We model hospitalisations and deaths separately because this ensures that correlations are not induced between their risk estimates, as would be the case if a multivariate CAR type model (e.g. Gelfand and Vounatsou, 2003) was used. These between outcome correlations may affect the estimated covariate effects and their standard errors, and we do not want the death rates, for example, to influence the relationship between air pollution and hospitalisation rates. We fit the same univariate spatial models outlined below to the hospitalisation and death outcomes separately, and inference is based in a Bayesian paradigm using Integrated Nested Laplace Approximations (INLA) via the R package INLA (Rue et al., 2009).

Level 1 -Data likelihood model
As the response variable = ( 1 , … , ) is a vector of spatially aggregated counts for the IZs, the first level of our Bayesian hierarchical model is the Poisson log-linear specification given by Here denotes the risk (rate) of hospitalisation or death in the th IZ relative to the expected count , and is on the same scale as the SMR described above for interpretation purposes. The spatial variation in is modelled by a × 1 vector of covariates, including air pollution concentrations, { }, with regression parameters , and a spatial random effect . The air pollution concentrations are taken from the spatially re-aligned PCM model output described in the previous section, and for consistency with the existing air pollution and Covid-19 studies (e.g. Wu et al., 2020, Huang et al., 2021and Konstantinoudis et al., 2021 we assume they are known concentrations. Additionally, these concentrations are used by the UK government advisory Committee on the Medical Effects of Air Pollutants (COMEAP) in their reports (see for example COMEAP, 2010), which validates our use of them in this study as known concentrations. We note that an alternative modelling approach would be to allow for uncertainty in the estimated pollution concentrations, using approaches similar to those proposed by Blangiardo et al. (2016), Lee et al. (2016), Huang et al. (2018) and Cameletti et al. (2019). The next section describes the models we fit for the spatial random effects = ( 1 , … , ), while the following section discusses prior specification.

Level 2 -Spatial random effects model
The random effects allow for any residual spatial autocorrelation present in the data after covariate adjustment, which is caused by factors such as unmeasured confounding. The spatial autocorrelation structure in the random effects is controlled by a × binary neighbourhood matrix , which specifies which pairs of areas are spatially close together, i.e. are defined to be neighbours. If two areas ( , ) are defined to be neighbours then their random effects ( , ) are modelled as partially autocorrelated, while if they are defined not to be neighbours then ( , ) are modelled as conditionally independent. In this paper we consider the two commonly used specifications of described below, which allows us to assess the sensitivity of our results to this modelling choice.
Border sharing -The border sharing rule is the most commonly used method for specifying , and = 1 if areal units ( , ) share a common border and = 0 otherwise, with = 0 ∀ . However, in our data there are 6 island IZs that have no neighbours under this definition, which we rectify by assigning the mainland IZ that is closest to each of these island IZs as a neighbour. The matrix is further adjusted by adding in 5 additional neighbour relations linking island groups to the mainland, which ensures the neighbourhood graph has only one connected component.

Nearest neighbour -
The border sharing rule results in vastly different numbers of neighbours across the set of IZs (between 1 and 26), so we compare this to the −nearest neighbour rule where every IZ has a similar number of neighbours. Specifically, area is a neighbour of area , and hence = 1, if it is one of the nearest to it in terms of inter-centroid distances, with = 0 ∀ as before. This matrix is asymmetric, and is made symmetric by if = 1 and = 0 then setting = 1. Here we choose = 4 because it has an almost identical mean number of neighbours for each IZ as the border sharing specification (5.162 compared to 5.163). However, these specifications only agree on around 60% of their neighbours, meaning they are somewhat different specifications of spatial closeness. Finally, the matrix is adjusted by adding in 2 additional neighbour relations to ensure the neighbourhood graph has only one connected component.
The adjustments described above to ensure the neighbourhood graphs each have one connected component ensures that multiple identifiability constraints (one for each connected component) are not required when fitting the model. A number of conditional autoregressive (CAR) priors have been proposed for modelling spatial autocorrelation in areal unit data using a neighbourhood matrix, and here we consider the following three specifications to see how robust our results are to this modelling choice. Leroux et al. (2000) for globally smooth spatial autocorrelation.

Leroux -The CAR prior proposed by
Modified BYM -The CAR prior proposed by Simpson et al. (2017) for globally smooth spatial autocorrelation whose scale is not affected by .

LCAR -
The CAR prior proposed by Lee and Mitchell (2013) for locally smooth spatial autocorrelation.
A full mathematical specification of each of these random effects priors is given in Section 2 of the supplementary material accompanying this paper.

Level 3 -Prior distributions
The regression parameters = ( 1 , … , ) are assigned independent weakly informative zero-mean Gaussian prior distributions with a large variance, i.e.
∼ N(0, 100000) for = 1, … , , to ensure the data play the dominant role in estimating their values. This choice of prior distribution for the regression parameters is commonly used in the literature, but the choice of prior distributions for the precision and spatial dependence parameters ( , ) described in Section 2 of the supplementary material is less straightforward. Therefore we fit the three hierarchical spatial models described above to the data with two different choices of prior distributions for ( , ), so that its impact on the results can be assessed. These two sets of prior distributions are given by:

Study results
The results of our study are presented below, and in all cases inference is based on integrated nested Laplace approximations using the full Laplace approximation. To illustrate the sensitivity of our results to model specification, we compare models with all possible combinations of the following three factors: • Spatial random effects priors -the three CAR priors for the spatial random effects listed in Section 3.2 denoted by Leroux, Modified BYM, and LCAR. • Neighbourhood matrix -the two specifications of the neighbourhood matrix described in Section 3.2, namely the border sharing rule denoted Border and the 4 nearest neighbour rule denoted 4NN. • Priors -the two sets of prior distributions for ( , ) outlined in Section 3.3 and denoted by Main and Alternative.
In all cases the models include the following covariates, which were chosen following the model building exercise outlined in Section 2.4: Covid-19 case rates per 100,000 people over the study period; number of carehome places; population density; the percentage of working age people defined to be employment deprived; the percentage of the adult population who were double vaccinated by the end of April 2021; and the percentages of people who are black, Chinese, or from Indian/Pakistani/Bangladeshi descent. Finally, each model includes one of the three air pollutants (NO 2 , PM 2.5 , PM 10 ). Note, multiple pollutants are not included in the same model in this main analysis due to their high collinearity. The next sub-section describes the overall fit of each model (4.1), while subsequent sub-sections present the results of the air pollution effects (4.2), the other covariate effects (4.3), and outlines the further sensitivity analyses presented in the supplementary material accompanying this paper (4.4).

Overall model fit
The overall fit to the data of each model specification is summarised by the deviance information criterion (DIC, Spiegelhalter et al., 2002) in Table 2  outcomes changing the model has relatively little impact on the DIC, with maximum reductions of 55 for hospitalisations and 49 for deaths between the best and worst fitting models. The LCAR prior fits the hospitalisation data best while the modified BYM CAR prior fits the death data best, although as described above the differences in DICs are not large. The effect of changing the specification of is smaller for the death outcome and slightly larger for the hospitalisation outcome, and for the latter the border sharing specification consistently best fits the data. Finally, the choice of priors for ( , ) have almost no impact on model fit for either disease outcome as measured by the DIC.

Pollution-health effects
The estimated relative risks between each pollutant and disease outcome for all model specifications are displayed in Tables 3 and 4, which respectively relate to the hospitalisation and death outcomes. In all cases these relative risks are estimated from single pollutant models, and relate to a 5 μg m −3 increase in NO 2 and a 1 μg m −3 increase in both PM 2.5 and PM 10 as these are realistic increases in the concentrations that might occur in Scotland (see Table 1).
For the hospitalisation outcome the estimated relative risks are fairly consistent across the different models, suggesting the results are relatively robust to model misspecification. The main finding is that elevated PM 2.5 concentrations are consistently statistically significantly associated with Covid-19 hospitalisations, with a 1 μg m −3 increase in concentrations being associated with between a 7.4% and a 9.3% increased risk. To further examine the consistency of these estimated associations between the models we summarise their entire posterior distributions via density strips in Section 3 of the supplementary material accompanying this paper.
Increasing PM 10 concentrations by 1 μg m −3 is estimated to increase the risk of hospitalisation by between 1.9% and 2.9%, with half of these estimated associations (those based on the border sharing ) being just statistically significant at the 5% level while the other half (those based on the 4 nearest neighbour ) are nearly statistically significant. Finally, the estimated relative risks for NO 2 correspond to between a 1.6% and a 2.8% increased risk for a 5 μg m −3 increase in concentrations for all models considered, but these effects are not statistically significant as all the 95% credible intervals contain the null risk of one.
For the death outcome the estimated relative risks are more variable across the different model specifications for NO 2 and PM 2.5 compared to the corresponding results for hospitalisations. This is most likely to be because the death outcome has many fewer events in each IZ compared to the hospitalisation outcome (6 vs. 21 on average), leading to less precise and consistent inference and a stronger impact of the prior distributions. For PM 2.5 a 1 μg m −3 increase in concentrations is associated with between a 2.9% and a 10.3% increased risk, and Table 3 Estimated relative risks and 95% credible intervals for the effects of each pollutant on Covid-19 hospitalisation rates using 2019 pollution data. The relative risks relate to a 5 μg m −3 increase in NO 2 and a 1 μg m −3 increase in both PM 2.5 and PM 10 . 4 of these 12 associations are statistically significant at the 5% level. The corresponding risks for NO 2 range between 1.7% and 5.9% for a 5 μg m −3 increase in concentrations, although in all cases these effects are not statistically significant. Finally, the effects of PM 10 are consistent across all 12 models, with non-significant relative risks close to one in all cases.

Non-pollution covariate effects
The estimated relative risks and 95% credible intervals for the nonpollution covariates are presented in Tables 5 and 6, which respectively relate to hospitalisations and deaths. In all cases the relative risks relate to realistic increases in each covariate, which are given in brackets in column 1 of the tables. The tables present relative risks from models with the three types of spatial CAR prior assumed for the random effects that allow for spatial autocorrelation. However, for brevity all results relate to the border sharing neighbourhood matrix with the main priors for ( , ) and PM 2.5 being the pollutant included in the model. The tables show that in all cases the estimated relative risks are almost identical across the three spatial correlation structures, suggesting that the results are robust to this choice.
The tables show that unsurprisingly increased Covid-19 case rates are significantly associated with increased hospitalisations and deaths, as increasing the case rate per 100,000 people over the entire study duration by 1000 is estimated to increase hospitalisations by around 12% and deaths by around 13.5%. Covid-19 vaccination rates, as measured by the percentage of adults who were double vaccinated by the end of April 2021, are significantly associated with reduced deaths, with a 5% increase in vaccination rates estimated to lead to a 4% reduction in the risk of death. This relatively small effect size occurs because a sizeable proportion of the deaths occurred before the vaccination programme began, thus attenuating its effect size. In contrast, vaccinations appear to have no effect on the risk of hospitalisation after adjusting for the other covariates, as the estimated relative risks in Table 5 are close to the null risk of one and are not significant at the 5% level. This may be because the cohort of people who were vaccinated by April 2021 were the mostly vulnerable individuals in health terms (either in the highest age groups or people with underlying health conditions), and hence vaccination reduced their likelihood of death but not of being hospitalised. Alternatively, the lack of an effect may be due to a sizeable proportion of the study duration occurring before vaccination began, which means we should treat the results of this variable with extreme caution.
In addition to Covid-19 case rates, the other main driver of elevated risks of hospitalisation and death is socio-economic deprivation. A 5% increase in the percentage of the working age population who are defined to be employment deprived is significantly associated with a 16% increased risk of hospitalisation and a 8.5% increased risk of death. The number of care home places is also significantly associated with Covid-19 deaths, because if an area increases its care home places by 10 the estimated risk of death increases by around 4%. In contrast, care homes appear to exhibit no association with hospitalisation risk, as the relative risks are all close to one. Population density appears to have no significant associations with either hospitalisation or death once the other covariate effects are accounted for, as all the estimated relative risks are close to one. Finally, ethnicity does not appear overall to have a strong association with Covid-19 hospitalisations or deaths, as 14 of the 18 estimated associations are not significant at the 5% level. An exception to this is for the risk of hospitalisation for people from Chinese descent, as for every 1% increase in an areas' Chinese population, the relative risk reduces by between 2.3% and 2.6%. Additionally, people who are black appear to have a small increased risk of both D. Lee et al.

Table 5
Estimated relative risks and 95% credible intervals for the effects of the non-pollution covariates on Covid-19 hospitalisation. The relative risks relate to the increases given in brackets in the first column of the

Further sensitivity analyses
We have conducted a number of further sensitivity analyses to assess the robustness of our results, and they are presented in Sections 4 to 7 of the supplementary material accompanying this paper. The first of these relates to how we estimate pollution concentrations, and in the main study we have used annual average pollution concentrations for 2019 to represent population exposure, because as previously discussed it is likely to be more representative of normal concentrations than annual average concentrations in 2020. However, in Section 4 of the supplementary material we present estimated relative risks and 95% credible intervals based on the 2020 annual average pollution concentrations as a sensitivity analysis, and the results show broadly similar messages to those presented in Tables 3 and 4. Section 5 of the supplementary material investigates the potential for spatially varying effects across Scotland, specifically estimating separate effects for the health boards containing the four largest cities in Scotland (Grampian-Aberdeen; Tayside-Dundee; Lothian-Edinburgh; and Greater Glasgow and Clyde-Glasgow). Additionally, the models fitted so far in this study consider each pollutant separately in single pollutant models, because this is the norm in the existing air pollution and Covid-19 literature. However, we extend this in Section 6 of the supplementary material by estimating the joint effects of NO 2 and PM 2.5 / PM 10 , because individuals are simultaneously exposed to multiple different pollutants. Finally, as the estimated PM 2.5 association with hospitalisations is the strongest of all those considered, we examine how this association varies by age group and period of the study in Section 7 of the supplementary material.

Discussion
This paper presents the first study examining the potential effects that air pollution may have on worsening Covid-19 outcomes at the small-area scale in Scotland, and adds to the relatively small international evidence base on this topic that includes (Coker et al., 2020;Wu et al., 2020;Berg et al., 2021;Konstantinoudis et al., 2021;Sun et al., 2021). The major strength of our study is the extensive sensitivity analysis we have performed, which allows us to quantify the robustness of our results to model misspecification. This sensitivity analysis includes changing the residual spatial autocorrelation structure (both and the random effects model), the prior distributions assumed, the estimation of air pollution concentrations, the potential for regional variation in effect sizes, the simultaneous effects of multiple pollutants, and how the effects vary with age and vaccination.
The major limitation of our study is its ecological (population-level) design, as the unit of inference is the group of individuals who live in each IZ rather than having data for each individual. Ecologicallevel studies are prone to ecological bias and the ecological fallacy (Wakefield and Salway, 2001), the latter of which is where one incorrectly assumes that the estimated group-level association holds at the individual-level. This estimated group-level association does not in any way equate to individual-level cause-and-effect, because it may be affected by factors such as within-area variation in either the exposure or the confounders. We used this ecological study design because the data required for an individual-level study were not available for confidentiality reasons, and this ecological approach has been predominant in the air pollution and Covid-19 literature to date (see for example Konstantinoudis et al., 2021and Berg et al., 2021, while Mendy et al., 2021 is a notable exception). Thus our findings outlined below should be treated as indicative associations, rather than evidence of individual-level cause-and-effect.
Our main finding is that fine particulate matter air pollution measured as PM 2.5 is significantly associated with increased risks of hospitalisation due to Covid-19, as all 12 estimated relative risks for this pollutant-outcome pair are statistically significant at the 5% level. The size of the estimated relative risks are also fairly consistent, ranging between a 7.4% and a 9.3% increase in risk for each 1μg m −3 increase in concentrations. This compares to estimates of around 18% in Ohio from Mendy et al. (2021) in an individual-level study, and 26% in Colorado from Berg et al. (2021) in a spatial ecological study similar to this one.
Similarly, when considering death as the outcome measure, Coker et al. (2020) and Wu et al. (2020) report significant increased risks of 9% and 11% for each 1 μg m −3 increase in PM 2.5 concentrations in northern Italy and the USA respectively. In contrast, Konstantinoudis et al. (2021) report insignificant results in England, but this may be due to their use of estimated (using a downscaling model) rather than known Covid-19 death counts at the small-area scale. It may also be due to the small number of deaths they had in each areal unit, which numbered just over 1 on average. Our present study has an average of only 6 deaths in each IZ, and this could also be the reason why 8 of the 12 estimated relative risks between PM 2.5 and Covid-19 deaths were not statistically significant. Our estimated relative risks ranged between around a 2.9% and a 10.3% increased risk, but the small numbers of deaths caused these estimates to have relatively wide and hence non-significant 95% credible intervals in the majority of cases.
Overall then this study corroborates the majority of the earlier research findings situated in different countries, which adds weight to the international evidence that fine particulate matter air pollution worsens Covid-19 outcomes. The sizes of these estimates across the different studies is however quite variable, which could be due to numerous factors including the varying concentrations of air pollution and the small numbers of Covid-19 events in each areal unit making the estimates slightly unstable. Thus an important area of future work will be to conduct a meta-analysis of these studies, which will allow a more precise global effect estimate to be produced that borrows strength across the individual studies. A further area of future work will be to investigate the impact of allowing for pollution uncertainty in the disease models, for example using an approach similar to that proposed by Cameletti et al. (2019).
Our overall conclusions around the effects of the other two pollutants on Covid-19 ill health are less clear cut. For example, PM 10 appears to have a borderline significant association with hospitalisations (6 out of 12 associations were significant), with increased risks of between 2% and 3% for a 1 μg m −3 increase in concentrations. In contrast, no effects are observed with Covid-19 death, as all 12 estimated relative risks are very close to one and non-significant. For NO 2 the increased risks are estimated to be around 1%-3% for hospitalisations and between 2% and 6% for deaths, but in all cases they are not significant at the 5% level. The effects of NO 2 and PM 10 on Covid-19 outcomes appear to have been examined less frequently than those of PM 2.5 , and of the studies referenced here only (Konstantinoudis et al., 2021) has considered NO 2 while none considered PM 10 . This suggests that future studies are needed to examine the effects that both these pollutants may have on Covid-19 outcomes, to provide more conclusive evidence on their effects than have been possible to draw here.

Declaration of competing interest
This study was funded by the Scottish Government's Environment and Forestry Directorate.