Estimating multiple greenspace exposure types and their associations with neighbourhood premature mortality: a socioecological study

Background: Greenspace exposures are often measured using single exposure metrics, which can lead to conflicting results. Existing methodologies are limited in their ability to estimate greenspace exposure comprehensively. We demonstrate new methods for estimating single and combined greenspace exposure metrics, representing multiple exposure types that combine impacts at various scales. We also investigate the association between those greenspace exposure types and premature mortality. Methods: We used geospatial data and spatial analytics to model and map greenspace availability, accessibility and eye-level visibility exposure metrics. These were harmonised and standardised to create a novel composite greenspace exposure index (CGEI). Using these metrics, we investigated associations between greenspace exposures and years of potential life lost (YPLL) for 1673 neighbourhoods aspplying spatial autoregressive models. We also investigated the variations in these associations in conjunction with levels of socioeconomic deprivation based on the index of multiple deprivations. Results: Our new CGEI metric provides the opportunity to estimate spatially explicit total greenspace exposure. We found that a 1-unit increase in neighbourhood CGEI was associated with approximately a 10-year reduction in YPLL. Meaning a 10% or 0.1 increase in the CGEI is associated with an approximately one year lower premature mortality value. A single 1-unit increase in greenspace availability was associated with a YPLL reduction of 9.8 years, whereas greenness visibility related to a reduction of 6.14 years. We found no significant association between greenspace accessibility and YPLL. Our results further identified divergent trends in the relations between greenspace exposure types (e.g. availability vs. accessibility) and levels of socioeconomic deprivation (e.g. least vs. most). Conclusion: Our methods and metrics provide a novel approach to the assessment of multiple greenspace exposure types, and can be linked to the broader exposome framework. Our results showed that a higher composite greenspace exposure is associated with lower premature mortality.


Introduction
The connection between human health and the natural environment is widely recognised (Frumkin et al., 2017;Díaz et al., 2018;Bratman et al., 2019). Several recent reviews have demonstrated that human interactions with nature, particularly exposure to green and blue spaces (commonly referred to jointly as 'greenspaces') provide multiple pathways enabling positive health benefits (Hartig et al., 2014;Markevych et al., 2017). The importance of these pathways is also recognised in the increasing emphasis on urban green infrastructure as a core part of spatial planning (Lindley et al., 2019;Dennis et al., 2020). Existing literature indicates that exposure to greenspace reduces health risk factors associated with all-cause mortality, adverse pregnancy outcomes, elevated blood pressure, type 2 diabetes, among others via improved physical health (Kondo et al., 2018;Rojas-Rueda et al., 2019;Jimenez et al., 2021). Studies have also reported positive effects related to greenspace exposure on mental health. Positive effects on mental health include the reduction of anxiety levels and stress levels as well as alleviating depression, improving sleep, enhancing cognitive development, and improving overall life satisfaction (Dadvand et al., 2015;Fong, Hart and James, 2018;Kondo et al., 2018). Nonetheless, a few studies have also found null, mixed, or adverse effects related to greenspace exposure (Picavet et al., 2016;Dzhambov et al., 2020). Noting this, some scholars have argued that the possible reasons for these inconsistent findings might be attributed to differences in measuring exposures in terms of both spatial and temporal variations as well as in the context of specific health and medical conditions (e.g., allergies) that may be negatively impacted by some types of green space exposure (Markevych et al., 2017;Davis et al., 2021;Stas et al., 2021).
The conceptualisation of greenspace exposure used to frame previous studies involving objective measurement can be broadly classified into three categories; greenspace availability, accessibility, and visibility (Dadvand and Nieuwenhuijsen, 2019). These three categories represent the most commonly used spatial measures of greenspace exposure (Ekkel and de Vries, 2017;Labib, Lindley and Huck, 2020b). Each of these three greenspace exposure categories are linked to different, though often overlapping, mechanistic pathways influencing health (Hartig et al., 2014;James et al., 2015;Markevych et al., 2017;Bratman et al., 2019). For example, availability of greenspace has been associated with the reduction of environmental stressors (e.g., air and noise pollution, heat mitigation) (Frumkin et al., 2017;Lindley et al., 2019). Access to greenspaces may encourage physical activities and increase social cohesion, linked to the capacity building pathway (Hartig et al., 2014;Markevych et al., 2017;Nieuwenhuijsen et al., 2017). Visibility of greenspace or 'greenness' visibility (Labib, Huck and Lindley, 2021) may be associated with stress recovery and attention restoration (Ulrich, 1984;Kaplan and Kaplan, 1989). Additionally, spatial measures of greenspace exposure may act as proxies for the "cumulative opportunity" of exposure to nature (Frumkin et al., 2017), the actual exposure will depend on the usage and time spent in contact with greenspaces (Bratman et al., 2019;Holland et al., 2021). It should be noted that, unlike some other external environmental exposure types (Turner et al., 2017), the precise impact and value of greenspace exposure influences are, as yet, unclear. This is due at least in part to the multiple mechanistic pathways involved in health benefits, the heterogeneous exposure assessment methods applied in existing studies, and the difficulty in accounting for actual exposure in terms of amount of time spent in nature (Turner et al., 2017;Labib, Lindley and Huck, 2020b;Holland et al., 2021).
Existing approaches to the objective measurement of greenspace exposure typically only consider one of these three types of exposure (availability, accessibility, and visibility) and often associate them individually with health indicators. Accordingly, the aggregate value of the impact of greenspace on human health is not yet well understood (Frumkin et al., 2017;Bratman et al., 2019;Labib, Lindley and Huck, 2020b). Understanding the influence of aggregate exposure is crucial because many of the non-communicable disease and health indicators considered in relation to greenspace exposure can be attributed to the aggregate effects of multiple exposure types (Frumkin et al., 2017;Silva, Rogers and Buckley, 2018;Labib, Lindley and Huck, 2020b). Additionally, individuals are exposed to varying amounts and types of greenspace at different locations (e.g., home, office, streets), and this variability may result in differing health effects both on any given individual and from individual to individual over time . Such variations in exposure over different spatial and temporal extents also relates to the uncertain geographic context problem (UGCoP) noted by (Kwan, 2012). For greenspace exposure, the UGCoP indicates that there is spatial uncertainty in the actual areas over which the greenspace exposure can impact health outcomes along multiple pathways and contexts (Pearce, 2018;Labib, Lindley and Huck, 2020a). The pre-existing methodologies applied to greenspace exposure assessments are usually considered at a fixed spatial scale. As a result, there are uncertainties in estimating the aggregate effects of greenspace exposure at multiple locations related to varying temporal and spatial scales (Kwan, 2012;Labib, Lindley and Huck, 2020b). To better understand exposure duration, exposure sequences, and exposure accumulation, measurement of aggregate greenspace exposures at multiple scales therefore requires careful attention.
In this regard, new methodological solutions are required that allow the measurement of composite greenspace exposure based on hierarchical spatial scale values. Turner et al. (Turner et al., 2017) indicated that conducting assessments of greenspace exposures that represent the totality of all such measurements in aggregate may have only moderate feasibility for the study of larger populations.
Turner et al. also noted that it would be difficult to infer this exposure at the individual-level in such populations. However, the rapid advancement in spatial science disciplines (e.g., Geographical Information Science-GIS) and related location-based technologies, such as remote sensing, Global Navigation Satellite Systems, and environmental and personal sensors, in conjunction with increased computing capabilities for big data analysis could help overcome at least some of the aforementioned challenges (Jia, 2019;Labib, Lindley and Huck, 2020b).
Considering the above arguments, the primary aim of the present study was to develop a new index that allows estimation of aggregate greenspace exposures at hierarchal spatial scales by applying tools and techniques related to GIS, satellite images, and other spatial data. We present a new composite greenspace exposure index (CGEI) that integrates three spatially explicit objective greenspace exposure types (i.e., availability, accessibility, and visibility) for the provision of city-or region-wide measurements of greenspace exposure at fine spatial resolutions. We also investigate statistical associations between single and composite greenspace exposure estimates at the neighbourhood scale in relation to years of potential life lost (YPLL) as a measure of premature mortality. Previous studies indicated that the availability of greenspace reduced mortality and morbidity for diverse populations (Coutts and Horner, 2016;James et al., 2016;Fong, Hart and James, 2018;Rojas-Rueda et al., 2019), and that greenspace availability varies with the level of social deprivation and inequality in any given neighbourhood (Mitchell and Popham, 2008;Labib, Lindley and Huck, 2020a). We therefore hypothesise that greenspace exposure characteristics (and by extension YPLL) for socially deprived neighbourhoods will differ depending on whether composite or single greenspace exposure measurements are used. We also expect that overall greenspace exposure in such neighbourhoods will reduce YPLL in those neighbourhoods.

Study settings
We undertook this study in the Greater Manchester (GM) area of the United Kingdom. Greater Manchester is a post-industrial city-region with an area of 1,276 km 2 and an estimated population of 2.8 million as of (Dennis et al., 2018. The city-region has a diverse landscape pattern, including flat plain areas surrounded by hills (the Pennine Chain) that rise to 500 m in the north and east. It has varying greenspaces of different sizes and types such as urban parks, pastoral areas, river corridors, and multiple areas of urban to rural transition (Dennis et al., 2018;Labib, Lindley and Huck, 2020a). In the context of greenspace exposure, the differing characteristics in the area's natural environment is of particular interest for understanding the variability and importance of different types of greenspace exposure. From the perspective of both human health and socioeconomic inequalities, Greater Manchester is also an important area of study in a UK context, because neighbourhoods in the cityregion have a range of social deprivation characteristics, from the particularly affluent through to extremely deprived. These levels, in turn, are associated with both varied and elevated levels of illness and disability (GMCA, 2019;Dennis et al., 2020).

Single greenspace exposure modelling
The composite greenspace exposure index we created in this study represents a combined measurement of three single greenspace exposure metrics (i.e., availability, accessibility, and visibility).
We developed each of these individual metrics as distinct measures of greenspace exposure, with each following a different methodological approach using multiple sources of spatial data, tools, and analytical frameworks. The following sections summarise and discuss the methods applied in developing the single exposure metrics.

Greenspace availability exposure
We created a greenspace availability exposure index (GAVI) as a multi-scale, multi-metric map combining three commonly used greenspace metrics i.e., Normalised Difference Vegetation Index (NDVI), Leaf Area Index (LAI), and Land Use-Land Cover (LULC) at five spatial scales (i.e., 100, 200, 300, 400, and 500 m spatial resolutions -which can be thought of as equivalent to differing buffer distances commonly used to estimate exposure) following the methodology described in Labib et al. (Labib, Lindley and Huck, 2020a). These thresholds were carefully selected considering the spatial scale and resolution of the input data, and are thus less vulnerable to scaling effects (Labib, Lindley and Huck, 2020a). We considered these three metrics to account for different characteristics of greenspace representing photosynthetically active vegetation. NDVI characterises greenspace density, LAI measures greenspace volume, and LULC accounts for the overall presence or absence of greenspace (James et al., 2015;Engemann et al., 2019). NDVI and LAI were obtained from 10 m resolution Sentinel-2 satellite images (collected on 4th July 2018, with cloud cover < 2%). Sentinel-2 satellite images were selected due to their higher overall accuracy in identifying vegetation compared to moderate resolution Landsat images (Markevych et al., 2017;Labib and Harris, 2018 Layer. The LULC data included five land cover types (i.e., urban, water, tree canopy, forbs-shrubs, and grass), with a spatial resolution of 10 m.
For each greenspace metric, we calculated mean values in each grid cell of the map at five spatial scales (i.e., resolution) and combined these five scale maps using a weighted average method to produce multi-scale greenspace availability maps based upon NDVI, LAI and LULC ( Figure S1, Supplementary document). The multi-scale weighting, based upon lacunarity analysis, accounts for scale sensitivity effects observed at varying buffer distances, a detailed description of which is available in Labib et al. (Labib, Lindley and Huck, 2020a). Finally, we took the mean of the three maps to produce our multi-metric, multi-scale availability index using equation 1.
In (eq1), GAVIj is the greenspace availability index value for cell j, and GSNDVIj, GSLAIj, and GSLULCj are multi-scale greenspace metric 'exposure' values for corresponding cell j. T is the number of metrics (in our case, T = 3). In this case, GAVI values ranged between 0 to 1, where 1 indicated the highest availability of greenspace, and 0 denoted the lowest level of, or no available, greenspace. It should be noted that we removed negative NDVI values from our computation, as these values represented bodies of water or other non-green features. We then set the negative NDVI values to zero.
LAI values were normalised using minimum-maximum normalisation to obtain values consistent with other metrics. Details of LAI transformation and calculation can be found in Labib et al. (Labib, Lindley and Huck, 2020a). For this research, we re-sampled the GAVI map to 5 m resolution using nearest neighbour technique in ArcGIS Pro (v 2.4) to achieve consistency with other exposure layers.

Greenspace accessibility exposure
We computed the greenspace accessibility exposure index following a four-step method. First, we selected publicly accessible greenspaces within the study area. Second, we identified the access points for these greenspaces. Third, we ran a network analysis to measure and map the shortest network distance from the access points to any location within the study area. Fourth, we normalised the access distance to produce an accessibility raster map for the study area. The details of these steps follow.
Step 1: Selecting accessible greenspaces To determine accessible greenspaces, we used two GIS datasets; UK Ordnance Survey (OS) Open Greenspace (scale: 1:2500), and the OS Master Map Accessible Natural Greenspace layer (scale: 1:2500). We selected greenspaces that were publicly accessible to any users within or surrounding the GM area (up to 10 km). This distance was selected following Natural England's 'Accessible Natural Greenspace Standards' (ANGSt) model, which details standard access distances for varying greenspaces sizes in the UK see Comber et al. (Comber, Brunsdon and Green, 2008 (Pauleit et al., 2003;Comber, Brunsdon and Green, 2008). The size of our selected greenspaces ranged between 0.04 ha (400 m 2 ) to more than 500 ha (5 km 2 ), in order to ensure that we accounted for many forms of greenspace, from small 'pocket parks' through to large country parks.
Step 2: Access point identification In urban areas like GM, greenspaces are usually entered through specific access points (Comber, Brunsdon and Green, 2008). To account for this, we identified the access points for greenspaces using two approaches. Firstly, using the OS Open Greenspace access points layer, we selected the access points that intersect the boundary of the greenspaces. However, this data layer did not contain access points for many natural greenspaces (e.g., country parks). To identify the missing access points, we performed a geometric intersection between the greenspace polygon layer (Step 1) and a road and path network layer (OS MasterMap ITN Layer Urban Paths, Scale 1: 1250) using ArcGIS Pro (v2.5). This intersection operation identified the points where streets and paths intersected the greenspace boundaries and, thus, could be access points. The two datasets were combined for use in this analysis.
Step 3: Network distance analysis To determine the accessibility of a greenspace from a given location, we calculated the network distance, as this recognises that individuals generally need to follow designated road or path networks in order to reach greenspaces in urban areas (Labib, Lindley and Huck, 2020b). This was achieved by undertaking a 'service area' analysis in QGIS (v 3.14) using the QNEAT3 toolbox to measure and map the network distance from greenspace access points to all other locations within the study area, using the QNEAT3 Isochrone Area (Iso-Area) algorithm. In Iso-Area, the starting points were represented by greenspace access points (identified in Step 2). Using the road and path network layer (OS ITN Layer Urban Paths; details: https://bit.ly/31GaYlu) the Iso-Area algorithm produced a network distance surface (raster) from the access points by applying the TIN-interpolation method (toolbox details: https://root676.github.io/IsoAreaAlgs.html) to a maximum distance of 10 km based on the upper limit in the ANGSt model (Pauleit et al., 2003;Comber, Brunsdon and Green, 2008). The output raster layer mapped the network distances at a spatial resolution of 10 m for computational efficiency. We masked the raster layer to exclude the extra cells generated outside the Greater Manchester boundary and resampled to 5 m resolution to achieve consistency with other layers and datasets.
Step 4: Accessibility index calculation Utilising the raster grid layer produced in Step 3 ( Figure S2, Supplementary document), we computed the greenspace accessibility index for each raster grid cell within the study area using In (eq2), GACIj is the greenspace accessibility index expressed as a continuous value for cell j, ranging between 0 and 1, where 0 indicates the lowest accessibility and 1 indicates the highest. NDmax is the maximum network distance from the nearest access point within the entire study area (in our case the maximum distance was 5932.7 m); NDj is the network distance of a cell from the nearest greenspace access points.

Greenness visibility exposure
We measured and mapped eye-level greenness visibility exposure for an observer located at ground level by applying viewshed analysis using new software we developed as part of our analysis (available at: https://github.com/jonnyhuck/green-visibility-index). We combined a binary greenness layer produced using LULC data from Dennis et al., (Dennis et al., 2018) with LiDAR based digital surface and digital terrain model data (Environmental Agency, 2016) to perform viewshed analysis at a 800 m viewing distance at 5 m intervals for >86 million observer locations covering the whole study area. We used a line of sight (LOS) algorithm, in conjunction with an empirical distance decay model, in order to calculate a distance-weighted ratio of visible greenness to visible area from each given observation point. The details of the decay model and LOS algorithm are reported in Labib et al. (Labib, Huck and Lindley, 2021). Based on the LOS and decay weight values, we used equation 3 to compute the viewshed greenness visibility index for the study area. The details of the data, algorithms, steps, and modelling process can be found in Labib et al. (Labib, Huck and Lindley, 2021).
Where, VGVIj is the 'viewshed greenness visibility index' value for the observer cell j; Gi is the visible green cell, Vi is the visible non-green cell, and di is distance decay weight corresponding to visible cell i. The estimated VGVI values ranged between 0 and 1, where 0 = no visible green, and 1 = all visible green (Labib, Huck and Lindley, 2021).

Composite Greenspace Exposure Index (CGEI)
We combined the three single exposure indices into a composite index value for greenspace exposure. The resulting composite greenspace exposure index is produced by calculating the mean of the GAVI, GACI, and VGVI datasets. We used (equation 4) to produce the CGEI map.
In (eq4), CGEIj is the composite greenspace exposure index value for cell j; GAVIj, GACIj, and VGVIj represent the respective availability, accessibility, and visibility index values for cell j obtained from single exposure maps; w represents the weighting value for each metric. The resulting operation produced the composite greenspace exposure index map at 5 m spatial resolution for the study area.
Note that in this study we have used an equal weighting (w = 1) for each of the three single exposure metrics. Equal weighting was used in this case as currently there is no empirical evidence supporting alternative weightings for these single greenspace metrics in relation to their potential effects on premature mortality. However, the CGEI can also be estimated using differing weights for each individual metric. To illustrate the CGEI estimation process using different weightings for individual metrics we experimented with several hypothetical weight values see Supplementary note-1 for further details.

Assessment of CGEI and single exposure indices
The potential implications of the composite greenspace exposure metric reported in the present study were evaluated in two ways. Firstly, we investigated how the composite greenspace exposure measurement and its individual components each related to neighbourhood socioeconomic deprivation.
Secondly, we compared each of the metrics to neighbourhood YPLL. We used English Lower Super Output Area (LSOA) Census units due to the availability of population-level socioeconomic deprivation and YPLL data, as well as several confounding variables. LSOAs contain a mean population of around 1500 and range from 1000 to 3000 and are widely used to represent neighbourhood units in national analyses (Mitchell and Popham, 2008;Daras et al., 2019). In our analysis, we examined 1673 LSOAs with a total population of 2,682,528 and a mean population of 1603 (standard deviation 394).

Exposure variations at different deprivation levels
We obtained socioeconomic and health data from the English Indices of Multiple Deprivation (IMD) 2015 (DCLG, 2015). The IMD dataset includes social deprivation scores for the LSOAs across seven sub-domains (i.e., income, health, employment, crime, education, living environment, and barriers to housing and services). In this study, we used IMD deciles to investigate variations of greenspace exposure in LSOAs with differing deprivation levels. An IMD decile value of 1 indicated the highest level of deprivation and 10 indicated the lowest level of deprivation. We used a box plot to explore the differences in greenspace exposures visually and then conducted an ANOVA with post hoc Tukey Honestly Significant Difference analysis to identify whether the differences in exposure values among the IMD deciles were statistically significant or not.

Associations of greenspace exposures with YPLL
The years of potential life lost outcome variable was extracted from the IMD 2015 health and disability sub-domain. YPLL measures 'premature death,' defined as death before the age of 75 from any cause, including death due to disease or other external causes. This indicator was estimated based on mortality data for the period 2008 to 2012 as the numerator, and the denominator was the 2008-2012 population estimate. The details of YPLL estimation can be found in the IMD technical report (DCLG, 2015). YPLL is age and sex standardised, therefore it is not vulnerable to biases such as disproportionate representation by a certain demographic section, or double counting (DCLG, 2015;Dennis et al., 2020).
Additionally, it is also weighted by the age of the individual who has died, reflecting the higher impact of an unexpected death of a younger person, than an older one. It should be noted that YPLL is a population-level inference of premature mortality, which has been aggregated to the neighbourhood level (LSOA) for the purposes of this analysis. The associations that we observe are therefore at the population-level and the effects are not transferable to individuals or to different spatial scales of analysis.
Pearson correlation coefficients (r) were calculated to assess correlations among greenspace exposure metrics, YPLL, and neighbourhood socioeconomic variables. Additionally, a hierarchical cluster-based 'heat map' visualisation was produced based on the correlation coefficients to identify clustering among correlated variables (method details in Zhang et al., 2017). Most previous studies have used non-spatial models to explain the associations between exposure and health outcomes (Mitchell and Popham, 2008;Dadvand et al., 2015;Dennis et al., 2020). In this study, we have applied both non-spatial and spatial models to investigate the associations between greenspace exposures and YPLL based on frequentist statistical inferences. Results from non-spatial regression models indicating associations between greenspace exposures and YPLL are presented in Note-2, Supplementary document. In our non-spatial models, we applied the generalised linear model (GLM) with a gamma distribution and an identity link function to account for positive values of YPLL that could skew distributions. We have also performed multiple linear regressions to aid comparison between the nonspatial and spatial models. While the GLM allows non-normal error distribution of the residuals, the residuals may have spatial autocorrelation, which might violate the assumption of independence of error distribution. We therefore computed Moran's I statistic (Moran, 1950) for the model residuals in order to identify the presence of spatial autocorrelation in non-spatial models. All the non-spatial models indicated significant spatial autocorrelation in the regression residuals (Note-2, Supplementary document), thus demonstrating spatial dependency and violating the model assumption of independence of observations and inflating the model parameter estimates. As a result, these results are not reported here.
We fitted spatially explicit models to account for potential spatial autocorrelations and spatial dependencies among the variables in the models. As such, we used a 'spatial autoregressive with autoregressive error' (SARAR) model, which accounts for spatial dependence in both independent and dependent variables Bivand, Pebesma and Gómez-Rubio, 2013). As both our dependent and independent variables were spatially explicit, SARAR provides a better fit for the model than other spatial regression models such as spatial lag and spatial error models. The details of SARAR modelling can be found in Note-3, Supplementary document and Kelejian and Prucha . In our modelling we used a first order queen contiguity spatial weight matrix, this decision was informed by a previous study see Anselin and Rey., (Anselin and Rey, 2014) and our experimental observations of model fit.
We formulated several regression models sequentially for both the spatial approaches, in order to explore the associations between YPLL and greenspace exposure for all of the LSOAs (all population model). In this regard we considered unadjusted and adjusted models. Model 1 is un-adjusted where each greenspace exposure metric was entered in separate models; Model 2 was adjusted for confounders. Model 2 included variables from Model 1 and the confounders. The selection of confounders was informed by a previous systematic review of the relevant literature (Labib, Lindley and Huck, 2020b) and reference to several original studies (Richardson and Mitchell, 2010;Wheeler et al., 2015;Dennis et al., 2020). In the fully adjusted models, we included income scores, crime deprivation scores, barriers to housing-service scores, distance to nearest general medical practice (GPs), the Shannon diversity score, and annual average PM10 concentration as confounders. Income, crime, and barriers to housing and service scores acted as proxy indicators for income levels, crime rates, physical and financial accessibility of housing and local services together. Distance to GP represents access to health services (Ensor and Cooper, 2004;Daras et al., 2019). The Shannon diversity score indicated the overall land cover mix within the study area as a proxy for vegetation quality (Dennis et al., 2020). Finally, the PM10 concentration indicated one element of the ambient air quality of these neighbourhoods since there may be an additional health burden from air quality which is not accounted for by greenspace benefits (Medina et al., 2004;Jeanjean, Monks and Leigh, 2016;Khomenko et al., 2021). We also tested sensitivity to other confounders such as employment and the education deprivation score, both of these demonstrated high multicollinearity with income deprivation whilst not improving the models, and so were not included. We assessed each model's explanatory power (pseudo R 2 ), Akaike Information Criterion (AIC), and likelihood ratio test (LR) in evaluating the model's goodness of fit, prior to determining the optimum selection of confounders and most robust models.
In addition to population analysis for all the neighbourhoods (LSOAs) in the study area, we explored whether the association between aggregate greenspace exposure and YPLL varied by subgroups at different levels of deprivation. We used the IMD score to classify each LSOA, and hence its population, into multiple deprivation quintile groups (

Greenspace exposure distribution and variations
The greenspace exposure index (single and composite) maps are presented in Fig rather different spatial pattern to the other two measures. We found relatively higher accessibility exposure values in and around dense urban cores, likely because these areas typically have many small to medium-size public parks and gardens (e.g., sized 2-20 ha), and good transport links. However, these urban parks and gardens did not result in higher availability or visibility exposure values due to their fragmented pattern and low connectivity. Fig. 1, therefore, demonstrates the differences in exposure values depending on how exposure is conceptualised and measured. Fig. 1(D) illustrates the composite greenspace exposure index map. This map is a combined output of the three single indices, and each cell of this map indicates the mean exposure value of the three exposure surfaces as our measure of aggregate greenspace exposure. The relationship between the three input indices and the resulting composite index is also illustrated for a terraced house located near to a park (Fig. 2). Due to the close proximity of the park (only a few streets away), the house has a very high accessibility exposure value (0.957). Conversely it has a much lower visibility value (0.065) as the park cannot be seen due to the local terrain and intervening buildings. Finally, the house has a relatively low value for greenspace availability (0.225) because of the densely built-up nature of the surrounding area. The composite greenspace exposure index value (CGEI = 0.416) provides an aggregate overall moderate score for greenspace exposure for this particular house. Therefore, the composite greenspace exposure metric represents a comprehensive measurement that accounts for all three objective exposure types together.

Fig. 2:
Differences in greenspace exposure values for an example residence. GAVI shows availability exposure value, GACI shows accessibility exposure value, VGVI presents visibility exposure and CGEI shows the composite greenspace exposure value for the example residence.

Deprivation and greenspace exposures
The analysis of different greenspace exposure metrics and social deprivation deciles at the neighbourhood (LSOA) scale indicates variations in how greenspace exposures co-vary with social deprivation (Fig. 3). Figure 3 illustrates that the least deprived areas have greater greenspace exposure compared to the most deprived areas in the domains of greenspace availability, visibility, and composite greenspace exposures. In the case of accessibility, however, we observed higher exposure values for more deprived areas, although with a smaller level of variation across the IMD deciles (summary descriptive statistics Table S3, Supplementary document). This result illustrates accessibility exposure in our case study area has a different spatial distribution than availability and visibility based on existing greenspaces, parks and gardens. In the case of Greater Manchester, the most deprived neighbourhoods often have public parks and gardens in close proximity, but these neighbourhoods have low greenspace visibility and availability exposure because they are more densely built-up areas with low provision of private gardens and streets with low levels of canopy coverage. Conversely, the least deprived neighbourhoods usually exhibit lower building density, greater provision of private gardens, and streets with greater levels of canopy coverage, but with less provision of public parks and gardens in proximity.
These results demonstrate that apparent inter-group variations in individual greenspace exposure measures can be misleading, and therefore the benefit of this composite measure. The ANOVA analysis demonstrates that the differences in greenspace exposures among different deprivation deciles are statistically significant for all greenspace exposures (Table S4, Supplementary document). The post-hoc analysis identifies an interesting pattern in which differences are usually only significant in the case of the difference between the most and least deprived deciles.
However, among similar deprivation deciles (e.g., 1-3 or 7-9) the differences in greenspace exposures are not statistically significant (Table S4, Supplementary document). This result implies that neighbourhoods that share similar deprivation levels usually demonstrate similar characteristics in terms of greenspace exposure.

Associations among greenspace exposure types and YPLL
Correlation analyses showed that there were very strong positive correlations among composite, availability and visibility metrics for the Greater Manchester neighbourhoods. By contrast, greenspace accessibility only had a small correlation with the composite metric, and a weak negative correlation with the availability and visibility exposure metrics (Fig. 4A). The 'heat map' also illustrates a similar pattern, as availability, visibility and the composite metric are clustered in the same group, indicating high similarity among these metrics; whereas the accessibility metric was not included in this cluster with the other greenspace metrics (Fig. 4B). Such divergent patterns of greenspace accessibility measure were also observed in the spatial distribution greenspace exposure maps shown in Fig. 1.
The correlation analyses and the correlation 'heat maps' further indicated that the availability, visibility, and composite greenspace exposure measures have moderate to weak negative correlations with YPLL and are in contrasting cluster groups with it as well ( Fig. 4B). Additionally, these metrics have negative correlations with deprivation scores and the air quality variable, but they have positive correlations with distance to GP (Fig. 4A). In contrast, the accessibility greenspace exposure metric has a positive correlation with YPLL, deprivation score, and air quality variables, all of which are clustered in the same group (Fig. 4B). analysis. By contrast, for greenspace accessibility exposure, we found increasing accessibility was associated with higher YPLL values. In the unadjusted model, the effect is significant, but following the adjustment, the effect attenuates and becomes insignificant (Table 1). This result implies that, at the population level, accessibility exposures do not significantly relate to YPLL in the case study area.
Overall, these findings meant that the association between greenspace exposure and YPLL, or premature mortality, differed across different types of greenspace exposure metrics. Therefore, measuring exposure using a single metric might not present a complete picture of the influence of greenspace exposure on health (e.g., premature mortality). The adjusted CGEI exposure model indicates that a single unit increase (from CGEI 0 to 1, or 100%) in composite greenspace exposure is associated with a reduction of approximately ten years in the YPLL value. Thus, a 10% or 0.1 increase in the composite greenspace exposure value is associated with an approximately one year lower premature mortality value. It should be noted that both availability and the composite exposure models have nearly identical effect size and R 2 values (~ 0.702, Table 1), meaning that they both explain similar levels of variability in YPLL after controlling for confounders. This is partly a result of similarities between the availability and visibility metrics caused by the fact that 'visible' greenspace (a measure of line of sight) is also likely to be 'available', though the inverse is not necessarily true. The result of this is that the effect of the availability and visibility surfaces upon the composite are quite similar, meaning that (in the case of equal weighting of the three inputs in eq.4) the composite surface resembles both of them quite closely ( Fig 4A). In addition, the exposure metrics and health indicator were aggregated at neighbourhood scale. The aggregated information may have reduced variance, which in turn affected the explanatory power of the models.
Based on these results, we therefore cannot fully confirm which metric is the better predictor of neighbourhood level YPLL for our study area. However, despite having very similar R 2 values, the composite metric when compared with all the individual exposure metrics (Fig. 5A), has a slightly greater effect size (Table 1). In addition, conceptually the CGEI measures holistic greenspace exposure.
Considering the preceding we used the CGEI metric for subgroup analysis in the case of each socioeconomic deprivation group. is significantly inversely correlated with premature mortality among people living in areas of greater deprivation (groups 3-4) compared to those in the least deprived areas (groups 1-2). It should be noted that there is no significant composite greenspace exposure effect on YPLL for people living in areas of relatively low or no deprivation (groups 2, 1). Additionally, for people living in the least deprived areas (group 1), the effect of composite greenspace exposure on YPLL is the opposite of the hypothesised direction, implying that in the least deprived areas, increased greenspace may be associated with increased YPLL, though this effect is not statistically significant. For group 5 greenspace exposure is not significant in the adjusted model. It is possible that in the most deprived areas, the higher premature mortality rate may be attributed to other reasons that we did not consider in the models (e.g., unhealthy food habits, race/ethnicity) (Cecchini et al., 2010). Nevertheless, overall we have observed a deprivation-related gradient in the effect of greenspace exposure on premature mortality.

Multiple spatially explicit measures of greenspace exposures
To our knowledge, this is the first study that estimates greenspace exposure comprehensively by combining multiple spatially explicit greenspace exposure measures into a composite metric.
Previously (Rugel et al., 2017) developed a composite natural space index at a fixed spatial scale (i.e., postcode) combining multiple measures of greenness. The methodological approach that we have developed in this study has improved upon this fixed scale approach by standardising and harmonising different greenspace exposure measures in order to produce a composite metric that estimates aggregate greenspace exposure at hierarchical spatial scales and fine spatial resolutions. We named this the composite greenspace exposure index (CGEI). We argue that the greenspace exposure modelling and mapping methods we developed in this study offer the opportunity to provide comprehensive estimates at the population level and provide the basis to infer individual-level aggregate greenspace exposures at meaningful temporal and spatial scales.
CGEI differs from traditional single exposure metrics in that it can capture objective greenspace exposure holistically and combine multiple inter-twined greenspace exposure types. Such a holistic exposure measure allows for a better understanding of the overall impact of greenspace exposure upon health outcomes such as mortality, morbidity, and non-communicable diseases that may be linked with multiple greenspace exposures. However, it should be noted that the composite is an aggregated expression of the single greenspace indices. Although it provides generalised exposure information it obviously cannot determine the specific cause-effect relations of each exposure type to specific health outcomes. For example, visibility may be more important in understanding the impact on a specific mental health outcome such as stress recovery, (Ulrich, 1984;Kaplan, 1995), than total greenspace exposure as represented in the composite measure. We therefore argue that the use of the CGEI does not diminish the importance of the use of individual exposure maps, as they provide the opportunity to understand specific types of exposure and, hence, which pathways correspond to which health indicators. As a result, these metrics are complementary to one another when evaluating the health impact of greenspace. Greenspace exposure maps (both composite and single metrics) can be used to assess greenspace exposures for places where people live (e.g., home), work (e.g., office), or go about their daily activities (e.g., school, shops) (Fig. 6). In addition to these fixed locations, the results allow the estimation of individual exposure profiles by taking account of people's everyday movement (e.g., using GPS tracks) and the amount of time they spend in different locations (details in Note-4, Supplementary document). In Fig. 6, we provide the example of a hypothetical household of three people (two adults, one child) with a CGEI value of 0.60 for the home. However, each household member spends differing amounts of time in various locations, and each member has different travel routes. The child travels to school (CGEI 0.42) along streets exhibiting moderate levels of green cover.
The child has an overall CGEI of 0.55 based on time spent at home, school, and during periods of movement. Similarly, the first adult (adult1) travels to the office (office-1) (CGEI 0.34), taking a route that exhibits relatively low greenspace exposures for an overall CGEI of 0.501. Finally, the second adult (adult2) travels to a different office (office-2) (CGEI 0.31) along a route with low greenspace exposure, providing an overall CGEI of 0.48.
In the illustrative example (Fig. 6), we demonstrate how the mapped outputs could be readily used to generate aggregate greenspace exposure estimates for individuals in households with different exposure profiles. In turn these individualised exposure profiles could be linked with individualised health data. While the composite provides an estimation of aggregate greenspace exposures, the single exposure maps can nevertheless provide crucial insights into the different types of greenspace exposure that an individual may experience at different spatiotemporal scales. This type of space-time exposure profiling has been utilised in other exposure studies relying on activity-based exposure measurements (Briggs, 2005;Zhang et al., 2018), but multiple exposure pathways have not been considered. Our new approach reduces existing methodological challenges in objective greenspace exposure assessments that rely on single exposure metrics, including measurement of exposure at fixed spatial scales, the uncertain geographic context problem, and issues with heterogeneous methods of exposure assessments across studies (Kwan, 2012;Turner et al., 2017;Labib, Lindley and Huck, 2020b).

Impact of greenspace exposure types on premature mortality and deprivation
We tested the effects of greenspace exposure in relation to neighbourhood-level socioeconomic deprivation in its relation to years of potential life lost (YPLL), a measure of premature mortality. We found that areas with higher availability and visibility exposures are usually the least deprived, and the most deprived areas have comparatively low greenspace exposures according to the same metrics.
When we considered accessibility exposure. The results demonstrated the opposite trend, indicating that the most deprived areas have greater accessibility to public greenspaces (Dennis et al., 2020), than the least deprived due to the frequent proximity of small or medium size (e.g., 2-20 ha) public greenspaces. We speculate that such patterns could be attributed to the nature of the development of post-industrial towns in England, where parks were historically built by mill owners amongst dense terraced housing in order to support the factory workers, following the Open Spaces Act of 1877 and Public Health Act 1875 (Clark, 1973;Jordan, 1994). In such towns and cities, such dense terraced housing areas typically present with higher deprivation values. Our accessibility result is therefore consistent with previous work suggesting that greenspace accessibility is greater in areas of greater socioeconomic deprivation in the UK (Barbosa et al., 2007;Jones, Hillsdon and Coombes, 2009).
When comparing the mean composite greenspace exposure values for neighbourhoods at differing socioeconomic levels, we observed a deprivation-related gradient in greenspace exposure, in which the least deprived areas exhibited higher composite greenspace exposures than the most deprived areas. These results align with previous research showing that greener neighbourhoods are also less deprived communities (Mitchell and Popham, 2008;Rigolon et al., 2021). However, we show that the composite measure presented here can estimate and differentiate greenspace exposures to a greater extent than the individual single exposure metrics which are typically used to evaluate the relationships between inequality and greenspace exposure. The results of the present study also suggest that the composite index has the potential to reveal new or refined health associations in previously reported studies which have so far only used traditional (e.g., percentage of neighbourhood greenspace) or individual metrics (e.g., accessibility). As we have demonstrated, single exposure metrics measuring differing factors such as accessibility or availability may provide results that appear different or contradictory (Fig. 3). As a result, single metric-based results can potentially misrepresent the true level of differences in greenspace exposure experienced by neighbourhoods at different levels of socioeconomic deprivation.
When we evaluated the associations between greenspace exposure and premature mortality (i.e., YPLL), the results showed that individual greenspace exposure metrics (i.e., availability, accessibility, and visibility) provided considerable variances in effect sizes (e.g., effect size for availability: -9.80, visibility: -6.06) and directions regarding the associations between exposures and outcomes (Table 1). Both the availability and visibility greenspace exposure metrics indicate significant negative associations and imply that higher levels of greenspace availability or visibility are associated with lower YPLL (reduced premature mortality). These findings are consistent with previous studies indicating greater availability or visibility of greenspace can have positive health benefits, including reductions in mortality (Fong, Hart and James, 2018;Kondo et al., 2018;Rojas-Rueda et al., 2019).
Our results, which show no significant associations between greenspace accessibility and YPLL, are consistent with several previous studies reporting associations between greenspace exposure and health outcomes Klompmaker et al., 2018;Jarvis et al., 2020;Labib et al., 2020) but not with others (Coutts, Horner and Chapin, 2010;Wilker et al., 2014;Dennis et al., 2020). These latter studies reported positive health effects related to increased greenspace accessibility.
Such inconsistent results might be linked to different approaches to conceptualising and measuring accessibility (e.g., shortest distance vs. fixed distance) between the present study and previous studies (Jarvis et al., 2020;Labib, Lindley and Huck, 2020b). It must also be noted that several other studies have argued that the positive effect of greenspace accessibility may be context-specific (Jones, Hillsdon and Coombes, 2009;Richardson et al., 2010). where the poor perception of greenspace may deter its use (Jones, Hillsdon and Coombes, 2009).
Therefore, even if greenspace accessibility is high, if that greenspace attracts few visits then the associated proactive health benefits might not be significant.
Despite these inconsistent associations when using single metrics, in line with our hypothesis, we found increasing overall community greenspace (i.e., measured as composite greenspace exposure) was associated with a lower level of premature mortality. Our overall findings are consistent with previous reports of the positive impact of greenspace exposure associated with reductions in different types of mortality among people living in greener communities (Mitchell and Popham, 2008;Coutts and Horner, 2016;James et al., 2016;Rojas-Rueda et al., 2019).
Our research suggested that variations in the effects of different greenspace exposure measurements might link with the underlying mechanistic pathways associated with each exposure metric (Marselle et al., 2021;Zhang et al., 2021). For example, greenspace availability (i.e., the overall amount of vegetation in an area) might reduce the effect of harmful environmental stressors such as air pollution or heat (Markevych et al., 2017;Lindley et al., 2019) and reduce premature death. Greenspace accessibility (i.e., the relative proximity of public green spaces) may be associated with building capacities by encouraging physical activity and social cohesion (Hartig et al., 2014;Markevych et al., 2017). However, in our case study area at the population level, such mechanisms might not be as effective as other pathways. This conclusion was supported by the insignificant association between accessibility and YPLL in the adjusted model. Visibility of greenspace may be linked with restoring capacities related to psychological mechanisms such as stress recovery that reduce morbidity (Ulrich, 1984;Frumkin et al., 2017;Markevych et al., 2017). In the present study, the effect size of visibility was even smaller than that for availability, indicating the positive effect of visible greenspace might have an even lower relative impact than availability in lowering premature mortality. Overall, the composite greenspace exposure showed a slightly larger effect size than any individual metric in the adjusted model (Table 1). As the composite greenspace exposure accounts for all three well-known important mechanistic pathways through which nature influences health (Markevych et al., 2017), it is reasonable to argue that composite greenspace exposure might have a stronger relationship with premature mortality than has previously been considered when based only on individual metrics (e.g., visibility, accessibility). This finding is critical because it implies that many health-related indicators such as mortality, morbidity, and other non-communicable diseases and health syndromes might be influenced by the aggregate impact of multiple greenspace exposure types due to the multiple associated exposure benefit pathways.
We also found significant differences in the relationship between composite greenspace exposure and YPLL among groups at different socioeconomic deprivation levels. Our results indicate that increasing greenspace exposure in areas of moderate to extreme deprivation corresponds to a lower YPLL (reduced premature death) than in the least deprived areas. These findings are consistent with published studies that suggest that the health benefits of increased greenspace exposure are more pronounced among low-income and socioeconomically deprived populations (Marmot, 2020;Rigolon et al., 2021;Wu and Kim, 2021).

Methodological approach in multiple spatially explicit exposure modelling
In this study, we developed a new methodological solution for measuring aggregate greenspace exposures at hierarchical scales and fine resolutions. The complete measurement of a set of environmental exposures can be linked with the broader framework of the 'Exposome,' which considers "the totality of human environmental exposures from conception onward" (Wild, 2012). The exposome concept includes all external and internal exposures throughout a human lifespan (Wild, 2012;Vrijheid et al., 2020). Analysis of a subset of external greenspace exposures might provide a more useful framework for studying the associations between aggregate greenspace exposure and human health, as opposed to using a single type of greenspace exposure. The methodological process we introduce in this study illustrates an approach that allows a measurement of the summation of the multiple spatially explicit exposures and allows capturing high spatial-temporal exposure variability at multiple scales.
Therefore, a spatially explicit composite measurement could be considered a step toward a "spatial exposome" and results from such measurements can be utilised to support the broader exposome framework. A spatial exposome allows for spatial hierarchies in aggregate exposure assessments for any given set of external environmental exposures, which can then, in turn, allow the inference of comprehensive exposure profiles for individuals as well as for populations of varying sizes. While our composite exposure metric provides an introductory example of the spatial exposome concept, it will require further development in terms of other sets of external exposure variables (e.g., air, noise pollution) and time points.

Limitations and future developments
Although we have evaluated the robustness of our methodology and elucidated the implications for understanding premature mortality and socioeconomic inequality, our study does have some limitations. First, our exposure metrics are spatially explicit objective measurements of greenspace exposure based on the different vegetated areas alone (i.e., they do not include blue spaces).
Furthermore, the measurements primarily represent the quantity of greenspace rather than its perceived quality (which we have used as a confounder). As a result, consideration should be given to integrating objective (e.g., vegetation diversity) or subjective (e.g., attractiveness) measurements of quality with the composite greenspace exposure index in future studies. Such subjective perceptions of greenspace exposure could be collected using individual surveys and integrated with our objective assessment to provide an overall understanding of the effect of both objective and perceived greenspace exposure on health.
Second, to evaluate our new metric, we investigated the direct association between greenspace exposure and premature mortality. We did not conduct an explicit mediation analysis to understand the mediating effect of variables such as air or noise pollution or heat. Although some previous studies have indicated mixed results for, and relatively lower effects from, these variables in terms of their mediation of the associations between greenspace exposure and health (James et al., 2016;Vienneau et al., 2017).
In future studies, these variables should be critically evaluated for their potential effects on results generated using our composite greenspace measurement approach applying more explicit mediation and moderation approaches, such as structural equation modelling (Dzhambov et al., 2020).
Third, we used equal weights for each of the single greenspace exposure metrics in the composite greenspace calculations. No pre-existing evidence existed to indicate the precise level of health benefits associated with each type of greenspace exposure based on their differing effect pathways, the individualistic nature of the impacts, and the spatiotemporal variability. However, it would be possible to construct a weighted average for the composite exposure estimation if such information is available. In supplementary Note-1, we provided a worked example of weighted composite greenspace exposure index estimation based on a range of hypothetical weights. To generate actual weights for different input layers of the composite in future applications, a meta-analyses or analytical hierarchy approach could be adopted to identify the relative importance of availability, accessibility, and visibility exposures considering different health outcomes.
Finally, our population-level inference of greenspace exposure and premature mortality was made using cross-sectional data, and we had no means of knowing the extent to which individuals experienced variances in their levels of greenspace exposure throughout their lives. Therefore, we cannot confirm that the relations we observed can be inferred as a causal effect of greenspace exposure on premature mortality. Additionally, the population and exposure information were aggregated at the neighbourhood administrative boundary scale (LSOA). As a result, the correlations observed in aggregate cannot be transferred to individuals or to different spatial scales of analysis, due to the potential presence of zoning effects associated with the modifiable areal unit problem and ecological fallacy (Openshaw, 1984;Goodchild, 2011). Although (Annerstedt Van Den Bosch et al., 2016) noted that spatially aggregated population data is an acceptable alternative to using individual data, we recognise that to fully understand the effect of comprehensive greenspace exposure on individual health, longitudinal and life-course assessments of greenspace exposure need to be accounted for (Turner et al., 2017;Jia, 2019).

Conclusion
In this study, we provided a new methodology for estimating multiple greenspace exposure metrics. Our approach has the potential to reduce the heterogeneity inherent in current approaches to greenspace measurements, as it provides an analytical technique that allows multiple quantitative greenspace exposure measurement practices to be harmonised and standardised. This enables a shift from the traditional approach of measuring single greenspace exposure towards a more holistic approach to greenspace exposure estimation at multiple spatial scales. Our analytical approach also allows more spatially explicit aggregate exposure assessment and can be partially linked with the exposome framework. When applying the new index to population-level analyses, we found significant negative associations between community greenspace exposure and premature mortality. The association between increased greenspace exposure and lower premature mortality was particularly marked in areas with higher aggregate greenspace exposure levels. However, such associations differed significantly based on any given community's level of socioeconomic inequality. Overall, our results suggest that composite greenspace exposure measurements can reveal new insights into the public health benefits of exposure to the natural environment. Furthermore, the variations in the level of benefit derived by individuals from given levels of greenspace exposure may depend on the population of which they are members and their socioeconomic status.

Declaration of interest
We declare that we have no conflicts of interest.

Note 1: Experiment with different weights for composite
In the main text, we estimated the composite greenspace exposure index (CGEI) as a mean of the GAVI, GACI, and VGVI. We took the average of the single exposure metric providing equal weights to each exposure type. As noted in the main text, there is no pre-existing evidence available that identified the relative weights of these exposure metrics in relation to premature mortality; therefore, we assumed each greenspace exposure might have an equal effect on premature mortality rate. However, it can be argued that certain exposure type may have relatively higher weights than others, depending on the health outcomes considered. For example, access to greenspace may be more important for physical activity-related outcome variables. Considering the potentials of different weights for different greenspace exposure types, we also developed a weighted approach for the CGEI estimation.
We developed the following equation to estimate the weighted CGEI value for varying weighting of the single exposure indices.
Here, CGEIWnj is the composite greenspace exposure index value for cell j for weight Wx (x =1 to n); GAVIj, GACIj, and VGVIj represent the respective availability, accessibility, and visibility index values for cell j obtained from single exposure maps; WxGAVI, WxGACI,, WxVGVI, indicate weight value for the corresponding exposure layer.
To test the sensitivity of the weighted CGEI, we generated three hypothetical weighting scenarios for single exposure layers (Table NS1-1). The weighting system considers one metric to have twice more important than the other two metrics. We used these three weighting scenarios and produced three weighted CGEI maps ( Figure NS1-1). Finally, we compared the weighted CGEI maps with the non-weighted (average) CGEI map.  Figure NS1-1: CGEI maps with varying weighing for the input exposure metrics. CGEI-equal where the input exposure maps have equal weights; CGEI-W1, CGEI-W2, and CGEI-W3 reflect the different weights for GAVI, GACI and VGVI metrics.
As illustrated in Figure NS1-1, the weighted CGEI maps showed a similar spatial pattern in exposure to the equal-weighted CGEI map. However, the CGEI-W2 map has a slightly different spatial pattern in exposure gradient. It has more weights for the accessibility metric (i.e., GACI), which has a varying exposure pattern compared to the availability visibility metrics. As the maps look very similar, we further explored the correlations among these maps. The correlations result presented in table NS1-2. It is clear that the weighted maps have very high correlations (coefficients are near 0.9) with the equal-weighted map. Such relations are reasonable as they have similar spatial patterns. Furthermore, the weights are not considerably different from the equal weights (e.g., GEVI weight is 0.5 in CEGIW1, in CGEI-equal GAVI weight was 0.333). We recognise that if specific input layers have very high weights (e.g., 0.9 for one layer and 0.5 for the other two layers), the weighted CGEI map may have a considerable correlation with the equalweighted CGEI map. To further evaluate the sensitivity, additional weighting combinations need to be checked with other evaluation approaches such as simulation (e.g., Monte Carlo) and the analytical hierarchy process, which are beyond the scope of this current study.
Nonetheless, to explore the sensitivity of different weighted CGEI maps with our outcome variable (YPLL), we tested their correlations at a neighbourhood scale (Table NS1-2). The results showed that higher weights for availability and visibility in the CGEI have slightly stronger correlations with YPLL than the equal weight CGEI. In contrast, the higher weighted accessibility CGIE (W2) has relatively lower correlations with YPLL than the equal-weighted CGIE. These examples indicate that varying weights of the composite input layers might identify stronger or weaker correlations with the health outcome variable. Such differences in associations need to be considered in modelling the effects of CGEI on health outcomes; as for different health outcomes, certain weighting may be more relevant than equal-weighted CGEI.

Note 2: Non-spatial modelling and outputs
Non-spatial models do not consider the spatial patterns (e.g., clustering) and dependence observed among variables; therefore they do not account for spatial autocorrelation, lags, and covariance among, and between, dependent and independent variables in the models. However, if autocorrelation, and spatial lag are ignored in regression models those models become vulnerable to the violation of the assumption of independence of residuals and Type-I error estimation (Waller and Gotway, 2004). These are issues that can affect non-spatial statistical models such that, by consequence, they produce more apparently significant results than are justified by the data. Such issues can also inflate parameter estimates (Labib et al., 2020a;Waller and Gotway, 2004). Nonetheless, in existing studies of health and greenspace, non-spatial models are the most commonly applied to the data.
Considering this, we applied the same variables and model structure used in adjusted spatial models to such non-spatial models.

Generalised linear models (GLM)
In the present study, we used non-spatial generalised linear models (GLM) using a gamma distribution and an identity link function to account for positive values of YPLL that could skew distributions. GLM is a robust modelling approach that allows variables to have non-linear distribution and it does not require assumptions usually associated with linear regression models to be fulfilled. The model results for greenspace exposure as related to YPLL are presented in Table NS2-1.  values. Nonetheless, it should be noted that this trend is specific to areas such as the Greater Manchester area and similar post-industrial UK cities as discussed in Section 3.3, and Section 4 in the main manuscript.
The pseudo R 2 (i.e., Nagelkerke) and AIC values are also presented in We estimated the Moran's I value by applying a Monte Carlo simulation (a permutation bootstrap test) for the regression residuals of each model. The Moran's I value for each model was positive and statistically significant (p < 0.001). This suggests that the model residuals are clustered, and that there is high spatial autocorrelation among the variables used in the models (Waller and Gotway, 2004;Bivand et al., 2008).

Multiple Linear regression models
While the GLM models are more flexible with model assumptions and allowing varying distribution, these models are not useful in comparing with spatial model. Therefore to provide easier comparisons with spatial models, we also run multiple linear regression for the same model structure.
The results of the multiple linear regression presented in Table NS2-2. *** indicate statistical significance at p < 0.05. R 2 is Nagelkerke pseudo-R-square. †model adjusted for: Income score, Crime score, Barriers to housing score, Shannon diversity score, Distance to nearest general medical practice, and yearly average air pollution (PM10).
The linear regression models also indicate similar associations as the GLM models, however the effect sizes are slightly larger for all the exposure variables. The AIC values are relatively higher compared to the GLM models, and the presence of spatial autocorrelations in the residuals are confirmed for these linear regression models. These results indicate while the linear regression models show similar pattern in associations between greenspace exposure and YPLL, there estimates are inflated due to both violation of model assumptions and spatial autocorrelations.
As the presence of spatial autocorrelation in the models has been validated, it can be argued that the model estimations cannot be considered accurate, and that the models might inflate the parameter estimates (Waller and Gotway, 2004;Bivand et al., 2008;Griffith, 2005). It is noteworthy that for the same exposure variables when comparing spatial regression model parameter estimates (Note 2) to non-spatial model parameter estimates that the latter result in considerably higher values.
This further indicates that the use of non-spatial models with spatially-explicit exposure and health data is likely to result in estimation of greater effect sizes, which may be attributed to spatially autocorrelated observations and spatial dependence (Veloz, 2009;Waller and Gotway, 2004). Overall experiments with non-spatial models have indicated that greenspace exposure is significantly associated with health outcomes (i.e., YPLL). However, the model residuals are spatially autocorrelated implying the parameter estimates may not be accurate and, hence, the statistical significance may also be vulnerable to Type-I errors. Considering the previous, we adopted spatial regression as the primary modelling approach used in this study.

Note 3: Spatial modelling and outputs Spatial Regression modelling
Non-spatial modelling confirmed the presence of significant spatial autocorrelation in the regression residuals (Note-2). Therefore, the values for the coefficients and p-values of the non-spatial models may not provide adequate confidence values such that inferential reasoning in regard to their relationships can be supported (Bidvan et al., 2008;Waller and Gotway, 2004). In this regard, spatial regression models such as spatial autoregressive models or geographically weighted regressions (GWR) are the alternative solutions. For the purposes of the present study we selected spatial autoregressive models to test the relationships between greenspace exposure and YPLL, given their robustness in explaining the overall pattern of relationships between the dependent and independent variables that account for spatial dependence, and autocorrelation. While the GWR provides estimations of coefficients using local regression, GWR models are unable to make inferences about overall relationships between dependent and independent variables (Fotheringham et al., 2003). As a result, GWR was not used in the present study.
Before running the spatial models, we applied the lagrange multiplier (LM) test to investigate for spatial dependence and spatial heterogeneity among the model variables and errors (details in Anselin, 1988a). The LM test was used to verify whether conducting spatial modelling could improve model reliability and resolve issues with spatial autocorrelation, and dependence (Anselin, 1988a,b). In our case, we ran the LM test for each non-spatial model (Note-S1). The LM test results indicated that both the spatial error model, and the spatial auto-regressive moving-average (SARMA) model could significantly improve reliability of the modelled relationships (Table NS3-1). Therefore, use of spatial models was indicated versus use of non-spatial models. ***indicate statistical significance at p< 0.05.
As illustrated in Table SN3-1, the LMerr, and SARMA dependence tests indicated the presence of statistical significance implying that the spatial error and moving average models could account for spatial dependence in the relationship between greenspace exposure and YPLL. In this case, the values for the lag tests did not reach statistical significance, implying the independent spatial lag variable may not have significant impact on the models. However, SARAR tests did provide statistically significant values, which indicated that the dependent variable of the model was affected by spatial lag, and that the error terms of the models demonstrated spatial autocorrelation (Anselin, 2013). The previous was also true in our case, in which the YPLL (dependent variable) showed spatial lag, and the error terms of the models were spatially autocorrelated. As a result, a combined nested model would be needed to account for both a spatially lagged dependent variable and a moving average for error terms. On the basis of these observations, in the present study, we used a spatial autoregressive model with autoregressive disturbances (SARAR), also known as the "Spatial simultaneous autoregressive" (SAC) model .
As the name suggests the SAC/SARAR models combine two types of spatial models, one to deal with spatial dependence in the lagged variables in the model and another model to account for spatial autocorrelation in the error distribution. Simply put, SAC models allow higher-order spatial dependence in the dependent variable, independent variables, and spatial errors LeSage, 2008). For n spatial units, the model can be expressed with the following equations (details of the derivation in : 1998); Here, yn denotes the dependent variable (in our case YPLL), Xn denotes the independent variables (greenspace exposure and, Wn, and Mn are the spatial weight matrix, un denotes the regression disturbances, and εn indicates the non-spatial component of the errors. The λn (lag simultaneous autoregressive lag coefficient), and ρn (error simultaneous autoregressive error coefficient) are the spatial autoregressive parameters. Lambda accounts for spatial lag in the dependent variable and rho accounts for spatial autocorrelations among the regression errors. The model allows the spatial matrixes to be the same with Wn = Mn, which will frequently be the case when the model is in use . The spatial weight matrix represents the spatial structure of the data and defines the spatial relations among locations and, therefore, determines the spatial autocorrelation statistics (Getis, 2009;Zhou and Lin, 2007).
There are several approaches that can be applied in developing a spatial weight matrix, such as use of a distance based matrix, or weights based on boundaries between spatial units (Tsai et al., 2009).
Commonly, however, spatial regression models are used to compute spatial weights following boundary-based approaches such as contiguity-based weight metrics (e.g., queen criterion of contiguity) due to their robust and objective estimations of spatial weights among spatial units that share a common border of non-zero length (Tsai et al., 2009;Bidvan et al., 2019). In our case, we selected a first order queen contiguity-based weight metric for the estimation of spatial weights for the regression models.
Queen contiguity combines the spatial contiguity of both the Rook and Bishop relationships into a single measure which provides a better characterisation of spatial weights among features, especially when health-related data are used in spatial models (Tsai et al., 2009;Li et al., 2017). Based on the model structure described in the main manuscript, the results of both unadjusted and adjusted spatial regressions of exposure variables and YPLL are presented in Table NS3-2. The spatial regression models also produced estimates for rho, lambda, and likelihood ratio (LR) test results. As presented in Table NS3-2, the rho, and lambda parameters in all the models are statistically significant. This implies that the simultaneous autoregressive parameters accounting for spatial lag and errors have influence on the modelled variables and results. Adding rho and lambda as parameters has allowed us to adjust spatial lag and autocorrelation and thus provided a robust estimation of the observed associations. This also implied that, compared to the non-spatial models, the results of the spatial models provide greater confidence for making statistical inferences about the modelled relationships. In this regard, the LR test results for all the models are also significant indicating the SAC models have better goodness of fit compared to non-spatial linear models.
Additionally, that there are no significant spatial autocorrelations in the model residuals. In short, these results indicated spatial lag, and errors are critical components in modelling the relationships between greenspace exposure and YPLL. Spatial models account for both spatial lag and autocorrelation and therefore provide a better estimation of the associations between greenspace exposure and YPLL. The comparisons of estimated coefficients of spatial (Table NS3-2) and non-spatial models indicate that the parameters estimates provided by the spatial models are considerably smaller than the parameter estimates provided by the non-spatial models (Table NS2-1, NS2-2). This observation provides an interesting insight, the larger parameter estimates in the non-spatial models may have been attributed to the autocorrelated errors and spatial lags, which are not considered by non-spatial models (Veloz, 2009;Waller and Gotway, 2004). Therefore, in non-spatial models the parameter estimates might have been inflated by these correlated observations (Veloz, 2009), which in turn exhibited apparently larger effects for greenspace exposure on YPLL, than actually exist. These are critical observations because many existing studies do not consider spatial lag and autocorrelations when modelling the relationship between greenspace exposure and health outcomes, and, as a result, these models might have inflated the effect size of the estimates. Therefore, the inconsistent results provided by previous studies that used non-spatial models might be due to varying spatial lag and autocorrelation values existing in their models. Future studies should consider these effects carefully to provide, better, more accurate, and robust estimations of the associations between greenspace and health.

Note 4: Dynamic exposure profiling for hypothetical household members
Greenspace exposure can vary for each individual depending on their spatial and temporal variations in daily activities Zhang et al., 2018). Existing studies usually ignore such spatial and temporal variations in the exposure of individuals and measure exposure only in the context of a fixed geographic location . However, such an approach to exposure estimation is insufficient to provide inferences as to level of individual exposure because they do not consider the temporal dimensions and movement of people engaged in differing activities. Both of these dimensions can have an impact on individual levels of exposure. In the present study we developed greenspace an exposure index map, that can be used to measure individual greenspace exposures while taking account both of spatiotemporal variations and individual's daily movements. Using GPS tracks, geocoded addresses, and duration of exposure at different activity locations or along paths of movement, individual daily greenspace exposure can be measured with greater accuracy than heretofore. In this study we provided an example of such exposure estimation using a hypothetical household with three members (i.e., child, mother, and father). Each individual in the household had both different activity locations and different movement patterns for each typical workday in the week. The method applied to individual daily exposure estimation followed the steps listed below: Step-1: Geocode the address of their activities. For simplification of this hypothetical example, we listed, home, office, school, and park. Other places such as shops, gym, restaurants can also be considered.
Step-2: Record their GPS tracks of paths travelled from place to place. We used hypothetical GPS track records created by the lead author.
Step-3: From GPS tracks or travel diary entries estimate the amount of time each individual spent at each location as well as travelling to and from different locations. Time of travel depended on the mode used to travel. In this regard we simplified matters by using standard travel speeds for different modes of travel (Table NS4-1). Step-4: Extract greenspace exposure value for each activity location and mean exposure value along the paths of travel. We used 5 m intervals along the GPS tracks to extract composite greenspace exposure values, and then took an average of all exposure values to estimate mean greenspace exposure along the each path of travel.
Step-5: Estimate the time-weighted greenspace exposure value for different locations, and travel paths.
We used the following equation to estimate the 24 hr greenspace exposure for each individual. Additionally, temporal profiles were provided that indicated the amount of time each individual spent at different locations or along travel routes.
The results indicated that the child spent more time at home than the parents and that home had a higher CGEI value than most of the other locations (e.g., school) that were considered. Additionally the child also visited the park. The child used cycling and walking as modes of travel, which resulted in longer durations spent on travel compared to motorised modes (e.g., car) and the paths of travel used by the child had higher CGEI values compared to paths of travel applicable to the parents. In aggregate the child's daily activities resulted in an IDGE value of 0.557, a higher value than for either parent. The mother spent considerable time at the office (with a low CGEI), the routes of her travels also had lower exposure values and included the use of a car. Thus, the mother had a lower IDGE (0.501) than her child (Table NS4-2). Finally, the father had an even lower IDGE (0.489) than the mother, due to the low CGEI values obtained at the father's office and the combination of modes of travel and long travel routes the father experienced. we argue that our greenspace exposure mapping technique has the potential to improve individual daily greenspace exposure estimations by integrating additional data sets such as GPS tracks. Furthermore, such detailed exposure assessments would allow the creation of complete greenspace exposure profiles for any given individual and that would provide the opportunity to estimate greenspace exposure over long periods of time, and at multiple locations. Such processes could be used to support the wider framework of the exposome by allowing better characterisations of how greenspace exposure is associated to human health on an ongoing basis. Fig. S1. Combined multiscale NDVI (a), LAI (b), and LULC (c) exposure values. SI Tables   Table S1. Greenspaces considered in accessibility exposure measurements and rationale for selecting these greenspaces.