Greenspace spatial characteristics and human health in an urban environment: An epidemiological study using landscape metrics in Sheffield, UK

Cross-sectional research linking exposure to greenspace with human health rarely describes greenspace characteristics in detail, but a few studies do find that some types of greenspace have greater health benefits than others. We review literature linking landscape metrics to multiple mechanisms by which greenspace exposure is posited to benefit health. Using metrics identified in this process to describe the composition and configuration of urban greenspace, we conduct a small-area epidemiological analysis of self-reported general health for the city of Sheffield, UK. A relatively high proportion of water cover and a high diversity of tree planting are associated with lower levels of poor health; while a high proportion of grass cover, which may be indicative of low quality greenspaces, is associated with higher levels. The presence of large greenspace patches that are well interspersed with the built environment is also associated with lower levels of poor health. We demonstrate a successful methodology for identifying useful landscape metrics even where effect sizes are small, and explore the challenges of translating results of landscape metric studies into policy guidance.


Introduction
It is now widely accepted that exposure to greenspace, including urban greenspace, has health-promoting efects for humans (World Health Organization, 2016). These beneits derive from several processes, including reducing stress and improving psychological restoration; promoting physical activity and neighbourhood social cohesion; amelioration of air and noise pollution and of the heat island efect; and modulating immune functioning (reviewed in World Health Organization, 2016). Moreover, the beneits have been shown to be strongest in more deprived groups, such that greenspace can reduce health inequalities associated with socioeconomic deprivation (Maas et al., 2009;Mitchell and Popham, 2008).
Cross-sectional research linking greenspace with physical and psychological health outcomes has often used remotely sensed data to assess greenspace exposure. Greenspace exposure is most often measured simply as total area within e.g. a census area or a bufer around postcode areas, or as distance to the nearest greenspace, without attempting to describe that greenspace (Jorgensen and Gobster, 2010;Wheeler et al., 2015). Not all greenspace is equal in its health beneits, however: studies that split greenspace into even relatively broad typologies ind that some types afect health to varying degrees, but others do not. For example, Wheeler et al. (2015) found that UK census areas with a large proportion of 'broadleaf woodland', 'arable and horticulture', 'improved grassland' and 'coastal' land covers had lower rates of self-reported poor health. Other land covers ('coniferous woodland', 'semi-natural grassland', 'mountain, heath, bog', 'saltwater' and 'freshwater') had no signiicant efect.
One approach to describing iner details of the structure of landscapes, without resorting to resource-intensive, diicult to scale methods such as site surveys, photographs or simulations (e.g. Dramstad et al., 2006;Fuller et al., 2007;Hoyle et al., 2017;Palmer, 2004), is to calculate landscape metrics from remotely sensed data. Landscape metrics quantify spatial heterogeneity in terms of composition (i.e. what exists -quantity and diversity of patches) and coniguration (spatial arrangement, e.g. patch shape and aggregation). They have been used extensively in ecological and environmental sciences to relate spatial patterns to processes such as biodiversity, water quality, and aesthetic preference (Uuemaa et al., 2013). In recent years, a number of studies have found relationships between landscape metrics and several of the processes from which the health beneits of greenspace are thought to derive. These processes include aesthetic preferences, which are linked to psychological restoration (Dramstad et al., 2006;Palmer, 2004;Staats et al., 2003;Van den Berg et al., 2003); neighbourhood walkability, which promotes physical activity (Hajna et al., 2014;Manaugh and Kreider, 2013); and noise and air pollution levels (Han et al., 2018;Liu and Shen, 2014;Weber et al., 2014).
These studies, however, rarely explicitly link landscape metrics to measures of human health directly. There are two notable exceptions, which have used an epidemiological approach to look at speciic aspects of health. Shen and Lung (2017) used pollution and mortality data from administrative districts in Taiwan to show that landscape metrics are associated with air pollution concentrations, which in turn is related to mortality from respiratory diseases. Müller-Riemenschneider et al. (2013) investigated the relationship between metrics indicating neighbourhood walkability in bufers around addresses with cardiometabolic risk factors determined from health survey data, inding a negative relationship between walkability and prevalence of obesity and diabetes.
The relationship between the composition and coniguration of urban greenspace and more general measures of health does not appear to have been studied previously. There are many dozens of landscape metrics, with little consensus on which may be most useful in particular situations (Cushman et al., 2008;Uuemaa et al., 2009). In order to guide the choice of metrics used in our analysis, we therefore undertake a literature search for studies that have used landscape metrics to describe the urban environment and relate them to the processes through which health beneits are thought to derive.
The aim of this study is therefore to test the utility of landscape metrics as indicators of aspects of urban greenspace that contribute to human health beneits, where health is measured directly instead of using processes such as physical activity promotion or air pollution reduction as a proxy. Speciically, we use self-reported general health, a subjective composite measure that is associated with a range of physical, mental and social factors as well as all-cause mortality (Kyin et al., 2004;Mavaddat et al., 2011). Our aims are: • To review landscape metrics that have previously been found to have utility as indicators in studies linking landscape patterns (greenspace composition and coniguration) to processes that drive beneits to human health from urban greenspace.
• To use these metrics to explore the relationship between landscape patterns and self-reported general health in an urban environment, using Sheield, UK as a case study.
• To evaluate the usefulness of the landscape metric indicator approach to planning and designing cities to minimise health inequalities.

Literature review
The purpose of our literature review was to identify studies that have found statistically signiicant associations between landscape metrics and mechanisms through which beneits to human health from greenspace arise. We identiied mechanisms from a World Health Organization (2016) review. Eight mechanisms were considered: aesthetic preference (related to restoration and relaxation), physical activity promotion, social value, air pollution reduction, noise pollution reduction, immune function regulation, exposure to sunlight and promotion of pro-environmental behaviour (another mechanism identiied in the review, heat island mitigation, was not considered because our study area has a temperate climate and excessive heat is rarely a problem).
We used Scopus to search for papers including either "landscape metrics" or "fragstats" (a widely used landscape metric software package), plus a term relating to one of the above mechanisms, in the title, abstract or keywords. The full list of search strings is shown in Supplementary Material 1.1, along with the criteria for inclusion. In general, papers and individual metrics were rejected if they did not link  M. Mears, et al. E c o lo g ic a l I n d ic a t o r s 1 0 6 ( 2 0 1 9 ) 1 0 5 4 6 4 neighbourhood or intra-urban landscape metrics calculated from remotely sensed data to a mechanism of beneit (or harm) to human health, or if they focused exclusively on measuring the built (rather than green) environment. The results from Scopus were supplemented using references contained within relevant studies, and by using the same keywords with Google Scholar. A diferent approach was taken for a ninth mechanism, presence of biodiversity, which was identiied by the World Health Organization (2016) as a co-beneit of greenspace but has been shown to be associated with psychological beneits from greenspace (Fuller et al., 2007). Beninde et al. (2015) provide a recent synthesis of factors inluencing intra-urban biodiversity. We selected metrics that measure the signiicant factors identiied in this meta-analysis.

Study area and units
The city of Sheield, UK (53°23′N, 1°28′W; Fig. 1) is an inland city covering an area of 368 km 2 , with a population in 2011 of 552,000 (Oice for National Statistics, 2016). Sheield lies over a wide altitudinal range of nearly 600 m, and includes a large expanse of moorland in the west. The eastern part of the city was a centre of industry until the mid-twentieth century. Consequently, there remains a strong westeast gradient in deprivation, with ex-industrial areas sufering income and health deprivation relative to the historically wealthier and cleaner west (Department for Communities and Local Government, 2015). Sheield is similar to other ex-industrial northern English cities in having a higher level of socioeconomic deprivation than the national average (Department for Communities and Local Government, 2015), and approximately two thirds of households living in semi-detached or terraced housing according to the 2011 census.
This study uses Lower-layer Super Output Areas (LSOAs) as the units of analysis. LSOAs are an English census geography used for reporting small area statistics. They contain an average population of approximately 1500 and have been used in previous research into relationships between greenspace and health (Brindley et al., 2018;Mitchell and Popham, 2008;Wheeler et al., 2015). Many Oice for National Statistics data are available at LSOA scale and not at the smaller Output Area scale, because the smaller headcounts involved mean luctuations due to chance are more likely and, in some cases, identiication of individuals becomes possible. Larger geographies, such as Middle-layer Super Output Areas, however, are more likely to average out genuine patterns at the intra-urban scales that we are interested in.
Sheield contains 345 LSOAs, although due to LIDAR data availability, the more rural areas with lower population density have been excluded from analysis (n = 307 LSOAs included; Fig. 1). This study is therefore focusing on the more urbanised parts of the city, containing 89% of Sheield's resident population (according to the 2011 Census). Excluding rural areas also has the beneit that rural greenspace (predominantly large expanses of agricultural and extensively managed natural/semi-natural land) is not conlated with urban greenspace (mostly much smaller, planned, intensively managed sites) in analysis. Two LSOAs that are discontiguous with the rest are included from the Stocksbridge suburb. These LSOAs are unremarkable other than one having a high proportion of water cover, and within sensitivity analysis (not shown here) their inclusion/exclusion made no qualitative diference to analytical results. Generalised LSOA boundaries were used (generalised to 20 m), as this reduced fragmentation of small, thin sections of LSOAs following conversion of vector polygons to a raster surface.

GIS data
Landscape metrics were calculated using three GIS raster layers: land cover, land use, and a combined vegetation heights/types map. The land cover map identiies what is located on the land surface, with classes describing diferent types of woodland, grassland, other vegetation, and water, as well as buildings and artiicial surfaces. The vegetation heights/types map combines the green land covers from the land cover map with heights derived from LIDAR satellite imagery, in order to diferentiate e.g. short and tall coniferous and broadleaved trees. In contrast, the land use map describes what the land is used for, e.g. residences, commerce, agriculture, leisure and recreation. These maps were used as appropriate for each landscape metric (e.g. land use patch density with the land use map; vegetation Shannon diversity on the combined vegetation heights/types map).
The land cover and use maps were produced by Ersoy (2015), based on the National Land Use Database classiication schemes and created using data sources dating between 2007 and 2012. These maps are besteforts using data available at the time of creation; it is possible to create maps of similar resolution and typology using available national and local datasets. Briely, maps were based primarily on Ordnance Survey (OS) MasterMap topography area polygons and attributes, with additional land cover detail provided by Land Cover Map 2007, Forestry Commission National Inventory Woodland and Trees, and OS 1:10000 scale raster; and land use detail provided by OS AddressBase Plus, Sheield City Council Green and Open Spaces, and OS 1:10,000 scale raster.
MasterMap captures individual features that are considered important in a landscape, such that in complex urban landscapes many more small features are mapped than in rural landscapes. For example, tree rows and some individual trees are mapped in urban parks, while only larger areas of woodland are mapped in the countryside. Thus the distribution of parcel areas is heavily right-skewed, for example with broadleaf tree land cover parcels in the study area having a mean area of 7568 m 2 but a median area of 576 m 2 .
Vegetation heights were calculated from the diference between 50 cm resolution LIDAR Digital Surface and Terrain Models, and categorised to represent broad vegetation types (< 0.5 m = short grass; 0.5-2 m = long grass/shrubs; 2-10 m = small trees; 10-15 m = medium trees; 15-20 m = tall trees, > 20 m = very tall trees; height categories following Graius et al. (2017)). These were then combined with vegetation types from the land cover map to create the combined vegetation height/type map.
Details of the map typologies, and of the composition of the study area, can be found in Supplementary Material 1.2.

Population data
Self-reported general health was obtained from 2011 UK census data, which asked of every individual the question "how is your health in general?", with the possible answers: very good; good; fair; bad; very bad. The measure used in this study, which we term poor health, combines the 'bad' and 'very bad' categories. The measure was aggregated to LSOA scale. We used indirect standardisation (Naing, 2000) to calculate expected rates of poor health given the age and sex distribution of the LSOA population, for use as an ofset term in the statistical model. This health measure has been used in previous epidemiological studies of greenspace and health (Brindley et al., 2018;Mitchell and Popham, 2008;Wheeler et al., 2015). The geographical distribution of poor health is shown in Fig. 2.

Controlling variables
To minimise confounding with socioeconomic factors known to inluence health that might reasonably correlate with aspects of urban greenspace, we included three controlling variables as covariates in the statistical model (stratiication was not a suitable approach to controlling for confounding due to inclusion of multiple confounders; McNamee, 2005;Pourhoseingholi et al., 2012). To control for deprivation, we used the income deprivation domain of the English Indices of Deprivation 2015, which is calculated from the proportion of individuals in receipt of various forms of state inancial support (data relating mostly to 2012-13). Air pollution was controlled for using estimates of PM 10 concentrations for 2010, generated from 1 km modelled data from the UK's Department for Environment, Food and Rural Afairs and assigned to LSOAs using population weighted averages, where the population represented the census headcounts at postcode unit level. Smoking rates were controlled for using lung cancer hospital admissions from 1st April 2002 to 31st March 2014 as a proxy. The ratio of observed to expected admission counts was calculated for each LSOA, with expected counts adjusted for age and sex. These three variables were selected as they are used as confounders in other analyses exploring the health efects of greenspace (Brindley et al., 2019(Brindley et al., , 2018Mitchell and Popham, 2008;Richardson et al., 2010).

Landscape metrics
We used Fragstats v4.2.1 (McGarigal et al., 2012) to calculate the metrics identiied in the literature search for each LSOA. For compatibility with Fragstats, vector input data were rasterised with a 2 m cell size, and the vegetation height raster surface was aggregated to the same size.
We aimed to match our metrics as closely as possible to those identiied in the literature review in terms of the focal land cover/use types (e.g. tree land covers, recreation land uses), including land cover/ use diferentiation or aggregation (e.g. broadleaved and coniferous trees considered separately or as one category). Prior to metric calculation, we therefore reclassiied the land cover and use maps accordingly. Studies rarely indicated whether they counted water as a green land cover; we have included it as such. Some additional modiications were made to improve relevance of metrics to the present study, e.g. conversion of area-and count-based metrics to percentage-or densitybased (due to variation in the size of LSOAs), and aggregating grey land covers to keep the focus on greenspace. Details of the reclassiications and modiications can be found in Supplementary Material 1.2. All GIS manipulations were performed in ESRI ArcGIS 10.1.

Statistical model
Negative binomial regression was used to test for associations between self-reported poor health counts and landscape metrics, after standardising by expected poor health counts given LSOA age and sex composition and controlling for potential confounding factors as described. Inclusion of a quadratic term for income deprivation was indicated from visual inspection and conirmed using AIC corrected for small sample size (AICc). No other polynomial and no interaction terms were included due to lack of a priori expectations or indication from visual inspection. We transformed some variables to reduce skew in the distribution: several landscape metrics were log-transformed and income deprivation was square root-transformed. Models were run in R v4.3.2 using package MASS v7.3 (R Core Team, 2017; Venables and Ripley, 2002).

Multi-model inference
Due to the large number of landscape metrics (eighteen) with no a priori expectations of which might impact on poor health, we used an information theoretic approach, following the methods of Symonds & Moussalli (2011). All possible subsets of landscape metrics were tested (but with the controlling variables and ofset included in all models), and the AICc value calculated for each. As there were a large number of models within a few AICc units of the model with lowest AICc, we used multi-model inference and averaging to gain insight into the importance of the metrics as determinants of health, and to create the inal inferential model.
A measure of the probability that each landscape metric would appear in the true best model was obtained by summing over the Akaike weights of the models in which the metric was included. This measure is often termed relative importance, but does not indicate effect size or indeed the probability of statistical efect (Galipaud et al., 2014), so is instead referred to here as probability of appearing in the best model, or appearance probability. Given the expected distributions of appearance probabilities for weakly or uncorrelated predictor variables (Galipaud et al., 2014), we consider there to be good support for variables with appearance probability > =0.75 and tentative support for variables with appearance probability > =0.5 and < 0.75.
A 'plausible set' of models was created by taking all models within six AICc units of the lowest AICc and removing those that were more complex versions of a simpler model with lower AICc (Richards et al., 2011). The plausible set was then model-averaged using full averaging, i.e. average of coeicients weighted by the AICc value for each model, with the coeicients for terms not appearing in a model set to zero to prevent inlation of coeicients for unimportant variables.

Imputation of missing values and sensitivity test
Amongst the selected landscape metrics were Shannon diversity index of tree, shrub and grass habitats, which were calculated from the combined vegetation heights/types map (described in Section 2.3.1). Some LSOAs did not contain any of one or more of these land covers. In these cases, a value of zero was imputed for the Shannon diversity index (a Shannon diversity of zero otherwise results from a monoculture).
As the Shannon diversity of tree habitats was found to be relatively likely to appear in the best model, in order to test the sensitivity of the analysis to this imputation we repeated the analysis using only LSOAs with each of these land covers present, i.e. those without imputed values (n = 196).
The results of the sensitivity test are shown in full in Supplementary Material 3.1. In general, the efect of using this subset of LSOAs was that AICc values were lowest for smaller models as compared to using the full dataset. This is likely due to a loss of statistical power arising from the reduction of sample size and also, in some cases, due to reduction of the numerical range of landscape metric values (variable distributions for both full data and subset are shown in Supplementary Material 2). Consequently, fewer landscape metrics appear in the plausible set, and the probability of metrics appearing in the true best model is uniformly lower. However, tree habitat Shannon diversity remained amongst the     metrics most likely to appear in the best model, and had a similar coeicient in both averaged models. The grass and shrub Shannon diversity metrics also had a low appearance probability in both the full dataset and subset. We therefore do not consider imputation to have biased these metrics.

Literature review
The literature search identiied eleven original studies that found signiicant relationships between landscape metrics and aesthetic values (Dramstad et al., 2006;Franco et al., 2003;Palmer, 2004), physical activity (Kim et al., 2016;Manaugh and Kreider, 2013;Müller-Riemenschneider et al., 2013;Su et al., 2013), air pollution (Shen and Lung, 2017;Wu et al., 2015) and noise pollution (Han et al., 2018;Sakieh et al., 2017). No studies were found that looked at social values, immune functioning, exposure to sunlight or promotion of pro-environmental behaviour. Two review papers of aesthetic preferences (Nassauer, 1995) and physical activity levels (Lee and Moudon, 2004), and a meta-analysis of urban biodiversity (Beninde et al., 2015), were also identiied. A total of 63 metrics were identiied from these papers, although some are duplicates both within and between mechanisms. The studies and metrics are summarised in Table 1.
There was considerable diversity in the typologies of the maps that the landscape metrics were calculated on. Some calculated metrics on a two-class scheme of green versus built land covers (Shen and Lung, 2017;Sakieh et al., 2017), while others diferentiated up to 100 land covers (e.g. Dramstad et al., 2006;Palmer, 2004). Some included built land covers in calculations of e.g. diversity (Han et al., 2018;Palmer, 2004;Wu et al., 2015), while others treated these as background (Kim et al., 2016;Shen and Lung, 2017;Sakieh et al., 2017). There was often conlation between land cover and land use (e.g. Manaugh and Kreider, 2013;Palmer, 2004), although on balance land cover seemed to be more central. The exception to this was studies of physical activity levels, where land use was the focus.
The studies analysed a diverse range of response variables, and there was little consistency in metric choice between studies. There was also variation in the scales at which landscape metrics and response variables were measured. For aesthetic preference studies, the viewshed was the unit of analysis (Dramstad et al., 2006;Franco et al., 2003;Palmer, 2004). For physical activity studies, bufers of between 400 m and 1600 m around homes were commonly used, in eforts to capture the distance most people are prepared to walk (Kim et al., 2016;Müller-Riemenschneider et al., 2013;Su et al., 2013). The studies of noise pollution and one of air pollution used bufers centred around monitoring stations, at scales relevant to noise attenuation (300-1000 m) and air pollutant dispersion (100-5000 m) respectively (Han et al., 2018;Sakieh et al., 2017;Wu et al., 2015). One study of physical activity and one of air pollution analysed administrative geographies (Manaugh and Kreider, 2013;Shen and Lung, 2017).
Most of the metrics identiied from the review of intra-urban biodiversity related to habitat structural heterogeneity (Beninde et al., 2015). This is posited to be the mechanism by which humans perceive biodiversity (Fuller et al., 2007).
It was not computationally feasible to include all the identiied metrics in our statistical analysis; in addition, many metrics were theoretically similar (e.g. diferent diversity indices calculated on the same data) or were empirically highly correlated. We therefore selected ifteen metrics to carry forward for statistical analysis by theoretical and empirical comparison, with the aim of minimising redundancy. We also calculated an additional three metrics to relate air pollution removal to tree land covers where original studies had used a single aggregated 'green' category, as trees are known to be especially valuable for air pollution removal in the study area (Mears, 2010). The selected metrics are described in Table 2. Full details of their calculation and behaviour

Multi-model inference
Multi-model analysis yielded a large number of models with low Akaike weights (Akaike weight for best model = 0.001). The plausible set included 70 models (plausible sets shown in Supplementary Material 3.2). The inal, averaged model explains a high proportion of variation in poor health (Pearson's r of observed poor health versus itted values = 0.949). Fig. 3 shows the probability of landscape metrics appearing in the best model, and Table 3 shows model coeicients for the plausible set averaged model. Two metrics have a high probability of appearing in the best model (> =0.75). The greenspace splitting index (Fig. 4a, appearance probability = 0.99), which is also a signiicant term in the averaged model, shows an association between subdivision of greenspace within an LSOA and higher levels of poor health. Less water cover is also associated with increased poor health (Fig. 4b, appearance probability = 0.79).
Five further metrics have a moderate probability of appearing in the best model (> =0.5 and < 0.75). Lower tree habitat Shannon diversity, greater grass cover, greater recreation land use patch density, greater greenspace mean contiguity index (i.e. larger patches with less complex shape), and lower green/grey landscape shape index (i.e. less complex patch shape) are all associated with higher levels of poor health ( Fig. 4c-g). The remaining metrics have low probability of appearing in the best model (< 0.5), and three do not appear at all in the plausible set: tree cover, shrub habitat Shannon diversity index, and green/grey mean shape index.

Selecting useful and parsimonious landscape metrics
Landscape metrics have been widely used in studies of ecosystem services, which beneit humans directly or indirectly (Uuemaa et al., 2013). This study, which appears to be the irst to link landscape metrics to a measure of small-area population general health, adds to the evidence (Müller-Riemenschneider et al., 2013;Shen and Lung, 2017) indicating that they also have utility in linking landscape patterns directly to measures of human health.
The process of selecting a suitable and adequate suite of metrics to describe landscapes for a particular purpose is challenging, due to issues of redundancy, scale dependence, and interpretation (Cushman et al., 2008;Lustig et al., 2015;Uuemaa et al., 2009). Moreover, attempts to identify a parsimonious suite of metrics do not produce consistent results (Cushman et al., 2008;Lustig et al., 2015). Some of the studies identiied in our literature search reported using "common" metrics (Dramstad et al., 2006;Han et al., 2018;Palmer, 2004), yet did not always use the same ones, while others did not report any rationale; there was also little consistency in their chosen metrics, scales of analysis, or greenspace typologies. Given the large number of metrics that exist, in combination with the range of possible typologies and resolutions at which landscapes can be mapped, it can be diicult to deduce from theory which metrics will describe the landscape for the subject under study most efectively. However, there were a few cases in which choices seem to have been driven by theoretical expectations. Land covers (individual) Gives a picture of the composition of the greenspace.

Green land cover Shannon diversity index Green land covers (individual)
A measure of land cover diversity. Increases with more land cover types and more even land cover distributions. ** 6 Land cover contagion index Green land cover (individual) and grey land covers (aggregated) A measure of patch 'clumpiness' (the probability that two random adjacent cells belong to classes i and j, summed over all i and j). Increases with more land covers, fewer individual patches, increasing dominance of individual patches, lower patch shape complexity, and lower dispersion and interspersion (intermixing) of land covers. ** 7 Greenspace patch density Green land covers (aggregated) A simple measure of greenspace subdivision i.e. degree of fragmentation -the number of greenspace patches, standardised to landscape area; although individual patches may be small. 8 Greenspace splitting index Green land covers (aggregated) A measure of subdivision derived from landscape coherence, or the probability that two animals placed at random in an area will be on the same patch. Increases with more individual patches, more even land cover distribution, and increasing subdivision of land covers. ** 9 Greenspace mean contiguity index Green land covers (aggregated) A measure of patch spatial connectedness and shape based on the average degree of contiguity of pixels in a raster map. Increases with patch area and more strongly with lower shape complexity (i.e. increasing contiguity). ** 10 Greenspace mean shape index Green land covers (aggregated) A measure of patch shape complexity. Increases with greater diversion of patch shape from the simplest square. ** 11 Green/grey mean shape index* Green land covers (aggregated) and grey land covers (aggregated) As per metric 10.

12
Green/grey landscape shape index Green land covers (aggregated) and grey land covers (aggregated) A measure of dispersion of land covers. Measures shape complexity as per metric 10, adjusted for the size of the landscape. ** 13 Tree land cover patch density Tree land covers (  Studies of aesthetic preferences, for example, used metrics corresponding to the universal preference for savannah-type landscapes (Nassauer, 1995). Metrics indicating land use mixture were important to physical activity levels, consistent with the 'walkability' concept (Manaugh and Kreider, 2013). Overall, however, the small number of studies limits the extent to which patterns in the usage and analytical relevance of metrics can be observed. In this study, although we used a literature search to select metrics previously found to have statistically signiicant relationships with processes that drive beneits to human health from greenspace, the majority of included metrics did not show strong associations with our measure of health. This may result from the fact that most previous studies did not look at health directly, but rather at mechanisms from which health beneits may result, so their response variables were a level of abstraction away from ours. Interestingly, one of the two previous epidemiological studies found that greenspace patch density was signiicantly positively related to mortality from respiratory disease, while the proportion of area occupied by the single largest greenspace patch showed a signiicant negative relationship (Shen and Lung, 2017). These measures are similar respectively to the recreation land use patch density and greenspace splitting index metrics found to be important in our study (the splitting index is afected by patch size distribution and patch number; we did not include largest patch index in our statistical analysis due to its high correlation with the splitting index). These similarities may hint at the generalisable importance of these aspects of greenspace patterns.
Alternative approaches to selecting metrics are to use ordination or clustering techniques (e.g. principle components analysis (PCA), selforganising maps) to deine dimensions to the data (Cushman et al., 2008;Lustig et al., 2015); and machine learning techniques such as random forests, which produce a simple measure of variable importance (Marston et al., 2014). Our approach using theoretical similarities and pairwise correlations seems to have produced broadly similar results to Cushman et al. (2008), who used PCA to ind a parsimonious suite of metrics, with our selected metrics representing many of the groupings identiied in that study; although our approach to reduction would have been challenging had a much larger number of metrics been considered. However, the results of any approach to selecting a subset of metrics will depend on the composition of the suite initially tested. Furthermore, the interpretation of output from clustering and machine learning techniques is not always obvious (Cushman et al., 2008;Cutler et al., 2007;Lustig et al., 2015). This limits their usefulness for producing planning and policy guidance.  . 3. Probability of landscape metrics appearing in the true best regression model, calculated as sum of Akaike weights of models in which the metric appears.

Table 3
Results of plausible set averaged negative binomial regression models. Terms signiicant at p < 0.05 are shown in bold. NB tree cover, green/grey mean shape index and shrub habitat Shannon diversity index were not included in the plausible set.  1 0 6 ( 2 0 1 9 ) 1 0 5 4 6 4 multi-model inference approach with a plausible set constructed using both ΔAICc and nested models. We found the approach to be useful in identifying important metrics, while reducing the likelihood of overitting compared with a non-averaged model with all variables included. The plausible set has increased inferential power over the allcombinations averaged model due to slightly reduced variance for terms with a relatively large efect size (compare Table 3 with Supplementary Material 3.2.2). This is consistent with Richards et al.'s (2011) suggestion that this approach to building a plausible set of models for averaging can improve the accuracy of efect size estimation. In general, the probabilities of landscape metrics appearing in the best model correspond well to the results of the plausible set averaged model, with strong correlation between metric z-values and appearance probability (Spearman's rho = 0.92). This correlation yields some support to the notion that metrics with a high or moderate appearance probability, but which are not statistically signiicant, also have some Fig. 4. Geographic distribution of landscape metrics found to be likely to be important to self-reported poor health at Lower-layer Super Output Area scale. Quintiles, except in b where two lowest quintiles aggregated due to frequency of zeroes (indicated by *). Only LSOAs included in statistical analysis are shown. (a) Greenspace splitting index (no units, positively associated with poor health); (b) water cover (%, negative association); (c) tree habitat Shannon diversity index (no units, negative association, zero imputed where data not available); (d) grass cover (%, positive association); (e) recreation land use patch density (patches per ha, positive association); (f) greenspace mean contiguity index (no units, positively associated); (g) green/grey landscape shape index (no units, negatively associated).
M. Mears, et al. E c o lo g ic a l I n d ic a t o r s 1 0 6 ( 2 0 1 9 ) 1 0 5 4 6 4 predictive value. However, results from simulation studies ind that predictors that are correlated with a response variable only weakly or not at all can still have a high appearance probability (Galipaud et al., 2014). It is therefore not possible to draw irm conclusions from the appearance probability alone. An additional beneit of using an inferential modelling approach is that the averaged model can function as a composite indicator, by combining metric values to predict the prevalence of poor health. One notable drawback of the multi-model inference approach in general is that computational requirements increase exponentially with the number of predictor variables. While it is a more robust approach to variable selection than stepwise model building (Hegyi and Garamszegi, 2011), given the vast number of landscape metrics that exist it would not be feasible to test all of them in the framework used here. It is therefore essential that a considered approach to metric selection or reduction is used (Section 4.1).

Landscape metrics as indicators of general health
We found that landscape metrics contribute to inferential models of small-area population general health, even when confounding variables with efect sizes orders of magnitude larger are included, and were able to identify particular metrics of importance (Table 3, Fig. 3). This is despite a small sample size with relatively high intra-area variation driven by demographic factors that are diicult to capture in crosssectional data.
Of the landscape metrics included in this study, our results indicate strongest support for the importance of the greenspace splitting index and the proportion of water cover. A large greenspace splitting index, which results from green land covers being split into many patches with an even size distribution, is associated with higher levels of poor health. The splitting index is high along the river corridors to the north-west and north-east of the city centre, where greenspace was largely replaced by heavy industry in the past. It is also high in the city centre and areas to its immediate west, where population densities are highest, leaving little residual greenspace between residential developments. A large splitting index has previously been reported to be related to higher levels of urban noise (Han et al., 2018;Sakieh et al., 2017), but has not been tested in relation to any other mechanisms of beneit to human health. Fig. 4a shows the distribution of this metric across the study area. A low proportion of water cover (Fig. 4b) is associated with greater levels of self-reported bad health in an LSOA. The spatial distribution of this metric is partially dependent on topography, with natural rivers and ponds/lakes contributing, but it is notably lower on average in the city centre, where culverting, covering and illing of water bodies to make space for development is more common. Previous research has found positive relationships between water in landscapes and emotional, restoration and recreational beneits, and the presence of water plays a signiicant role in landscape preferences (Völker and Kistemann, 2011). Water cover was also positively associated with aesthetic preferences and biodiversity in previous landscape metric studies (Beninde et al., 2015;Franco et al., 2003;Palmer, 2004).
There is moderate support for an additional ive landscape metrics. A lower Shannon diversity index of tree habitats (Fig. 4c), and a greater proportion of grass cover (Fig. 4d), are associated with higher levels of poor health. These metrics show broadly opposite spatial distributions, with high tree diversity and low grass cover in the more aluent west and along the ex-industrial river corridors. These metrics were included due to their positive inluence on biodiversity (Beninde et al., 2015). Grass cover was also strongly correlated with several metrics that were identiied in the literature search but not included in statistical analysis: notably the percentage cover of vegetation types permitting open views, which is usually positively associated with aesthetic preferences (Dramstad et al., 2006;Palmer, 2004); and total green cover, which in previous studies has shown a negative relationship with air and noise pollution (Han et al., 2018;Sakieh et al., 2017;Wu et al., 2015). The present result of more grass cover being associated with worse health, which is contrary to these previous studies, may arise from grassed greenspaces in the study area often being relatively low quality amenity greenspaces of utilitarian design, compared to those with more shrub, tree or water cover. This is supported by the opposite spatial distributions of these two metrics, since higher tree diversity is more likely in areas with greater overall tree cover. In particular, a diversity of tree planting might indicate that a greenspace has been designed for aesthetic impact; and is likely also to correlate with greater biodiversity in other taxa (Beninde et al., 2015). These aspects of planting design may therefore impact on health via the psychological beneits of aesthetic and biodiversity values. The particularly high tree diversity along the north-east river corridor may be explained by the presence of green corridors running along the river banks, and by small areas of decorative planting outside of the large commercial properties now in this area.
The only land use metric that is supported is recreation land use patch density (Fig. 4e), which is positively associated with poor health, although the value of this metric is high in some of the more aluent areas of the city, such as in the west. This is again contrary to a previous study in which patch density is positively associated with physical activity (Kim et al., 2016). Our result may suggest that it is better to have fewer, but larger patches, rather than a high density of small patches. This idea is supported by the greenspace splitting index metric, which indicates an association between the presence of at least some large patches and less poor health.
The inal two metrics supported by the analysis are greenspace mean contiguity index (Fig. 4f) and green/grey landscape shape index (Fig. 4g), both of which describe aspects of patch shape. The contiguity index is afected by patch size (with larger patches having higher values), but is more strongly inluenced by patch shape complexity: patches with a more complex shape have lower contiguity. Shape index assesses shape complexity by focusing on the length of edge between diferent classes (in this case, green versus grey land covers), with more complex shapes having higher values. In both cases, compact, squarelike shapes have least complexity, while complexly shaped patches that are interspersed amongst other land covers have high complexity. Poor health is associated with higher greenspace mean contiguity index and lower landscape shape index, i.e. simple patch shape with low interspersion. As would be expected, these two metrics also show broadly opposite spatial distributions. The city centre has greenspace with a simple shape, likely due to most of the greenspace in this area comprising parks, whereas other areas have more incidental greenspace e.g. small amenity areas and street greenery. In other areas, the spatial patterning of these metrics is complex and not easily explained in terms of density, deprivation or local history. Previous studies of noise pollution have found associations in the same directions for these metrics (Han et al., 2018;Sakieh et al., 2017).
It is interesting to note that both composition metrics and coniguration metrics have been highlighted as important. Both cover and diversity aspects of landscape composition are represented. Patch density and splitting index are a simple and more sophisticated metric of the aggregation aspect of coniguration, while the landscape shape index and contiguity index describe the shape. This conirms the importance of selecting metrics that indicate a diversity of aspects of landscape pattern (Cushman et al., 2008).
Taken in combination, the conigurational metrics indicate that having greenspace well interspersed with grey land covers, and large patches of greenspace, is associated with reduced rates of poor health at the LSOA scale. A high level of interspersion means that more people are likely to have easy access to a greenspace. This is important as greenspace use falls drastically with distance to greenspace, and physical use of greenspaces (as opposed to passively experiencing nearby greenspaces) is likely to provide the majority of health beneits (Lee et al., 2015;Schipperijn et al., 2010). Large greenspaces also tend to be associated with greater beneits, possibly again mediated by how they are used (Lee et al., 2015). The compositional metrics additionally indicate that the land covers within greenspaces are also important, with the presence of water, a diversity of tree planting, and a smaller area of grass cover being associated with less poor health. As discussed above, exposure to "bluespace" is known to have health beneits via opportunities for recreation and psychological restoration, and is also aesthetically valuable in a range of contexts (Franco et al., 2003;Palmer, 2004;Völker and Kistemann, 2011). Urban trees are also widely accepted to be important to aesthetics and biodiversity, and to contribute to air and (perceived and actual) noise pollution mitigation (Beninde et al., 2015;Forestry Commission England, 2010;World Health Organization, 2016). The inding that more grass cover is associated with more poor health may simply relect poor quality of greenspaces in these areas: greenspace quality, which includes quality of planting design and vegetation management, is at least as important as greenspace quantity with regards to the likelihood of its use (Lee et al., 2015).

Future directions
The results of this study highlight that there is value in using landscape metrics in studies of beneits to human health from urban greenspace. There are several lines of research that would strengthen the utility of this approach. First, although LSOAs enable analysis of health geographies at a relatively ine scale, and are drawn to capture homogenous areas (Department for Communities and Local Government, 2015), associations identiied at population level may not hold at the individual level. The efects of greenspace exposure further from home (e.g. at work, or while commuting) and cumulative exposure from previous homes or temporal changes to local land cover/ use cannot be captured. There may also be residual confounding that we have not captured in our controlling variables. Using larger or smaller census geographies may ind diferent results; our decision to use LSOAs relects a balance between averaging out statistical luctuations at smaller scales, and avoiding averaging out genuine patterns at larger scales.
Moreover, it is also not possible to infer causation from cross-sectional studies; indeed, establishing causal links between greenspace and health is an on-going challenge, as the associations are complex (Lee and Maheswaran, 2011). This is important in light of the results that suggest that the presence of water, diverse tree planting, and large greenspace patches are associated with low levels of poor health. Previous studies of diferent areas have found that nearby water landscape features, trees, parks and other greenspaces are associated with more expensive housing (Conway et al., 2010;Escobedo et al., 2015;GLA Economics, 2003;Luttik, 2000), and income is strongly associated with health (Mitchell and Popham, 2008). Moreover, attempts to reduce socioeconomic health inequalities by improving greenspace infrastructure may prove counter-productive if housing becomes unafordable (Anguelovski et al., 2018).
If causation could be established, one might synthesise compositional and conigurational recommendations from these results as follows. In terms of landscape composition, reduced poor health might be achieved through increasing water cover, increasing the diversity of tree planting, and reducing grass cover. While this is a simple planning recommendation in the sense that it is easily interpretable, it is not necessarily straightforward to create water features in a landscape where none exist. Nevertheless, our inding supports the well-known value of water and trees in urban landscapes (Forestry Commission England, 2010;Völker and Kistemann, 2011). Further, this combination of recommendations might enable reduction of poor health without an overall increase in the amount of greenspace.
Recommendations for the conigurational aspect of greenspace centre around the greenspace splitting index, recreation land use patch density, greenspace contiguity index and green/grey landscape shape index. These metrics are more diicult to translate into clear guidance, as there are multiple ways of achieving the same metric value (e.g. splitting index can be reduced by having fewer patches or greater patch dominance), not all of which are consistent with the general indings that more greenspace is better (Maas et al., 2009;Mitchell and Popham, 2008). Further exploration would be required to establish threshold values that may contribute to health, and whether particular subtypes of land cover are especially important. Nevertheless, the literal recommendation from our analysis would be that having fewer, larger patches (of greenspace generally, and of recreation land uses) instead of many small ones, and designing a high level of interspersion of green and grey land covers would promote reduction of poor health.

Conclusions
Landscape metrics have potential for describing and analysing the aspects of urban greenspace that have beneits to human health. One of the key challenges in landscape metric studies is the selection of a parsimonious suite of metrics. We had success in identifying a set through a literature review followed by removal of theoretically or empirically redundant metrics, yielding similar results to more sophisticated approaches used in other studies, such as ordination or clustering.
Despite the small efect sizes of landscape metrics compared to other demographic and environmental variables, we were able to use multimodel inference to identify which of our selected suite of metrics were associated with self-reported general health at the LSOA scale. Although the nature of our method does not allow demonstration of causation, our results support the well-established indings that water cover and trees (speciically diversity of trees) are important for the well-being of urban residents, and also indicate that large patches of greenspace that are interspersed with the surrounding matrix of built infrastructure are associated with lower levels of poor health. Nevertheless, while landscape metrics are a simple method for capturing details of landscape composition and coniguration, care must be taken in interpretation and explanation.