Demystifying Normalized Difference Vegetation Index (NDVI) for Greenness Exposure Assessments and Policy Interventions in Urban Greening

Most nature and health research use the normalized difference vegetation index (NDVI) for measuring greenness exposure. However, little is known about what NDVI measures in terms of vegetation types (e.g., grass, canopy coverage) within certain analysis zones (e.g., 500m buffer). Additionally, more exploration is needed to understand how to interpret changes in NDVI (e.g., per 0.1 increments) for policy intervention in urban greening. This study aims to address such gaps in the literature. We, therefore, measured mean values of NDVI and other vegetation metrics for different buffer zones (100, 300, and 500 m) at sample locations within the Greater Manchester study area by applying focal statistics. For each scale, we fitted linear and nonlinear (i.e., GAM) regression models to explore (1) what vegetation types and amounts best predict the NDVI values in a multivariate model and (2) how changes in NDVI explain changes in different vegetation coverages in univariate models. We found that NDVI is most sensitive to tree canopy at 300 meters scale and that changes at different levels of vegetation values and types predict dissimilar changes in NDVI. The models with three vegetation types can explain approximately 80% variance in NDVI values (i.e., Pseudo R²) for buffer zones of 300 and 500 meters. We observed that forb and shrub density is most sensitive (12.21% change) to mid-range increments in mean NDVI (0.45 to 0.55) within 300 meters. These sensitivities usually follow nonlinear patterns for all the vegetation types in multivariate and univariate models. Our results indicate that NDVI values are more sensitive to certain types and amounts of vegetation within various buffer zones. Overall, interpreting changes in NDVI values for urban greening interventions would require careful evaluation of the relative changes in types and quantities of vegetation for different buffer zones.


Data processing
Step 1: focal statistics analysis Step 2: development of greenness exposure maps Step 3: random location sampling

Introduction
New research into healthy lifestyles has led to numerous studies investigating possible links between the physical landscape and human health, especially in urban contexts.Multiple authors researching these links indicate that natural environments are salutogenic and benefit both physical and mental health (Bratman et al., 2019;James et al., 2015;Markevych et al., 2017;Twohig-Bennett & Jones, 2018).Furthermore, the quality of the available, accessible, and visible natural environment in urban areas (composed of green and blue spaces) is linked to overall human health (Labib, 2019;Zhang et al., 2022).Here, availability refers to the amount of physical greenspace.Accessibility is defined as the spatial proximity of greenspace to locations of interest.Finally, visibility refers to the amount of greenness that can be visually seen from a particular area of interest.Each of these categories is related to one or more different 'pathways' through which greenspace influences health, such as physical activity, stress reduction, heat mitigation, and social cohesion (Labib et al., 2020a(Labib et al., , 2021;;Markevych et al., 2017;Zhang et al., 2022).
Many environmental health studies rely on greenness data (e.g., NDVI), which measures relative abundance and spatial distribution of vegetation, to define greenness exposure.These measures of greenness are usually assessed with remote sensing-based spectral indices that identify vegetation using reflectance measurements from instruments on satellites and planes (Jimenez et al., 2022).The Normalized Difference Vegetation Index (NDVI) is the simplest and most commonly used objective measure of vegetation density and is often used in measuring greenness exposure in urban settings for environmental health studies (Jimenez et al., 2022;C. Reid et al., 2018;Rhew et al., 2011;Rugel et al., 2017).NDVI applications have increased due to the availability of satellite data over long periods and geographic extents at increasingly higher spatial and temporal resolution (Jimenez et al., 2022).Spatial scale is a fundamental concept in geographical analysis to assess greenness exposure that refers to the selection of units used to represent the Area of Interest (AOI).The spatial scales commonly used in the literature are the body/personal scale, neighborhood scale, and city/district scale (Labib et al., 2020a).The body scale captures the immediate surrounding of the human body (e.g., 10-100 m).The neighborhood scale defines where most people spend large amounts of their time (e.g., 500 m).Finally, the city or district scale captures a wider geographic area, comprising several neighborhoods.Out of these scales, the urban neighborhood scale is the most commonly used spatial scale (Labib et al., 2020a).Most research investigating the health-greenness relationship uses a point of interest (e.g., a residential building) and tests the sensitivity of greenness exposure at multiple buffer distances (Browning & Lee, 2017;Su et al., 2019).The most frequent length used for analysis is 301-600 m (Labib et al., 2020a).Additionally, greenness exposure assessment is sensitive to the spatial resolution of NDVI (Jiang et al., 2005;Jimenez et al., 2022;Su et al., 2019).In the literature, the spatial resolution of the remotely sensed satellite images used to study the health-greenspace relationship varies from higher resolutions, like the National Agricultural Imagery Program (1 m²), Sentinel-2 (10 m²), or Landsat (30 m²), to coarser ones like MODIS (250 m²) or AVHRR (1 km²) (Jimenez et al., 2022;C. Reid et al., 2018;Rhew et al., 2011;Rugel et al., 2017).
Although existing literature mainly assesses greenness exposure using NDVI, it investigates the health-greenspace relations at multiple spatial scales, with different buffer distances and spatial resolutions.However, to the best of our knowledge, there is a lack of research aiming to understand what NDVI measures on the ground in terms of varying vegetation types and amounts of vegetation at different spatial scales.NDVI is calculated based on land surface reflectance of visible red and near-infrared wavelengths, as vegetation absorbs more energy in the red wavelength and reflects more infrared radiation than non-vegetated surfaces (Huang et al., 2021;C. Reid et al., 2018;Rouse, 1974).
While NDVI is very useful in understanding vegetation presence and health, one crucial problem that still remains is revealing what type of vegetated and non-vegetated surfaces NDVI values measure.
For instance, NDVI values do not differentiate the reflectance between trees and grass, which results in an incomplete understanding of what type of vegetation within the AOI is responsible for higher or lower values of NDVI and the effects on human health.Therefore, scale effects might disguise relationships between greenspace metrics and human health.Other studies have shown that greenspace metrics exhibit spatial scale sensitivities, and NDVI values may be driven by different types of vegetated or non-vegetated surfaces at various scales (Jiang et al., 2005;Labib et al., 2020b).
The effect of these metrics on human health is also expected to change in strength and significance as the spatial scale changes (Labib et al., 2020b).Therefore, it is critical to research what types of vegetation (e.g., trees, grass) and which spatial scale of analysis mainly constitute the mean or median NDVI values observed within the buffer distances or AOIs.
Additionally, a large amount of NDVI and health research often indicates how health is influenced by specific unit changes in NDVI (Barboza et al., 2021;P. James et al., 2016;Twohig-Bennett & Jones, 2018).Rojas-Rueda et al. (2019) noted that an increment of 0.1 NDVI within a buffer of 500 m or less is associated with lower all-cause mortality.However, little has been explored about how to interpret changes in NDVI (e.g., per 0.1 increments) for policy intervention in urban greening.In particular, what type and amount of vegetation can contribute to such changes and whether these changes follow a linear or nonlinear trend over different NDVI value ranges and buffer distances.Such limited understanding has caused inadequate translation of NDVI-based health metrics into usable greening policies and practices.Subsequently, NDVI has remained a mysterious metric of greenness exposure assessment for health-related studies both for researchers and practitioners.
Considering such gaps in the literature, this study aims to demystify NDVI values using multiple greenness metrics at multiple spatial scales.The objectives and aim of this study are: (1) to investigate the sensitivity of NDVI values to various vegetation types at multiple scales and (2) to explore how specific value increments of NDVI reflect changes in vegetation types and total greenness percentage.The rest of the paper is structured in 5 parts: literature review in section 2, data and methods in section 3, results in section 4, discussion in section 5, and conclusion in section 6.

Literature Review
An increasing body of literature investigates the relationships between green space exposure and human health.The natural environment is associated with better self-perceived general and mental health, increased well-being, and lower mental distress (Astell-Burt et al., 2013;P. James et al., 2015;Nieuwenhuijsen et al., 2017;Richardson et al., 2013;Stigsdotter et al., 2010;Triguero-Mas et al., 2015;van den Berg et al., 2015van den Berg et al., , 2016;;White et al., 2013).Similarly, green space is associated with higher levels of physical activity and vitality, a less sedentary lifestyle, and lower body weight and all-cause and cardiovascular mortality levels (Astell-Burt et al., 2013;James et al., 2015;Nieuwenhuijsen et al., 2017;van den Berg et al., 2016).Nevertheless, some studies found weak or no evidence between physical health (e.g., body weight), mental health, and greenness exposure in urban areas (Lee & Maheswaran, 2011;Richardson et al., 2013).The potential pathways that link greenness to health include promoting physical activity, social contact, and attention restoration, as well as mitigating air pollution and reducing stress, noise, and heat exposure (Bowler et al., 2010;James et al., 2015;Markevych et al., 2017;Nieuwenhuijsen et al., 2017;Richardson et al., 2013).Moreover, there is growing support in the environmental health field for further epidemiological research to develop policy recommendations for urban greening that consider the overall human health benefits of greenness exposure (Bowler et al., 2010;Lee & Maheswaran, 2011;Markevych et al., 2017;Nieuwenhuijsen et al., 2017;White et al., 2013).
The present study is interested in understanding how other researchers assess greenness exposure in environmental health literature.The key components (e.g., spatial dimensions) found to assess greenness exposure in the literature are: the spatial scale (i.e., buffer distance) of the analysis, the greenness metrics used to represent greenspace surfaces (e.g., NDVI), and the spatial resolution of the remote-sensed satellite images (e.g., 30 m²) used to derive such metrics (Labib et al., 2020a;Su et al., 2019).Therefore, analyzing greenness exposure within buffer distances between 30 and 5000 meters is the standard approach used in the most recent literature (Browning & Lee, 2017;Labib et al., 2020a;Su et al., 2019).The preferred unit of analysis or AOI is the neighborhood scale which is often described as an ego-centered neighborhood (i.e., defined around individuals' residences) (Chaix et al., 2009;Labib et al., 2020a;Su et al., 2019).Mainly land use and land cover maps that include vegetation types (e.g., tree canopy and grass) and vegetation indices (e.g., NDVI) derived from satellite images are used to represent greenspace surfaces (Cusack et al., 2017;Kardan et al., 2015;Labib et al., 2020a;C. E. Reid et al., 2017).Finally, the spatial resolution of the remotely sensed images usually varies between higher and coarser ones (e.g., 1 m² vs 250 m²).But higher spatial resolutions, if available, are preferred since they improve greenness exposure assessment (Jimenez et al., 2022;C. Reid et al., 2018;Su et al., 2019).
When assessing greenness exposure in environmental health research, the associations between the natural environment and health are sensitive to the spatial resolution and the spatial scale used in the analysis (Jimenez et al., 2022;Labib et al., 2020b).The spatial resolution of satellite images (e.g., 5 m², 30 m², and 250 m²) directly affects the identification of green spaces (Jimenez et al., 2022).Higher resolutions can better identify the spatial heterogeneity of urban areas by capturing smaller vegetated surfaces (e.g., private gardens), whereas coarser ones might fail to classify such small urban green spaces (Jimenez et al., 2022).This phenomenon is known as greenness exposure misclassification (Jiang et al., 2005;Jimenez et al., 2022;Su et al., 2019), which impacts the relationships between greenness and health (Tsai et al., 2019).Furthermore, greenness exposure estimates tend to be greater for lower resolutions, while vegetation distributions tend to be similar across spatial resolutions (Jimenez et al., 2022;Su et al., 2019).On the other hand, the spatial scale or buffer distance used to assess greenness exposure within the AOI influences the strength and significance of the associations between green spaces and health, where greater buffer distances are associated with better health outcomes (Browning & Lee, 2017;Helbich, 2019;Su et al., 2019).This influence is also known as scale effects (Labib et al., 2020b).Additionally, Jiang et al. (2005) suggest that NDVI at different scales may not be comparable.
Furthermore, there is growing literature trying to understand NDVI as a measure of greenness exposure assessment for policy intervention (Gascon et al., 2016;Helbich, 2019;Labib et al., 2020b;Leslie et al., 2010;Rhew et al., 2011;Rugel et al., 2017).Rugel et al. (2017) proposed a novel approach (Natural Space Index) that includes the presence, form, accessibility, and quality of green spaces, and criticized that purely objective vegetation measures like NDVI might misclassify greenness exposure.Labib et al. (2020b) developed a combined multi-scale NDVI 'exposure index' mapping technique that is less vulnerable to scale effects than traditional approaches.Helbich (2019) performed a time series analysis of NDVI and suggested that NDVI and epidemiological data should be temporally aligned to diminish contextual uncertainties in environmental exposure assessments.
Another study suggested a lack of agreement between the perceived and objective measures of greenness (Leslie et al., 2010).On the other hand, Rhew et al. (2011) found that psychologists' ratings of greenness and NDVI were highly correlated within a buffer distance of 100 meters and promoted NDVI as a reasonable measure of neighborhood greenness.Moreover, Gascon et al. (2016) analyzed the relationship between built environment characteristics and NDVI and suggested NDVI as a useful greenness metric depending on the study area.
However, this research again introduced new composite measures (Labib et al., 2021;Rugel et al., 2017), which are difficult to translate into policy and practice action for urban greening.These metrics still fail to indicate what type of vegetation and what amount should be needed at which spatial scale.The existing environmental literature recommends assessing greenness exposure at multiple spatial scales using remotely sensed satellite images with higher spatial resolutions (e.g., 10 m²) to derive greenness metrics (e.g., NDVI) and exploring modeling relationships between greenness metrics beyond linear regression models.These suggestions will be further critically investigated in this study.

Case study area
In the present study, we explored the different vegetation types and the amounts which explain the normalized difference vegetation index (NDVI).We, therefore, applied a novel approach to the Greater Manchester area.Our area of interest is northwest of England, which covers a surface of about 1276 km² and has a population of approximately 2.8 million people.The metropolitan county was created after the industrial revolution in 1974 and became a city region in 2011 (Dennis et al., 2018).The Greater Manchester area interests researchers studying the relationship between greenness exposure and overall human health outcomes in urban areas given its spatial heterogeneity, accounting for the diversity of built environment and green and blue spaces over the whole county (Dennis et al., 2018;Labib et al., 2021).Moreover, the variety of green spaces and vegetation types (e.g., urban parks, private gardens, pastoral farmland, tree canopy, forbs and shrubs, and grass) present in the region promotes more specific research questions enhancing the quality of greenness and health studies (Labib et al., 2021).

Data details
The current study considers five greenspace metrics: NDVI, overall greenspace, tree canopy, forbs and shrubs, and grass densities (Table 1).The continuous variable NDVI used in this analysis was collected from Labib et al. (2020b).The index was derived from remote-sensed satellite images from Sentinel-2A at 10 m² resolution obtained in 2018 with cloud cover below 2%.NDVI is calculated from reflectance measurements in the red and the near-infrared bands and ranges from -1 to 1, where lower values indicate the absence of vegetation (e.g., water) and higher ones the abundance of healthy vegetation (C.Reid et al., 2018).The remaining LULC-based metrics (i.e., greenspace, tree canopy, shrubs, and grass) were obtained from the study by Dennis et al. (2018).These measures were defined from a supervised learning exercise classifying urban, water, tree canopy, forbs and shrubs, and grass categories at 10 m² resolution.Furthermore, greenspace (binary variable) was extracted from the same land use and land cover map, where 1 indicates a pixel belonging to a 'green' category (e.g., tree canopy, forbs and shrubs, and grass) and 0 otherwise.

Table 1
Name, definition, spatial resolution, and variable type for the five vegetation metrics considered in demystifying NDVI.Greenspace (all together) Aggregates the presence of tree canopy, forbs and shrubs, and grass as green surfaces.

m² Categorical and binary variable taking 1 if an image
pixel belongs to a green category (i.e., tree canopy, forbs and shrubs, and grass), and 0 otherwise.

Tree Canopy
Measures image pixels with tree cover.Trees are large woody plants having secondary branches supported on the main trunk.
10 m² Categorical and binary variable taking 1 if an image pixel is mostly covered by trees, and 0 otherwise.

Forbs and Shrubs
Measures image pixels with forbs and shrubs cover.
Forbs are broad-leafed plants with showy flowers.
Shrubs are small woody plants with multiple steams.
10 m² Categorical and binary variable taking 1 if an image pixel is mostly covered by forbs and shrubs, and 0 otherwise.

Grass
Measures image pixels with grass cover.The grass is a plant with slender leaves and inconspicuous flowers.
10 m² Categorical and binary variable taking 1 if an image pixel is mostly covered by grass, and 0 otherwise.

Data processing
This study presents a new approach to demystify the normalized difference vegetation index for greenness exposure assessments and policy interventions in urban greening.The chosen methodology starts by collecting remote-sensed satellite imagery (Fig. 1a) for five greenness metrics (Table 1), then assesses greenness exposure at multiple spatial scales (100 m, 300 m, and 500 m) and produces exposure maps for each scale (Fig. 1b, c, d).We then randomly sample point locations on the map within the Greater Manchester area boundaries and extract raster cell values at those locations (Fig. 2).Additionally, we scaled all features performing min-max normalization.These tasks were performed using the following steps: Step 1: focal statistics analysis A "gliding box" algorithm was used to perform focal statistics at multiple spatial scales over remote-sensed satellite images to produce exposure maps (Labib et al., 2020b).The algorithm starts by fitting a square box of size 'r x r' (i.e., filter) on the top-left corner of the input image (e.g., a remote-sensed image such as NDVI).The size 'r' of the filter is adjusted regarding the three spatial scales used in this study and accounts for the number of pixels.In our case, r = 21, r = 61, and r = 101 were used to represent neighborhoods of 100 m, 300 m, and 500 m, respectively (i.e., for 10 m² input images) (Fig. 2).Focal statistics is about the operating sum or range on a smaller area (i.e., filter) within an input image.In this case, we average the pixel values covered by the size of the filter in the input map and output the result into a single pixel of a newly created image.The same operation is performed by moving one pixel to the right in the input image, which acts as the central pixel of the square box or neighborhood of size 'r x r' until every pixel at the output image is filled.The fast Fourier transform convolution (i.e., fftconvolve) from the SciPy Python library was used to convolve the neighborhood filter over the whole input image while assuming zero-padded boundaries.

3.3.2.
Step 2: development of greenness exposure maps After performing the "gliding box" operation over the whole input image, a new output image is produced that includes averaged values for each spatial scale and greenness metric (Fig. 1b, c, d).A total of 15 output images (exposure maps) were produced for later use.Each output layer consisting of vegetation mean scores represents the greenness exposure at multiple buffer or neighborhood distances.Neighborhood or exposure distances are selected based on known pathways between vegetation and mental or physical health (Labib et al., 2020a;Markevych et al., 2017).

Step 3: random location sampling
Once we developed the exposure maps, we wanted to get the data distribution of the mean vegetation values to discover the relationships between the different metrics.At first, the dimensions of the output layers were 7443 x 5987 pixels and accounted for 44,562,241 pixels.If we considered all existing data points (i.e., the centroid of each pixel), we would have too much data to perform subsequent analysis.So, to avoid an excess of computational load, we decided to sample the original data at random locations within the boundaries of the study area.We performed random sampling guaranteeing a minimum Euclidean distance between each location equal to the buffer distance of the spatial scale (e.g., 300 m), ensuring a complete data representation (Fig. 2).Given the minimum distance constraint, we randomly sampled 10,000 data points for spatial scales of 100 m and 300 m and 5,000 data points for 500 m.

4: extraction of raster values
After randomly generating data points within the specified boundaries, we extracted the vegetation values from each raster image/exposure map at those locations using a simple sampling operation.First, we performed a merge operation between the randomly generated points (i.e., locations) and the boundaries of the Greater Manchester area to ensure we kept only the vegetation values within our area of interest.After merging both data frames, 6,000 data points remained for buffer distances of 100 m and 300 m and almost 3,000 locations at 500 m.Then, we sampled the raster images (i.e., exposure maps) at every point location within the Greater Manchester area and stored them for further modeling (Fig. 2).The same method was applied for every spatial scale and vegetation metric (Fig. 2).After completing all required steps to process the five greenness metrics, we obtained three data sets (i.e., one per spatial scale), including the averaged values for NDVI, percent of green spaces, tree canopy, forbs and shrubs, and grass densities, and the exact point location coordinates (Table 2).These data are used in modeling the relationships between varying metrics and NDVI (Fig. 2).

Table 2
Data frame (i.e., first 5 rows) describing averaged values for NDVI, percent of green spaces, tree canopy, forbs and shrubs, and grass densities and location coordinates for a buffer zone of 100 meters.

Model setup
We used Generalized Additive Models (GAMs) to discover relationships between vegetation types and NDVI.GAMs were preferred over simpler linear regression models because they allow specifying predictors as nonlinear terms (e.g., spline), as existing research does not indicate if the relationships between NDVI and other metrics are linear or nonlinear.GAMs are flexible and can identify both linear and nonlinear trends in relationships between the dependent and independent variables.GAMs are semi-parametric models adding linearly smooth functions and take the following form (T. J. Hastie, 2017; T. Hastie & Tibshirani, 1987;Wood, 2006): Where X₁, X₂, …, X are the predictors, y is the response variable, and g is the link function that relates our dependent variables to the expected value of the independent variable.The smoothing functions ₁, …,  allow the user to model nonlinear relationships efficiently without fitting all possible transformations.GAMs benefit from the additive characteristic of the generalized linear models while adding nonlinear functions (T.J. Hastie, 2017; T. Hastie & Tibshirani, 1987;Wood, 2006).The model has good interpretability of individual predictors on the response, is flexible, and avoids overfitting (Wood, 2006).
In the present study, the GAMs are operationalized using Python's pyGAM library.GAMs are built upon three main components: the distribution of the response variable, link function, and functional form (T. J. Hastie, 2017; T. Hastie & Tibshirani, 1987;Wood, 2006).The normal distribution of the response variable is used since it best fits NDVI data.The link function selected was the identity.Other link functions (e.g., logit, inverse) were allowed in the model, but the identity suited linear regression purposes and dependent variables following a normal distribution.Finally, the functional form specifies whether the predictors are linear, spline, factor, or tensor (T.J. Hastie, 2017; T. Hastie & Tibshirani, 1987;Wood, 2006).To discover the appropriate form of the dependent variables, we modeled a set of possible combinations: 1) including only linear terms, 2) model with splines, and 3) models with linear and spline terms (Table 1 and 2, Appendix).All the corresponding analyses and reproducible workflow were done in Python and can be found here: https://github.com/alexdelai/NDVI-dimist.
Furthermore, GAMs have two additional hyperparameters: regularization penalty and smoothness (Table 1 and 2, Appendix).Lambda controls the strength of the regularization penalty or rate with larger values enforcing higher smoothing (T.J. Hastie, 2017; T. Hastie & Tibshirani, 1987;Wood, 2006).The ideal lambda will not be too high to avoid underfitting, nor too low as it might overfit the data.Smoothness is determined by the number of splines used to fit each predictor.The model allows choosing between p-splines, default, or cyclic p-splines.Here, the term p-spline stands for penalized basis splines where coefficients are partly determined by the data to be fitted and partly by an additional penalty function reducing overfitting.A spline function of order n is a piecewise polynomial function of degree n-1 in a variable x (T.J. Hastie, 2017; T. Hastie & Tibshirani, 1987;Wood, 2006).

3.5.
Model calibration and validation

Generalized Cross-Validation
Generalized Cross-Validation (GCV) was evaluated to select the best model among all possible GAMs combinations (Table 1 and 2, Appendix).GCV is a model validation technique for assessing how well the results of a statistical analysis will generalize to an independent data set, which means how well a model will predict unseen data (Bottegal & Pillonetto, 2018;Golub et al., 1979;Wahba, 1985).The validation provides an unbiased estimate of the mean square prediction error (MSE) and is considered an approximation to the leave-one-out cross-validation (Bottegal & Pillonetto, 2018;Golub et al., 1979;Wahba, 1985).Furthermore, this is one of the main approaches used to estimate parameters in the context of regularization techniques, determining the smoothness parameters in splines (Bottegal & Pillonetto, 2018;Golub et al., 1979;Wahba, 1985).The chosen tuning parameters were the ones giving the lowest Generalized Cross-Validation score.

Effective Degrees of Freedom
Effective Degrees of Freedom (DoF) were assessed for each model to obtain the simplest model (Table 1 and 2, Appendix).DoF shows the number of values in the final calculation of a statistic that are free to vary.The main concern is that the higher degrees of freedom a model has, the higher probability a model will overfit the training dataset, which can be overcome using regularization techniques (T.Hastie et al., 2009;G. James et al., 2021).The DoF statistic accounts for how many parameters are estimated by the model measuring its complexity.For two models having the same GCV score, we chose the one with lower DoF as the best (i.e., simple over complex models) because it would better generalize future data.

Other statistics and model tuning
Additionally, Akaike Information Criteria (AIC) and Pseudo R-Squared were assessed to select the best models (Table 1 and 2, Appendix).AIC is an estimator of the prediction error and relative quality of statistical models for a given dataset.The criteria assess the amount of information lost by a given model, where the preferred model is the one with the minimum AIC value.Pseudo R-squared is a measure to evaluate the goodness of fit of nonlinear regression models.A pseudo-R² only has meaning when compared to another pseudo-R² of the same type, on the same data, and predicting the same outcome.Higher Pseudo R-Squared scores were preferred over lower ones.
Finally, we tuned the model parameters (e.g., lambda and number of splines) by performing a grid search over the space of estimated parameters using the corresponding function in the pyGAM library (servén et al., 2018).We need to tune the model parameters to test different combinations of parameters and obtain the best possible model in terms of GCV.Grid search produced better GCV and DoF scores compared to more basic techniques (e.g., fit).

Descriptive statistics
In Table 3, averaged NDVI, percent of green spaces, tree canopy, forbs and shrubs, and grass densities for different buffer zones are described in terms of mean, median, standard deviation, skewness, and kurtosis.From Table 3, we observe that mean NDVI is higher at 100 meters compared to other buffer distances and that forbs and shrubs density is highly skewed at 100 and 300 meters.

Multivariate analysis
The main goal of this study was to investigate what type (e.g., tree canopy, grass) and amount of vegetation can explain the mean NDVI values estimated within certain analysis zones (e.g., 300 m).
To explore these, we fitted linear and nonlinear (i.e., GAM) models ( dependence plots (Fig. 3) highlight the effect of each vegetation type on the relative changes in mean NDVI while holding other variables constant.Furthermore, we found that grass density has the highest predictive power on NDVI at this spatial scale (Fig. 3a).Tree canopy higher than 0.4 to 0.5 positively correlates with values higher than mean NDVI, while lower densities correlate with values lower than mean NDVI (Fig. 3a).Similarly, forbs and shrubs percentages higher than 0.3 positively correlate with values higher than mean NDVI (Fig. 3a).
For buffer zones of 300 meters, the best model explaining mean NDVI was a completely nonlinear function of the mean tree canopy, forbs and shrubs, and grass densities (Table 1, Appendix).
The chosen model scored the lowest GCV among all possible models (  1, Appendix) that was statistically significant, where higher predictor values were correlated with higher NDVI values (Fig. 3b).In Fig. 3b, the partial dependence plots showed that all metrics covaried with above-mean NDVI values for scores higher than 0.3 to 0.4, approximately.Tree canopy showed the highest predictive power, predicting NDVI values almost 0.5 times above the mean for tree canopy scores close to 1.0 (Fig. 3b).The predictive power of shrubs and grass percentages increases steadily until 0.7, where the curve flattens (Fig. 3b).These changes in observed values of NDVI based on different vegetation types indicate that at the 300 m scale of analysis NDVI is more sensitive to canopy coverage than to grass and shrubs densities (Fig. 3b).
Again at 500 meters buffer distance, the best function explaining NDVI was a nonlinear model, which scored the lowest GCV statistic among the tested relationships (Table 1, Appendix).
Here, tree canopy, forbs and shrubs, and grass densities were defined as additive spline terms and were found statistically significant (Table 1, Appendix).Therefore, the three vegetation types explain NDVI nonlinearly at a spatial scale of 500 meters (Table 1, Appendix).Again, the tree canopy showed the highest explanatory power and peaked slightly above 0.4 times higher than the mean NDVI (Fig. 3c).Forbs and shrubs' density of around 0.7 explained about 20% of the above-mean NDVI values (Fig. 3c).Higher values of grass percentage explained almost 20% higher than the mean NDVI (Fig. 3c).The model's Pseudo R² was equal to 0.8039 (Table 1, Appendix).All metrics explained above mean NDVI values for scores higher than 0.3 (Fig. 3c).Finally, the models analyzing smaller buffer zones showed lower prediction errors on new data than those measuring large distances (Table 1, Appendix).

Univariate analysis
Additionally, this study explores how to interpret changes in NDVI (e.g., per 0.1 increments) for policy intervention in urban greening.We fitted linear and nonlinear regression models to investigate how changes in NDVI predict changes in vegetation coverage in univariate models.Within 100 meter buffer zones, the best models were nonlinear for greenspace and vegetation types (Table 2, Appendix).NDVI correlated with the above mean values of greenspace at 0.5, reaching a maximum of 0.6 and a minimum of -0.4 relative to the mean greenspace percentage (Fig. 4a).Our independent variable (NDVI) predicted the above mean values of tree canopy at 0.4, and tree canopy sensitivity peaked at around 0.8 of NDVI (Fig. 4b).NDVI values higher than 0.5 explained densities above the mean for forbs and shrubs, where shrubs density reached a maximum of 0.3 times above its median for 0.9 NDVI (Fig. 4c).Finally, the above mean values of grass percentage were predicted for NDVI higher than 0.6, peaking at NDVI values close to 1 (Fig. 4d).To explain univariate greenness exposure at a buffer distance of 300 meters, we found that the best functions modeling the relationship between every vegetation type and NDVI were statistically significant and nonlinear (Table 2, Appendix).NDVI predicted the above mean values for all green metrics starting from the interval 0.3 to 0.5 (Fig. 4).Besides greenspace, the strongest correlations were found for forbs and shrubs and grass percentages at around 0.3 for high NDVI scores (Fig. 4c, d).Our independent variable explained 0.1 times above the mean of tree canopy density around 0.7 NDVI to decline later (Fig. 4b).To explain univariate greenness exposure at a spatial scale of 500 meters, the preferred models for greenspace, tree canopy and shrubs demonstrated a nonlinear relationship, whereas the best model explaining grass followed a linear pattern (Table 2, Appendix).Statistically significant coefficients were obtained for every relation.NDVI values between 0.3 and 0.5 predicted scores above the mean for all vegetation types (Fig. 4).Besides overall greenspace, the highest sensitivity of a vegetation type to NDVI was found for grass percentage, where NDVI values close to 1 predicted grass percentage 0.5 times above its mean (Fig. 4d).NDVI explained 0.5 above the mean value for greenspace, 0.15 for tree canopy, and 0.25 for forbs and shrubs percent (Fig. 4a, b, c).The predictive power of the tree canopy declines after 0.8 NDVI (Fig. 4b).The overall univariate analyses illustrated in Fig. 4 show that changes in NDVI values in different ranges can have both nonlinear and linear changes in vegetation type and amount of vegetation for different spatial scales.In particular, within the range of lowest (i.e., 0 to 0.1) and highest (0.9 to 1) NDVI values there are more widespread changes in values of individual vegetation metrics relative to changes in NDVI (Fig. 4).Additionally, canopy coverage and shrubs showed higher variation in nonlinear changes than overall greenspace and grass percentages (Fig. 4).
Based on the observed modeled relationships in the univariate models, we summarized how specific value increments of NDVI (e.g., low, mid, and high) reflect changes in vegetation types and total greenspace percentage (Table 4).In order to obtain the prediction sensitivity of different vegetation coverage, we designed 0.1 increments at low (i.e., from 0.25 to 0.35), mid (i.e., 0.45 to 0.55), and high (i.e., 0.65 to 0.75) values of NDVI.From Table 4, we observe that forbs and shrubs density consistently showed the greatest change (11.1%, 12.21%, and 10.37% changes, respectively) related to mid-value increments of NDVI at 100, 300, and 500 m buffer distances.Note that the best function predicting grass at 500 meters was linear, so the percentage change in grass remains constant for low, mid, and high increments of NDVI (Table 4).

Table 4
Prediction sensitivity analysis: sensitivity of green spaces, tree canopy, forbs and shrubs, and grass densities to increments in mean NDVI for different buffer zones.

Sensitivity of NDVI to vegetation types
Until now, NDVI has been mainly used to assess greenness exposure in environmental health research and estimates the presence of vegetated and non-vegetated surfaces within a study area.
However, the vegetation types and amounts that NDVI measures at multiple spatial scales are less evident.For instance, NDVI does not differentiate the reflectance between shrubs and grass, and scale effects might cause different vegetation types to drive NDVI values within different exposure distances (Labib et al., 2020b).Therefore, we aimed to understand what vegetation types and quantities explain NDVI values at different buffer distances in multivariate models and whether these relations were linear or nonlinear.Our results suggested that NDVI values are sensitive to vegetation types and quantities of vegetation at different spatial scales and that these sensitivities usually followed nonlinear patterns (Table 1, Appendix).NDVI was most sensitive to tree percent within a buffer distance of 300 meters (Fig. 3).From (Fig. 3a, c), we identified that grass and tree densities explain higher above-mean NDVI values within buffer distances of 100 and 500 meters, respectively.
NDVI was least sensitive to grass density within an exposure of 500 meters (Fig. 3).Moreover, the multivariate models with three vegetation types explained about 80% variance in mean NDVI values at 300 and 500 meters (Table 1, Appendix).Finally, the spatial scale influenced the predictive power of vegetation types on NDVI, as the models analyzing smaller buffer zones showed lower prediction errors on new data than those measuring large distances (Table 1, Appendix).This methodology helps us understand the factors that affect NDVI measurements which is crucial for using greenness exposure assessments to study greenspace-health relationships.

Sensitivity of vegetation types to increments in mean NDVI
Currently, several studies in environmental health research use NDVI for greenness exposure assessments to investigate the health outcomes of certain unit changes in NDVI (Barboza et al., 2021;P. James et al., 2016).For instance, Rojas-Rueda et al. ( 2019) showed that 0.1 increments in NDVI within a buffer distance of 500 meters or less is associated with lower mortality in humans from all causes in the study.However, more exploration is needed to understand how changes in NDVI (e.g., per 0.1 increments) should be interpreted for policy intervention in urban greening.In particular, the ways in which vegetation types and quantities can induce such changes and whether these changes follow linear or nonlinear trends across different ranges of NDVI values and buffer distances are poorly understood.Our results suggested that vegetation coverages are sensitive to different increments in mean NDVI and spatial scales and that these sensitivities usually follow nonlinear patterns (Table 2, Appendix).From Table 4, we identified that forbs and shrubs were most sensitive to a mid-range increment in mean NDVI (i.e., 0.45 to 0.55) within every AOI (11.10%, 12.21%, and 10.37% changes, respectively).Percent of green spaces, which aggregates other vegetation types, increases by 21.57% after a mid-range increment in mean NDVI within a buffer distance of 100 meters (Table 4).The tree canopy was the least sensitive (0.28% change) to an increment in mean NDVI (i.e., 0.65 to 0.75) at 300 meters (Table 4).Therefore, these results suggest what vegetation types and amounts contribute to inducing increments in mean NDVI across multiple spatial scales.
This helps to address a lack of common understanding of NDVI and might contribute to an adequate translation of NDVI-based health relations into usable greening policies and practices.

Strengths and relevance
There is growing support to further consider epidemiological research in developing policy recommendations for urban greening (Bowler et al., 2010;Lee & Maheswaran, 2011;Markevych et al., 2017;Nieuwenhuijsen et al., 2017;White et al., 2013).However, the methodologies and composite measures developed recently are challenging to translate into policy and practice for urban greening (Gascon et al., 2016;Helbich, 2019;Labib et al., 2020b;Rugel et al., 2017).For instance, Gascon et al. (2016) assessed the relationships between built environment characteristics and NDVI and found that measures of urban greening explained between 32% and 53% of mean NDVI values.
Therefore, in this study, we proposed a novel approach to demystify NDVI for greenness exposure assessment and policy intervention, indicating what type of vegetation (e.g., trees) and what amount should be needed at which spatial scale (e.g., 100 meters) to achieve potential NDVI benchmarks.

Gaps and limitations
In conducting the present study, we identified some issues that might limit the interpretability of our results.First, demystifying NDVI only in the Greater Manchester area constrains our results as other cities or regions may have a unique spatial heterogeneity, including different types of the built environment and green and blue spaces.Second, we used satellite images to derive NDVI, greenspace, and vegetation types with a spatial resolution of 10 m².Higher spatial resolutions are preferred in environmental and health research as they improve the greenness exposure assessment within the AOI (e.g., better classification of urban green spaces like private gardens smaller than 10 m²).However, satellite images with a higher resolution were not available for the case study area and vegetation types.Third, as we aimed to understand what NDVI measures in terms of vegetation coverages, we removed built environment or control variables from our analysis, which also belong to the spatial heterogeneity of urban areas and might contribute to explaining the normalized difference vegetation index.And fourth, NDVI might be affected by the vegetation phenology and when the NDVI image was captured (Huang et al., 2021).In this study, as we used the summer period, we can assume that vegetation was in good health.But, vegetation phenology may also drive increments in NDVI, which requires additional future studies.

Recommendations for future research
As mentioned previously, future research should address some of the issues found in the present study.We encourage researchers to apply the methodology to other cities (e.g., Vancouver or Utrecht) as those may have a different distribution of green spaces and compare the results to this project to make the analysis more robust.Also, the authors of this paper suggest using a better spatial resolution of remote-sensed satellite images (e.g., 2 m² or 5 m²), when available, to improve the assessment of greenness exposure and modeling results.We advise researchers to include other built environment (e.g., urban green) or control (e.g., low and high-density vegetation areas) variables to explain NDVI.Adding built environment characteristics might enhance the representation of spatial heterogeneity of cities, while incorporating control variables might limit the influence of confounding or external factors on the relationship between greenspace, vegetation types, and NDVI.Additionally, we encourage future research to study vegetation phenology using time series analysis as it might influence increments in NDVI.Overall, we urge further epidemiological research to consider the presented recommendations and compare their results to this study to make this methodology more robust and increase our understanding of NDVI for greenness exposure assessment and policy intervention in urban greening.

Conclusions
The Calculated from reflectance measurements in the red (R) and the near-infrared (NIR) bands using the following equation: (NIR−R)/(NIR+R).10m²Continuous variable ranging from -1 to 1.A score of -1 indicates the absence of vegetation and 1 the abundance of healthy vegetation.

Fig. 1 .
Fig. 1.(a) Input image measuring NDVI for the case study area.(b) NDVI map capturing exposure within 100 meters buffer distance.(c) NDVI map capturing exposure within 300 meters.(d) NDVI map capturing exposure within 500 meters.

Fig. 2 .
Fig. 2. Flow diagram describing the required steps to process greenness metrics for different buffer zones and explore univariate and multivariate relationships.

Fig. 3 .
Fig. 3. Multivariate analysis: partial dependence plots.(a) Sensitivity of NDVI to vegetation types for buffer zones of 100 meters.(b) Sensitivity of NDVI to vegetation types for buffer zones of 300 meters.(c) Sensitivity of NDVI to vegetation types for buffer zones of 500 meters.

Fig. 4 .
Fig. 4. Univariate analysis: partial dependence plots.(a) Sensitivity of greenspace to NDVI for different buffer zones (100, 300, and 500 meters).(b) Sensitivity of tree canopy to NDVI for different buffer zones.(c) Sensitivity of shrubs density to NDVI for different buffer zones.(d) Sensitivity of grass density to NDVI for different buffer zones.
main focus of this study was to explore what NDVI values measure in terms of vegetation types and understand how different increments in NDVI values translated into changes in vegetation coverage values within multiple buffer distances.The results of our analysis indicated that NDVI is sensitive to vegetation types, amounts of vegetation, and spatial scales and that different increments in mean NDVI predict dissimilar changes in vegetation values depending on greenness type and buffer distance.We identified that NDVI is most sensitive to the tree canopy and that forbs and shrubs density is most sensitive to mid-range increments of NDVI values, both within 300 meters of exposure.These sensitivities usually follow nonlinear patterns.Furthermore, we developed multivariate models with three vegetation types that explain about 80% variance in mean NDVI values at 300 and 500 buffer distances.While NDVI is very useful in understanding vegetation presence and health, one crucial problem that remains is revealing what type of vegetated and non-vegetated surfaces NDVI values measure within the AOI.For instance, NDVI values do not differentiate the reflectance between trees and grass.Such limited understanding has caused inadequate translation of NDVI-based health metrics into usable greening policies and practices.Therefore, this study proposed a novel methodology that successfully identifies the type and quantity of vegetation that explains NDVI values within different exposure distances.The analyzed AOIs are relevant to the pathways linking greenness exposure and health outcomes.Finally, our results address the mystery behind NDVI for greenness exposure assessment and might be translated into actionable policy interventions in urban greening.Policymakers and practitioners should carefully evaluate how increments in mean NDVI values predict dissimilar changes in vegetation coverage values that are sensitive to the type and amount of vegetation and spatial scale.This approach might be applied to other cities to encourage evidence-based policy recommendations that make urban areas greener and healthier.canopy) + linear(forbs and shrubs) + spline(grass); Model 5: NDVI ~spline(tree canopy) + spline(forbs and shrubs) + linear(grass); Model 6: NDVI spline(tree canopy) + linear(forbs and shrubs) + spline(grass); Model 7: NDVI ~linear(tree canopy) + spline(forbs and shrubs) + spline(grass); and Model 8: NDVI ~ spline(tree canopy) + spline(forbs and shrubs) + spline(grass).

Table 3
Scale, metric, mean, median, standard deviation, skewness, and kurtosis for the five greenness metrics and the three buffer zones considered to demystify NDVI.

Table 1 ,
Appendix).For 100 m buffer zones, the best model to explain NDVI values was a mixed linear-nonlinear model in which tree canopy and shrubs were presented as spline/nonlinear terms and grass as a linear term (Table1,Appendix).We selected the mixed model among all possible alternatives as it had the lowest GVC score (Table1, Appendix).The modeling alternatives are described in Table1(in Appendix) with their corresponding scores for GCV, Effective DoF, AIC, and Pseudo R-Squared statistics.Here, the best model has a pseudo R-squared of 0.7475 (Table1, Appendix), indicating that the three vegetation types can explain 74.7% variance in mean NDVI value for 100 m buffer zones.The partial

Table 1
, Appendix).The Pseudo R² was equal to 0.7983, indicating the model can explain about 80% variance in observed NDVI values (Table 1, Appendix).Each vegetation type showed a nonlinear relationship with NDVI (Table

Table 2 ,
AppendixUnivariate modeling: scale, model type, model name, GCV, DoF, AIC, and Pseudo R² of the relationships between greenness metrics for different buffer zones in untivariate models.