Keep it real: selecting realistic sets of urban green space indicators

With increasing urbanisation, urban green spaces are expected to be crucial for urban resilience and sustainability, through the delivery of ecological, economic and social benefits. In practice, however, planning, management and evaluation of urban green spaces are rarely structured and evidence-based. This represents a missed opportunity to account for, track and foster the multiple benefits that green spaces are expected to deliver. To gain insight into this gap, this study assesses the availability and uptake of relevant evidence by city governments. Interviews, focus groups and quantitative surveys were applied in four medium-sized European cities: Coimbra (Portugal), Genk (Belgium), Leipzig (Germany), and Vilnius (Lithuania), covering the main governance and climatic gradients in Europe. Using straightforward data exploration and regression, we analyse which ecological, economic and social indicators are typically chosen by cities and why. Together with the city stakeholders, we derived a common set of benefit categories and key performance indicators which can be adapted to diverse local contexts. We conclude that cities tend to make pragmatic decisions when composing their indicator sets, but nevertheless cover multiple urban green space dimensions. Finally, we explore how indicator choice could be optimised towards a complementary and credible indicator set, taking into account a realistically feasible monitoring effort undertaken by the cities.


Introduction
By 2050, 68% of the world's population will live in urban areas (United nations, Department of Economic and Social Affairs and Population Division 2018), population will increase from 7.6 to 9.7 billion (United nations, Department of Economic and Social Affairs and Population Division 2018) and heatwaves, floods, water scarcity, wildfires, and storm occurrences will intensify (European Environment Agency 2016). Realising a physically liveable environment is one of the biggest challenges for today's cities.
Urban green spaces, or 'urban green-blue infrastructure' comprises all natural and semi-natural landscape elements in urban environment (European Environment Agency 2017). While these spaces are under increasing pressure of urban development, they have been shown to mediate detrimental effects of demographic and environmental changes in cities: providing air and water purification, reducing heat-island effects, UV exposure and noise (Tzoulas et al 2007, Jansson 2014; increasing drought and flood resistance (Farrugia et al 2013, Liu andJensen 2018); reducing stress, and increasing social contact and physical activity of the population (Lee and Maheswaran 2011, Konijnendijk et al 2013, Gascon et al 2016. Despite the evident importance of urban green spaces for urban quality of life, it remains unclear how cities measure, track and compare the quality of their urban green spaces or the actual multiple benefits. The main challenge is to develop-at the city level-a set of indicators, which balances feasibility (data availability, resources, time and technical capacity) and quality (in relevance for multiple benefits, scientific credibility and clarity) (van Oudenhoven et al 2018, Demolder et al 2018. Numerous academic and grey literature papers propose indicator sets for urban green spaces (Baycan-Levent et al 2009, Azadi et al 2011, De La Barrera et al 2016, but these rarely fully consider the local context (i.e. local biotic and abiotic conditions, as well as the socio-political and governance situation), which determines the choice of data and information used by the cities.
Within the local context of four European cities this study assesses, (i) which data and information is available at the city level, (ii) how this evidence covers multiple indicators related to urban green spaces, (iii) how cities evaluate feasibility, relevance, credibility and clarity for these indicators, and (iv) which indicators are currently measured and used. We conclude that cities tend to make pragmatic decisions when composing their indicator set, but nevertheless cover multiple urban green space dimensions. For scientists, we provide a framework as a common checklist across cities that accommodates context-specific definitions and diverse evidence types. We support practice with guidance on which indicators to add to the indicator portfolio, taking into account cityspecific feasibility.

Analytical framework
Our analytical framework is based on two main concepts. First, a hierarchical indicator typology was codeveloped based on the plural valuation framework of human-nature relationships as applied in IPBES (Díaz et al 2015b), and adapted for an urban context. Second, we applied evaluation criteria for humannature indicator quality as developed by van Oudenhoven et al (2018).
The indicator hierarchy aims to compare diverse sets of indicators across four European cities (table  1). At the bottom, most granular, level are the city indicator sets. These are (potentially) measured by available datasets in the specific city context. At the next level, these city indicators are associated with urban green space key performance indicators (KPI). These KPI are concepts, not necessarily linked to concrete datasets, which potentially vary in interpretation between cities. For instance the KPI 'regulation of hazards and extreme events' has a different meaning (and different city indicators) in a wildfire-prone city like Coimbra compared to a flooding-prone city such as Genk. These KPI then organise in benefit categories, which relate to the diversity of societal benefits (and policy goals) of urban green infrastructure, and cover the three main dimensions of human-nature relationships: the physical dimension (nature itself, ecological and intrinsic values); the contributions to people (ecosystem services, economic and instrumental values) and the social dimension (diverse values concerning quality of life) (table 1 and supplementary data (available online at stacks.iop.org/ERL/15/095001/mmedia), Carmen et al 2020) (Díaz et al 2015a, Jacobs et al 2016, 2020. The final framework consists of three value dimensions, 9 urban green space benefit categories and 41 urban green space KPI (table 1). All city indicators are associated with the KPI. Their association varies between specialized indicators, which fully and uniquely fit a single KPI (score 2), and general indicators which provide (partial) information on several KPI (score 1, see supplementary data, Carmen et al 2020).
To assess indicator quality, we base ourselves on the framework of van Oudenhoven et al (2018) since it summarizes the relevant current literature. Van Oudenhoven et al (2018) synthesize 16 criteria for selection of appropriate indicators for decision making, and organise these in four main categories: credibility, salience, legitimacy and an additional feasibility criterion. In our study, we take over credibility and feasibility but choose relevance and clarity instead of legitimacy and salience respectively. Relevance and clarity were found to be more familiar, clear terms than salience and relevance although they cover about the same quality aspects (table 2 and the supplementary data, Carmen et al 2020).

Data collection
The data collection was performed in the context of the EU BiodivERsA funded UrbanGaia project on ecosystem services for resilient, greener and healthier cities. This study reports on the projects' activities concerning data and information on ecological, socio-economic and governance aspects of urban green spaces in all four European cities that participated with the UrbanGaia project: Coimbra (Portugal), Genk (Belgium), Leipzig (Germany), and Vilnius (Lithuania). The cities vary in size, climate, sociopolitical and environmental context (table 3). Urban green spaces in Genk, for instance, have been shaped in recent (20th century) history by enclosing urban sprawl in a coal-producing industrial area, while in Vilnius, Verkiai regional park is a historical landmark with a prominent presence in the cities' history. Nonetheless, each of the four cities are motivated to improve the quality of green spaces and encounter similar challenges in doing so.
First, a list of possible indicators was inventoried for each city. This inventory was assembled by revising policy documents for each country and city, interviewing local experts and city administrators, and by organising focus groups. Association with the KPI set was first performed by the  (Díaz et al 2015b with the city indicator sets. KPI and benefit categories were adapted/amended to create a common hierarchical framework to organise information on multiple benefits of urban green spaces (table 1). Once this common structure was agreed upon, the city indicators were associated with KPI in a cross-table scoring (a score of 0/1/2 if the indicator is not/partly/perfectly associated with the KPI) exercise by one or more local experts in each city (Carmen et al 2020).
Indicator qualities (relevance, feasibility, clarity, credibility) were scored in follow-up interviews with city officials using a 5 point likert scale. Additional information was recorded on the level or ease of implementation, and how often and by which institution the indicator (would) be implemented (tables 1 and 2, and supplementary data (Carmen et al 2020)).
Contrary to similar studies in the field of urban green performance (e.g. Van Herzele and Wiedemann 2003, Baycan-Levent et al 2009), we did not collect or compare data on green infrastructures, or propose new indicators. Our data focuses on how existing indicators themselves perform. This is an essential point as these indicators are the 'agents' which are meant to effectively transfer relevant knowledge to decision making.

Data analysis
Data exploration and analysis was performed in R using the common packages such as dplyr for data wrangling and ggplot2 for graphics. Central to our analysis is the actual use of certain indicators by city officials, or the (implicit) choice to not to. From a broad and diverse set of potential indicators, a limited set is chosen and implemented to guide decision making. Among other factors such as accessibility and knowledge about the indicators, this choice is assumed to be related to the indicators' relevance, feasibility, clarity and credibility. This was verified using a simple generalised linear model (glm function in R) which extends linear regression models to allow for the dependent variable to be nonnormal. In our case, the dependent variable is a binomial variable that equals 1 if the indicator is used and 0 otherwise. The model includes the four measures of indicator quality as numeric variables (1-5 likert scale) and the city as a categorical control variable. The glm function then applies Fisher Scoring to maximize the likelihood function and thus find the best fitting coefficients.
To estimate indicator efficiency, recent literature points to the need to compare performance with feasibility (van Oudenhoven et al 2018). We therefore apply Data envelopment analysis (DEA) (Cooper et al 2011) which compares multiple benefits (performance of the indicator regarding relevance, clarity and credibility, as well as coverage of the diverse dimensions of UG benefits) with the 'cost' (feasibility/implementation) of each indicator. An indicator is more efficient if it provides higher quality information on multiple benefits, and is easier to implement (i.e. feasibility is high and/or it is easy to implement in practice). To allow for a

Which urban green space benefits can be covered by the indicators?
We first analyse the complete list of indicators inventoried for each of the four cities (Carmen et al 2020). Figure 1(a) shows the overall coverage of each of the benefit categories by all available indicators. It shows that, among the urban green space benefit categories, Non-material contributions and Health and wellbeing are well covered, while Material contributions, Regulation contributions, and Governance and justice are underrepresented. Figure 1(b) shows the distribution of indicators over the three dimensions of interest. Vilnius stands out because of its focus on the Nature dimension, the other three cities are quite similar. Figure 2 shows the total coverage score of indicators that are measured and used to track the urban green space's performance. Figure 2(a) shows that Nature and Cultural aspects are covered the best, while Material contributions lack support. Coimbra and especially Vilnius focus on the physical dimension, while Genk and Leipzig are quite balanced in the benefit categories that are covered ( figure 2(b)).

How do cities make their choice of indicators?
As expected, cities select indicators that have, on average, high indicator quality ( figure 3). This intuitive observation is confirmed by a simple logistic regression model (table 4). It appears that relevance and feasibility are the most important factors to explain indicator implementation.

Can the indicator set be improved?
Each city listed between 48 and 133 potential indicators (figure 1); identifying the best can be difficult since they cover the three dimensions to a varying extent and differ in indicator quality. Since there are many aspects that determine whether an indicator is 'good' , we use DEA to rank indicators from most to least efficient. An indicator is efficient if it requires little effort to implement, has high indicator quality, and covers multiple benefit categories. Overall, cities chose highly efficient indicators. However, several 'low hanging fruits' are inventoried as potentially interesting to complement the city's indicator set.
As an illustration (details for all cities can be found in the appendix), for Leipzig, ten efficient indicators are selected by the model, among which six indicators are currently used by the city authorities (table 5). If Leipzig wants to expand their indicator set in the future, indicators 16, 26, 1, and 12 are good candidates (figure 4). Accounting for the coverage of benefit categories (figure 2), 16 and 26 are the most interesting as they cover the 'material contributions' category which is lacking in Leipzig (figure 2).

Discussion
This study is based on four local contexts. Although the cities cover diverse biotic, abiotic, socio-political, and governance situation, our results should still be interpreted taking into account their European specificity. These are contexts without major environmental disasters, socio-political conflicts or widespread and extreme poverty. All four are medium-sized cities with a relatively stable and accountable democratic governance. Applying these findings to cities or municipalities in different geopolitical settings and scales should be done with caution.
On a theoretical level, our study advances the study of urban green space by providing a comprehensive yet adaptable evaluation framework. (1 = partly associated, 2 = perfectly associated). Coverage of a benefit category is the weighted sum of these scores over city indicators and its underlying KPI. The weights are related to the number of KPI in this benefit category.
The framework applied to evaluate green space impact is inspired by the IPBES framework which describes plural values involved in human-nature relationships (Díaz et al 2015a, Davies et al 2018. By confronting this with the urban context and the four city-specific indicator sets (which articulate specific priorities) we have obtained a hierarchical classification of key performance indicators, validated in real-life practice. The framework is flexible enough to include different indicator sets in each city, while ensuring a minimum acceptable degree of comparability at higher levels. However, as the indicator sets are not aligned over cities, no analysis on which indicators typically score good/bad over all cities could be done. The appendix exemplifies a detailed analysis for each city to show that this approach generates applicable results (to guide choice of indicators) while it is based on a common framework. This urban green space checklist can be used to assess impacts on plural values in other urban contexts.
Our results confirm that feasibility is one of the main criteria for indicator selection (van Oudenhoven et al 2018). While researchers long relied on credibility, salience and legitimacy (Cash et al 2003) to evaluate indicators, our study shows large differences in the ease of implementation depending on local context and support from local partners. Applied research, which aims at improving evidence-based decision making on urban green spaces should therefore inventory the resource and capacity limitations to measure and interpret these indicators. While increasing resources and capacities are certainly needed on the municipalities' side, we argue that researchers should avoid compiling idealized, exhaustive and perfect indicator sets, and implying these should be measured for each green space project and repeatedly over time. This is unrealistic, demotivating, and does not advance evidence-based decision making.
Indeed, based on this study, cities tend to focus on 6 to 12 high quality indicators that cover all three dimensions (physical-contributions-social). The stakeholder focus groups and interviews confirm that indicators are selected in a rather pragmatic manner: they are publicly available, freely provided through other organisations or very easy to implement. This could explain why quite some low-hanging fruit is overlooked: several good quality indicators that are measured or easy to implement are not always used (e.g. tree species, sports facilities in the Leipzig case; see table 5). The physical dimension is covered the best (figure 2), as cities mostly rely on the expertise of park managers and natural scientists. While the physical quality is a necessary basis, the actual societal impact of green spaces is measured through its final contributions to people and impacts on quality of life.  Contributions (ecosystem services) are somewhat less covered, which is probably due to the fact that these sit 'in the middle' of the material processes and social benefits and are often less tangible to measure. Cities could benefit from bringing in more social science perspectives and skills in the day to day green space management.

Conclusion
To evaluate the impact of urban green infrastructures, cities pragmatically use indicator sets, covering several benefit categories for urban green space within physical, contributions to people and social dimensions. The used city indicator sets are thus effective Each axis shows the coverage score for one of the dimensions (physical, contributions to people, and social), divided by the efficiency score. Indicators that are located further away from the origin have high outputs (physical, contributions to people, or social score) yet need low inputs (high feasibility, meaning a low feasibility score). Note that the DEA model also includes the indicator quality scores, not visualized here. and efficient, but can be easily complemented with a few efficient (feasible) indicators to patch specific blind spots. Decision making on the city level could benefit from devoting some resources and in-house capacity building specifically to the development, implementation and interpretation of realistic urban green space monitoring with an optimal set of indicators, and by integrating these in decision processes. Especially in an urban context, it is key to involve natural as well as social science perspectives to avoid biased results and unfair decisions.
The study field of urban green spaces, and how to effectively produce socio-economic benefits, would profit from cross-project and cross-city comparison of outcomes. Our urban green space framework can be applied as a common checklist across cities, while accommodating context-specific definitions and diverse evidence types.
Finally, environmental scientists who aim at improving decision making would be more effective in helping cities when feasibility, resources and capacity are taken as the boundary conditions for the applied research. Unlike in product engineering, this consideration is often forgotten and recommendations are therefore often disregarded as academic and unrealistic. A case-based, participatory approach is therefore also commendable when aiming for relevant and actionable research. ivERsA 3: 2015 call under grants BRAINbe BR/175/A1/URBANGAIA-BE (Belgium); 01LC1616A (Germany); S-BIODIVERSA-17-17-1 (Lithuania), and BIODIVERSA/0008/2015 (Portugal). The research is performed in cooperation with CERNAS (Coimbra, Portugal), INBO (Brussels, Belgium), UFZ (Leipzing, Germany), Universidad de Malaga (Malanga, Spain), Mykolas Romeris University (Vilnius, Lithuania). We also thank the cities' local governments, field workers, and contacts for their time and effort invested in the project.

Data availability statement
The data supporting the findings of this study are openly available within the article and its supplementary materials (Carmen et al 2020).