Improving city-scale measures of livable sustainability: A study of urban measurement and assessment through application to the city of Birmingham, UK

Abstract Despite the attention sustainability-related urban measurement and assessment methods have received it is still not well understood how accurate (or not) the various methods are; their limitations in holistic city performance assessment; or, how they can be effectively used to better the design of the urban environment, city services and policies. Necessarily, urban measurement and assessment methods focus upon what is known. However, reflecting upon the unknowns and their impacts has the potential to deliver crucial insights into the assessment of city performance and governance. To this end, this study applies and critiques the city performance measurement and assessment method UK City LIFE 1 in order to explore the challenges of, and prospects for, filling these gaps. UK City LIFE 1 is designed to measure ‘livable sustainability’ at the city scale for the purpose of aiding UK policy makers and urban design decisionmakers. Results suggest that definitional uncertainties, the availability and viability of data, and the design of the method introduce inaccuracy, uncertainty and bias into data interpretation. This, combined with the complexity of city systems and the nascent ‘science of cities’, prevents causal effects from being fully described, potentially rendering decision-makers impotent. However, the language of ‘realizing the multiple benefits of interventions’ and ‘coupling and uncoupling relationships’ alongside making the unknown explicit has the potential to empower decisionmakers in the face of absent and disconnected data and interpretational challenges.


Introduction
Cities are now the dominant form for human habitation (United Nations, Department of Economic and Social Affairs, Population Division, 2014), the dominant depleters of planetary resources (Bai, 2007) and the dominant producers of CO2 emissions (Nordbo et al., 2012). As such, cities have become the simultaneous saviors and destroyers of human coexistence with the planet. As centers of innovation and culture (Athey et al., 2007;Vojnovic, 2014), cities are the means by which humans can live sustainable lifestyles, balancing environmental, economic and social priorities, taking advantage of economies of scale and their concomitant efficiencies (Dodman, 2009;Klopp & Petretta, 2017). As wielders of large demand footprints (Folke, Jansson, Larsson, & Costanza, 1997), cities are the means by which humans can live unsustainable lifestyles, ignorant of food, energy and waste processes (amongst others) with the potential to compromise a healthy, sustained and livable future.
Despite the attention urban measurement and assessment has received and the sophistication of the available methods it is still not well understood how accurate (or not) the various methods are; their limitations in holistic city performance assessment; or, how they can be effectively used to better the design of the urban environment, city services and policies (Kitchin et al., 2015;Leach et al., 2016). In part this is because of definitional challenges, with neither sustainability nor livability having agreed definitions (Gough, 2015;Bell & Morse, 2008;Connelly, 2007;Kidd, 1992;Leach et al., 2016;Vojnovic, 2014;Tanguay, Rajaonson, Lefebvre, & Lanoie, 2010) and that our understanding of these concepts isn't fixed, but changes over time and contexts (Connelly, 2007;Portney, 2013;Astleithner & Hamedinger, 2003). It is also because our understanding of how cities operate and how they are best governed is incomplete (Vojnovic, 2014;Batty, 2012). In truth, there is no 'one' way a city behaves. For example, cities change along multiple temporal scales: centuries, decades, political cycles, yearly seasons, day to night, hour to hour. They also change along multiple geographic scales: city quarters, neighborhoods, streets and households. Urban measurement and assessment methods are one attempt at developing this understanding: they capture what is known about a city and assess its value (e.g., via benchmarking against the performance of other cities or extrapolating performance into the future).
Necessarily, urban measurement and assessment methods focus upon what is known. However, reflecting upon the unknowns and their impacts has the potential to deliver crucial insights into the assessment of city performance and governance. Many critiques of urban measurement and assessment methods exist (see for examples (Morse, 2004;Lyytimäki, Gudmundsson, & Sørensen, 2011;Lynch & Mosbah, 2017;Klopp & Petretta, 2017;Pinfield, 2007;Ghosh, 2006;Mori & Christodoulou, 2012;Mayer, 2008;Pires et al., 2014;Dawodu, Akinwolemiwa, & Cheshmehzangi, 2017)). What are less common are critiques that cover design as well as application and interpretation. Yet it is inevitably the case that in understanding the complete process of designing, applying and interpreting a method, a richer picture emerges of the (unavoidably) compromised view of the city afforded by the method. Such insights can be used to improve measurement and assessment methods, inform decision-making and influence governance practices.
Within this context, this paper describes the design, application and interpretation of UK City LIFE 1 (UK City Livable-sustainability Indicator Framework Edition 1)a city performance measurement and assessment method that is designed for the comprehensive and holistic measurement of livable sustainability (the delivery of livability alongside sustainability (Gough, 2015)), to the city of Birmingham, UK, for the purpose of addressing the study's three primary research questions: (1) Is it possible, in practice, to holistically and at the city scale measure a UK city's livable sustainability? (2) What challenges and gaps arise and can these be overcome? and (3) How can the outcomes be interpreted to aid local authority decision-making in the UK?
The study triangulates information from three strands of analysis, described in the subsequent sections of this paper: a review of the sustainability and livability measurement and assessment literature (Section 2), a critical assessment of the design and application of UK City LIFE 1 to the city of Birmingham, UK (Section 3) and a critical reflection upon the interpretation of UK City LIFE 1 , taken with UK local authorities, urban design decisionmakers and urban professionals (Section 4). Section 5 discusses the outcomes and sets out the conclusions of the study.
This research was conducted as part of the development of the Livable Cities' decision-making method, a decision-support tool designed to assist urban design professionals in thinking holistically, complexly and long-term when making interventions in an urban environment, which can range from the delivery of city services to urban regeneration to policy formulation. The work forms part of the Livable Cities Program Grant.

A review of the sustainability and livability measurement and assessment literature
Current city performance measurement and assessment methods are many and vary in criteria, measurement methodology, robustness, transparency and applicability to specific urban contexts, with no single method dominating (Kitchin et al., 2015;Leach et al., 2016). This presents a challenge to policymakers and others wishing to use these methods to inform policy (Mayer, 2008).
City performance data for urban design decision-making and policymaking are almost universally expressed in the form of indicators (Kitchin et al., 2015). As such, this paper will focus upon indicatorbased city performance measurement and assessment methods. Indicators provide information about the object of the data or are used as a representation of an associated factor (Business Dictionary, 2016;Oxford Dictionaries, 2016). For example, GDP (gross domestic product) and employment rate are both indicators of economic performance. Used wisely, indicators can effectively measure city performance (Bell & Morse, 2008;Kitchin et al., 2015) whilst guarding against the tendency for them to compromise creativity (Leach et al., 2015).
This section describes the principles, challenges and gaps for the design, application and interpretation of indicator-based urban sustainability and livability performance measurement and assessment suggested by the literature. An exploratory literature survey was undertaken to collect and analyze written sources that contributed to understanding the principles, challenges and gaps. Literature from 1992 to 2017 were considered to coincide with the Rio Earth Summit and the establishment of Agenda 21 (the local implementation strategy for global sustainability and climate change mitigation), which gave rise to a proliferation of sustainability-and livability-focused measurement and assessment methods (Kitchin et al., 2015).

Designing urban measurement and assessment methods: a review of the literature
The criteria for designing successful indicator-based measurement and assessment methods are diverse and contested (Mayer, 2008). Inevitably, indicator choice is, at some stage in the process, based upon arbitrary decisions (Niemeijer & de Groot, 2008). This said, there are principles that should be followed where possible and these are described here.
Consideration should be given to how interpretation is influenced by the conceptualization and measurement of the performance criteria (Marans & Stimson, 2011). In order for performance data to be useful, a determination of what constitutes 'good' performance must be made. Who does this, and how, materially influences the conclusions drawn (Kitchin et al., 2015;Mackenzie, 2008). As such, the transparency of this information is crucial for policymaking (Kitchin et al., 2015).
Performance measurement should be accompanied by a performance assessment framework that allows for the accurate and clear interpretation of the data (Mayer, 2008). The framework should have a clearly defined area of focus (e.g., livable sustainability), be designed for the context in which the measurements are to be taken (e.g., cities in the UK), evidence a clear causal chain, make explicit interdependencies and extend across disciplinary and professional siloes (e.g., architecture, engineering, planning and governance) (Leach et al., 2016).
Selecting or designing a performance assessment framework and indicators useful for policymaking requires careful consideration. Any given framework should be holistic with minimal overlap, be simple (without compromising any underlying complexity), include subjective and objective perspectives as well as quantitative and qualitative data, be usefully organized and be relevant to decision-making (Leach et al., 2016). The individual indicators should be simple, elegant, effective, sensitive to change, measurable and verifiable (preferably in a standardized way), conceptually sound, understandable, unambiguous, objective (value-free) and draw upon data that either exist or are relatively easy to obtain (Leach et al., 2016;Kitchin et al., 2015).
Perhaps most importantly, however, is designing a city performance measurement and assessment method that is fit for purpose. This requires a clear understanding of the intended use of the outcomes and the best-fit mechanism for achieving this. Parris and Kates (2003) identify four purposes for sustainability assessments: (1) decision making and management, (2) advocacy, (3) participation and consensus building and (4) research and analysis. Kitchin et al. (2015) identify two broad mechanisms for achieving these purposes: (1) indicators for description and providing context; and, (2) indicators for diagnostics, determining performance and progress towards targets.

Applying urban measurement and assessment methods: a review of the literature
The application of urban measurement and assessment methods is carried out by various stakeholders for various reasons. These include academic initiatives, such as Biophilic Cities, the Designing Resilient Cities Method and SuBSelec (Lombardi, Leach, Rogers, et al., 2012;The Biophilic Cities Project, n.d.;Tanguay et al., 2010); public-sector initiatives such as the many bespoke city dashboards that exist and the Urban Audit (European Commission, n.d.; Kitchin et al., 2015;Gough, 2015); third/other sector initiatives, such as the Urban Sustainable Development Goal, ICLEI's Local Authorities Self-Assessment of Local Agenda 21 (LASALA) tool and the City Biodiversity Index (United Nations, n.d.; ICLEI, n.d.; Convention on Biological Diversity, 2013); and, private sector initiatives, such as 100 Resilient Cities, Arcadis' Sustainable Cities Index and the Economist's Global Livability Ranking, (S, C, & D, 2014;Arcadis, 2015;Rockerfeller Foundation, n.d.).
The motivations for these initiatives vary. As generalizations, academic and public-sector initiatives seek to add to the knowledge base through the creation of new knowledge in order to increase understanding and inform decisions. Third and other sector initiatives usually have philanthropic goals. Private-sector initiatives are used to enhance the organization's offering. Motivations must be understood in order to put the arising outcomes into context. For example, if a measurement and assessment method has been designed to enhance a private organization's economic offering, this may have affected the balance of the method, skewing it towards the organization's strengths and away from its weaknesses.
The choice of urban measurement and assessment method is important. Different results can be obtained from the same data if different methods of assessment are used (Mayer, 2008). Bespoke methods benefit from being designed to be fit for purpose (e.g., specific geographic or thematic contexts). The tradeoff is that understanding issues that go beyond the chosen geographic boundary (such as sustainability, which operates beyond the city scale at regional, national and global scales) or theme (e.g., water, which is closely tied to energy and water) is compromised making meaningful benchmarking impossible to achieve. A challenge for any sustainability-related urban measurement and assessment method is to generate holistic understanding within and outwith the city.
It is also important to be cognizant of the capacity and capabilities of those applying the method. For example, in the UK there are particular challenges faced by the public sector in applying, interpreting and maintaining a city performance measurement regime. In recent years the capability and capacity of Local Authorities to undertake these (and many other) tasks has been severely compromised by austerity and concomitant budget cuts (The Economist, 2017;Lowndes, 2013). Without appropriate resources, UK Local Authorities will struggle to decide what data to collect and to acquire the specialized knowledge and skills for their collection and analysis.
For any initiative, data availability and viability are a potential problem. Kitchin et al. (2015) describe a number of potential data pitfalls. Do those applying the method have access to the necessary data? That is, do the data exist and are they made available? How current are the data to which they have access? Are the data in a form that is usable? Are they unbiased, reliable, accurate and at an appropriate scale? No matter how advanced the urban measurement and assessment method may be, if the data it uses are compromised in some way then so will be the data interpretation.
2.3. Interpreting urban measurement and assessment methods: a review of the literature Some issues related to the interpretation of urban measurements have already been discussed. These include how interpretation is affected by the choice of measurement method (its fit for purpose), its performance criteria, how it defines its subject area (e.g., sustainability, livability), and who carries out the application and interpretation. Kitchin et al. (2015) describe a number of interpretational considerations. The authors make the argument that on the surface the data arising from urban measurement and assessment methods can appear to represent an 'absolute truth', but in fact many factors influence their ability to do thisand that the very existence of an absolute truth is a myth. They go on to say that data themselves cannot be separated from their contexts and the ideas that gave rise to them; they are not the manifestation of a static situation that can only be measured on one way, if the situation can be directly measured at all; and, data are generated from normative, value-laden processes. As such, it is the responsibility of interpreters of the data to fully understand the measurement data and assessment frameworks being used and to account for biases and unknowns.
Much of what has been written thus far in this paper relates to interpretation for the purpose of accurately describing and understanding an urban situation in as much detail as possible. The dashboards described by Kitchin et al. (2015) are one manifestation of this approach. Batty (2016) presents an alternative, or perhaps an evolution: understanding an urban situation in as little detail as necessary. What Batty advocates is urban theory building: "… abstract[ing] from an agreed reality, throwing away that which appears irrelevant to the purpose in hand and only keeping what appears to be essential to good explanation" (Batty, 2016, p. 797). One unifying theory of cities is unlikely to be achievable given the individual and collective complexities of cities, but theories of aspects of cities are achievable and can be combined and made manifest via urban measurement and assessment methods.

3.
A critical assessment of the process of designing and applying UK City LIFE 1 to the city of Birmingham, UK This section describes the processes of designing and applying UK City LIFE 1 to the city of Birmingham, UK, for the purpose of addressing two of the study's three research questions: (1) is it possible, in practice, to holistically and at the city scale measure a UK city's livable sustainability? and (2) what challenges and gaps arise and can these be overcome? The arising dataset, including all data sources and dates, are published in Leach, et al., (forthcoming). Gough (2015) defines livable sustainability as the delivery of livability alongside sustainability: "…sustainable outcomes result over time through a series of livability outcomes. Together, the conceptual linkages between livability and sustainability reveal tensions, but also complementarities that can assist with the other's implementation" (Gough, 2015, p. 147). Livable sustainability arises from an inherent tension in measuring the sustainability performance of cities: that of scale. Sustainability is a long-term challenge with greatest relevance at global, national, regional, mega city and city region scales (Lynch & Mosbah, 2017;Banai, 2012). Livability makes sustainability relevant at the city and sub-city scales (Lynch & Mosbah, 2017). It achieves this largely by ignoring sustainability's long-term viewpoint, giving rise to the criticism that short-term interventions risk derailing long-term sustainability (Gough, 2015). In short, livable sustainability attempts to combine the long-term views of sustainability with the short-term impetus of livability (Gough, 2015).
UK City LIFE 1 has been developed by the authors as part of the Livable Cities research program (see www.livablecities.org.uk), which is devising interventions, and their means of implementation, to transform engineering to deliver societal and planetary wellbeing. A cornerstone of the program's methodology is determining how UK cities are currently performing with regard to livable sustainability. This gave rise to the following research question: does there exist an effective way of assessing the livable sustainability performance of UK cities? A comprehensive review of existing livable sustainability-related measurement and assessment methods was undertaken and the results tested against the literature to determine their suitability for measuring UK urban livable sustainability. As none proved to be suitable, primarily due to the lack of comprehensive coverage or applicability to the UK context, a bespoke city analysis methodology was designed: one that applied the best available knowledge via effective indicators delivering comprehensive coverage within a city performance assessment framework. The city analysis methodology draws together a collective knowledge on city assessment that has been developing over recent decades with the most effective ways of assessing the livable sustainability performance of UK cities. UK City LIFE 1 is the embodiment of the outcome and, we contend, makes a unique contribution. Moreover, it enables a critique of the processes of its design, application and interpretation in a way that would not be possible if a preexisting method had been chosen.
At the start of the design process the intended purpose and mechanism for UK City LIFE 1 were determined. It was imperative that these were returned to throughout the design, application and interpretation process in order to avoid scope creep.
• Purpose: decision-making and management, specifically aiding UK local authority decisionmakers. This focus led to compromises being made that might otherwise have been overcome. For example, data that are available to researchers but not local authorities were excluded, although the indicator was retained to make explicit what is missing. The purpose being to prompt consideration by local authorities to collect these data or by data services to allow local authorities access to the data.
• Mechanism: indicators for description and providing content, which could be used as a baseline from which to backcast from a desired future scenario to elucidate barriers to its realization (as part of the Livable Cities research program).
In addition, an early assessment was made as to what constitutes 'good' performance. The performance being measured was scoped (i.e., livable sustainability) and initially it was assumed that there was a common understanding of what good performance entails. It quickly became apparent that this was not the case. A number of discussions, workshops and focus groups were organized between the authors, with the wider Livable Cities' academic team and with practitioner experts in order to converge upon a common understanding. At one point it seemed reasonable to deconstruct the problem by determining parameters of good performance for each of the method's indicators. However, this quickly proved impossible because for many of the indicators the literature does not support the allocation of absolute values and it is not possible to quantify the interactions between all the indicators. Instead, it was decided to set out an overriding principle of good performance: moving towards sustainability without compromising the livability of those who live in, work in and visit the city.
Following the literature review of existing sustainability-related measurement and assessment methods and consultation with practitioner experts it was decided that an indicator-based approach to measuring city performance was appropriate. This gave rise to an immediate tension between the usable number of indicators and the ambition to comprehensively and holistically measure livable sustainability at the city scale. The literature advocates a limited number of indicators for any given method (Hardi & Zdan, 1997;Tanguay et al., 2010), a view which was supported by practitioner experts, but this was determined to be insufficient to provide an in-depth understanding of city performance. Instead, an assessment framework was devised to assist with interpreting the indicators.
The framework incorporates an 'intelligent reductionist' approach to urban policymaking. It comprises four tiers (see Fig. 1), derived from the classic strategic planning hierarchy (Leach et al., 2014). The following summary is taken from a full description of the framework available in (Leach et al., 2016). The least granular of the four tiers is that of the 'lens'. There are four lenses, aligned to the four commonlyaccepted pillars of sustainability: society, environment, economy and governance. This ensures a holistic approach. 'Goals' constitute the second tier, where each pillar of sustainability is broken-down into a set of desired outcomes, for example to 'enhance community and individual wellbeing'. 'Actions' form the third tier and break-down each goal into 'actionable activities' that, crucially, can be measured. An action connected to 'enhancing community and individual wellbeing' is to 'promote healthy living and healthy long lives'. Indicators form the fourth and final tier of the framework with groups of indicators aligned to individual actions. Aligned to 'promoting healthy living and healthy long lives' are the indicators 'healthy life expectancy', 'satisfaction with health' and 'avoidable mortality' to name only a few. In this way a causal chain and linkages (established from the literature) between indicators are mapped. In principle, the chosen framework complied with the best practice principles for such a framework and described in Subsection 2.1. The framework has a clearly defined area of focus (livable sustainability), is designed for the context in which the measurements are to be taken (UK cities), and evidences a clear causal chain and makes explicit interdependencies (via the four tier structure). In addition, it attempts to be holistic (extending across disciplinary and professional siloes) with minimal overlap, be simple (without compromising any underlying complexity), include subjective and objective perspectives as well as quantitative and qualitative data, be usefully organized and be relevant to decision-making. However, as the indicators were incorporated into the framework compromises had to be made.
The framework retained its holistic design, but data for approximately one third of the indicators either did not exist, were proprietary or were not viable at the city scale (e.g., the sample sizes were too small to be meaningful). The authors decided to retain indicators that had null values so it is evident what is not included. The authors also made some compromises on data viability to ensure some data representation was made where possible.
Aligning indicators to actions meant that overlap between indicators was minimized but created another problem: some actions shared indicators in order to provide the fullest picture of the action in question. Sharing indicators was trialed, but proved to be too confusing for users. A stripped-back interpretation of the actions was developed to eradicate indicator sharing but this inevitably compromised some of the underlying complexity. Subjective and objective indicators were explicitly included and existing indicators were preferenced as this meant data were more likely to exist. Even so, many subjective indicators simply did not exist and in these cases the authors created the indicator in order not to lose the desired subjective element. For example, to ensure a subjective perspective on food the following indicator was included despite there not being data collected for it: percentage of people who strongly or somewhat agree that they don't pay much attention to the amount of food they use at home.
The framework was designed to incorporate quantitative and qualitative data types, but because the authors preferenced existing indicators over creating new ones, and because quantitative measures are dominant in indicator assessments (Kitchin et al., 2015) UK City LIFE 1 is predominantly quantitative.
The assessment framework was designed to be usefully organized for local authority decision making. However, in the Birmingham application the cross-disciplinary approach (meaning it also crossed departmental siloes at the Council) and the time needed to engage with the framework proved difficult to overcome. It might be argued that governance challenges faced by Birmingham City Council (Kerslake, 2014) and its ongoing restructuring exacerbated the problem.
The authors selected 346 indicators based upon their relevance to livable sustainability and the criteria presented in Subsection 2.1. The specifics of the selection process are beyond the scope of this paper, but briefly this entailed a thorough examination and cross-referencing of the livability and sustainability literature and existing livability-and sustainability-related assessment methods and their indicators to identify rigorous indicators, whether they existed or not, that comprehensively measured livable sustainability at the city scale and were relevant to the UK context. Despite these efforts, compromises had to be made during application of the indicators to the city of Birmingham.
As expected, data availability proved to be a defining challenge in applying UK City LIFE 1 to Birmingham. As previously discussed, during the design phase some data were known not to be availablesuch as where new indicators were created. During the application phase further data gaps were discovered. Various reasons existed for the unavailability of data. The most common was that data existed at the national, regional or neighborhood scales, but not the city scale and the existing data could not be disaggregated or aggregated. Some of these data are available through special license access from the UK Data Service, but this option is unavailable to local authorities and so was not pursued. In a small number of cases it was possible for the Livable Cities team to collect missing data using primary data collection techniques (e.g., surveys). These are restricted to those indicators whose data are collected by other organizations but not made availableimperfectly filling the data gap. For Birmingham, 92 of 346 total indicators are null (27%).
Using indicators that already exist over creating new indicators helped ensure that the indicators were measurable and verifiable in a standardized way. This strategy also increased the likelihood of data being collected for the indicator by third-party sources. Data were collected for 2011 as a first preference (given the prevalence of Census data, with the last UK Census conducted in 2011). Data for the least recent year after 2011 were selected as a second preference, with data for the most recent year prior to 2011 being the third preference. The disadvantage with this approach is that the Census data reflects a past and fixed point in time. Despite efforts to collect data for 2011 in order to provide as complete a snapshot in time as possible, inevitably this was not possible and data for different indicators are taken from different years.
Perhaps the most defining trade off that arose from the application of UK City LIFE 1 to Birmingham was that between data rigor and providing enough information for decision-making. Subsection 2.2 describes a number of data availability and viability issues, such as determining how biased, reliable and accurate the data are. Ideally, each data point would be assessed for compliance with all the criteria, but in reality this simply is not possible. Some compliance issues take preference, such as whether the data are in a usable format, because continuing without knowing this would prevent the data from being used at all. In many cases the information is not easily available and tracking it down would unrealistically delay performance measurement. As a panacea to this, be selecting data from trusted sources (such as the Office for National Statistics in the UK) the occurrences of noncompliance can be minimized. In the end, a decision must be taken as to whether any compliance failure is severe enough to exclude the dataand in doing so potentially compromise interpretation of the dataset.

A critical reflection of the process of interpreting UK City LIFE 1
The data collected from Birmingham was interpreted through the UK City LIFE 1 framework in collaboration with local authorities, urban design decisionmakers and urban professionals as part of an iterative process that included a series of workshops and meetings. This process sought to address the last of the study's three research questions: (3) how can the outcomes be interpreted to aid local authority decisionmaking in the UK?
It was apparent from the outset that the large number of indicators would require grouping and consolidation in order to be used effectively. The UK City LIFE 1 framework provides one such grouping, that of the four tiers. This arrangement allows for potential unintended consequences to be made explicit and for the determination of multiple benefits to be realizedthese being two sides of the same coinsee Leach et al. (2016) for an illustration. The underpinning linkages were established from the literature, but it is fair to say that the science of cities is still being developed and some of the linkages are more certain than others. Being able to determine potential multiple benefits and unintended consequences arising from decisions had traction with those consulted, but there was a nervousness about the large amount of information contained within the framework. Birmingham viewed through each of the four UK City LIFE 1 lenses showed, not unsurprisingly, a complex picture. The large number of indicators allowed for performance highlights to be robustly identified and for a considerable degree of depth of understanding; however, a simplification was encouraged by those consulted for the purpose of revealing "the bigger picture".
A second, thematic grouping of indicators was undertaken that aligned more closely with how other indicator sets are grouped and with user expectations (e.g., transport, water, energy, governance, wellbeing, health, etc.). Although this grouping was well received and allowed for easy interpretation of the themes, it had the disadvantage of reinforcing disciplinary and departmental siloes and did not provide an overarching picture of performance.
The task of distilling and communicating and overall picture of Birmingham's livable sustainability performance was challenging. The nature of the data contained within UK City LIFE 1 is such that it is not possible to conduct statistical tests upon them. This is partly because of the use of different sources and data types (i.e., objective, subjective, quantitative, qualitative) meaning that the data have different sample sizes and cohorts and cannot be analyzed together. Some of the data are available at the scale of the lower super output area (LSOA), but not all, making statistical analyses impossible when applied to a single city (i.e., there are not enough data points). Conducting statistical analyses beyond the existing descriptive statistics is therefore impossible and another way forward had to be found.
Visualizations of the data offer an opportunity to summarize the data in meaningful ways (Kitchin et al., 2015). In the first instance, a force diagram was used to illustrate desirable and undesirable relationships. Livable sustainability brings together wellbeing, resources (security and efficiency) and carbon emissions (as a proxy for damage to the planet). Their desired relationships can be expressed as follows and visualized in Fig. 2. This was well-received by those consulted and it was clear it could be used to inform decisions, but it was too abstracted from the data to provide detailed guidance.
• Uncoupling carbon emissions and wellbeing • Uncoupling carbon emissions and resource security • Uncoupling carbon emissions and resource efficiency • Coupling wellbeing and resource efficiency • Coupling wellbeing and resource security • Coupling resource efficiency and resource security An attempt was made to determine the degree of the relationships described in Fig. 2 using a mathematical approach, which can be found in the supplementary information (Appendix A: Calculations for Birmingham's Livability Scores and coupling measures). Scores (termed 'coupling scores') were calculated for each of the four elements of livable sustainability: wellbeing, resource security, resource efficiency and carbon emissions. This involved a drastic distillation of the full indicator set into a representative (if crudely so) and manageable subset. These scores were then used to create 'coupling measures' that determined the distance between the four elements of livable sustainability and that mirrored the relationships visualized in Fig. 2. The outcomes could then be visualized in a 3D space and Venn diagram combination, effectively locating a city's performance in a 3D 'livable sustainability space' (inspired by Graedel et al.'s (2012) paper: Methodology of Metal Criticality Determination). Fig. 3 illustrates Birmingham's performance within this livable sustainability space. Each plane of the box represents the degree of uncoupling and the Venn diagram represents the degree of coupling. The desired (most livable sustainable) position is the nearest, top most corner of the box with all three spheres completely overlapping.
This visualization of Birmingham's livable sustainability is a compromise between the detailed and in-depth picture of Birmingham afforded by UK City LIFE 1 and the force diagram. The tradeoffs with visualizing and simplifying the dataset, however, are not to be ignored. A visualization can make disorder appear organized (Kitchin et al., 2015) and although efforts were made to ensure that the subset of indicators used for the livable sustainability space calculations are as representative as possible, it is the case that such simplifications can lead to poor decision-making (Keeney, 1992). However, it is also the case that simplifications can discard the irrelevant and focus decisionmakers upon the essential (Batty, 2016). The key is using the appropriate degree of detail for the decision in question, and this perspective was reinforced by those consulted. This led to the authors recommending that the headline view presented in the 'livable sustainability space' is used to inform strategic decisions, complemented with the detailed view presented by UK City LIFE 1 as operational decisions are made.

Discussion and conclusions
This paper describes the design, application and interpretation of UK City LIFE 1 to the city of Birmingham, UK, in order to address the study's three research questions: (1) Is it possible, in practice, to holistically and at the city scale measure a UK city's livable sustainability?
(2) What challenges and gaps arise and can these be overcome? and (3) How can the outcomes be interpreted to aid local authority decisionmaking in the UK? To achieve this, the study triangulated information from three strands of analysis: (1) a review of the livable sustainabilityrelated measurement and assessment literature, (2) a critical assessment of the design and application of UK City LIFE 1 to the city of Birmingham, UK, and (3) a critical reflection upon the interpretation of UK City LIFE 1 , taken with local authorities, urban design decisionmakers and urban professionals.
It is not possible to comprehensively measure something that does not have clear definitional boundaries (Bell & Morse, 2008), and so the contested definitions of livability, sustainability and livable sustainability are a barrier to their measurement. In practice, this is a barrier that must be overcome if we are not to discard the valuable insights afforded by their measurement. Developing clear definitions is one way forward, but one that has proven difficult to achieve despite decades of effort (Portney, 2013), resulting in their being no agreed way of measuring these concepts and thus multiple methods for doing so have arisen (Mori & Christodoulou, 2012;Tanguay et al., 2010). This has fragmented efforts to address the challenges and opportunities afforded by livability and sustainability and has stymied useful comparisons across geographic and thematic contexts. To overcome this difficulty, urban measurement and assessment methods can clearly define their area of focusessentially manufacturing the needed boundariesand  J.M. Leach et al. Cities 71 (2017) 80-87 within this set out to holistically measure performance. In practice, there are a number of barriers to achieving holistic measurement. Many of these are obvious and are related to the data used to populate the indicators: availability, viability, scale, reliability, format, bias, accuracy and causality. Data challenges are not easy to overcome: in some cases they are impossible to overcome. Missing data (either because the indicator has a null value or because the indicator was excluded altogether) skews the emphasis of the urban measurement and assessment method. This, in turn, skews the emphasis of arising interpretations and decisions.
Data challenges risk obscuring the impacts of less obvious barriers: defining purpose and operational mechanisms, defining 'good' performance and developing a robust assessment framework. Together, these elements impact upon how the data can be interpreted and used for decision-making. Beyond the scope of this paper, there are additional considerations with regard to using indicator-based urban measurement and assessment methods for decision-making. Kitchin et al. (2015), provides a comprehensive summary and worth mentioning is the misconception that a measurement and assessment method provides information about a 'one true' external reality, that the data themselves are objective and independent from the situation they are describing, that the very act of measurement is not a normative one and that those interpreting the data do so in rational and objective ways.
All of this introduces inaccuracy, uncertainty and bias into any given urban measurement and assessment method and potentially renders decisionmakers impotent. However, in the case of UK City LIFE 1 the language of 'realizing the multiple benefits of interventions' and 'coupling and uncoupling relationships' alongside making the unknown explicit (i.e., including null value indicators) had the potential to empower decisionmakers in the face of absent and disconnected data and interpretational challenges.
The livable sustainability space visualization gives urban decisionmakers and policymakers the city's 'big picture' performance, identifying areas of concern and of success. UK City LIFE 1 provides the detailed information to elucidate the reasons for the 'big picture' performance and to inform operational activities. Together they provide the evidence needed for policymakers to commission targeted studies to uncover the underpinning reasons. In the UK, this latter aspirationfor policymakers to commission targeted studiesis at serious risk. UK city councils currently face a perfect storm of constrained budgets, reduced staff capacity and capability and increased responsibilities on the national and world stages to engage with strategic agendas (such as sustainability). Birmingham appears to be acutely suffering and council capacity was repeatedly raised as a barrier during the consultation exercises carried out as part of this study. To paraphrase one member of the Council: Birmingham simply doesn't have the money or the staff to commission its own studies. And yet at the same time there was recognition that it is crucial to have an evidence-base for policymaking and urban decision-making.
Reduced council capacity can leave councils reactive to (rather than proactive about) strategic agendas, potentially leading to inefficient policymaking and use of resources. A number of organizations offer cities bespoke solutions to addressing strategic challenges. These include 100 Resilient Cities (100 Resilient Cities, 2016); Arup's City Resilience Framework and Smart Cities initiative (Arup, 2016); IBM's Smarter Cities (IBM, n.d.); Siemens Intelligent Infrastructure (Siemens AG, 2016); and, CH2M HILL Cities (CH2M HILL, 2015). When considering such offerings, councils should satisfy themselves that the arising performance measurements and assessments meet the good practice criteria described herein. In addition, councils must take cognizance of vested interests. Organizations may be predisposed to promote their in-house approaches and solutions even if another approach or solution would be better suited.
Arising from the Birmingham case study, a potentially transformative step in delivering a city's aspirations has been identified. This is based upon a two-tier model where one tier provides detailed measurements of city performance and elucidates multiple benefits and the other tier provides an overview that couples desirable, and uncouples desirable from detrimental, measures of city performance. This model -UK City LIFE 1 and the livable sustainability spacefounded on measurements and the identification of key relationships, could be translated to any UK city and even any global city with consideration paid to its national context. This new model not only provides an overarching picture of a city's livable sustainability, but by synthesizing the outputs with backcasting techniques, which make explicit potential barriers to achieving a desired future performance, it can provide the necessary evidence base to engender bold and assured policymaking and, crucially, make explicit how cities can advance towards their common goals of sustainability and livability. As one member of Birmingham's council explained: we must change how we think about making decisions so that we do so in an evidence-based waythis is very different to how things are currently done.

Data
UK City LIFE 1 , including all data sources and dates, has been published in Leach et al., (forthcoming).
Calculations for the livable sustainability space have been published in the supplementary information (Appendix A: Calculations for Birmingham's Livability Scores and coupling measures).

Funding
The authors gratefully acknowledge the financial support of the UK Engineering and Physical Sciences Research Council (EPSRC) under grant EP/J017698/1: Transforming the Engineering of Cities to Deliver Societal and Planetary Wellbeing.