Characterising the vertical structure of buildings in cities for use in atmospheric models

.


Introduction
High population densities in cities can expose large numbers of people to extreme weather events such as heatwaves, whose effects can be exacerbated by poor air quality and the impacts of climate change.With urban areas being home to 54% of the world's population and rising (> 60% by 2030) (United Nations, 2018), these extreme events threaten public health and may cause severe economic loss.Because of this, there is a need for increasing accuracy and spatial resolution in weather forecasting and climate projections for urban areas (Grimmond et al., 2020).
Cities contain a heterogenous mix of buildings and trees with varying heights and densities.The material properties vary between and within the impervious (e.g., buildings, roads, sidewalks) and pervious (vegetated) areas.These characteristics influence the absorption and loss of shortwave and longwave radiation, while also impacting ventilation within the urban canopy (Guo et al., 2016;Kent et al., 2019).These effects cause trapping of heat (impacting local temperatures), influence the surface energy balance, and can affect precipitation and thunderstorm intensities (Collier, 2006;Liang, 2018;Shepherd, 2005).
Therefore, the 3D structure of cities should be treated with appropriate assumptions in order to understand the impacts of urban areas on the shortwave and longwave radiation (Arnfield, 1982;Masson, 2000;Martilli et al., 2003).Within numerical weather prediction (NWP) models, surface obstacles (e.g., buildings, vegetation) and the interactions between them need to be parameterised, due to computational and data constraints (among others) that limit surface schemes.NWP urban morphology often assumes an 'urban canyon', consisting of flat roofed buildings of the same height with constant width, that are characterised by the building height to canyon width (H/W) ratio (e.g.Arnfield and Grimmond, 1998;Kusaka et al., 2001;Porson et al., 2010).Morphometric parameters are key for accurate modelling of urban weather and climate, and include: building heights (Masson, 2020;Zhu et al., 2019) and their probability distribution (e.g., Temperatures of Urban Facets (TUF), Krayenhoff et al. (2014)), building plan area, wall area, and H/W (e.g., Town Energy Balance (TEB), Masson (2000); Lemonsu et al. (2012)) with vertical structure of cities additionally vital (Wentz et al., 2018).One reason why urban NWP schemes remain simple is a lack of global data to describe this vertical structure of cities.If global data for all buildings were available, the required parameters for modelling (e.g., building fraction, wall area) could be derived at any NWP grid resolution without parameterisations.
An additional reason for NWP urban schemes simplicity is a lack of accurate but computationally efficient methods to represent vertically resolved energy exchanges within urban canopies.Model developments, such as the SPARTACUS-Urban radiation scheme (Hogan, 2019a), the Building Effect Parameterization scheme for turbulent fluxes (Martilli et al., 2002;Salamanca et al., 2011;Schubert et al., 2012), Seoul National University Urban Canopy Model (Ryu and Baik, 2012;Ryu et al., 2013), and NJUC-UM-M (Kondo et al., 2005), have potential use in NWP.These consider a multi-layer approach, resolving multi-layer fluxes within the urban canyon, taking into account varying building heights and often vegetation.
Often building footprints, sometimes with height information, are available from municipal sources.Airborne stereophotography, photogrammetry or LiDAR data are used to develop digital elevation models (DEMs) and digital surface models (DSMs) (Gamba and Houshmand, 2002;Xu et al., 2017).These allow high resolution (< 1 m) characterisation of individual cities (Gage and Cooper, 2017;Goodwin et al., 2009;Holland et al., 2008;Lindberg et al., 2011;Lindberg and Grimmond, 2011).However, these data are unavailable for all cities worldwide due to the expense of data collection, storage and processing (Kent et al., 2019).Community volunteer projects (e.g., OpenStreetMap) collect building footprints, which may also have building height attributes provided (e.g., Microsoft providing U.S. cities, Heris et al. (2020)), or derived from other new techniques (e.g., using surrounding building information (Bernard et al., 2022)).
Building heights can be derived from remote sensing observations (Rao, 1972;Champeaux et al., 2005) but few satellite missions have both the coverage and sufficient resolution to provide individual building heights for large areas (Frantz et al., 2021).Datasets have also been derived from combining multiple satellite sources (e.g., Frantz et al. (2021)), and additionally with local building footprint data (e.g., Milojevic-Dupont et al., 2020), but studies such as these may only cover a single continent, or a few cities, and/or rely on the availability of open-source building footprints.
Across the range of current morphology dataset creation studies, methodologies are inconsistent and need to be standardised before building information (e.g., height) can be provided across large areas.Hence, providing parameterisations that use publicly available data to estimate the vertical structure of cities at resolutions suitable for NWP would be beneficial, until better datasets become available globally.Such relations have been developed for urban areas (e.g., plan area fraction of buildings, mean building height, frontal area index) and are used operationally, e.g., UK Met Office use the Bohnenstengel et al. (2011) within the UKV (Hertwig et al., 2020).These parameterisations often require inputs that are not widely available (e.g., urban land cover), are derived from a single city but applied globally, and/or assume very limited variability e.g., one mean building height per built ('urban') fraction with saturation (i.e., tall dense city centres are the same as other areas).Intra-city urban form variability has been included using the Stewart and Oke (2012) local climate zones (LCZ) which provide representative ranges of values of parameters (e.g. plan area fraction of buildings or paved surfaces, roughness element height, and frontal area index) for each class, with datasets available to use as urban model inputs (Demuzere et al., 2022), and integration with models, e.g., within the Weather Research and Forecasting model with a single layer urban canopy model (Brousse et al., 2016;Molnár et al., 2019) with analysis on urban air temperatures.
Given the importance of radiation to surface energy exchange, this work, and multi-layer modelling generally, is motivated by the need to reduce the sources of error in urban radiation calculations, which are from: (1) the radiation scheme, even if the urban morphology is known exactly (2) approximating the morphology from a few parameters (e.g. plan area or building cover fraction, mean building height and total building wall area) (3) incomplete knowledge of the parameters in any given city.Stretton et al. (2022) address the first of these, evaluating the SPARTACUS-Urban shortwave radiation scheme.They demonstrate that SPARTACUS-Urban accurately predicts profiles of absorption into urban facets (mean absolute error < 16%), and the effective albedo at the top of the canopy (normalised bias error < 6%) for real-world scenes.Overall, errors for all variables are largest when the sun is low in the sky, when the impact of the underlying assumptions on building geometry are largest.
The specific objective of this paper is to address (2) by identifying and parameterising key vertical profiles of the urban form, using methods and coefficients that can be used globally, while retaining some of the realistic intra-city variability.Given the range of urban forms (within and between cities) and the sparsity of data, the relations developed here ideally need to be both simple and universal.We focus on input parameters including building: height, plan area fraction (with height), and wall area (with height), as they define the area of roof and walls exposed for energy exchange with height.Problem (3) is partially addressed here by giving a range of typical values of parameters for six cities.However, the parameterisations developed in this paper will become more applicable once datasets of building cover and mean building height are available globally.

M.A. Stretton et al.
The parameters to describe the urban form are selected and defined (section 2) and used within this study's various methods (section 3).The proposed parameterisations are assessed with 'true' urban morphology (section 4), combined (section 4.4), and used with SPARTACUS-Urban (Hogan, 2019b;Stretton et al., 2022) to simulate absorbed shortwave radiation into the three facets (walls, roof, and ground) (section 5).The conclusions drawn are given in section 6.

Plan area fraction (λ p )
The plan area fraction of buildings (λ p ) is the ratio of the total area covered by buildings to the 'grid-cell' or total horizontal area of interest.Typically, λ p is assumed to be constant with height, (λ p (z)), from the surface (λ p (z = 0)), unlike real buildings which rarely have equal heights in an area or an individual model grid-cell.Whilst we often model arrays of regular cubes (Kanda et al., 2005a;Kanda et al., 2005b;Morrison et al., 2018;Stretton et al., 2022)).Rather, λ p (z) varies (Fig. 1a) resulting in variations in sunlit and shaded surfaces (e.g., roofs) at any given height, allowing interception of radiation reflected from higher surfaces, increasing the radiation trapping with a canopy.
λ p (z) can be related to λ p (z = 0) if the area-weighted mean building height ( H) is known (e.g., determined from building data in real cities), assuming: where y(z/ H) is a universal function with the properties: y(0) = 1, y(∞) = 0, and the vertical integral of y must equal 1.A functional form for y is proposed in section 3.2.

Wall area: building perimeter length (L)
The wall area relates to the normalised building perimeter length, L, at any given height (L(z)) (Stretton et al., 2022).L is the total building perimeter (m) normalised by the total area of the grid-cell (m 2 ).Thus, if the canopy is divided into n layers and layer i has thickness Δz i , then: with λ W the total wall area divided by the grid-cell area (Masson et al., 2020).If all wall orientations are assumed to be equally probable within a grid-cell, then λ w = λ f π (Hogan and Shonk, 2013), where λ f is the frontal area indexthe total projected wall area for a particular wind (or azimuth) direction, relative to the grid-cell area.λ f is often used to parameterise aerodynamic drag of roughness elements (Raupach, 1992;Grimmond and Oke, 1999;Sützl et al., 2020).

Effective building diameter (D)
Assuming both narrow and wide buildings have an equal probability of extending to any height (Fig. 1a), L is proportional to λ p .We use this to define an effective building diameter (D) that is independent of height: This is analogous to the relations used between perimeter length and area of clouds and trees (Jensen et al., 2008;Hogan et al., 2018).
The D parameter can be thought of as the diameter (or width) of buildings in an equivalent idealized city, with the same properties (λ p and L) as a real city (Fig. 1b), where all buildings are identical cylinders (or cubes).Note that we neither assume buildings have a particular shape, nor that they are all the same size in a real city.Rather, D quantifies the assumed proportionality between L and λ p in Eq. 3, while having a simple physical interpretation.

Deriving λ p (z) and L(z) from high resolution reference data
Profiles of λ p and L are derived from high resolution reference data from six cities with differing characteristics (e.g., morphology, city layout) (Table 1).The cities chosen are Auckland, New Zealand; Berlin, Germany; Birmingham, UK; London, UK; New York City (NYC), US; and Sao Paulo, Brazil.Multiple global cities are used to capture the variation in building and land-use types seen exhibited across urban areas.
For London, Berlin, and Birmingham, building footprints (Umweltatlas Berlin, 2010;EMU Analytics, 2018) 1 are used to create a 1 m × 1 m raster with mean height assigned to each building.Therefore, buildings have flat roofs and non-tapering vertical walls without building features (e.g., pitched roofs) above the mean height.During rasterization, "false walls" between adjoining (e.g., terraced) and overlapping buildings are removed.Building information for Auckland, NYC, and Sao Paulo are LiDAR datasets provided as 4 m × 4 m raster (Kent et al., 2019).Although some differences in building morphology may arise from the raster resolution chosen (Fig. SM 1, Table SM 1), by using the highest resolution available (i.e., not coarsening to the 4 m × 4 m resolution of some datasets) allows the parameterisations to be closer to 'true' data.Buildings/pixels with height values below 2.5 m (mean storey height in the UK (OPDC, 2018)) or without height information are discarded for consistency across all datasets.For all grid-cells in all cities, the topography is assumed to be flat.
Each city dataset is split into 2 km × 2 km grid-cells, to have a similar resolution to operational limited-area NWP models (e.g.Met Office 1.5 km UKV model (Tang et al., 2013), DWD 2.8 km COSMO-DE model (Baldauf et al., 2011)).At this scale, most grid-cells will have many buildings with different heights, but are less likely to contain multiple neighbourhoods with very different characteristics.Grid-cells with incomplete (i.e., missing) building data (determined from visual inspection of aerial imagery) and/or any areas with extremely small building coverage (λ p (z = 0) < 0.001) are removed.Profiles are calculated using a height interval (Δz) of 0.5 m.

Deriving globally obtainable parameterisations for λ p (z) and L(z)
The profiles of λ p and L are parameterised for the six cities (Table 1).The parameterisations developed for λ p (z) and L(z) are used in five combinations of increasing complexity, with differing input data requirements.These are evaluated against the high-resolution reference datasets (P0).The parameterisations (P#) of λ p (z) and L(z) are given increasing numbers (#, 1 → 5) as the input requirements decrease (i.e., with the intention they are more globally applicable) (Table 2): Table 1 Cities and high-resolution reference data used to evaluate urban form parameterisations. City area (A city ) from Demographia World Urban Areas (2020), US Gazetteer (2021) as basis for the whole city area, an area can be >100% if it contains areas outside of the city, or areas of water.The 2 km × 2 km parameters are derived from either raster (rDSM) or vector (vDSM) digital surface models with the horizontal resolution (Δx) indicated.P1 the most data demanding (e.g., from high-resolution raster data, Section 3.1) parameterisation, uses non-globally available (yet) λ W and λ p (z).An effective building diameter, D, is used to predict L(z) to ensure the latter satisfies Eq. 2 (Table 2).This can be achieved if λ w is known, by: where the normalised building volume, V, is the ratio of building volume and grid-cell area, related to building plan area and mean building height ( H) by V = λ p (z = 0) H (parameterisation: Wall area conserved (CWA) D, Table 2).P2 uses λ W as in P1 but parameterises λ p (z) from λ p (z = 0), and mean building height ( H) using Eq. 1, with: where x = z/ H, a is a function of b that ensures the integral of y from zero to infinity is one.Fig. SM 2 demonstrates that this functional form is a good fit to the median building-fraction profile from real cities worldwide.Value of the best-fit parameter, b, derived from the high-resolution reference data have smaller values when there is larger variation in building heights within a grid-cell.To determine b, we take λ p (z), and normalise each axis by H and λ p (z = 0), such that the resultant curve represents the y(x) function from Eq. 4 (Fig. SM 2).The normalised curves are interpolated using a common vertical normalised height interval of 0.05, prior to determining the median profile (Fig. SM 2a, b).For the λ p (z) variable b parameterisation (Table 2), grid-cells are categorised in H intervals.Values of b are derived using median normalised profiles for each H interval calculated from all cities.P3 is P2 but for λ p (z) the fixed b parameterisation (Table 2) is used with b = 4.7 (Eq.4) across all H for all grid-cells.This value is derived from the multi-city median y(x).Using one b value allows assessment of cities similarity, and if parameterisations perform better (less error in radiation fluxes) if more data are included.P4 requires only grid-cell values of λ p (z = 0) and H. λ p (z) is obtained as in P2 and derives L from calculating an effective building diameter (D) from: where the constants p, q, and r (p = 0.847, q = 5.17, r = 11.96) are derived by fitting D to λ p (z = 0) and H across all cities (Table 1, Table 2).This is referred to as linear-fit D (Table 2).P5 the simplest case uses only λ p (z = 0) and H, so requires the least data.It parameterises λ p (z) as in P3.To obtain L, with D set to 20.93 m for all cities (fixed D, Table 2) and λ w is assumed proportional to V (Eq.4).
All the 2 km × 2 km grid-cells available are used to derive parameters for the L(z) and fixed b parameterisations.For the variable b parameterisations (P2, P4), for each height interval a bootstrapped random sample (Padiyedath Gopalan et al., 2019) of 1000 is used to ensure all intervals have the same sample size.

SPARTACUS-urban radiative transfer model
The original SPARTACUS (Speedy Algorithm for Radiative Transfer through Cloud Sides) 3D radiative exchange model for complex cloud fields simulates lateral radiative exchange between clear and cloudy regions in proportion to the cloud edge length per unit area of a NWP grid-cell, using cloud fraction and cloud edge length (Hogan et al., 2016).A similar approach has been applied to forest vegetation (Hogan et al., 2018) and city buildings (Hogan, 2019b).All three assume any obstacles to radiation are randomly distributed within a horizontal plane, allowing the mean radiation field to be modelled as a function of height.The open-source software SPARTACUS-Surface combines SPARTACUS-Urban and SPARTACUS-Vegetation.As we focus only on buildings, we refer to it as SPARTACUS-Urban.
SPARTACUS-Urban is underpinned by the 1D discrete-ordinate method, solving coupled ordinary differential equations for 2 N

Table 2
Inputs required for parameterisations (P1 → P5) to determine building fraction (λ p ) and normalised building edge length (L) from D (Fig. 1b) (assumed constant with height).These are evaluated using the actual 'true' profiles of λ p and L (P0, section 3.1), mean building height, H, and building fraction at the surface, λ p (z = 0), are required inputs for all parameterisations.streams of radiation, with N streams per hemisphere.The radiation field is described more accurately as N increases, but with added computational cost.In this work, 16 streams are used (i.e., N = 8).A scene (any combination of building geometry, solar zenith angle, and albedo) is split into n layers to calculate radiative interactions per level.Each layer has regions of clear-air and buildings.SPARTACUS-Urban computes the radiative interactions between the three facets (wall, roof, and ground) using the vertical profiles of L and λ p to characterise the urban form with z within a grid-cell.If vegetation is included, it also needs to be accounted for in each layer.

Metrics to evaluate parameterisations against high resolution reference datasets
The parameterisations (P1 → P5, Table 2) are evaluated using the bias error (BE i = P# i -P0 i ) between the 'true' (P0) and parameterised profiles for building fraction and building edge length.The BE of individual λ p and L vertical profiles across each city are analysed using the median, mean, and 5th -95th percentiles.
Similarly, the λ w for L parameterisations is evaluated using the mean BE (MBE).The mean λ w is also evaluated using a normalised MBE (nMBE), computed by dividing the MBE by the 'true' mean λ w computed from the P0 data, and multiplying by 100 to give a percentage.The mean absolute error (MAE = Σ(|P# i -P0 i |)/n) for λ w for each parameterisation is calculated for the number of gridcells per city (n), to examine the performance across all data.
The impact of the parameterisations to the vertically integrated shortwave absorption into the walls (a Wall ), roof (a Roof ) and ground (a Ground ) per unit area of the entire horizontal grid-cell (W m − 2 ) is assessed using the MBE, nMBE, and MAE as above.The total absorption is also assessed for each grid-cell using the normalised BE (nBE).Similarly, we examine the impact on the shortwave bulk albedo for each grid-cell as above.

Spatial variation of morphology in high resolution reference datasets
For all six cities, the largest values of both plan are fraction at the surface (λ p (z = 0)) and mean height ( H) are found near the city centre or central business district (CBD) (Fig. 2).The largest H values occur in NYC (e.g., Manhattan) (~ 30 m for 2 km × 2 km grid-Fig.2. Parameters derived from building data (Table 1) at 2 km × 2 km resolution for (columns) six cities: (row 1) building fraction at the surface (λ p (z = 0)), (row 2) mean building height ( H), and (row 3) normalised building perimeter length at the surface (L(z = 0)) (Eq.3); with city boundaries (black) and water bodies (blue) shown.Data sources are given in Table 1.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)M.A. Stretton et al. cells) and Berlin.The lower values in other cities may be due to large variations in building types within a grid-cell, e.g., in Sao Paulo very tall buildings are often surrounded by areas of smaller buildings within the same grid-cells.Values of λ p (z = 0) are higher in NYC, Sao Paulo, and Auckland.Similar values of L(z = 0) are found across each individual city, lowest in Berlin, Birmingham, and London (< 0.05 m − 1 ) and largest in Auckland, NYC, and Sao Paulo (0.05-0.1 m − 1 ).These could be impacted by fraction of the whole city analysed and the definitions used (Table 1), as this varies between 45% and > 100%.For example, in Auckland, a smaller fraction (0.45) of the city is analysed, so likely less variability in land-use and building type are characterised.However, each city has grid-cells that contain parks, residential areas, and parts of the CBD.Although only six cities are analysed, they cover a wide range of λ p and H combinations, city layouts, locations, employment types/industries, and demographic characteristics.However, as other cities are analysed the result details will vary.
Across all cities the 2 km × 2 km grid-cells generally have a H of <30 m (colour bar, Fig. 3).The few grid-cells with H > 40 m often have larger λ p (z = 0) values.This suggests that grid-cells are not dominated by a large number of tall buildings, although this depends on the exact grid-cell location.Effective building diameter (D) values are always >8 m but mostly <40 m (Fig. 3).Values of D tends to increase with λ p (z = 0) (between 10 and 20 m, when surface building fractions are 0.0 to 0.5) in grid-cells with low values of H (< 10 m) (Fig. 3).Grid-cells with large values of D (up to 100 m) tend to have smaller λ p (z = 0).

Evaluation of parameterisations for normalised perimeter length (L)
We assess if the assumption that the effective building diameter (D) is constant with height (z) (Section 3.2) is appropriate.Within and between cities, D varies with z (Fig. 4).The mean and median D values across each city appear to follow similar vertical relations, with the mean D at each height larger than the median, especially with increasing height.In Birmingham, London, and NYC, D is approximately constant with height between 7 and 10 m, and at ~25 m (Birmingham and London), and ~ 35 m (NYC).Similar behaviour is seen in Berlin, although the mean and median D values are much closer in magnitude.In Sao Paulo, D is roughly constant with height above 20 m.In Auckland, D decreases with height, with D > 20 m until z > 50 m; but Auckland has both the fewest gridcells (only 59) and smallest fraction of the city analysed (Table 1).
As the vertical variation of D is much smaller than the vertical variation of plan area fraction of buildings (λ p , Fig. 4), we test the consequence of assuming D is constant with height.In four of the six cities, wider buildings tend to extend higher into the urban canopy (not Auckland or Sao Paulo).Auckland appears to be an outlier (cf.other cities), possibly because of the smaller fraction of the city assessed (0.45, Table 1).In grid-cells with taller buildings (larger mean H), the assumption of a constant D is appropriate, but could be replaced with values obtained from two (or more) height intervals to better characterise the vertical profile; however, this additional complexity is not introduced here.
Examining the normalised perimeter length at each height level (L(z), Eq. 3, Table 2) parameterisations derived using the highresolution reference dataset λ p profile for each grid-cell, an increase in L(z) bias error (BE) between the linear-fit D and fixed D parameterisations is evident (Fig. 5).In most cities there is a tendency to overestimate L(z) higher in the canopy (Fig. 4, Fig. 5), Fig. 3. Relation between the three parameters used in the 'linear-fit D' parameterisation (Table 2) for all grid-cells (N = 1429): building fraction at the surface (λ p (z = 0)), effective building diameter (D) calculated using normalised wall area (λ w ) and mean building height ( H).
indicating an under-estimation in effective building diameter (i.e., from Eq. 3).Hence, buildings with a larger horizontal size are more likely to be taller.
The λ w nMBE for the linear-fit D parameterisation are lowest in NYC (0.01%), and largest in Auckland (− 15%) (Table 3).The fixed D parameterisation of λ w have larger nMBE than the linear-fit D parameterisation, with values between − 6.4% (Berlin) and − 26% (Auckland).Generally, all nMBE indicate λ w is underestimated when using both linear-fit D and the fixed D parameterisations for L(z) (Table 3).For Auckland, London, and Birmingham, the linear-fit D parameterisation performance is always better than the fixed D parameterisation.The distribution of BE in L(z) (Fig. 5) for both the linear-fit D and fixed D parameterisations, indicate the median is <0.005 m − 1 (mean L(z = 0) spans 0.02-0.05m − 1 ).Exceptions to this include the fixed D parameterisation in Auckland.The 5th -95th percentiles of profiles (shading, Fig. 5) for Berlin, London, and Birmingham do not exceed 0.04 m − 1 at any height but are slightly larger for the remaining cities.Both UK cities (London and Birmingham) have similar shaped BE profiles, suggesting a similar urban structure.In London, the L skill changes with height from an underestimation (z < ~7 m) to overestimation (z > ~9 m) (Fig. 4).This could result from the larger H variation across the city (Fig. 2), cf.Birmingham which has lower H with less variation, as the CBD covers fewer grid-cells.NYC has the largest L(z) differences of the six cities, with the fixed D parameterisation median BE and MBE generally smaller than the linear-fit D parameterisation for z ≤ 10 m.For all cities the largest L(z) errors occur for z< 10 m, with most overestimating L(z) for both parameterisations below 5 m.

Evaluation of parameterisations for plan area fraction of buildings (λ p )
Parameterisation methods for λ p (variable b and fixed b, Table 2), are evaluated using the λ p (z) vertical profiles from the highresolution reference datasets (P0, Table 2) for each grid-cell (Fig. 2).For the variable b parameterisation, b varies with H intervals (b = 2.1-6.5, Fig. 6), whereas fixed b uses one multi-city value (b = 4.7) independent of city.As both parameterisations assume λ p (z = 0) is known, they agree with the 'truth' at the surface.
Generally, b decreases when mean building heights exceed 7 m (Fig. 6).The multi-city fixed b is approximately equivalent to the variable b parameterisation value of b at ~7 m.Grid-cells with the lowest H have the smallest variation in building heights and are associated with the largest b values.Areas with low H are often found in suburbs, dominated by two-storey buildings.This contrasts with the CBD, where a wider variation of building heights occurs, with lower values of b.
The variability around the median multi-city curve (black, Fig. 6) shows the variability in b is higher when H is larger (whiskers,   2).2).(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 4
As Table 3, wall area (λ w ) for six cities calculated using P2-P5 (Table 2) assessed using P0 with bias and normalised bias error (MBE and nMBE, section 3.4).cities (Asian megacities) may sit with Sao Paulo, or more with NYC.
Using the variable b and fixed b parameterisations (Table 2) gives absolute BE for λ p of <0.025 for all heights in all cities (black lines, Fig. 7), except Sao Paulo.There is little difference between the two parameterisations.Generally, 90% of the data (5th -95th percentile, shading Fig. 7) in each city are within 0.03 of the 'true' values at each height.These results suggest Eq. 1 is applicable to real-world cities.
Focusing on individual cities, the UK cities have similar vertical profiles (Fig. 7), as expected given their similar b values (Fig. 6).For Auckland and Sao Paulo, λ p is underestimated below 10 m and overestimated above, the opposite to London and Birmingham.For NYC, large errors extend to higher building heights, especially for the fixed b parameterisation.In Sao Paulo the largest BE (> 0.05) occur when z < 10 m.Berlin, up to 20 m, has the second largest maximum magnitude of BE (~0.06), but low mean and median absolute BE (< 0.02).
Overall, the fixed b parameterisation b (Eq.4) gives a reasonable fit in all cities.Comparing both parameterisations (variable b, fixed b) shows that only small improvements when b values are dependent on mean building height.

Evaluation of combined parameterisations
The combined L(z) and λ p parameterisations (P1 → P5, Table 2) are evaluated using the P0 high resolution reference profiles for each city.This evaluation also assesses λ w for parameterisations of L.
MBE for λ w is smallest if more city-specific information is used (Table 2).For mean λ w , the nMBE are <2.5% for P2 and P3, but up to − 17% for P4 (Table 4).The largest nMBE occur in Auckland and Berlin, and smallest in Birmingham and London.Using fixed Fig. 9.As Fig. 8, but for P4 and P5.

Table 5
As Table 4, but for SPARTACUS-Urban simulated total absorption when the solar zenith angle is 75 • for (a) walls, (b) roof, and (c) ground facets.coefficients (P5) has the largest nMBE in all cities, with Auckland the poorest (− 27%).For P2 and P3, profiles of L(z) have similar BE in all cities (Fig. 8).In NYC, P3 tends to underestimate L(z), and P2 overestimate L(z), least.Across all cities, 90% (5-95th percentile, shading Fig. 8) of both the P2 and P3 profiles are <0.01 m − 1 different from P0, except for Sao Paulo between 5 and 10 m.In London and Birmingham, both P2 and P3 overestimate L(z) below z = 5 m but underestimate L(z) above, while in Sao Paulo and Auckland the opposite occurs.Overall, the largest errors occur in Sao Paulo.
Consistent with the errors in total λ w (Table 4), P4 and P5 perform the poorest for L(z) (Fig. 9), Generally, the L(z) absolute value of BE are <0.01 for P4 (black) and P5 (red, Fig. 9) with close to 90% of the profiles being within magnitude of 0.02 of P0.For all cities when P5 is used, the range of the 5 -95th percentile of profiles is wider.
5. Impacts of applying parameterisations for λ p and L in the SPARTACUS-urban radiative transfer model

Impact on vertically integrated absorbed shortwave radiation
To assess the impact of approximating the urban morphology using parameterisations (P2-P5, Table 2) on simulated (section 3.3) total integrated shortwave absorption into urban facets, results are compared to the P0 in each grid-cell.Although three solar zenith The SPARTACUS-Urban simulated wall absorption (a Wall ) with a θ 0 = 75 • , generally have nMBE magnitudes >3% for P2 (cf.> 15% as complexity increases to P5, Table 5) across all cities.Given poorer skill of the urban form parameterisations in Auckland (section 4), the poorest a Wall performance is consistent (P4 15%; P5 25%) (Table 5).The largest nBE in a Wall occur when using P1-P3 in Auckland, NYC, and Sao Paulo, in areas with the largest λ p (z = 0) and L(z = 0) (Fig. 2, cf.Fig. 10).However, for P4 and P5 no clear relation is evident between grid-cell morphology and a Wall nBE.Generally, P1 -P3 underestimate the absorption of radiation by walls (Table 5), which is consistent with the morphology results (Table 4).
Simulated a Roof , using P2-P5 morphology, have all cities have a mean nBE < 5% (Table 5), and across all grid-cells the largest nBE magnitudes <25% (Fig. 11).Given the skill in morphology parameterisations (section 4), Sao Paulo and Auckland overestimate a Roof , while London and Birmingham underestimate a Roof .The highest nBE occur in areas with the largest λ p (z = 0) (e.g., CBD, industrial), and therefore P0 a Roof is highest.
Across all parameterisations, many cities have an absolute value of nMBE in mean a Ground of <2% (Table 5), apart from P5 in Auckland (7.5%) and NYC (9%).The spatial patterns of a Ground show underestimates in Auckland and Sao Paulo, but overestimates in the other cities when using P1-P3 (Fig. 12).
Overall, the lowest nMBE occur when θ 0 = 0 • with a Roof and a Ground nBE < 1% for all cities (Table SM 3

Impact on shortwave bulk albedo
We additionally assess the error in approximating the urban morphology using P2-P5 on the simulated bulk albedo for each gridcell (i.e., ratio of upwelling to downwelling shortwave radiation at the top of the urban canopy).For all θ 0 values tested, the nBE have similar magnitudes across the six cities, so we focus on θ 0 = 45 • .The nBE results for θ 0 = 75 • are slightly larger (Fig. SM 7).
Simulated bulk albedo generally increases with parameterisation complexity (i.e., P1-P5), but all have an absolute nBE < 10% (Fig. 13).This is smaller than the total facets absorption nBE (Fig. 10 -Fig.12) that have a maximum of ±20-50% for P5 (largest for a Wall ).This suggests bulk effects result from within canopy compensation (i.e., some absorption overestimates and underestimates).The net result is a good approximation of the shortwave bulk albedo for all parameterisations.Unlike absorption, albedo has a similar trend for all six cities, being overestimated for P1-P3 in all grid-cells and for P4-P5 in most grid-cells.For the latter, notably P5, the albedo is underestimated in areas with the higher surface building fractions (Fig. 2).
Hence, the crudest assumptions about urban form have low impact on the bulk albedo.However, this contrasts with the error in shortwave absorption (section 5.1), suggesting that the within-canopy impacts work to cancel one another out.

Conclusions
Given the challenges in collecting urban morphology data (Frantz et al., 2021;Masson, 2020), that are critical to simulating meteorological processes in cities (e.g.Hogan, 2019b;Krayenhoff et al., 2020), we propose methods to parameterise vertical profiles of building plan area fraction (λ p ) and normalised building perimeter length (L).The latter is simply the vertically resolved normalised wall area.
Although profiles can be derived from high resolution building height data, few datasets with large spatial extent exist.Here, data are analysed for six cities (in Europe, North America, Oceania, and South America) to derive both city specific and multi-city parameters, and to evaluate model skill.
The impact of parameter uncertainty is assessed using SPARTACUS-Surface (Hogan, 2019b), shortwave radiation simulations of albedo, and shortwave radiation absorption by three urban facets (roof, walls, ground).The grid-cell bulk albedo has an absolute normalised bias error of ≤10% when using the urban form parameterisations. Albedo is more likely to be underestimated in highdensity central-city areas when the most assumptions are made about the urban morphology.Reducing the assumptions made, and therefore increasing the knowledge of the urban morphology used within the parameterisation, decreases the absolute value of normalised bias errors to ≤2% for all areas of cities.
The low error in bulk albedo hides errors within the urban canopy introduced by the parameterisations, as errors in shortwave absorption appear to cancel out.This may make the parameterisations acceptable for use with more complex models.Generally, low (< 20%) absolute values of normalised errors in absorption occur at when the sun is overhead, when building fraction parameterisation is more influential (cf.wall area parameterisation).The largest normalised bias errors (up to 50%) occur in wall absorption for larger solar zenith angles, when wall area is assumed unknown (Section 4.4, 5.1).Thus, trying to overcome a lack of urban form data through parameterisations introduces larger errors.Hence, if vertical profiles of radiation absorption are the focus, appropriate wall area data are critical.We conclude that building height variations can be approximated using a function of mean building height and surface λ p .Parameters are derived for both all-cities and each city.Overall, 90% of the profiles have building fraction error < ±0.03 at any height (cf.'true' data).As there is little difference between the two methods in four out of six cities analysed, it suggests the multi-city approach may be acceptable.
Additionally, we find intra-city building horizontal extent can be approximated using an effective building diameter D derived either as function of mean building height and λ p , or from the 'true' wall area.The latter is most constrained by data availability.Although we treat D as constant with height, we find evidence suggesting two values may improve representation of vertical variability.The near ground (for areas with λ p > 0.05) mean D across all six cities is ~21 m.For all methods used to determine D, the absolute value of median bias error is <6%, with the magnitude of bias error of 90% of all profiles <0.02 m − 1 (cf.0.02-0.05m − 1 at the surface).
The proposed parameterisations can be combined to give a full description of the vertical morphology, with five combinations used here, with decreasing data requirements.Little skill is lost from using a more general (cf.detailed) parameterisation for λ p .Parameterisation of L has a larger impact on wall area.If wall area is known but λ p is parameterised, the mean total wall area is calculated within 3% for all cities, whereas if wall area is unknown this error can increase by up to an order of magnitude.The parameterisations for the six cities perform best in areas with both low λ p and L (e.g., suburban areas).In denser areas, errors are generally larger and increase without knowledge of wall area.Additionally, in denser areas the errors in absorbed shortwave and bulk albedo may have opposite signs of those in the outer city.
Overall, as our parameterisations can characterise vertical profiles of λ p and L they have utility in multi-layer urban canopy models (e.g., SPARTACUS-Urban, BEP, TEB).However, the uncertainty in radiative fluxes shown here will impact other modelled fluxes (e.g., storage and sensible heat fluxes) so future work should assess these cascading effects.Alongside this, as the morphology within a given grid cell influences the surface roughness, future work could include assessing the impact of the parameterisations developed here on the momentum transfer.Here, only λ p and mean building height data are used to approximate urban form.Including building wall area information would improve this, but it is the most challenging of the three variables to obtain.Additionally, we note that the coefficients and best-fit parameters derived here, which impact absorption and albedo, may change as the range and mean of cities properties are expanded and analysed at different scales.
If only mean building height and plan area fraction are available for a city at grid resolution, building height and λ p can be derived from existing empirical relations, such as Bohnenstengel et al. (2011), that are currently applied globally, despite being developed from data from a single city.But, these relations require input data (i.e., built or 'urban' [sic] fraction).Thus, there is a clear and increasing need to improve global datasets of urban morphological parameters for weather and climate models, but also for a wide range of other applications that use the outputs from these models for decision making and managing cities day-to-day, in emergencies and for the long-term changes (both climate and urban).

)Fig. 1 .
Fig. 1.Effective building diameter (D): (a) assumption it is constant with height (Eq. 3) implies an equal probability of both large and small buildings extending to different heights (cross section); and (b) plan view of two equivalent areas with buildings that are either cuboids of width D or cylinders of diameter D.

Fig. 5 .
Fig. 5. Mean building edge length at the surface ( L(0)) and vertical profiles of bias error (BE, section 3.4) of normalised building edge length (L) determined using the 'linear-fit D' parameterisation (grey) and 'fixed D' parameterisation (red) (Table 2) for (a) Auckland, (b) Berlin, (c) Birmingham, (d) London, (e) NYC, and (f) Sao Paulo.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 6 .
Fig. 6.Relation between b (Eq.4) and mean building height ( H) for each city (Table 1) and across all cities, for H intervals (2 m: 2-20 m, 5 m: 20-40 m).H intervals with sample size ≥ 5, containing > 10% of individual city grid-cells or across all (multi-city) (filled, otherwise open) the 'variable b' spread (whiskers) in median b with height are calculated from 1000 bootstrap samples, random with repetition to the same sample size of each interval, for the full dataset (P0, Table2).

Fig. 6 )
Fig. 6), which may be associated with smaller sample sizes.The multi-city b with H relation follows NYC closely (green, Fig. 6), as NYC has the most grid-cells sampled.Differences between the multi-city median and the NYC b at H = 40 m arise as extra grid-cells from other cities (i.e., with <5 grid-cells) are included.Auckland and Sao Paulo have b values are most different from both other cities and the multi-city relation (Fig.6).The UK cities, London and Birmingham, are similar to ~10 m but differ above this as London has higher H.Other cities with different variations in morphology may lie either side, or elsewhere in relation to this multi-city curve, e.g., taller M.A.Stretton et al.

Table 3
Evaluation of mean normalised wall area (λ w ) using the linear-fit D and fixed D parameterisations (Table2) using the conserved wall area (CWA) D parameterisation for each city assessed with metrics (section 4.4): mean bias error (MBE), normalised mean bias error (nMBE, %) and mean absolute error (MAE).Table1gives number of grid-cells analysed per city.