Surface Urban Energy and Water Balance Scheme (SUEWS): development and evaluation at two UK sites

The Surface Urban Energy and Water Balance Scheme (SUEWS) is evaluated at two locations in the UK: a dense urban site in the centre of


Introduction
Improved understanding of the urban environment is of paramount importance to our future.The number of people living in urban areas is projected to exceed 6 billion by 2050 (United Nations, 2014); the current estimate stands at 4 billion, over half of the total global population.Designing, building and operating safe and sustainable cities is therefore a crucial part of managing development.Urbanisation impacts the environment in numerous ways.Replacing vegetation or soils with impervious anthropogenic materials reduces infiltration and storage capacity, increasing flood risk (e.g.Rodriguez et al., 2003); buildings modify the wind field and alter radiation and energy exchanges, leading to warmer temperatures in cities (e.g.Oke, 1982), which may be further augmented by energy released from anthropogenic activities (Ichinose et al., 1999).The effects of urban areas also extend beyond city borders.At regional scales, for example, observations indicate enhanced rainfall downwind of settlements (e.g.Shepherd et al., 2002).On a global scale, cities contribute to increasing concentrations of atmospheric carbon dioxide (The Keeling Curve, 2014) and are a major source of greenhouse gases (e.g.Velasco and Roth, 2010;Christen, 2014).Climate-sensitive urban design, urban climate mitigation and disaster management are increasingly important given the changing climate, as the frequency and magnitude of extreme events are predicted to increase (e.g.Fowler and Hennessy, 1995;Meehl and Tebaldi, 2004).To further our knowledge of how the urban surface and atmosphere interact, observational campaigns across a range of sites, climates and weather conditions are required.The most practical way of exploring these interactions in more detail and quantifying the effects of changes to the system (e.g.future climate, urban design scenarios) is to use models.
The Surface Urban Energy and Water balance Scheme (SUEWS) is a relatively simple model that can simulate both energy and water fluxes (Järvi et al., 2011).The model is centred on the urban energy balance (Oke, 1987), and urban water balance (Grimmond et al., 1986), where Q* is the net all-wave radiation, Q F the anthropogenic heat flux, Q H the turbulent sensible heat flux, Q E the latent heat flux and ΔQ S the net storage heat flux; P is precipitation, I e the water supplied by irrigation or street cleaning, E the evaporation, R the runoff (including above-ground runoff and deep soil runoff) and ΔS the net change in water storage (including water in the soil and water held on the surface).SUEWS, designed specifically for urban areas, considers seven surface types: paved surfaces (including roads, pavements, car parks), buildings, evergreen trees and shrubs, deciduous trees and shrubs, grass, bare soil and open water (e.g.rivers, lakes, swimming pools, fountains).Characteristics of these seven surface types must be provided as inputs to the model, including albedo, emissivity, moisture storage capacity, building height, tree height and, importantly, the plan area fractions of each surface type.Drainage characteristics are required for each surface, as are soil characteristics for the single soil layer that exists below each surface (except water), and vegetation characteristics for the three vegetated surfaces.If available, additional information about the anthropogenic energy and water use is beneficial, since these impact the available energy and partitioning of energy between the turbulent fluxes.Model output includes each term of the energy and water balance at every time-step.SUEWS is set up to require basic meteorological data as input (incoming shortwave radiation K ↓ , air temperature T air , atmospheric pressure p, relative humidity RH, wind speed U and precipitation P).If measurements are available, additional observational data can be supplied and used instead (for example incoming longwave radiation L ↓ can be calculated within the model or supplied if observations exist, Section 4.2.3).
SUEWS has been developed from the urban water balance model of Grimmond et al. (1986) and the urban evaporation-interception scheme of Grimmond and Oke (1991) and now incorporates several other submodels.The Objective Hysteresis Model (OHM) (Grimmond et al., 1991) calculates ΔQ S ; the Net All-wave Radiation Parameterisation (NARP) (Offerle et al., 2003) provides Q*; the Local-scale Urban Meteorological Parameterisation Scheme (LUMPS) (Grimmond and Oke, 2002) provides an initial estimate of the atmospheric stability.Järvi et al. (2011), hereafter Ja11, details how these and other sub-models combine to form SUEWS. Further development of SUEWS has since focused on snow-related processes relevant to cold-climate cities (Järvi et al., 2014, hereafter Ja14).In this paper, we describe and evaluate the latest model version (SUEWS_v2016a).Fig. 1 summarises the key concepts within SUEWS; for further details the reader is referred to Ja11 and the SUEWS manual (Ward et al., 2016).
Two key advantages of SUEWS are its relatively undemanding input requirements (i.e.basic meteorological data and surface information) and its simplicity, enabling runs of several years and multiple model grids to be carried out without specialised computing facilities.SUEWS can be coupled to meso-scale models, run on a standalone basis (as is done here) or used as a decision-making tool that sits behind a user interface tailored to suit the needs of urban planners or policy makers (Lindberg et al., 2015).
Fig. 1.Overview of the processes in SUEWS.C i is the amount of water on the canopy of each surface i, S i the moisture storage capacity of each surface, r b the boundary-layer resistance and z 0v the roughness length for water vapour; all other notation is defined in the text.
In order to use any model to aid decision-making it is critical that its performance has been assessed and understood for similar conditions.Original development, parameterisation and evaluation of SUEWS, and its predecessors, used data collected from suburban areas in Vancouver (Grimmond et al., 1986;Grimmond and Oke, 1991;Järvi et al., 2011).Other evaluations have used data from Los Angeles (Ja11), Helsinki (Ja14, Karsisto et al., 2015), Montreal (Ja14) and Dublin (Alexander et al., 2015).Applying an earlier version of SUEWS in Canberra, Mitchell et al. (2008) concluded that it offers great potential as a tool for urban planning (if developed further) but emphasised the need for evaluation in Australian cities and over a wider range of land uses.
In this paper SUEWS is evaluated at two UK sites, thus expanding the range of meteorological conditions, background climates, surface characteristics and patterns of human behaviour for which the model has been tested.Observational datasets spanning two complete years allow insights into seasonal variability in model performance.Recent developments to the model are described in Section 2. Section 3 provides more information about the evaluation sites and methodology.In Section 4 results are presented in the order of the model calculations, so that each quantity can be assessed with respect to the accuracy of the variables upon which it depends.Energy exchanges at the two sites are compared and contrasted in Section 5.

Model developments
SUEWS has recently been developed to run at a shorter time-step to represent rapid changes in the water balance, for example the movement of water following a rain event.The whole model now runs at a time-step specified by the user; 5 min is recommended but time-steps down to 1 min or up to 10 min are possible.
Previously, irrigated grass and non-irrigated grass were modelled as separate surface types and bare soil or unmanaged land was combined with the non-irrigated grass surface.In v2016a, there is one grass surface (the fraction of this surface that receives irrigation is a required input) and one (non-vegetated) bare soil surface.Note that this bare soil surface is different from the sub-surface soil stores underneath each surface.Irrigation is now also allowed for trees and shrubs.
Several small changes have been made to the water balance subroutines, including bug fixes associated with area normalisation (affecting irrigation) or unit conversion (affecting the horizontal movement of water between soil stores).Two major changes have been made to the calculation of evaporation.Firstly, there is now the ability to change the threshold above which evaporation from a wet surface is considered to take place.This affects the magnitude of the turbulent latent heat flux under partially wet conditions and the frequency with which latent heat fluxes are calculated assuming totally wet conditions.Secondly, a revised formulation for estimating the surface conductance is included (Appendix A).This new formulation aims to provide reasonable fluxes over a wide range of conditions, particularly for areas with little or no irrigation.
The albedo for evergreen trees and grass surfaces can now change with season, whereas previously only the albedo for deciduous trees could change.For additional details about changes to the model and for instructions on setting up and running the model, which is openly available, the reader is referred to the SUEWS manual (Ward et al., 2016).

Description of sites
SUEWS is evaluated at two UK sites: a dense urban site in central London based at the King's College Strand campus (Kc) and a residential suburban site in Swindon (Sw) about 100 km to the west (Fig. 2).At both sites eddy covariance (EC) observations of turbulent sensible and latent heat fluxes have been collected, along with measurements of incoming and outgoing shortwave and longwave radiation and basic meteorological variables (see Ward et al. (2013) and Kotthaus andGrimmond (2014a, 2014b)

for details).
A gap-filled meteorological forcing dataset (2011)(2012)(2013) is used to run the model.The period May 2011 to April 2013 (when flux observations are available from Sw) is used for evaluation, which allows 4 months for spin-up.The same evaluation period is used at both sites to facilitate the comparison between Sw and Kc.Using two complete years also means the evaluation spans a range of conditions without favouring any particular season.The Kc dataset includes observations from two sites on the same rooftop referred to as KSS and KSSW (separated by b45 m horizontally and 1.4 m vertically): data are from KSS for 01 January 2011-25 March 2012 (Kotthaus andGrimmond, 2014a, 2014b) and from KSSW for 04 April 2012-31 December 2013 (Bjorkegren et al., 2015).
This study concerns fluxes at the neighbourhood-or local-scale (10 2 m).The EC observations have a source area (defined as the portion of the upwind surface that influences the measurement) which changes with time, depending on wind direction, wind speed and stability (e.g.Schmid et al., 1991).Footprint models indicate the EC fluxes originate from an area within a few hundred metres of the flux towers.The surface characteristics required by SUEWS have been calculated based on the land cover within this area (Table 1).Although the radiometers are located on the same mast as the EC instrumentation they have a much smaller source area that is fixed in time.
Estimation of the anthropogenic heat flux is influenced by the spatial resolution of the data sources required, but aims to be representative of the EC footprint.At Sw energy consumption statistics were used to estimate Q F (see Appendix A of Ward et al. (2013)).For Kc the GreaterQf model (Iamarino et al., 2012) was used (Kotthaus and Grimmond, 2014a).The Q F values obtained are considered to be reasonably compatible with the EC fluxes, given the challenges associated with quantifying this highly spatially variable flux.
The key difference between the sites is the level of urbanisation, evident in the proportion of vegetation (45% at Sw compared to only 5% at Kc); proportion of impervious surfaces (paved and built surfaces cover 81% of the surface at Kc); height of buildings (Sw has mainly 1-2 storey houses whilst building heights at Kc are larger and more varied); and population density (Table 1).The local climate zone classification (Stewart and Oke, 2012) for Sw is 'open low-rise' whilst Kc is 'compact midrise'.

Table 1
Characteristics for the London (Kc) and Swindon (Sw) sites including the plan area fractions of paved surfaces ('Paved'), buildings ('Bldgs'), evergreen trees and shrubs ('EveTr'), deciduous trees and shrubs ('DecTr'), grass ('Grass'), bare soil ('BSoil') and open water ('Water').The surface cover has been determined based on the average footprint climatology at Kc (Kotthaus and Grimmond, 2014b) and for 500 m around the flux tower at Sw (Ward et al., 2013).The measurement height corresponds to the height of the wind speed measurement for the meteorological forcing data; for Kc the average of the KSS (48.9 m) and KSSW (50.3 m) heights is used.The proximity of the two sites means they experience very similar meteorological conditions (Ward et al., 2015).Temperatures in London tend to be slightly warmer than in Swindon, whilst humidity is slightly lower.Compared to previous studies using SUEWS, the climate (maritime temperate) is fairly similar to Vancouver and Dublin, warmer (and with less snow) than Helsinki or Montreal, and cooler than Los Angeles which experiences hotter and drier summers.
During the evaluation period, the south of the UK experienced both very wet and very dry conditions: rainfall was well below average in 2011 and spring 2012, whereas April-December 2012 was exceptionally wet; rainfall in spring 2013 was not atypical.Normal (1981Normal ( -2010) ) annual rainfall for southern England is about 780 mm and mean air temperature is about 10 °C (Met Office, 2014).Summer 2011 and 2012 were cloudy and slightly cooler than normal, whilst spring and autumn 2011 were much warmer and drier than normal.Winter 2011/12 was warmer, winter 2012/13 cooler and March 2013 much cooler than normal.Snow fell and remained on the surface for a few days at Sw during 10-12 February 2012 and 18-25 January 2013.Other light snow showers occurred but the snow did not settle.

Model setup
In this application SUEWS is run offline and forced using observational data.A model time-step of 5 min is specified.The input meteorological dataset has a resolution of 60 min, which is linearly interpolated to 5 min to run the model (precipitation is distributed evenly throughout each hour).The model output is averaged back to 60 min for comparison with observations.As the purpose of this work is to evaluate the various components of the model, rather than to obtain the 'best' results, runs have been performed using the most basic input meteorological dataset (K ↓ , T air , p, RH, U and P).
Fixed values of the roughness length and displacement height were provided, rather than calculated inside the model.Although SUEWS can simulate snow accumulation and melt in cold climates, this option was not used as settling snow is rare.In the absence of detailed information the same soil properties were assumed for the soil stores beneath each surface: a soil layer depth of 350 mm, with a maximum moisture capacity of 150 mm (saturated soil moisture content of 0.43 m 3 m −3 ).The initial soil moisture state was set to 80% of the saturation value.For Swindon, 2% of water from paved surfaces was allowed to flow to other surfaces (grass) and 10% of water from roofs was allowed to flow to other surfaces (2% to grass and 8% to paved surfaces).The remaining proportions (98% for paved surfaces and 90% for buildings) become runoff into pipes.Water from pervious surfaces is allowed to infiltrate into the soil stores beneath.The same conditions were used in London, except the 10% of water from buildings all goes to paved surfaces (none to grass) and 10% of water from evergreen trees and deciduous trees is allowed to flow to paved surfaces.Irrigation is assumed to be zero for these UK sites.The same assumption of negligible irrigation was made in Dublin, Ireland, (Alexander et al., 2015) on account of the mild and wet climate.Although some irrigation occurs in the UK, it is on a much smaller scale and less frequent compared to the North American sites where the model has been used previously.

Model evaluation
In the following, the subscript 'MOD' denotes model output and 'OBS' denotes observations used to evaluate the model (including quantities such as Q F which are not strictly observed, but which have been estimated independently of the SUEWS model).Daytime conditions are defined as those for which K ↓ N 5 W m −2 .Dry conditions are identified when there is zero rainfall and no water on the surface (according to the model output).Statistical measures used to assess model performance include the root mean square error (RMSE), coefficient of determination (r 2 ), mean absolute error (MAE) and mean bias error (MBE).

Seasonal cycle of vegetation phenology
The phenology, or state of vegetation, in SUEWS is based on leaf area index (LAI) calculated at a daily timestep according to the number of growing or senescence degree days (Ja14).Assessing the vegetation phenology is an important first step in ensuring the timing of the seasons is modelled appropriately.The seasonal cycle of LAI was evaluated using photographs of vegetation in Swindon and London and found to look reasonable.The base temperature for growing degree days was increased relative to Helsinki and Montreal (from 5 °C to 6 °C) with the effect of delaying leaf-out slightly.It is reasonable to expect the base temperature to be slightly higher for these UK sites as the length of the growing season varies with latitude.The base temperature for senescence was set to 11 °C (higher values cause leaves to fall too early).However, the model seems to reach full leaf-out too suddenly and slightly too early (see Section 4.5.3).Vegetation status derived from Earth observation would be useful for a more detailed evaluation, but the surface cover variability relative to pixel size makes these data challenging to interpret in cities.The seasonal cycle varies between years due to inter-annual variability in temperature, for example the start of leaf-out was relatively late in 2013 (not beginning until mid-April) due to cold weather during spring.

Incoming shortwave radiation
Incoming shortwave radiation, K ↓ , is required forcing data for SUEWS.The linear conversion to and from 5min resolution (Section 3.2) causes small differences between 60-min input and 60-min output K ↓ (Fig. 3a, b).There is a negligible bias between input and output K ↓ but some scatter (RMSE = 12 W m −2 ).

Outgoing shortwave radiation
Outgoing shortwave radiation, K ↑ , is calculated using a bulk albedo, α, based on the plan area fraction and albedo for each surface type (specified in the input files).Albedo values from Oke (1987), as used in Vancouver and Los Angeles (Ja11), result in bulk albedos of 0.13 (Kc) and 0.17 (Sw), which are larger than the observed values of 0.11 and 0.15.Observations suggest that European cities may have lower bulk albedos than North American cities, partly due to the building materials used.For example, values of 0.08 in Łódź, Poland (Offerle et al., 2003) and 0.11 in Basel, Switzerland (Christen and Vogt, 2004) have been measured.Therefore slightly lower albedos for buildings and paved surfaces are used here (Table 2), which improved model performance.Further improvements were achieved by enabling the modelled albedo of all vegetated surfaces to change with season (previously this only occurred for deciduous trees).The minimum albedo for deciduous trees was reduced from the original value to reflect the large change observed between leafon and leaf-off conditions (Table 2).The seasonal variation in albedo for evergreen trees is smaller and was neglected (the surface cover fraction of evergreen trees is very small for these sites, Table 1).
Modelled K ↑ compares better with observations when the adjusted albedos are used compared to the original albedos (MAE, MBE and RMSE decrease and the line of best fit is closer to 1:1).For Swindon, K ↑ is still very slightly overestimated, whereas for London K ↑ is slightly underestimated (Fig. 3c, d).Performance is poorer in winter, particularly the underestimation in albedo at Kc.Although radiometers are installed with the intention of providing radiative fluxes representative of the EC footprint, in reality surface heterogeneity and the size of individual elements (such as buildings and trees) mean there will be some bias.Therefore it may be disadvantageous to tune the input values to exactly match observations.The source area of the radiometers is spatially fixed in time and much smaller than the EC footprint upon which the model input site characteristics are based (Section 3.1).At Sw, the source area of the radiometer contains a relatively large proportion of road, which may make the measured albedo slightly lower than for the study area as a whole.At Kc, the presence of street canyons in the radiometer footprint gives rise to a lower measured albedo (0.11) compared to a second radiometer nearby (0.14), which sees mainly roof surface (Kotthaus and Grimmond, 2014b), and neither radiometer 'sees' the river surface comprising 14% of the Kc study area (Table 1).
Snow fell and settled at Sw on 10-12 February 2012, causing the observed albedo to increase to 0.3-0.6 during this period.A thicker layer of snow settled 18-25 January 2013.On 18 January, α OBS increased to 0.55 due to the fresh snow covering and by 26 January α OBS had fallen to 0.35.As the snow part of the model was not used for this evaluation, this behaviour is not represented and K ↑ OBS is clearly larger than K ↑ MOD for these days (light grey points in Fig. 3d).
Although SUEWS now takes into account seasonal variation in albedo for grass and trees, diurnal variation is ignored, as are changes due to surface conditions (wet/dry) or meteorology (cloudy/clear).Improved modelling of the albedo variability could be addressed in future, for example implementing the dependence of albedo on sun angle may improve wintertime performance at Kc (among others, Kotthaus and Grimmond (2014b) demonstrate higher albedos at lower sun angles).However, gains in performance are likely to be small as K ↑ is already well modelled with high r 2 of 0.96/0.99 and low RMSE of 4.06/3.11W m −2 at Kc/Sw.

Incoming longwave radiation
Incoming longwave radiation, L ↓ , is calculated using T air and RH to estimate cloud cover (Offerle et al., 2003;Loridan et al., 2011).The seasonal cycle of L ↓ is captured but the amplitude of diurnal fluctuations is underestimated, so the range of L ↓ MOD is smaller than observed (Fig. 4).The model tends to underestimate L ↓ (MBE = −0.8/−8.0W m −2 at Kc/Sw), particularly during the daytime (Fig. 4b, d), and often overestimates L ↓ at night with the result that Q* MOD is often less negative than Q* OBS at night, particularly when conditions are clear.0.16-0.170.14-0.15 a The bare soil surface was not fully implemented in previous model versions and the corresponding surface fraction was assigned the same albedo as the non-irrigated grass surface.The emissivity value for bare soil was based on the range suggested by Oke (1987).
The inability of the model to simulate the full range of observed L ↓ values is attributed to the empirical relation used to determine cloud fraction (Loridan et al., 2011): F CLD is then used to calculate L ↓ following Crawford and Duchon (1999), where ε clear is the clear-sky emissivity and σ is the Stefan-Boltzmann constant.According to Eq. ( 3), F CLD = 0 when RH = 0% (dotted lines, Fig. 5c, d).However, as RH rarely drops below 20%, the lowest modelled cloud fraction is 0.07.Similarly, full cloud cover is never modelled as T air and RH are never high enough at the same time for modelled F CLD to exceed 0.81.Hence the distribution of modelled F CLD is too narrow (Fig. 5); in reality clear skies (F CLD = 0) and full cloud-cover (F CLD = 1) both occur frequently.Indeed, observations of cloud fraction derived from ceilometer measurements demonstrate that clear skies and completely cloudy skies occur more often than partially cloudy skies (Kotthaus and Grimmond, 2014a).Comparison with Fig. 1 of Loridan et al. (2011) indicates the same issuethe parameterisation does not capture the full range of cloud cover observed.Attempts to re-scale F CLD obtained from Eq. ( 3) to fill the range 0-1 did not significantly improve modelled L ↓ , which is not surprising given the scatter evident in Fig. 5c and in Fig. 1 of Loridan et al. (2011).It would be possible to use observed K ↓ and top-of-atmosphere K ↓ (calculated using earth-sun geometry) to estimate cloud fraction during daytime, but this approach cannot be used at night and is unreliable for low solar elevation (Offerle et al., 2003).SUEWS has the option to use observed L ↓ if data are available.When observations are used, as for K ↓ , there is not exact agreement between the L ↓ input and output values due to the interpolation to 5-min data.As L ↓ changes more slowly than K ↓ , the discrepancies are smaller than for K ↓ (RMSE = 2 W m −2 ).Using observed, rather than modelled, L ↓ has a very minor effect on the outgoing longwave radiation, L ↑ , but does improve the net all-wave radiation, Q* (Section 4.2.5).

Outgoing longwave radiation
The model replicates the behaviour of the outgoing longwave radiation, L ↑ , remarkably well (Fig. 3g, h).The coefficients of determination are high (r 2 = 0.97-8) and scatter small (RMSE = 7 W m −2 ).Time series analysis reveals that the model tends to overestimate L ↑ at Sw, particularly for high values of L ↑ .This overestimation does not appear to be related to inaccuracies in L ↓ MOD but coincides with sunny conditions and is therefore attributed to a correction term in the longwave parameterisation which attempts to account for the difference between air temperature and effective radiative surface temperature using K ↓ (see Offerle et al. (2003) for further details).At Kc, L ↑ is underestimated, particularly for low values of L ↑ .This may be due to the influence of the anthropogenic heat flux on L ↑ OBS : Q F can cause an increase in surface temperature and thus an increase in L ↑ (Grimmond, 1992;Loridan and Grimmond, 2012), which would be inherently included in the observations but is not accounted for by the model.

Net all-wave radiation
Overall, SUEWS models Q* very well; r 2 is high (0.96-8) and the scatter is reasonably small (RMSE = 31/ 27 W m −2 at Kc/Sw).Most of the error comes from L ↓ , as it is the least well modelled of the radiation components (Section 4.2.3).Q* can be substantially improved by providing observed L ↓ in the meteorological forcing file.RMSE is reduced to 18/14 W m −2 at Kc/Sw (Fig. 6d, h).The underestimation in Q* is reduced, particularly at Sw at night (Fig. 6f).However, there is no significant improvement in the turbulent heat fluxes.At Sw, the remaining underestimation in Q* during daytime is a result of both K ↑ and L ↑ being overestimated, which again may partly result from the footprint composition of the radiometer not being represented by the local-scale land cover characteristics (Tables 1, 2).At Kc, the errors are reduced by using observed L ↓ (Fig. 6a-d Although RMSE values for different sites and analysis periods should be compared with caution, the values obtained here suggest similar or slightly better performance than for previous evaluations.Performance at the  3) for T air = −10 °C and 30 °C).F CLD OBS was not available at Sw, but the distribution is expected to be very similar to that at Kc due to the proximity of the two sites.
UK sites is better than for the Vancouver 1987 (Vs87) dataset (r 2 = 0.95, RMSE = 45 W m −2 ; Ja11).For cold-climates (Ja14), RMSEs between 25 and 41 W m −2 were obtained for the various sites and conditions (cold snow, melting snow and snow-free periods of evaluation); normalising the RMSEs by the range of observed Q* gives values between 0.028 and 0.061, which are still generally larger than for Kc/Sw (0.028/ 0.035).Q* was generally underestimated, especially in cold snow conditions.This behaviour is attributed to the L ↓ parameterisation, which performs less well in cold conditions: Fig. 5 shows that for temperatures of −10 °C modelled F CLD cannot exceed 0.5.Using observed, rather than modelled, L ↓ Karsisto et al. (2015) obtained small RMSE values of 5-25 W m −2 and high r 2 of 0.96-1.00for two sites in Helsinki.However, the low end of these RMSE (and r 2 ) values correspond to autumn and winter when Q* is smaller but model performance worse than in spring and summer.Similarly, for the North American studies, the highest RMSE obtained for Q* was 47 W m −2 for Vancouver in summer 1987 (when performance is strong) whilst the lowest was 25 W m −2 for Vancouver in winter 2009 (when the model substantially underestimates Q*).Whilst Q* is underestimated during daytime in Vancouver, it is overestimated in Los Angeles (Ja11).

Anthropogenic heat flux
In urban areas, the heat released to the environment as a consequence of human activities can significantly augment the available energy (e.g.Klysik, 1996;Taha, 1997;Ichinose et al., 1999;Hamilton et al., 2009).This anthropogenic heat flux, Q F , includes energy released from buildings (due to heating, air conditioning, cooking, using electrical appliances, etc), from transportation and from human metabolism.SUEWS estimates the anthropogenic energy release on a daily basis following the method of Sailor and Vasireddy (2006): where ρ pop is the population density.The coefficients a F0,1,2 can be specified separately for weekdays and weekends.The dependence on heating degree days (HDD) and cooling degree days (CDD) enables modelled Q F to vary with temperature to reflect the changing demand for building heating or cooling.The sub-daily variation in Q F is achieved by applying a diurnal profile specified in the model input files.
For the Swindon site, Q F has been estimated at 6-10 W m −2 using inventory data (Ward et al., 2013).These estimates are similar to those from other studies in suburban residential areas (e.g.Christen and Vogt, 2004;Bergeron and Strachan, 2010).Additionally, there is good agreement between carbon dioxide emissions estimated using the same approach and observed carbon dioxide fluxes (see Ward et al. (2013) for further discussion).Since Q F is very difficult to measure, these estimates are treated as the 'observed' values with which to compare the SUEWS model results.
As Q F is much larger and more spatially variable for the London site, there are greater errors associated with its estimation.The 'observed' values used here are derived from the GreaterQf model (Iamarino et al., 2012), adjusted for temperature, composition of the EC footprint and 25% overestimation as described in Kotthaus and Grimmond (2014a).Compared to the population-based SUEWS approach, GreaterQf is more applicable to the Kc site located in the central business district, as it implements diurnal energy use profiles for both domestic (i.e.residential) and industrial (including office) buildings.The latter gives a more appropriate representation of the human activities in the area.
The original values of the coefficients (Table 3) were derived for Vancouver and have been used at the other North American sites (Ja11, Ja14).However, they result in modelled Q F that is too large for the Swindon site.Although typical values for Q F are similar at Sw and the Vancouver Sunset site (Vs) (Grimmond, 1992), the population density at Vs is less than half that at Sw.This suggests that the energy release per capita is larger for the North American sites but their lower population densities mean the energy release per unit area is similar.In Swindon and London, there is no indication of an increase in Q F at high temperatures associated with extensive use of air conditioning (Fig. 7, see also Kotthaus and Grimmond (2014a)).As a result of cultural differences and a cooler climate, air conditioning in homes is far less prevalent in the UK than in North America.Coefficients were derived for the UK sites with the dependence on cooling degree days removed (a F1 set to zero).Slightly different coefficients were obtained for weekdays and weekends (Table 3).For Kc, the derived coefficients are similar to those for Vancouver (the population densities are responsible for the large difference in Q F values between Kc and Vs), but Q F rises more sharply with decreasing temperature at Vs.The diurnal profiles of energy use, derived from the Vs87 dataset (Table 3 of Ja11), are a reasonable match to 'observations' at Sw, but are less suitable for Kc, where Q F OBS peaks in the middle of the day (instead of during morning and evening rush-hours).Similar behaviour is also seen in the carbon dioxide flux (Ward et al., 2015).
As the Sw site is predominantly residential, the daytime population density (i.e. when people are at work) is much smaller (16.77ha 1 ) than when residents are at home during the evening and at night (47.63 ha − 1 ) (ONS, 2011).In the centre of London, representative population densities are even more critical.The workday population density for the borough of Westminster is 310 ha − 1 (ONS, 2011), the total daytime population density (including tourists) is considerably higher (460 ha − 1 , GLA (2013)), whilst the resident population density is much smaller (99 ha − 1 , ONS (2011)).An average of the workday and resident population densities has been used here (Table 1).However, the situation is complex: residents, workers and tourists have differing energy profiles (Iamarino et al., 2012); the land use comprises commercial (shopping, entertainment, hotels) and institutional (universities, hospitals) as well as residential areas, all with differing energy profiles and accurate estimation of Q F is hampered by its considerable spatial variability (Hamilton et al., 2009).The model coefficients given in Table 3 for the Kc and Sw sites have been used to provide an approximate estimate of Q F for these sites.However, both the Table 3 Coefficients for the anthropogenic heat model (Eq.( 5)) from the literature and fitted to the Kc and Sw data.
0.0102 0.0102 0.0073 0.0067 0.0037 0.0038 modelled and 'observed' values have substantial uncertainties.More accurate estimates of Q F could be made with more sophisticated models, but this is beyond the scope of this paper.

Net storage heat flux
SUEWS calculates the net storage heat flux, ΔQ S , using the objective hysteresis model (OHM) (Grimmond et al., 1991): where f is the surface cover fraction for each surface type, i, and t is time.The OHM coefficients a 1,2, 3 are selected from the literature for each surface type (Table 4).They are generally based on empirical fits to observational data (e.g.Doll et al., 1985;McCaughey, 1985;Yoshida et al., 1990) or simulation studies (e.g.Arnfield and Grimmond, 1998;Meyn and Oke, 2009;Sun et al., 2013).At the suburban site OHM generally performs well during the summer months (Fig. 8).The shape of the diurnal cycle is replicated well although the model slightly underestimates ΔQ S during daytime.In winter, Fig. 7. Mean daily anthropogenic heat flux, calculated using Eq. ( 5) with coefficients from Table 3, versus mean daily air temperature.'Observed' anthropogenic heat flux is also shown.Note different y-axis scales for Kc and Sw.

Table 4
Coefficients for each surface type used in the Objective Hysteresis Model (OHM) (Eq.( 6)) before adjustment for seasonal variation.if no adjustment is made to the OHM coefficients, ΔQ S is substantially underestimated (cyan, Fig. 8).ΔQ S MOD is typically 30-40 W m −2 more negative than ΔQ S OBS at night, due to the a 3 term (≈ −30 W m −2 ) in Eq. ( 6) which dominates when Q* is small.During daytime the underestimation is even larger.As ΔQ S is very difficult to measure directly, the residual (RES) of the observed energy balance is used for ΔQ S OBS .However, the findings are similar if the ground heat flux Q G (one component of the net storage heat flux) measured in Swindon is used instead of the residual, indicating that the behaviour is not caused by variations in energy balance closure with season.The observations (both RES and Q G ) suggest that night-time ΔQ S is much smaller in magnitude during winter than summer, as also noted by Keogh et al. (2012).

Surface type
The inability of the model to reproduce the observed wintertime behaviour is a result of the bias towards summertime observations in the currently available coefficients.There is evidence that the OHM coefficients should vary with season (Anandakumar, 1999;Ward et al., 2013), particularly the constant term a 3 .Seasonal variation in the OHM coefficients for these sites, obtained by least squares regression fits to Eq. ( 6) using observed ΔQ S and Q* by month, is shown in Fig. 9 alongside values for a dry asphalt surface from Anandakumar  (1999), hereafter An99.Although the coefficients for Kc, Sw and An99 correspond to different surface types, there is some similarity in their seasonal variation.Compared to summertime values a 1 is larger and a 3 smaller, or even positive, in winter.
As the wintertime performance is clearly problematic, the OHM coefficients a 1 and a 3 are adjusted for seasonal variation (summer and winter half-years) using scaling factors derived from An99 (a 2 is not adjusted as its seasonal behaviour is unclear).SUEWS uses one set of rescaled coefficients when the 5-day mean temperature is below 10 °C ('winter') and the other set the rest of the time ('summer').Despite the approximate nature of this approach (limited by the information available) the improvement in model performance is substantial.
At Kc, ΔQ S OBS has been calculated as in Kotthaus and Grimmond (2014a) using the residual of the observed energy balance accounting for the overestimation of Q F and underestimation of Q H and Q E .(ΔQ S OBS is smaller in the daytime and less negative at night than RES.) Accounting for seasonal variation in a 1 and a 3 also improves performance at Kc overall, although the nocturnal heat release is underestimated to a greater extent (Fig. 8).
Differences in ΔQ S between Kc and Sw are partly captured by the model as a result of their respective surface cover fractions, but the observations show a much larger difference in ΔQ S between the two sites than the model does.One reason for this is the similarity between OHM coefficients for different surface types (Table 4).The high value for a 1 for paved surfaces is responsible for most of the difference between ΔQ S MOD at the two sites, as it results in a diurnal cycle that closely follows Q* and gives high daytime values of ΔQ S MOD .For building surfaces, the mean of values from three sites was used (Yap, 1973;Taesler, 1980;Yoshida et al., 1990;Yoshida et al., 1991) as in Keogh et al. (2012).This average a 1 (=0.477) also contributes a substantial portion towards ΔQ S MOD , although the value was higher for the Yoshida study alone (a 1 = 0.82).The Yoshida study also found a large value for a 3 (−55.7 W m −2 ), which translates to a large nocturnal release.Fitting OHM coefficients to Kc/Sw observations yields average (year-round) values of 0.73/0.41for a 1 and −32.3/−7.1 W m −2 for a 3 , which, compared to the bulk modelled values (Table 4), differ from each other to a greater extent and in the direction expected (i.e.larger a 1 and more negative a 3 at the denser site).
Accounting for the three-dimensional nature of the urban surface by including the area of walls has previously been found to not significantly improve the performance of OHM, and actually decreased performance at sites where walls were important (Grimmond and Oke, 1999b).Therefore this has not been attempted here.Nevertheless, it seems reasonable to expect some dependence on built volume.Arnfield and Grimmond (1998) used a numerical model to demonstrate increasing a 1 and decreasing a 3 with increasing height-to-width ratio and density of building materials (see their Table 3).It is clearly important to consider building characteristics when selecting appropriate values for the coefficients.It seems that the materials mainly studied (gravel, tar, membrane) have low heat capacities and perhaps the coefficients for buildings are biased towards smaller storage fluxes than are representative of building materials in the UK (e.g.brick, stone and tile) and possibly other European cities. Roberts et al. (2006) reports an underestimation of the nocturnal release and daytime uptake for Marseille, and Karsisto et al. (2015) recommends that building properties in Jackson et al. (2010) (used by the Community Land Model) are adjusted to account for high-latitude cities with better insulated buildings.
The effect of Q F on the storage heat flux is an additional complicating factor at Kc. Grimmond (1992) incorporated Q F into the calculation of ΔQ S MOD (by replacing Q* with Q* + Q F in Eq. ( 6)), thus increasing ΔQ S MOD .Although daytime values may be in better agreement with ΔQ S OBS , the modelled nocturnal release of energy becomes even smaller, causing a greater underestimation of ΔQ S .Furthermore, relating ΔQ S to Q* + Q F distorts the shape of the diurnal cycle, which becomes particularly evident in winter when the diurnal cycle of Q F is wider than Q*.Anthropogenic heat may be released at a range of heights depending on building design (e.g.windows, ventilation systems), whilst Q F associated with traffic and human metabolism is directly released into the urban canopy layer where it warms the air volume incorporated in the definition of ΔQ S .Q F may also increase ΔQ S if Q H and Q E are limited by turbulence or moisture.Derived values of a 1 N 1 (Fig. 9) and larger a 1 for weekdays compared to weekends (not shown) further suggest that Q F impacts ΔQ S .However, since the OHM coefficients have been derived primarily using Q*, the impact of Q F has not been accounted for here and, consequently, we expect the model to underestimate ΔQ S during daytime.
Despite both modelled and observed storage heat flux estimates having considerable uncertainties, ΔQ S is evidently a major component of the urban energy balance.As well as seasonal variation, meteorological and surface conditions also influence the storage heat flux, such as wind speed, cloud fraction and wetness state (Offerle et al., 2005;Kawai and Kanda, 2010).These should be incorporated into SUEWS in the future, but first there is a real need for long-term observational datasets or simulations to inform more basic model parameterisation.
Accurate estimates of ΔQ S (along with Q* and Q F ) are important for determining the available energy required for calculation of the turbulent heat fluxes.The next section explores the atmospheric controls on the turbulent heat fluxes.

Friction velocity
Friction velocity, u ⁎ , is calculated using the input wind speed U and the logarithmic wind profile adjusted for stability, using the van Ulden and Holtslag (1985) and Högström (1988) stability functions for momentum.The stability is derived iteratively based on an initial estimate of Q H (Eq. 3 of Grimmond and Oke (2002)).SUEWS generally captures the variability of u ⁎ on the timescale of hours to days, as changes in u ⁎ closely follow changes in the input wind speed U.
Overall, u ⁎ is reasonably well modelled, particularly at Kc (Figs. 10,11).Around the London site, directional variations of the surface drag are relatively small, despite the complexity of the dense urban area (Kotthaus and Grimmond, 2014b).There are no major differences between the two measurement locations (KSS and KSSW, Section 3.1); the drag coefficient varies slightly differently with wind direction at each location but remains small (median value for 10°bins b 0.05).
At Sw, much of the scatter between modelled and observed u ⁎ is attributed to morphological differences around the flux tower.u ⁎ is underestimated for wind directions of 110-120°and 170-190°, whilst there is a slight overestimation for northerly wind directions (Fig. 10d).The roughness parameters (z 0m = 0.5 m, z d = 3.5 m) were derived for a circle (of radius 500 m) around the flux tower using fixed proportions of the mean obstacle height (Ward et al., 2013).However, the high values of the drag coefficient C D obtained close to 120°and 180°(Fig.10f) are consistent with the locations of nearby buildings and imply z 0m and/or z d for these wind sectors are larger than the nominal values.Similar results are found when comparing u ⁎OBS with u ⁎ derived using observed (instead of modelled) stability, indicating that most of the discrepancy between modelled and observed u ⁎ is caused by the heterogeneity of the urban surface affecting the observations, rather than the performance of the model.There may also be some micro-scale influences on the wind field due to these nearby buildings, which mean that similarity theory may be less valid for particular wind sectors.z 0m and z d can also change with time as a result of seasonal variations in leaf area (Grimmond and Oke, 1999a).Although the observations show some evidence of greater roughness when leaves are on the trees compared to when the trees are bare, incorporating changes of porosity of deciduous trees in the model makes little difference to the model performance.

Atmospheric stability
The model tends to predict greater near-surface instability than suggested by the observations at both sites, although the seasonality and diurnal variation are simulated reasonably well (Fig. 11).During winter night-times the model predicts more stable conditions than the observations, but if the OHM coefficients biased towards summertime are applied all year round, the performance is much worse as the model predicts unstable conditions even during winter at Sw (cyan line, Fig. 11).During summer, the Swindon observations indicate a tendency for slightly stable conditions at night, whereas SUEWS predicts slightly unstable conditions.Although this difference (in sign of the Obukhov length, L Ob , and hence the stability parameter, ζ) is apparent in Fig. 11 (row 5), the lower atmosphere is in the near-neutral regime so the question of whether conditions are very slightly stable or very slightly unstable is not particularly relevant.Modelled u ⁎ is not adversely affected by these differences in L Ob MOD and L Ob OBS , and the performance during night-time is good (row 4 of Fig. 11) as the stability-dependent term in the calculation of u ⁎ is essentially zero for both very slightly stable and very slightly unstable conditions.Therefore, although the sign of L Ob MOD should not be relied upon, the impact on other quantities is not major.(Note that the sign of L Ob does not directly determine the sign of Q H as Q H in SUEWS is determined by the residual of the energy balance.)The tendency of the model to predict greater instability than the observations does lead to an underestimation of the aerodynamic resistance, r av (Section 4.5.3).The largest differences occur when the atmosphere (L Ob OBS ) is stable (so r av OBS is large) whilst L Ob MOD is much smaller or negative (so r av MOD is small).Again, accounting for seasonality in the OHM coefficients improves L Ob and thus reduces the underestimation in r av .

Surface resistance
The aerodynamic and surface resistances are required to model the latent heat flux, Q E , using the Penman-Monteith equation (Penman, 1948;Monteith, 1965) modified for urban areas (Grimmond and Oke, 1991): where ρ is the density of air, c p the specific heat capacity of air at constant pressure, VPD the vapour pressure deficit, s the slope of the saturation vapour pressure curve, γ the psychrometric constant, r av the aerodynamic resistance for water vapour and r s the surface resistance.r av determines the rate at which water vapour is transported by turbulence between the surface and atmosphere.It is large for stable conditions and small for unstable conditions, with typical values of 40-70 s m −1 at Kc and 20-50 s m −1 at Sw.The surface resistance is analogous to the canopy resistance in natural environments and describes the environmental controls on evaporation for the whole urban surface (Grimmond and Oke, 1991;Järvi et al., 2011).It includes the stomatal responses of vegetation, but also includes the influence of other surfaces.The reciprocal of the surface resistance is the surface conductance g s .Despite several major land-surface models using g s to calculate evaporation and/or photosynthesis (e.g.Krinner et al., 2005;Best et al., 2011;De Kauwe et al., 2015), it is difficult to simulate g s in a generalised way.Various approaches have been suggested at both leaf-level (Damour et al., 2010) and canopy-level (Irmak and Mutiibwa, 2010).In SUEWS a Jarvis-Stewart formulation (Jarvis, 1976;Järvi et al., 2011) is used: where the sum is over the three vegetated surfaces and weighted by the surface cover fraction f of each surface i. g max is the maximum conductance and LAI (max) is the (maximum) leaf area index for each surface.G 1 is a constant.The functions g(K ↓ ), g(Δq), g(T air ) and g(Δθ) describe the control exerted by the incoming shortwave radiation, specific humidity deficit, air temperature and soil moisture deficit, respectively.In SUEWS_v2016a, the soil moisture deficit beneath vegetated surfaces is used.These functions range between a minimum (positive) value and a maximum value of 1 (when that quantity is not limiting).Various empirical relations are given in the literature for these functions (e.g.Jarvis, 1976;Stewart, 1988;Ogink-Hendriks, 1995;Järvi et al., 2011).The relations vary considerably between sites and there is little consensus on the coefficients, functional forms or even which variables to include.For example, several studies suggest the temperature dependence does not improve model fits (e.g.Adams et al., 1991;Wever et al., 2002) as temperature and humidity tend to be highly correlated.Values fitted for one site can be unsuitable for another site, and even values fitted for one year (or season) may not be appropriate for a different year or season (Stewart, 1988;Järvi et al., 2011;Zhang et al., 2014).
A new functional dependence on the controlling variables is presented (Appendix A) with the main objective of relaxing the control of soil moisture on evaporation.Using the North American parameters (Ja11) resulted in frequent periods of very low Q E MOD when Δθ approached 40 mm (the maximum permitted soil moisture deficit before evaporation was severely restricted).Whilst frequent irrigation at the North American sites maintains moist soils so that this limit was not reached in the Vs87 dataset, observations indicate this limit is often exceeded in the UK.With the new set of parameters the limiting Δθ is much larger (120 mm) and thus more suitable for the Swindon siteand presumably also for other sites with little or no irrigation.The new relations are designed to be less restrictive so that unrealistic values of the surface conductance (and Q E ) are avoided.
'Observed' surface conductances are obtained by rearranging Eq. ( 7) and using Q H and Q E from EC, At Sw the diurnal pattern of both observed and modelled g s exhibits the expected behaviour during summer: an asymmetrical shape, peaking in the morning and declining steadily through the afternoon as the vapour pressure deficit increases and plants close their stomata to conserve moisture; g s remains low throughout the night (Fig. 11).The diurnal cycle of g s is mainly determined by the changing ratio of VPD/Q E .Conductances are much smaller and have a less clearly defined pattern at Kc as there is less vegetation, reduced moisture availability and lower evaporation rates.In winter the diurnal cycle is shorter, more symmetrical and smaller in amplitude at Sw, whilst observed night-time values are higher at both sites, probably due to higher wind speeds and damp surfaces.Observed g s is also higher than suggested by the model during winter daytimes.This may be a result of evaporation occurring from impervious surfaces that are damp, or via cracks in these surfaces which would not be replicated by the model.There are also increased uncertainties associated with EC observations and 'observed' g s in winter.Nevertheless, the results seem to suggest that the dependence of modelled g s on LAI could be too strong, which is perhaps to be expected since the Jarvistype parameterisation was established for vegetated areas and is based on plant physiology.
The overestimation of g s in May and June at Sw appears to be related to the timing of leaf-out.In 2011, the model predicts that full leaf-out is reached at the end of April, so the surface conductance throughout May is calculated assuming vegetation is fully active, which is thought to be slightly premature.Similarly, in 2012 full leaf-out is reached in early May, a few weeks ahead of observations.In spring 2011, there could also be some influence of the soil moisture conditions prescribed at the start of the model run.Larger values of g s MOD are obtained in May 2011 compared to June 2011, partly because modelled soil moisture is higher in May than June as the initial soil moisture stores are being depleted (see Section 4.7.1).
For dry daytime conditions, the overall RMSE between 'measured' and observed g s is 3.3 mm s −1 at Sw, varying between 1.8 mm s −1 and 6.7 mm s −1 across the 24-month evaluation period.(At Kc the RMSE is smaller at 1.5 mm s −1 but as a result of generally smaller g s rather than better model performance.)Ja11 gives an RMSE of 7.4 mm s −1 , although the conductances obtained were larger than at Sw or Kc.For Swindon, r 2 is low in winter (b0.1) and around 0.5 in summer.

Latent heat flux
At Kc the model overestimates Q E (Figs. 12c,13).A considerable proportion of Q E MOD originates from the River Thames (as open water constitutes 14% of the source area, Table 1), yet the observations do not seem to have a distinct signal from the river.At this complex site one possible explanation is that the river's internal boundary layer is too shallow to reach the height of the sensors (Kotthaus and Grimmond, 2014b).It is also possible that the relatively low temperature of the deep river impedes evaporation (Sugawara and Narita 2012).However, when the model is run with 0% water (and the 14% redistributed proportionately among the other surface types) the model underestimates Q E and the performance decreases.As observed Q E is generally small at this site, a low signal-to-noise ratio probably enhances the frequency of occurrences of Q E OBS b 0 W m −2 (Kotthaus and Grimmond, 2014a).
At Sw, Q E is modelled well overall (Fig. 12d): r 2 = 0.72, similar to the performance for the Vs87 dataset (r 2 = 0.74; Ja11).However, the performance varies with atmospheric conditions, moisture availability and state of vegetation.SUEWS best captures the variability of Q E in the summer months, with high r 2 of around 0.8 (Fig. 14b).Correlation between modelled and observed Q E is higher in summer 2012 than summer 2011, and Q E is overestimated in summer 2012 but underestimated in 2011 and early 2013.There are times in summer 2011 when Q E MOD is much smaller than the observed value, coinciding with depletion of the soil moisture store under the grass surface.At these times lack of available water limits g s MOD and hence Q E MOD , whereas observations suggest evapotranspiration not restricted to the same extent.It is possible some residents watered their gardens during these warm dry spells which would provide additional moisture to support the observed evapotranspiration rates.Also tree roots may be able to access deeper reserves of soil moisture than permitted by the model.Future model development should consider adding a second soil layer to the model or allowing some evaporation to occur from paved and built surfaces (other than evaporation of intercepted water).
When surfaces are wet, the surface conductance is adjusted from the value obtained from Eq. ( 8).In wet conditions r s is set to zero and for partially wet conditions r s is adjusted following Shuttleworth (1978) (dashed box in Fig. 1).At night (when K ↓ = 0 W m −2 ) g s is set to 0.1 mm s −1 and then adjusted accordingly if the surface is wet or partially wet.Observations during and directly following rainfall are sparse because data from the open-path gas analysers used to derive Q E OBS cannot be used if the surfaces of the instrument are wet.However, there are some occasions when high evaporation rates are detected when the windows of the gas analyser are dry but the surrounding land surface is still in the process of drying.Although sensible and latent heat fluxes are very variable at these times, there is reasonably good agreement between model and observations.The available energy exerts strong control over Q E at these times when water is unlimited.
On the whole, the correlation between modelled and observed fluxes is better in summer than winter (Fig. 14a, b).There are several possible reasons for this.In summer there tend to be fewer measurement issues associated with EC data, the number of data points passing quality control tends to be larger, and the fluxes themselves tend to be larger so relative uncertainties are smaller.In winter shorter days mean the limitations and increased uncertainties associated with night-time data constitute a larger proportion of the dataset.Furthermore, most observational campaigns take place during summer; hence there is a bias in model parameterisation towards summertime conditions.These points should not be overlooked when comparing model performance using different datasets (e.g. the 24-month Kc and Sw results with the Vs87 summertime data); a higher r 2 value would be expected for a dataset restricted to summertime only.(Note the high r 2 for Q E at Kc seen in Fig. 14a for January 2013 should be discounted as Q E OBS is only available for 5% of this month.) Nevertheless, results for the UK sites seem to be broadly consistent with previous studies.Q E is underestimated at both sites in Helsinki, although performance is generally better at the suburban site than the city-centre site (r 2 = 0.21-0.62 and 0.06-0.25,respectively; Karsisto et al., 2015).Q E is also underestimated in Montreal, particularly during snow-free periods (MBE ≈ −10 W m −2 ; Ja14).These are  , 2015) and for the North American sites (20-56 W m − 2 across different seasons; Ja11).To attempt to account for the variation in the size of Q E between seasons and sites, Ja11 normalised the RMSE by the mean of observed Q E .This yields values of 1.3 for Kc and 0.7 for Sw which again suggests strong performance (particularly for Sw) compared with the values of 0.9-1.6 given for the North American sites.

Sensible heat flux
In SUEWS the sensible heat flux is calculated as the residual of the energy balance (Q and therefore accumulates the errors in all other terms.Nevertheless, there is reasonably good agreement with observations (Fig. 12a, b), more so at Sw than Kc.At Kc, Q H is almost always positive as there is a large energy input during both daytime (Q* plus a large Q F contribution) and night-time (Q F plus ΔQ S , as stored energy is released from the urban fabric).Q H is underestimated during the night and early morning at Kc, particularly during winter, due to underestimation of the stored energy release.Similar behaviour was seen at the city-centre site in Helsinki (Karsisto et al., 2015).Adjusting the OHM coefficients for wintertime makes performance worse at these times, although improves Q H during the day (Fig. 13).The generally high values of Q H , particularly in comparison with more vegetated suburban sites, means the RMSE is large (47 W m 2 ).High RMSE values are also found at the city-centre site in Helsinki and range between 38 and 67 W m −2 for the different seasons (Karsisto et al., 2015).At Kc the model represents the variability in Q H much better than in Q E (Fig. 14a), and although r 2 (= 0.53) is reasonable it is lower than at Sw (0.79) or Vs87 (0.77; Ja11) but comparable to the values obtained for Helsinki (0.34-0.67;Karsisto et al., 2015).The overestimation of Q H at Kc during daytime is a result of the underestimation of ΔQ S .Indeed this underestimation of ΔQ S results in overestimation of both turbulent fluxes at Kc, whilst at Sw the errors in Q H MOD and Q E MOD tend to compensate for each other.Similar behaviour is seen in Helsinki: daytime Q H is overestimated at the city-centre site and underestimated at the suburban site (Karsisto et al., 2015).
At Sw, accounting for the seasonality in the OHM coefficients has a major impact on Q H (Figs. 13, 14d).If summertime coefficients are applied all year round, Q H MOD often remains positive during night-time and during winter, in contrast to Q H OBS .Reducing the value of a 3 in winter (Section 4.4) improves ΔQ S and therefore Q H .However, a more sophisticated approach would improve performance further, for example in spring and autumn, night-time Q H is less well-modelled as the OHM coefficients should be somewhere between the summer and winter values applied.In summer, times when night-time Q H OBS is more negative than Q H MOD often coincide with times when Q* OBS is more negative than Q* MOD , as a result of the underestimation of the diurnal cycle of L ↓ (Section 4.2.3)., c) modelled (pink) and observed (grey) normalised soil moisture deficit (SMD).As observed SMD is available for only a short period at Kc, observed SMD has been normalised using maximum and minimum observed SMD for the period when observations are available (light grey line) and using maximum and minimum modelled SMD for the period when observations are available (dark grey line).The dashed pink line corresponds to the grass surface.
normalised between 0 (no deficit, wet soils) and 1 (maximum deficit, dry soils), as observed under grass (in a garden in Swindon, in a park in London) is shown in Fig. 15.
Summer 2012 was extremely wet, except for the first three weeks of September which were dry and fairly sunny.In contrast, most of 2011 and the start of 2012 (until April) had below average rainfall.In the autumn of both 2011 and 2012, the soil moisture stores in the model take longer to refill compared to the observations at Sw (Fig. 15c).The SMD for the grass surface decreases more than the SMD for the whole surface at these times, but the changes in SMD at the end of October 2011 and September 2012 are much smaller in the model compared to observations.Possibly, SUEWS is apportioning too much water to runoff at the expense of infiltration into the soil store.Exhaustion of the soil moisture store under the grass surface was identified as the cause of underestimated evaporation rates for periods of a few days in summer 2011 (Section 4.6.1).At Sw the modelled soil moisture for the grass surface is at its minimum during several periods between June and August 2011, which is not matched by the observations that reach a minimum for a short time in August 2011 only.Interestingly, there is better agreement between model and observations in terms of magnitude during summer 2012, which was much wetter.Perhaps the soil properties used are more suitable for soils that are regularly wetted by irrigation.Adding a second soil layer to SUEWS may also improve model capability, but unfortunately there are insufficient observations for these sites to attempt this here.Improvements to the accuracy of modelled soil moisture would be expected to improve the turbulent heat fluxes too (Section 4.6.1).
Other studies have highlighted the importance of specifying representative initial conditions (e.g.Best and Grimmond, 2013).The recommendation for SUEWS is to start the model run under conditions when soils can be assumed to be wet (but without snow), for example in winter for many regions.Even though the start of 2011 was actually fairly dry in the south of the UK, the initial assumption of high soil moisture does not have a major impact on the results as appreciable rainfall occurs near the start of the time series (85/60 mm at Kc/Sw in January 2011).Setting the soil to be too moist initially could contribute to an overestimation of the surface conductance for spring 2011 (see Section 4.5.3).
The surface characteristics at the two sites mean that the variation in soil moisture is much smaller at Kc than Sw.At Kc there is a substantial proportion of built and paved surfaces which limit infiltration and cannot evaporate water in the modelthey are treated as totally impervious and cracks in the pavement are ignored (Ja11).Future development of SUEWS should allow for some infiltration into and evaporation from built and paved surfaces to more closely represent reality (Hollis and Ovenden, 1988;Ramier et al., 2011), particularly as the use of permeable pavements is likely to become more widespread (Morgenroth and Buchan, 2009;Nakayama and Fujita, 2010).

Surface wetness
At every time-step, SUEWS outputs the amount of water held on the surface (e.g. in the tree canopy, on roofs, on pavements).These surface stores provide a supply of moisture that can evaporate, infiltrate or contribute to runoff at a later time-step, and whether the surfaces are wet, or not, affects the modelled evaporation rate (Section 4.6.1).Comparison of the modelled surface wetness state with data from a wetness sensor (which indicates whether surfaces are wet or dry but does not provide the amount of water present) suggests that the wetting and drying of surfaces following rainfall is well represented by the model.Results are shown here for July 2012 for Sw (Fig. 16); no data were available for Kc.

Energy exchanges at the two sites
SUEWS can be used to investigate differences in energy exchanges at these two sites.Initially, SUEWS is run for Sw, then each set of characteristics is adjusted (cumulatively) to match the input for Kc.First the meteorological input data are changed to those for Kc ('met'), next the building and tree height (and therefore z 0m and z d ) are adjusted to the Kc values ('z H '), next the Kc surface cover fractions are used ('f i '), and lastly the anthropogenic heat flux (model coefficients and population density) for Kc is used ('Q F ').Following this last step, almost identical model results are obtained for the Kc run and the adjusted Sw run (Fig. 17).
The similarity in meteorological conditions at the two sites means there is little impact on the overall climatology of the fluxes of using the Kc meteorological forcing at the Sw site.In a direct comparison of 60-min values (not shown), using the same meteorological forcing for the two sites reduces scatter in Q* and  therefore also in ΔQ S (and to a lesser extent Q H and Q E ) because there is a lag time between the sites even though they experience roughly the same synoptic conditions.Building and tree heights (and therefore surface roughness) are very different between the two sites (Table 1) but have little impact on the model results.The surface cover fractions are a significant control on the energy balance, however, as shown in numerous previous studies (e.g.Grimmond and Oke, 2002;Christen and Vogt, 2004;Offerle et al., 2006;Goldbach and Kuttler, 2013).In terms of the model results, Q* and ΔQ S are now the same for Kc and the adjusted Sw run (Fig. 17, column 5) because the radiation balance is identical (same meteorological forcing, same bulk albedo and emissivity values) and the OHM coefficients for the different surface types are now combined in the same proportions.Q H increases slightly whilst Q E is reduced slightly during the daytime as there is less vegetation and more impervious surface at Kc, but Q E increases at night because the water in the Kc footprint provides a continuous supply of moisture.However, comparison of columns 1 and 2 shows that the model does not perfectly capture the observed differences in Q E between the sites: Q E is slightly underestimated at Sw and substantially overestimated at Kc (Section 4.6.1),hence the difference in Q E MOD is much smaller than in Q E OBS .Adjusting the population density and Q F model coefficients (Table 3) accounts for the remaining differences in the modelled fluxes.The extra energy supplied by Q F primarily increases Q H and slightly increases Q E (column 6 of Fig. 17).
This exercise demonstrates the impact on surface energy exchanges of developing a suburban area into a dense urban region.SUEWS can be used in this way to explore options for designing sustainable cities or to assess the merits of various planning scenarios, such as including green-space or water bodies to mitigate heat stress.However, no model is perfect.This evaluation indicates that SUEWS does not fully replicate the observed differences in fluxes at these sites (compare columns 1 and 2 of Fig. 17).Q E is overestimated at Kc (largely due to the fact that evaporation from the river boosts Q E MOD yet does not seem to contribute substantially to Q E OBS ) and although the clear difference in Q H between the sites is replicated by the model, Q H is overestimated at Kc and slightly underestimated at Sw.In addition, the observed differences in storage heat flux are underestimated, as the OHM coefficients for the various surface cover types are too similar to generate the large differences observed in ΔQ S , even for very different source area characteristics (Section 4.4).

Conclusions
Recent developments to the SUEWS model are presented and the performance of the model is evaluated for two UK sites: a suburban residential neighbourhood in Swindon and a dense urban site in central London.These sites differ in various ways from the North American sites where the model was developed (e.g.building materials, energy and water use), particularly the central London site which is more built-up and more densely populated than sites where SUEWS has been applied previously.The universality or suitability of model parameters are assessed and new alternatives suggested.These will help model users select suitable values for other sites.The importance of accounting for seasonal variation in the OHM coefficients is highlighted.Several improvements have been made to the model itself, including an alternative formulation for the surface conductance.
The following conclusions are drawn from the model evaluation: -Vegetation phenology is generally modelled well but leaf-out is reached slightly too early and suddenly, leading to an overestimation of g s and Q E in spring.-Selection of suitable albedo values for a study site is important for obtaining realistic K ↑ , for example European cities tend to have lower albedos than North American cities.The model capability has been increased in terms of representing seasonal changes in albedo.Further developments may be beneficial (e.g.taking into account sun angle, wet/dry conditions) but are likely to have only a small impact.-Modelled L ↑ agrees well with observations but L ↑ may be slightly underestimated at Kc due to the impact of Q F on L ↑ OBS .-L ↓ can be modelled or observations can be provided if available.If L ↓ is modelled performance is reasonable, although the range of modelled values is smaller than the observed range and the amplitude of the diurnal cycle is underestimated.This is attributed to the narrowness of the cloud fraction distribution when estimated from RH and T air .Q* is improved if SUEWS is provided with L ↓ observations but there is no significant improvement in Q H or Q E .-Generally Q* is modelled well, in accordance with previous studies.
-For dense urban areas, obtaining an accurate estimate of Q F can be a significant issue, as Q F can be very large (~100 W m − 2 ) so relative uncertainties are substantial in absolute terms.An uncertainty of 5-10 W m −2 in Q F for suburban areas is generally fairly small compared to the energy available from Q*, whereas for city centres the uncertainty could easily be 50-100 W m −2 or more.-Inaccurate estimation of Q F can result in poor simulation of the other heat fluxes, particularly in areas of high population density where Q F is large and can be very variable spatially.The flexibility of SUEWS allows a time series of Q F to be provided as an input so that output from a more sophisticated model could be used if it is judged to be more appropriate than the simple approaches possible within SUEWS.-Wintertime storage heat flux modelled using OHM is significantly underestimated in suburban areas by about 30 W m −2 (i.e.nocturnal heat release is overestimated) due to a bias in derived parameters towards summertime field campaigns.For SUEWS, this impacts the sensible heat flux in particular.Even a basic approach to adjust the OHM coefficients for wintertime (larger a 1 , smaller a 3 ) significantly improves model performance.At dense urban sites, nocturnal heat release is underestimated.Furthermore, differences in ΔQ S between suburban and dense urban sites are underestimated, probably because (i) it is unclear how to incorporate the anthropogenic heat flux and (ii) the OHM coefficients are fairly similar for the different surface types.The narrow range of materials and conditions studied currently limits the applicability of the OHM approach.Further research is needed to inform parameterisations which better account for building volume, construction materials and seasonality.-SUEWS does not model the sign of the stability well during night-time at the suburban site, although this is not necessarily problematic as the stability is close to neutral.At both sites, stability is often underestimated by the model which leads to the aerodynamic resistance being underestimated.-Observations indicate Q H b 0 W m −2 at night in summer in Swindon but this pattern is not matched by the model.As Q H is calculated from the residual of the energy balance it collects the errors in all the other terms.The storage heat flux, in particular, is problematic, but accounting for seasonality in the OHM coefficients improves Q H considerably. -Q E is slightly underestimated at Sw but overestimated at Kc.The overestimation at Kc is thought to result from the complexity of the site and overestimation of evaporation from the river surface.Evaluation of SUEWS over open water would be beneficial, as well as more observational campaigns in cities with a substantial body of water within the flux footprint.
This study reinforces the importance of having a good understanding of the measurements used in model evaluations.In particular, uncertainties associated with observational data, limitations of the measurement techniques, representativeness of the observations and compatibility between model and observations must be considered.Time-series analysis is useful for checking for unrealistic model output which may be missed when considering summary statistics or plots.
The model was used to investigate the differences in surface energy exchange between the two sites.The biggest changes to the surface energy balance are attributed to the surface cover fractions (in particular the proportion of vegetation versus impervious surface) and the energy available (through the anthropogenic heat flux).As such, the model can be used to explore the impact of various urban-design scenarios on the local environment.
This work highlights the need to evaluate models under a range of conditions.The evaluation datasets here span two years and represent quite different conditions in terms of moisture availability.However, they by no means cover the full parameter space.Testing under more extreme conditions would be beneficial.This would need to be supported by a drive to capture extreme events, either by rapid deployment of instrumentation for intense observation periods during heat waves or droughts, for example, or by sustained support for longterm measurements.Priorities for future research include improving the seasonal variability in storage heat flux; more accurately representing the anthropogenic heat flux in densely populated areas; and evaluation across a wider range of sites, especially city centres.The SUEWS model and manual can be downloaded from http://micromet.reading.ac.uk and http://urbanclimate.net/umep/SUEWS.

Appendix A. Surface conductance parameterisation
For SUEWS to be applicable across a range of sites and conditions the surface conductance (g s ) parameterisation must be able to respond to a range of conditions.An alternative parameterisation for g s has been developed and is described here.Note that this is not intended to be the 'best-fit' to any particular dataset, but rather seeks to provide reasonable model performance over a wide range of circumstances, especially for those sites where little or no irrigation occurs.
The functional form of the dependence on incoming shortwave radiation, K ↓ , is retained (Stewart, 1988), where K ↓ max is the maximum observed incoming shortwave radiation (here K ↓ max = 1200 W m −2 ).As K ↓ increases, stomata open to allow photosynthesis and transpiration, thus g s increases.g(K ↓ ) reaches a maximum value of 1 when K ↓ = K ↓ max (if K ↓ N K ↓ max , then g(K ↓ ) is set to 1).The coefficient G 2 , determines the shape of the curvethe smaller G 2 , the more pronounced the corner; the larger G 2 , the more linear the relation.The value of 200 W m −2 used here is close to the middle of the range for other sites (Fig. A1a).If G 2 is too small, the diurnal cycle of g(K ↓ ) may exhibit artificially pronounced corners and a flat top.Different functional forms have been suggested for the dependence on specific humidity, Δq, including linear-piecewise (Jarvis, 1976;Stewart, 1988) and exponential (Irmak and Mutiibwa, 2010) relations (Fig. A1b).Here the relation suggested by Ogink-Hendriks (1995) is used: Note that the coefficients G 3 and G 4 used here are not directly comparable to those used in Jarvis (1976) or Ja11, as the functional form is different.This relation shows a similar decrease in stomatal conductance as the humidity deficit increases but does not have the sharp corner characteristic of the linear-piecewise model.The functional form of the dependence on air temperature, T air , is retained (Stewart, 1988), The peak value of g(T air ) occurs at T air = G 5 and the lower (T L ) and upper (T H ) temperature limits determine when evaporation switches off.The temperature limits are extended to T L = −10 °C and T H = 55 °C (Fig. A1c).
For the dependence on soil moisture deficit, Δθ, a new functional form is proposed which is normalised so that g(Δθ) ranges from 1 when there is no soil moisture deficit to 0 when the soil moisture is totally depleted and no water can be accessed for evapotranspiration (i.e.wilting point Δθ WP ): The wilting point has been set at Δθ WP = 120 mm, much larger than for Ja11 and larger than the 80-90 mm used at Thetford Forest (Stewart, 1988) but smaller than ≈140 mm used in Ogink-Hendriks (1995).The influence of soil moisture on surface conductance has been reduced relative to the Ja11 functions; now the behaviour is much more similar to the Stewart (1988) or Ogink-Hendriks (1995) behaviour, where soil moisture has little influence on g s until there is an appreciable deficit (Fig. A1d).In SUEWS_v2016a g s is calculated using the soil moisture deficit under vegetated surfaces only.
The coefficients summarised in Table A1 are informed by an empirical fit to Swindon data (g s from Eq. ( 9)) but aim to be as generic as possible.Non-linear least squares regression provided G 1 − G 6 once K ↓ max , T L , T H and Δθ WP were set to ensure a wide range of environmental and climatic conditions could be incorporated.
Compared to the Ja11 formulation, the new set of relations led to improved performance at Sw during summer (Fig. 14), mainly due to the relaxed control of soil moisture and reduction in the frequency of times when evaporation was restricted by unrealistic g s values.RMSE improves by 1-2 W m − 2 for both heat fluxes at Sw and for Q H at Kc (but the RMSE for Q E increases by the same amount); r 2 for Q E decreases slightly from 0.26 to 0.25 at Kc but increases from 0.68 to 0.72 at Sw.Despite the lack of dramatic improvement in model performance, the removal of unrealistic restrictions is important.When SUEWS performs best (i.e. at Sw in summer) the new parameterisation is statistically better (e.g.r 2 for Q E increases from 0.59/0.65 to 0.73/0.71for May/June).Promisingly, performance in Vancouver remains similar with this new parameterisation (Kokkonen et al., 2015).In SUEWS_v2016a, either Ja11 or this new parameterisation can be used.It is highly recommended that time series of Q E MOD and g s MOD are examined to ensure the selected parameterisation is appropriate.

Table A1
Values for the surface conductance parameterisation proposed in this study.Note that G 3 and G 4 are not directly comparable with the Ja11 values as they correspond to equations of a different form.

Fig. 2 .
Fig. 2. (a) Location within the UK and photographs of the (b) London (Kc) and (c) Swindon (Sw) sites.

Fig. 3 .
Fig. 3. Modelled versus observed radiation components for London (Kc) and Swindon (Sw).As K ↓ is required as a model input K ↓ MOD in (a, b) is taken from the 60-min model output file but it is not actually calculated by the model (see text for details).Light grey points in (d) indicate when snow settled in Sw.

Fig. 5 .
Fig. 5. (a, b) Frequency distributions of modelled and observed cloud fraction F CLD ; (c, d) cloud fraction versus relative humidity (dotted lines indicate F CLD calculated using Eq.(3) for T air = −10 °C and 30 °C).F CLD OBS was not available at Sw, but the distribution is expected to be very similar to that at Kc due to the proximity of the two sites.

Fig. 6 .
Fig. 6. (a, e) Median diurnal cycle and inter-quartile range for modelled and observed net all-wave radiation Q*; (b, f) difference between modelled and observed Q* by time of day; (c, g) scatter plot of modelled versus observed Q*. Results are also shown (in purple) for the case where observed L ↓ is provided with the meteorological forcing data (d, h) rather than modelled.

Fig. 8 .
Fig. 8. Median diurnal cycle and inter-quartile range by month for modelled and observed net storage heat flux ΔQ S and the difference between modelled and observed ΔQ S by time of day.Results are also shown (in cyan) for the case where the OHM coefficients are kept constant throughout the year (MOD CC ).Note different y-axis scales for Kc and Sw.

Fig. 9 .
Fig. 9. Variation of OHM coefficients by month for the Kc and Sw sites and for a dry asphalt surface (An99).

Fig. 10 .
Fig. 10.(a, b) Modelled versus observed friction velocity coloured according to wind direction; (c, d) difference between modelled and observed friction velocity as a function of wind direction; (e, f) observed drag coefficient C D = (u ⁎ /U) 2 in near-neutral conditions (−0.1 ≤ ζ b 0.1) as a function of wind direction.Boxplots indicate the median and inter-quartile range for bins of 10°.

Fig. 11 .
Fig. 11.Median diurnal cycle and inter-quartile range by month for modelled (pink) and observed (grey) friction velocity u ⁎ ; stability parameter ζ calculated from (z m − z d ) / L Ob MOD or (z mz d ) / L Ob OBS ; and, for dry conditions only, surface conductance g s .For ζ, results are also shown (in cyan) for the case where the OHM coefficients are kept constant throughout the year (MOD CC ).Note different y-axis scales (ζ and g s ) for Kc and Sw.

Fig. 12 .
Fig. 12. Modelled versus observed (a, b) sensible heat flux Q H and (c, d) latent heat flux Q E for London (Kc) and Swindon (Sw).

Fig. 13 .
Fig. 13.Median diurnal cycle and inter-quartile range by month for modelled (pink) and observed (grey) sensible heat flux Q H and latent heat flux Q E .Results are also shown (in cyan) for the case where the OHM coefficients are kept constant throughout the year (MOD CC )•.

Fig. 14 .
Fig. 14.Model performance in London (Kc) and Swindon (Sw): (a, b) coefficient of determination for net all-wave radiation, sensible heat flux and latent heat flux; mean bias error for (c, d) sensible heat flux and (e, f) latent heat flux for all data and data separated into daytime and night-time.Dashed lines in (a, b) are for the Ja11 surface conductance parameterisation (based on soil moisture under all surfaces as opposed to vegetated surfaces only, see Appendix A).Dotted lines in (c-f) are for the case where the OHM coefficients are kept constant throughout the year.
4.7.Surface and sub-surface moisture stores 4.7.1.Soil moistureModelled soil moisture responds well to drying and wetting periods at the timescale of a few days and the variation is in good agreement with observations across the seasons at Sw.The soil moisture deficit (SMD),

Fig. 15 .
Fig. 15.(a) Rainfall (60 min and cumulative total); (b, c) modelled (pink) and observed (grey) normalised soil moisture deficit (SMD).As observed SMD is available for only a short period at Kc, observed SMD has been normalised using maximum and minimum observed SMD for the period when observations are available (light grey line) and using maximum and minimum modelled SMD for the period when observations are available (dark grey line).The dashed pink line corresponds to the grass surface.

Fig. 17
Fig. 17.Median diurnal cycle and inter-quartile range of the energy fluxes for London (Kc) and Swindon (Sw) according to observations (OBS) and the model (MOD) and swapping Sw input for Kc input characteristics, including meteorological driving data (met), building and tree height (z H ), surface cover fractions (f i ) and anthropogenic energy use (Q F ). Kc results are the same in columns 2-6, whilst the Sw results change from representing the Sw site (column 2) to representing the Kc site (column 6).

.
Fig. 17.Median diurnal cycle and inter-quartile range of the energy fluxes for London (Kc) and Swindon (Sw) according to observations (OBS) and the model (MOD) and swapping Sw input for Kc input characteristics, including meteorological driving data (met), building and tree height (z H ), surface cover fractions (f i ) and anthropogenic energy use (Q F ). Kc results are the same in columns 2-6, whilst the Sw results change from representing the Sw site (column 2) to representing the Kc site (column 6).

Table 2
Original and adjusted albedo values and emissivity values.Minimum/maximum albedo values are for leaf-off/leaf-on periods.In the absence of additional information, emissivity values from Ja11 were used for the UK sites a .