Computers, Environment and Urban Systems Quantifying how landscape composition and con ﬁ guration a ﬀ ect urban land surface temperatures using machine learning and neutral landscapes

The ect is an important 21st century issue because it intersects with the complex challenges of population While the e ﬀ ects of urban landscape composition on land surface temperature (LST) are well-studied, less at- tention has been paid to the spatial arrangement of land cover types especially in smaller, often more diverse cities. Landscape con ﬁ guration is important because it o ﬀ ers the potential to provide refuge from excessive heat for both people and buildings. We present a novel approach to quantifying how both composition and con ﬁ guration a ﬀ ect LST derived from Landsat imagery in Southampton, UK. First, we trained a machine-learning (generalized boosted regression) model to predict LST from landscape covariates that included the characteristics of the immediate pixel and its surroundings. The model achieved a correlation between predicted and measured LST of 0.956 on independent test data ( n =102,935) and included predictors for both the immediate and adjacent land use. In contrast to other studies, we found adjacency e ﬀ ects to be stronger than immediate e ﬀ ects at 30m resolution. Next, we used a landscape generation tool (Landscape Generator) to alter landscape con ﬁ guration by varying natural and built patch sizes and arrangements while holding composition constant. The generated neutral landscapes were then fed into the machine learning model to predict patterns of LST. When we manipulated landscape con ﬁ guration, the average city temperature remained the same but the local minima varied by 0.9°C and the maxima by 4.2°C. The e ﬀ ects on LST and heat island metrics correlated with landscape fragmentation indices. Moreover, the surface temperature of buildings could be reduced by up to 2.1°C through landscape manipulation. We found that the optimum mix of land use types is neither at the land-sharing nor land-sparing extremes, but a balance between the two. In our city, maximum cooling was achieved when ~60% of land was left natural and distributed in 7 – 8 patches km − 2 although this could be location dependent and further work is needed. Opportunities for urban cooling should be required in the planning process and must consider both composition and con ﬁ guration at the landscape scale if cities are to build capacity for a growing population and climate change.


Introduction
The urban heat island effect must be one of the most studied of all environmental phenomena and matters greatly in human terms because it intersects with four pressing challenges of the 21st century: population growth, global climate change, public health and increasing energy demand. Over half of the world's population is currently concentrated in cities and this proportion is forecast to grow, leading to expanded Johnson & Wilson, 2009). In contrast, wealthier inhabitants are likely to respond by purchasing more air conditioning units, increasing energy demand in cities (Mirzaei & Haghighat, 2010).
Most studies of urban temperatures have focused on large cities (Peng et al., 2012;Tran, Uchihama, Ochi, & Yasuoka, 2006) where classic heat island effects are more prominent (Oke, 1973;Tan & Li, 2015), while smaller cities (where the majority of people live) have received little attention (Heinl, Hammerle, Tappeiner, & Leitinger, 2015;Ivajnšič, Kaligarič, & Žiberna, 2014). There has also been a tendency to study less diverse environments (Oke, 1973;Tan & Li, 2015) minimising complicating factors such as elevation, proximity to water and landscape diversity (Fabrizi, Bonafoni, & Biondi, 2010;Tan & Li, 2013). In attempting to describe landscapes, two terms have long been used in landscape ecology (e.g. Gustafson, 1998). Landscape composition refers to the number (or proportions) of land use categories within a defined unit (e.g. patch, pixel or municipal area) whereas landscape configuration considers the spatial arrangement of those units. While the effects of landscape composition on urban land surface temperature (LST) are well-known (green areas tending to be cooler and impervious surfaces hotter) fewer studies have considered how landscape configuration affects temperatures (but see Asgarian, Amiri, & Sakieh, 2015;Gage & Cooper, 2017;Li et al., 2011). Adjacency effects (i.e. what is next to what) could be significant determinants of local urban temperatures but are relatively under-researched (Chun & Guldmann, 2014Rajasekar & Weng, 2009;Su, Foody, & Cheng, 2012). Understanding what drives the diversity of temperature in urban areas may provide the clues needed to build capacity for mitigating some of the negative effects of climate change (Carter et al., 2015;Kleerekoper, Van Esch, & Salcedo, 2012) such as heatwaves which are known to increase mortality (e.g. Anderson & Bell, 2011). In this context, temperature regulation is one of several ecosystem services afforded by green space in urban areas and is a vital component of the land-sharing v. land-sparing debate (Collas, Green, Ross, Wastell, & Balmford, 2017;Stott, Soga, Inger, & Gaston, 2015). In other words, should cities favour low-density built land interspersed with green space but no large parks (land-sharing); or should they feature high-density buildings with large contiguous blocks of green space being set aside (land-sparing)?
One way to study how landscape configuration affects urban temperatures is to compare multiple cities or spatial units within cities (Gage & Cooper, 2017), but results may be confounded by covariates that cannot be controlled, including differences in composition. In this paper, we took a different approach, conducting a virtual experiment in which land use configuration was varied while composition was held constant, based on the actual make-up of a real landscape. Firstly, an empirical model was built to predict land surface temperature (LST) from landscape composition at the place of interest and from the surrounding area. Adjacency effects are thus intrinsic to this model. Although a wide range of approaches have been used to model LST, the complex non-parametric, non-linear and interacting relationships among variables mean that conventional statistical methods such as regression analysis have limited use. Robust methods come from machine learning where the data themselves drive the form of the  (left) and March (right) with water masked out in white. Green spaces are clearly visible as cooler (blue) areas and the impervious surfaces of the docks stand out as hotter areas (red) in September. The distance-weighted centre of all buildings (used to derive the variable DIST_CENTRE) is also shown. Gaps due to cloud and SLC-off effects in the Landsat imagery have been filled using a 3 × 3 neighbourhood filter to improve visualization here. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)  Sobrino, Jiménez-Muñoz, and Paolini (2004). c Stathopoulou, Cartalis, and Petrakis (2007).
relationship rather than relying on assumed linearity or normality. Decision trees offer a good approach because they are intuitive, can handle mixed predictor types and missing data, are invariant under monotonic transformations, automatically handle interactions, are little affected by the inclusion of irrelevant predictor variables (Hastie, Tibshirani, & Friedman, 2009) and also relatively immune to the effects of collinearity (Dormann et al., 2013). While single decision trees (e.g. regression trees as used by Guo et al., 2015) suffer from a potential lack of accuracy and instability due to the propagation of errors down successive splits of the tree, this may be overcome by combining the results of many trees through ensemble methods. In particular, stochastic gradient boosting (Hastie et al., 2009) through generalized boosted regression models has advantages in bias reduction over competing techniques such as bootstrap aggregation ("bagging") as used in Random Forests (Elith, Leathwick, & Hastie, 2008;Gage & Cooper, 2017) and was used in this paper. After deriving a model for LST, we generated predicted temperature surfaces for synthetic landscapes in which land use configuration was varied while composition was held constant. Land-use patterns generated in this way are termed neutral landscapes since they are "neutral" to the processes that form real landscapes (Gardner, Milne, & Turnei, 1987). Neutral landscape models typically allocate user-defined proportions of land use to pixels at random and then cluster those pixels according to some rule base (Saura & Martínez-Millan, 2000). The allocation of land use types defines composition, while the clustering defines the landscape configuration based on chosen patch metrics (such as size and shape) or adjacency rules (land use A next to land use B). Neutral landscapes thus mimic the composition and configurations of real landscapes but are constructed in an artificial way. Despite frequent use to test hypotheses in landscape ecology (Li et al., 2004;van Strien, Slager, de Vries, & Grêt-Regamey, 2016 and references therein) neutral landscapes have not previously been used to study patterns in LST to our knowledge. By using the neutral landscapes as test datasets in the predictive model, patterns of LST may be generated to address the question whether landscape configuration affects the temperature and heat island characteristics of an urban area. Our focus in particular was on whether land-sharing or land-sparing is the best strategy in urban planning for temperature regulation and the provision of refuges from extreme conditions such as heatwaves.

Study area
The study was centred on the city of Southampton covering an area of 51.8 km 2 at the confluence of the Test and Itchen rivers in Hampshire, southern England, UK, with a population of c. 245,000 people. Much of Southampton's western coast is dominated by impervious land surfaces devoted to the port and docks, but the city also retains over 50 well-distributed parks and open green spaces covering c.11 km 2 (Fig. 1). These varied land cover classes make Southampton an ideal candidate for studying heterogeneity in LST in a complex, medium-sized city.

Land surface temperature
Land surface temperature (LST) was determined for an extended region (427 × 329 pixels at 30 m resolution = 126.4 km 2 ) around the study city using Landsat 7 Enhanced Thematic Mapper Plus ( for other studies) with satellite overpasses at 10.48 h local time. Images were georectified against the Ordnance Survey (GB) MasterMap Topography layers (described below) using 10 or 11 ground control points with an RMS error of less than one pixel (7.96-10.46 m).
Landsat 7 ETM+ images collected after 31 May 2003 suffer from failure of the Scan Line Corrector that compensates for the forward motion of the satellite, resulting in~22% of pixels being lost per scene on average (https://landsat.usgs.gov/slc-products-background). These SLC-off effects are most pronounced at the edge of a scene and decrease towards its centre. As other image characteristics are unaffected, careful selection of study areas and scenes allows Landsat ETM+ imagery to be used with only minor impacts. For our full study area, the percentage of affected pixels was only 1.9% in March, 1.5% in May, 1.4% in September and 1.0% in November. For the central area used in the neutral landscape models (below), only 0.07% of pixels were affected over the four months. Together with cloud and cloud shadow, these pixels were flagged as defective and omitted from our model building processes. As the location of SLC-off effects is non-systematic with Resolution Imaging Spectroradiometer (MODIS) atmospheric correction routines to Level-1 data products. Water vapour, ozone, geopotential height, aerosol optical thickness and digital elevation are input with Landsat data to the Second Simulation of a Satellite Signal in the Solar Spectrum (6S) radiative transfer models to generate top of atmosphere (TOA) reflectance, surface reflectance, brightness temperature and quality assurance (QA) layers.
LST was derived from brightness temperature using the formula in Artis and Carnahan (1982): where LST = land surface temperature in K BT = brightness temperature in K. λ = wavelength of emitted radiance in m. For Landsat 7 ETM+ the midpoint of the thermal band is 11.45 μm, i.e. 11.45 × 10 −6 m. α = constant derived as h * c / σ where h = Planck's constant 6.26 × 10 −34 , c = the speed of light 2.998 × 10 8 and σ = the Boltzman constant 1.38 × 10 −23 , giving a value of 1.438 × 10 −2 . ε = emissivity of the surface in the range 0 to 1.
Surface emissivity values ε obtained from the literature (Table 1) were applied using the land classification method (Dash, Göttsche, Olesen, & Fischer, 2002). Six land classes were identified from Ordnance Survey (GB) MasterMap Topography layers and rasterised to 1 m resolution (see below). Each pixel was assigned an emissivity value from Table 1 and the mean emissivity was then calculated at 30 m resolution for analysis. In practice, urban pixels at 30 m resolution are Composition and configuration of the landscapes. All landscapes had the pixel composition of the original with 8.9% buildings (red), 34.5% hard surfaces (grey), 35.5% mixed surfaces (peach) and 21.1% natural surfaces (green). The numbers of mixed and hard patches were fixed at 403 and 1516 respectively for all landscapes while the numbers of built and natural patches were varied as shown. Contagion expresses the probability that two randomly selected adjacent pixels belong to the same land class, expressed as a percentage. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) rarely of a single land use type and the average emissivity of mixed pixels typically ranged from 0.94 to 0.98 in Southampton.
Predictor variables (Table 2; column 1) for modelling mean LST (the average of the four monthly LST values) were derived from high-resolution vector and raster products. Ordnance Survey (GB) MasterMap Topography data (scale 1:1250) were obtained using the EDINA Digimap Ordnance Survey Service (http://digimap.edina.ac.uk, downloaded 10 Mar 2015). The feature classes were grouped into six themes: natural surfaces (grassland, trees), hard surfaces (tarmac roads, car parks, paths, concreted areas), building footprints, glasshouses, mixed surfaces (mainly gardens comprising hard and natural surfaces) and the remainder (consisting of small unclassified areas, open water and coastal areas). Each feature class was rasterized to 1 m resolution to yield a single class per pixel. Pixels were then aggregated to the 30 × 30 m resolution of the Landsat imagery, resulting in percentage land cover classification for each Landsat pixel based on 900 sample pixels. Only the variables for natural, hard, mixed and built surfaces were used as predictors since proportions of glasshouses were tiny. Elevation, slope and aspect were derived from the OS Terrain 5 m DTM data set using the EDINA Digimap Ordnance Survey Service (http:// digimap.edina.ac.uk), downloaded 13 April 2015.
Variables for adjacency were calculated as percentage land cover in non-overlapping 30 m annuli around the focal pixel using ArcGIS 10.3.   For example, the variable BUILD_ANN12 (Table 2; row 3) assigned to the focal pixel how much of the land in an annulus starting at 30 m and ending at 60 m from the pixel centre was occupied by buildings. Similarly, NAT_ANN34 denotes how much land in an annulus from 90 m to 120 m from the pixel centre was classed as natural land cover. Distance variables (from the pixel centre to the nearest water and the weighted centre of buildingssee location on Fig. 1) were included as neighbourhood effects.
To model LST, decision trees with stochastic gradient boosting were fitted using generalized boosted regression models (R package gbm 2.1.3; Southworth, 2015). Ten-fold cross-validation optimisation of the model was performed using stepwise selection (Elith et al., 2008) on a random training sample of 10% of pixels (n = 11,438). Model simplification was achieved by backwards elimination of the least important variables until the change in deviance exceeded its standard error in the original model (Elith et al., 2008). The simplified model was then fitted to the entire dataset using the optimised parameter values: number of trees = 5600, bag fraction = 0.5, tree complexity = 5 and learning rate = 0.01.

Neutral landscapes
Neutral landscapes were created using the Landscape Generator (Slager & De Vries, 2013;van Strien et al., 2016) because of its flexibility in generating plausible landscapes with the desired composition and configuration. The Landscape Generator uses a computer-intensive, heuristic optimization procedure to incrementally adjust a base landscape until it meets the user's objectives (Slager & De Vries, 2013). To keep the task manageable, we restricted analysis to the central portion of the city (190 × 150 pixels) and converted the original percentage composition of land classes to record only the commonest class per pixel. To separate the effects of configuration from composition, the proportions of pixels in this base landscape (8.9% buildings, 34.5% hard surfaces, 35.5% mixed surfaces and 21.1% natural surfaces) were held constant throughout. To alter configuration, the number of built or natural patches they formed was varied in two convenient ways to generate new landscapes.
In the first, the starting point was the original landscape (left image in Fig. 2a). Using the Landscape Generator, the number of built or natural patches was then repeatedly halved (to the nearest integer) to create new landscape patterns (left set of images in Fig. 2b). In the second, the original landscape was first fully randomized and then reaggregated using the Landscape Generator into an image with the original number of patches but with different pattern (right image in Fig. 2a). The number of built or natural patches in this image was again repeatedly halved (to the nearest integer) to create a new set of landscape patterns (right set of images in Fig. 2b). To achieve these goals, the Landscape Generator starts by calculating how close the starting landscape is to the desired numbers of patches. An optimization loop then swaps pairs of cells, retaining the swap if the outcome is a closer match to the goal. The procedure continues to swap pairs of cells (taking from a few hours to several days to complete with an i7 processor) until the target number of patches has been reached (Slager & De Vries, 2013;van Strien et al., 2016).
Our aim in developing these 15 landscapes was to change the adjacency of land cover types, mimicking landscape fragmentation and an increase in the number of patches, typical consequences of urbanization (Dewan, Yamaguchi, & Rahman, 2012). Adjacency within the landscapes was quantified using both CONTAGION and CLUMPY (for the land class Build) from FRAGSTATS (McGarigal, 2015) because for our data, these were not collinear (Pearson's r < 0.7: Dormann et al., 2013). CONTAGION is the probability that two randomly selected adjacent pixels belong to the same land class, expressed as a percentage, where~0 is maximally dissagregated and interspersed, while 100 is maximally aggregated. CLUMPY is a class-level metric describing deviation from a random distribution, −1 being maximally disaggregated, 0 random and 1 maximally clumped. (See McGarigal, 2015 for formulae).
To assess the effect of the neutral landscapes on LST, the outputs from the Landscape Generator were first used to calculate the variables needed for adjacency effects (Table 2; Fig. 3) using ArcGIS 10.3 (ESRI, Redlands, CA). All other variables were kept at their original location specific values (e.g. altitude, slope and distance to water). In this way, each neutral landscape led to the creation of a synthetic dataset that was used to predict LST from the decision tree model. The entire workflow for the analysis is summarized in Fig. 3.
The LST predictions for each landscape were summarized as the mean, maximum, minimum and standard deviation of LST. Urban heat island (UHI) metrics were calculated as: UHI magnitude (difference between maximum and mean temperatures: (Rajasekar & Weng, 2009); hot island area (HIA -the area over which the observed temperature exceeded the mean plus one standard deviation: (Zhang & Wang, 2008); its converse, the cold island area (CIA -the area over which the observed temperature was less than the mean minus one standard deviation); and the numbers and sizes of hot and cold islands (Rook's case connectivity). If landscape configuration is not important in determining LST or heat island characteristics, these UHI measures would be expected to remain consistent across all neutral landscapes and adjacency measures would not be important in the predictive model.
To focus specifically on buildings, the predicted LST values for all built pixels within each neutral landscape were compared using a factorial GLM in SPSS (IBM SPSS Statistics 24). The series (whether derived from the original or randomized landscape) and numbers/types of clusters (natural or built) were specified as fixed factors and post-hoc comparisons made using the Ryan-Einot-Gabriel-Welsch Range procedure (Field, 2013).

Spatial and temporal patterns in measured LST
Emissivity-corrected LST ranges in Southampton were 9.0-28.8°C in March, in November, i.e. spanning 17 to 24°C at any one time (Table 3). Southampton therefore shows very strong spatial heterogeneity in LST across spring to autumn. The general spatial pattern was consistent between months (examples in Fig. 1) with the built-up parts of the city showing higher LST than the surrounding rural areas and green spaces within the city. The "hot island area" (Zhang & Wang, 2008) for LST covered 20-26% of the study area, being larger in the warmer months (Table 3, rows 5 and 6). The UHI magnitude for LST (Rajasekar & Weng, 2009) was similarly greater in the warmer months although reached its peak (16.6°C) in September as opposed to May for the hot island area. The seasonal change in hot island area reflected variation in the number and size of hot islands making up Southampton's heat island archipelago, 303 islands in May fragmenting to 1027 in November (Table 3, row 7). The largest island varied seasonally from around 10 to 15 km 2 but always lay at the heart of the city.
To summarise the pure effects of land cover types on LST, data were extracted only for 30 m pixels with 100% fractional cover of a single type (Fig. 4). In all months, the LST for pure pixels with buildings was higher than for hard surfaces, mixed surfaces and natural cover, and that order was maintained across months. For example, in May (the hottest month studied), pixels with 100% buildings were on average 9.98°C warmer than natural surfaces, 2.82°C warmer than hard surfaces, and 7.17°C warmer than mixed pixels.

Modelled predictors of LST
The optimised decision tree model for mean LST had a mean total deviance of 5.297 with estimated residual deviance of 0.471 (SE = 0.009) based on 10-fold cross-validation. The cross-validated correlation was 0.955 (SE = 0.001) within the training data and 0.956 with the independent test data (n = 102,935), indicating excellent predictive power. Two variables were dropped during model simplification leaving the relative variable importance in Table 2, column 3. The most important predictors of LST were the natural and building compositions of the immediate pixel and its neighbours, followed by the amount of adjacent impervious (hard) surface and elevation. It is significant that for both buildings and hard surfaces, it was the composition of the annuli that was most important rather than the composition of the pixel itself, indicating strong adjacency effects.

Landscape configuration and LST
Looking at individual pixel locations, predicted mean LST varied by up to 10.6°C between landscape scenarios, demonstrating the powerful effects of land composition and configuration on location-specific LST. Illustrative patterns in LST for the actual landscape through to the randomized landscape along a gradient where natural land cover patches were increasingly fragmented are shown in Fig. 5. Average LST across all pixels on each of the 15 neutral landscapes was, however, very consistent, ranging from 16.9 to 17.1°C (Table 4, column 2). Similarly, the minima varied within < 1°C.
The original landscape showed the largest standard deviation and range in LST (Table 4, top data row) and the fully randomized landscape the least (Table 4, bottom row). These represent the two extremes of land-sparing and land-sharing respectively in the analysis. The maximum LST for the fully randomized landscape was the lowest at 20.8°C, some 4.2°C lower than for the warmest case (Table 4, row 4). However, the fully randomized landscape also had the highest minimum LST (12.9°C) indicating that the cost of full intermixing is fewer opportunities for respite from hotter conditions.
The number of hot islands approximately equalled the number of cold islands on landscapes with the original patch mix (872 built/388 natural) irrespective of whether the landscape had been randomized or not (Table 4, rows 1 and 8). Reducing the number of built patches by clustering built pixels together decreased the number of hot islands and increased the number of cold islands (Table 4, rows 2-4 and 9-11). In contrast, reducing the number of natural patches led to more hot islands for configurations derived from the original landscape but not for the  Fig. 2. N = 28,500. Panel (a) shows neutral landscapes derived from the original landscape whereas the landscapes in panel (b) were derived from the randomized landscape. Each histogram shows the relative proportions (density) of pixels with temperatures from 12°C to just over 24°C making up each landscape. randomized landscape (Table 4, compare rows 5-7 with 12-14).
In fact, the parametric statistics in Table 4 conceal wide variation in the frequency distributions of LST between the landscapes (Fig. 6). The original landscape (with 872 built and 388 natural patches: Fig. 6a) exhibited three temperature peaks (at 13.5°C, 15.5°C and 19.5°C), a feature absent from all other landscapes. In general, grouping buildings into fewer patches reduced the variety of LST and removed the peak at 19.5°C from the original landscape. Putting green cover into fewer patches had a similar effect but with fewer pixels above 19.5°C. Overall, landscapes with broader, flatter distributions of LST had fewer hot or cold islands (a total of 364 for the original landscape) while the steeply peaked distribution of the fully randomized landscape (Fig. 6b) had 3689 islands.
The FRAGSTATS metrics for the neutral landscapes showed significant, usually non-linear relationships with heat island metrics ( Table 5). As the level of CONTAGION in the landscape decreased (one way to quantify land-sharing), there was a significant increase in minimum LST and a strong trend towards more, smaller hot islands (Table 5, rows 3, 6 and 7). Clustering buildings (CLUMPY) led to higher maximum LST, greater UHI magnitude, fewer but larger hot islands, and a smaller cold island area (Table 5, right-hand column).

The built environment
Post-hoc comparisons from a factorial GLM recognised seven significantly homogeneous but distinctive subsets of LST in built pixels. Starting with the original landscape, built pixels had a mean LST of 19.7°C (Fig. 7 4th bar). As the number of built clusters was halved from 872 on the original landscape, through 436 and 218 to 109 clusters, mean LST rose to 20.1°C, 20.5°C and 20.8°C respectively. Halving the number of built clusters while keeping the number of built pixels constant inevitably means clusters were larger, driving temperatures up.
In contrast, when the number of natural clusters was halved from 388 on the original landscape to 194 clusters, the mean LST on the built pixels dropped from 19.7°C to 19.4°C. Reducing the number of natural clusters further made no difference (within 0.03°C). This suggests for this particular landscape that the clustering of built pixels is dominant to clustering of natural habitats in driving LST and can be limited to some extent by intermixing natural patches. For the extreme fully randomized landscape, built pixels had a mean LST of 18.6°C, some 2.1°C cooler than the most clustered built landscape tested (with 109 built patches), setting the lower limit on what is possible to achieve by altering adjacency.

Discussion and conclusions
Vegetation and impervious surfaces are routinely the dominant predictors of LST within cities (Zhou, Qian, Li, Li, & Han, 2014) but there is debate over which is most important, e.g. Yuan and Bauer (2007) arguing that percentage impervious surface is a better predictor of LST than vegetation. In the present study, natural cover was the better predictor, perhaps because our predictive model contained separate variables for built and hard surfaces, together with adjacency effects that other studies may lack. Consistent with Chun and Guldmann (2014), the presence of buildings in Southampton always increased LST whether in the focal or adjacent pixels, and vegetation had the opposite effect. In particular, the extents of natural and built surfaces in an annulus 30-60 m around the focal pixel were particularly influential, as was the built surface 60-90 m away. What is new here is that we found the adjacency effects to be stronger than the immediate effects of the focal cell at 30 m resolution. Using spatial lag models, Chun and Guldmann (2014) also found evidence of adjacency effects, higher temperatures in neighbouring pixels being related with elevated temperatures in a target pixel. Similarly, Li, Zhou, Ouyang, Xu, and Zheng (2012) reported an effect of the spatial pattern in urban green space on LST, while Xie, Wang, Chang, Fu, and Ye (2013) found effects of the neighbouring vegetation and impervious surface fractions.
These studies strongly indicate that the spatial arrangement of green space and the built environment has a determining effect on LST (Asgarian et al., 2015). To explore this notion further, we tested for effects of landscape configuration on LST in a novel virtual experiment using 15 neutral landscapes. Taking the city as a whole, altering landscape configuration barely changed the mean or minimum LST (Table 4). This makes sense because the energy inputs to a cityscape are predominantly either external to the system (i.e. solar) or internal but dependent on landscape composition i.e. anthropogenic sources such as industry, motor vehicles and human metabolism (Sailor, 2011;Sailor & Lu, 2004) which relate to land use and were held constant. However, landscape configuration provides a mechanism for the redistribution of this fixed total energy such that each landscape had a different frequency distribution of LST classes (Fig. 6). This effect was so powerful that the maximum LST for a fully random landscape was 4.2°C lower than on the "hottest" landscape (similar to the actual configuration).
By manipulating the spatial arrangement of green space and built environment, there may be practical applications for these findings to mitigate overheating in city design, especially for new developments (Xie et al., 2013). Donovan and Butry (2009) and Wang, Chang,  Merrick, and Amati (2016) have shown how urban shade can reduce the energy demand of buildings and the analysis here suggests that the spatial configuration of land classes may have a similar effect beyond the distance that shade is cast. In the configurations tested, the mean surface temperature of buildings could be reduced by > 1.3°C using realistic scenarios and up to 2.1°C with more extreme re-arrangements of land use. The probable impact of this on energy demand appears to vary geographically (Cruz Rios, Naganathan, Chong, Lee, & Alves, 2017) although all else being equal and for the existing building stock in our study site, decreased demand for summer cooling might be expected. As energy demand is expected to rise in response to global climate change, the spatial arrangement of landscape units potentially offers exciting possibilities for regulating demand in new developments. This is in addition to the benefit of manipulating albedo and building wall materials as ways to minimise LST (Liu et al., 2017). These findings also have implications for the land-sharing v. landsparing debate (Collas et al., 2017;Stott et al., 2015) and the optimum spatial configuration of ecosystem service provision. In terms of temperature regulation, the Southampton study suggests that extreme landsparing is likely to result in higher temperatures in the built environment because green space would be insufficiently fragmented to cool adjacent built space. However, buildings themselves increase the temperature of green space which therefore needs to be large enough to regulate the excess heat. Extreme land-sharing on the other hand may lead to many small hot islands and fewer cold islands, essential spaces where people could cool down to reduce the acute effects of heat stroke. In our city, maximum cooling benefit was achieved when~60% of a pixel and its immediate neighbours is natural land, distributed in 7-8 natural patches per km 2 (i.e. 194 patches in 25.7 km 2 ). Using a footprint size of 90m 2 per three-bed semi-detached house in the UK, a building density of 44 dwellings ha −1 (approximately the density for UK new builds in 2008) would occupy an actual built area of 40% of the land. Clearly the remaining 60% cannot be left natural because properties must be served by roads, paths, car parking etc., suggesting that low-rise housing estates at 44 dwellings ha −1 would occupy too much land per dwelling to achieve maximum cooling. As the ability to cool down is crucial in reducing heat stress, particularly with the increasing frequency of heatwaves, there are clear potential public health benefits of access to well-distributed cooler spaces (Depietri, Welle, & Renaud, 2013). We suggest that the 7-8 patches of natural land per km 2 should weave throughout the built environment to form green zones. Guidelines from the Netherlands (Kleerekoper et al., 2012) suggest that all buildings should be within 200 m of some green space. Furthermore, public spaces may be insufficient to achieve the amount of green required and the involvement of citizens through private gardening initiatives may be essential (Kleerekoper et al., 2012), although poorer residents may lack the resources to maintain greenspaces (Mushore, Mutanga, Odindi, & Dube, 2018).
Future work should address the importance of patch shape on LST as this has practical implications for urban design. How far a patch of trees can reduce the surface temperatures of adjacent buildings without directly shading them also needs exploration, although 100-1000 m has been suggested (Kleerekoper et al., 2012). While in some areas LST is a good indicator of heat vulnerability (Mushore et al., 2018), the functional relationship between LST, air temperature and thermal comfort still needs further work. Ultimately, process-based models (e.g. Sodoudi, Zhang, Chi, Müller, & Li, 2018;Zhao, Sailor, & Wentz, 2018) together with experiments are needed to test the design rules generated here from predictive modelling on neutral landscapes.

Declaration of interest
The authors have no competing interests to declare.