A comparative analysis of surface and canopy layer urban heat island at the micro level using a data-driven approach

The Urban Heat Island (UHI) phenomenon, which is defined as the temperature differential between inner cities and surrounding areas, has been extensively studied over the past few years to unravel its mechanism and develop mitigation strategies. Nonetheless, two distinct types of temperature can be used to measure UHI, namely, (1) air temperatures, which refer to air temperatures at the canopy layer (CUHI), and (2) surface temperature, which refers to the temperatures at the surface layer (SUHI). The two types of UHI can have different deriving mechanisms and any effective mitigation strategies should be able to mitigate both concurrently. While efforts have been made to compare these two types of UHI, the studies so far seldom used consistent data. This is because SUHI is mostly based on remote sensing data, whereas CUHI commonly requires field measurements. These data are often not consistent with respect to spatial resolution, frequency, and accuracy. To address this gap, the present research aims to perform a comprehensive comparative analysis of CUHI and SUHI using data collected by a tailor-made mobile data collection unit at a high frequency and resolution at the micro-level (i.e., street-level). Data was collected from a fixed 8 km-long route in the city of Apeldoorn, the Netherlands, for the period of one year. Two different machine learning techniques, i.e., random forest and neural network, were used to study SUHI and CUHI. The results indicated a considerable variability between air and surface temperatures during the data collection campaign. Air temperatures ranged from -0,3 to 35,3 ◦ C, while surface temperatures fluctuated over a wider range, from -12,0 to 48,4 ◦ C at a micro-level. This variability of temperatures translated into an average of 0,10 ◦ C for CUHI, and -0,48 ◦ C SUHI. More importantly, however, the results highlighted the importance of investigating simultaneously the two types of UHI. This is because while urban features do not change dramatically in short periods of time, the impact of these same features on the CUHI and SUHI is different, therefore urban-heat resilience strategies planned for one type of UHI alone could have a different impact on the other type.


Introduction
The difference in temperature between the rural environment and inner cities, which is mainly due to the replacement of the natural landscape with man-made building materials, is referred to as the Urban Heat Island (UHI) phenomenon (Howard, 1833).On average, this difference can range from 2,3 to 5,3 • C (Van Hove, 2011) and it has significant negative impacts on heat stress, electricity consumption, cooling load, and air pollution, among others (Oke et al., 2017).Given that cities are expected to be home to 68 % of the world's total population by 2050, the importance of the UHI phenomenon for climate-conscious and heat-resistant urban design is continuously increasing (United Nations, 2019).
Several authors have argued for the urgency of containing and mitigating UHI (Hoverter, 2012;Kim & Brown, 2021;Kwak et al., 2020;Nwakaire et al., 2020;Parsaee et al., 2019;Qi et al., 2020;Van Hove, 2011) because the frequency and severity of heat waves suggest that the intensity of UHI has increased dramatically over the past few years.Particularly, northern hemisphere countries, where the built environment was not designed to withstand high temperatures, are now at extreme risk of suffering from heat-related illnesses (Bednar-Friedl et al., 2022).Only in the Netherlands, 650 people died during the heat wave of 2020.However, only 10 % of the municipalities have created plans to deal with extreme heat.Thus, a more comprehensive understanding of the phenomenon is urgently needed by governments and urban planners, who are ultimately the ones designing and implementing mitigation and adaptation strategies (Parsaee et al., 2019).
For urban planners to be able to consciously take UHI into account in their daily design practices it is necessary to understand first the impact of micro-level (i.e., street-level) mitigation strategies on UHI.This is because it is on the micro-level resolution that the decision-making of urban planners is focused on.Without adequate insight into how streetlevel design decisions can affect UHI, urban planners will be ill-prepared to develop long-term actionable strategies for combating UHI.
To be able to address UHI at the micro-level, it is first important to understand how it responds to design decisions.According to the core definition of UHI, i.e., the temperature differential between urban and rural areas, there are two main approaches to measure and quantify UHI: (1) UHI based on the temperature at the canopy layer, which is referred to as Canopy-layer UHI (CUHI), and (2) UHI based on surface temperatures, which is referred to as Surface-layer UHI (SUHI).Although both types of UHI have been widely studied in the academia (Akbari & Kolokotsa, 2016;Chakraborty et al., 2020;Firozjaei et al., 2020;Hua et al., 2008;Kim & Brown, 2021;Li et al., 2018;Lu et al., 2021;Mirzaei, 2015;Mohajerani et al., 2017;Nwakaire et al., 2020;Ren et al., 2008;Rizwan et al., 2008;Shastri & Ghosh, 2019;Stewart, 2011), the conducted research varies significantly in terms of objectives, scope, granularity (i.e., macro or micro level UHI study), and data collection regime (i.e., data source, frequency, and resolution).This inconsistency in how UHI is defined and measured results in a palpable absence of insight into how to tackle UHI.
To further elaborate on the existing inconsistencies, one should bear in mind the fact that SUHI is analyzed primarily by means of the land surface temperature (LST) via remote sensing (Chakraborty et al., 2020;Firozjaei et al., 2020;Li et al., 2018;Lu et al., 2021;Shastri & Ghosh, 2019).CUHI, on the other hand, relies on weather stations at fixed locations outside the city or in the proximity of airports (Hua et al., 2008;Mirzaei, 2015;Ren et al., 2008).Although weather stations collect data at a high frequency, these measurements can only represent a small area in the vicinity of the station and therefore cannot be generalized to capture the temperature profile of the entire city, especially at the micro-level.Due to this limitation, CUHI, in the way it is currently measured, can hardly be used to understand the micro-level urban drivers of the UHI phenomenon (Stewart, 2011).On the other hand, while land surface temperature (LST) provides data with a sufficient resolution to explain SUHI at the micro-level, it lacks temporal resolution because it depends primarily on satellite imagery.
To address the above problem, many researchers have tried to develop more rigorous data collection regimes using cross-sectional mobile measurements (e.g., vehicles, bicycles, foot transects).Table 1 presents a summary of the literature reviewed on mobile data campaigns in the last five years.Bicycles (Cao et al., 2020;Liu et al., 2017;Qiu et al., 2017;Rajkovich & Larsen, 2016;Romero Rodríguez et al., 2020;Yokoyama et al., 2018;Ziter et al., 2019) and vehicles (Dorigon & Amorim, 2019;Makido et al., 2016;Parece et al., 2016;Qaid et al., 2016;Yadav & Sharma, 2018) are widely used to measure mainly air temperature and humidity for CUHI studies using environmental sensors.Although the manner in which mobile units are used up to now addresses the issue of data resolution for CUHI, it does not resolve the problem of data inconsistency for the concurrent study of SUHI and CUHI.This is because these mobile units primarily focus on air temperature and ignore surface temperature (Smith et al., 2011).However, considering only one of the temperatures ignores the differences in diurnal, seasonal, and contextual disparities between the CUHI and SUHI, which, in turn, does not provide a complete understanding of the spatio-temporal characteristics of the UHI (Du et al., 2021;Peng et al., 2022).This is a major oversight since previous studies indicate that despite commonalities between SUHI and CUHI (Kim & Brown, 2021), the mechanisms that derive each can be different (Du et al., 2021).This highlights the significance of the concurrent study of CUHI and SUHI because a mitigation strategy can potentially have different degrees of effectiveness on each type of UHI and this has considerable implications on the development and adoption of mitigation strategies (Du et al., 2021;Ho et al., 2016;Hu et al., 2019;Sun et al., 2020;Sun et al., 2019).For instance, Du et al. (2021) found that the relationship between features such as vegetation density, population size, and UHI intensity was not consistent.The urban-rural difference in vegetation coverage had a greater impact on the intensity of SUHI than on CUHI during the day, while the opposite occurred at night.Furthermore, as discussed by Hu et al. (2019) seasonal variations also play a crucial role in shaping the magnitudes and patterns of SUHI and CUHI across different seasons, and geographies.
The concurrent study of SUHI and CUHI can only be achieved if consistent data are collected (in both resolution and frequency).To the best of the authors' knowledge, only a few studies have collected data usable for this type of study.Du et al. (2021) examined the intensity patterns of both SUHI and CUHI in 366 global cities with diverse climatic conditions.Hu et al. (2019) focused on three megacities in eastern China, and Peng et al. (2022) conducted a study in Kitakyushu City, Japan.These studies arrived at similar conclusions that during the daytime, SUHI intensity is frequently overestimated when compared to CUHI, and this overestimation varies throughout the year.Sheng et al. (2017) explored different measures of UHI intensity in Hangzhou, China, using both air temperature and LST.The results showed that the values obtained from LST and hourly air temperature measurements are not directly comparable.Additionally, the study highlighted the impact of weather conditions on UHI measurements, indicating that the LST-based UHI model is more accurate on hot, sunny days, whereas the air temperature-based UHI model is more reliable on dry, sunny days.Furthermore, Venter et al. (2021) examined 342 urban areas in Europe during the 2019 heat wave.Using data from citizen weather stations and satellites, they measured both CUHI and SUHI intensities.Their results showed that satellites overestimated UHI intensity by six times when compared to ground-based weather station measurements.Sun et al. (2020) found similar results, although they focused on assessing UHI trends and the correlation between LST and near-surface air temperature.The study showed a good statistical correlation between LST-UHI and air-UHI.However, when it came to minimum and maximum temperatures, there were significant discrepancies in temporal trends, with a considerable percentage of cities showing opposite trends.The authors suggested that combining air and surface temperatures is crucial to ensure reliable UHI trend data, especially for maximum and minimum temperatures.
There are two major limitations on how the types of UHI were compared: (1) the resolution of data used for the analysis is very low (in order of 1 km in the best case scenario), which is insufficient to provide insight into micro level urban planning decision-making; and (2) in the single study that used high-resolution data, data lacked consistency and frequency, i.e., the data was collected for different routes and at different times of the day.Therefore, there is still a need for a more rigorous analysis of SUHI and CUHI using a high-volume, consistent dataset that can counter the effect of seasonal and contextual variabilities on UHI behavior.Should consistent temperature data be collected, the relationship between SUHI/CUHI and the urban planning design parameters can be easily studied using data-driven methods, as shown in the previous work of the authors (Pena Acosta et al., 2022).For instance, supervised learning approaches have been successfully applied to find correlations between urban geometries and the intensity of the UHI (Chakraborty et al., 2020;Makido et al., 2016;Oukawa et al., 2022;Wang, Gao, & Peng, 2020).
In light of the above, there is a clear need for better understanding of the interplay between SUHI/CUHI and urban planning parameters using comprehensive data that is (1) consistent (both in terms of resolution and frequency), ( 2) comprehensive (i.e., consider seasonal and contextual variabilities), and (3) at the micro-level.Therefore, this research aims to perform a comprehensive comparative analysis of CUHI and SUHI using data collected by a tailor-made mobile data collection unit at a high frequency and resolution and at the micro-level (i.e., street-level).The ultimate goal is to investigate the spatio-temporal patterns of CUHI/ SUHI and their interplay with a wide range of street-level urban planning parameters.

Research Methodology
To tackle the complexity of the problem presented in Section 1, the methodology implemented in this research includes four main phases as shown in Fig. 1.In a nutshell, the data collection phase included two main steps, namely, the development of the mobile unit to map the required intra-urban data, and the gathering of relevant public data at the highest available resolution.In the Data Processing phase, urban morphological and socioeconomic parameters, and the data collected by the mobile unit were mapped and structured at the street-level.This process entailed the arrangement of the data in a two-dimensional table, where the rows represent each street, and the columns correspond to the socioeconomic, morphological, and temperature readings for each of those streets.In the third phase (i.e., model development) two supervised Machine learning (ML) algorithms were implemented, namely, Artificial Neural Network (ANN), and Random Forest (RF) to study the correlation between the urban morphological/socioeconomic parameters and SUHI/CUHI.In the fourth and last phase, predictions of the bestperforming models are analyzed by evaluating the contribution of each socioeconomic and urban morphological parameter to the SUHI/CUHI variations.Each of these phases is explained in more detail in the following subsections.

Data collection
The city of Apeldoorn, the Netherlands (Fig. 2), was chosen as the study case for this research.With an area of 341.2 km 2 and a moderate oceanic climate, Apeldoorn is the 11th largest municipality in the Netherlands, with 165,611 inhabitants as of 2022.Due to its geographical location, Apeldoorn presents a unique combination of vegetation and built environment, which provides the basis for an interesting UHI analysis.It is also a good archetype of a medium-sized city in the Netherlands.A fixed 8 km-long circuit (Fig. 2, right side) was set as the basis for a data collection campaign.Data was collected over a period of one year, from March 2021 to February 2022, with the intensity of six measurements per week in summer, and three measurements in winter, resulting in a total of 165 measurements.The 8 km-long route features 105 streets with distinctive urban morphology and socioeconomic parameters.
Urban parameters can be classified into three main categories, namely, environmental, urban morphological, and socioeconomic factors, as shown in Table 2.The table also shows whether these parameters are within the jurisdiction of urban planning decision-making.In general, environmental parameters, such as reference temperature, are external to the jurisdiction of urban planners.As discussed in Section 1, the intrinsic nature of these parameters requires different data collection schemes.For example, urban morphology features, and socioeconomic parameters can be retrieved from publicly available data sources, typically at a neighborhood or city resolution (Pena Acosta et al., 2021).On the other hand, environmental parameters are dynamic and influenced by the built environment, and therefore need to be measured at a higher spatio-temporal resolution.Hence, the data collection step consists of two main parts that can be conducted simultaneously: (1) the collection of the relevant cadastral datasets, and (2) the development of a mobile unit capable of capturing the required measurements.The previous work of the authors provides comprehensive explanations of the development of the mobile unit (Pena Acosta et al., 2022) and the publicly available datasets (Pena Acosta et al., 2021).Nevertheless, both processes are briefly presented in the following subsections for the completeness of the paper.

The mobile urban data-gathering station
A bicycle-based mobile urban data-gathering station is used to scan the city and collect geo-referenced and time-stamped air and surface temperature data.As shown in Fig. 3 (left side).the mobile unit is equipped with a sensor kit including (A) a GPS rover to provide accurate data at the frequency of 1 Hz, (B) a thermologger for measuring, air/ surface temperatures and relative humidity, (C) a display to monitor the data collection campaign in real-time, (D) a thermal camera, and (E) a processing center.The primary objective of the processor is to register, synchronize, and store the temperature readings every second.The data is saved in a CSV file, where each row includes location information from the GPS rover (i.e., latitude, longitude, and altitude) and temperature readings from the environmental sensors.This is a crucial step as the purpose of the mobile unit is to measure multiple temperatures at the same location.Fig. 3 (right side) shows an example of the geo-reference, time-stamped data synchronization at the processing center.Table 3 summarizes the characteristics of the sensors used for this study and the respective data.
Regarding the frequency of the data collection, three measurements per day were collected (early morning at 5:30 UTC, middle of the day at 10:30 UTC, and evening at 16:30 UTC).The reason for this is that the thermodynamic processes involved in the creation of UHI phenomena are closely correlated with both the duration of the sunlight and the process by which solar energy is absorbed and released by the urban fabric (Chen & Jeong, 2018), A measurement was recorded every second with a consistent cycling speed of 8 km/h, which translates to a 2-meter spatial resolution.This strategy was employed not only to enhance the granularity of the data but also to mitigate potential sources of error.These potential errors could include signal loss from the GPS and latency issues with the Extech HD500.To further ensure the accuracy of the data, all sensors were tested to ensure that the readings were accurate.

Data processing
As illustrated in Fig. 4, the final dataset consists of three main raw input data sources, each stored and structured differently.However, the objective of this step is to scale all data to a street-level resolution.To do so, each street is defined as the road segment between two intersections.Each street segment is assigned a 15 m buffer zone from the middle of the street section.This zone is considered as the street jurisdiction, from which all features are processed.A schematic representation of street features and label estimation is provided in Fig. 5.
Moreover, as previously explained by the authors (Pena Acosta et al., 2021), all features can be calculated within this jurisdiction.The densities per street segment (building, vegetation, and water) were calculated by dividing the areas of each feature by the area of the buffer.For instance, if 10 m 2 of greenery is present in a street with a buffer size of 1000 m 2 , the vegetation density is 0.01.The predominant land use was determined by the land use that is most dominant within the buffer area.For instance, if buildings in a street are 50 % residential, 35 % Fig. 1.Research methodology, compromising four phases: data collection, and data processing, model development, and analysis of the results to better understand the correlation between urban elements and SUHI/CUHI at the street level.commercial and 15 % industrial, the dominant land use is residential.Population counts were mapped to a grid structure, and the mean population per street was calculated using a weighted average, considering the proportion of cells within the buffer area.The mean building height and elevation were also determined using a weighted average within the buffer area.Furthermore, given the granularity  needed to evaluate the interplay between the two types of UHI at street-level, it is also important to take into consideration the street use.
To this end, the intended usage and design for each street (i.e., for pedestrians, bicycles, and vehicles) is retrieved from cadastral datasets.Based on this information, for each street segment, the proportion of the overlap between this information and the area of jurisdiction for the street is calculated as summarized in Eq. 1.
Street use s = area pedestrian s or area bicycle s or area vehicle s Total area of the street s (1) Regarding the data collected by the mobile unit, since each measurement point is stored with its corresponding latitude and longitude, the data can be georeferenced using a GIS software.In this research, ArcGIS pro 2,7 was used (Esri, 2022).Once the data points have been georeferenced, the measurements corresponding to the air and surface temperature readings were assigned to the corresponding streets.Thereafter, in each street, observations that were below and greater than the 10th and 90th percentiles were marked as outliers and removed from the dataset.For the rest of the observations, the mean value of air and surface temperature, as well as relative humidity (RH) were calculated.
In order to estimate SUHI and CUHI, the first step involved defining the location of the temperature reference.Traditionally, a non-urban point in proximity of the city is selected (based on the core definition of UHI).However as discussed by Stewart and Oke (2012) this traditional approach has its limitations because the clear distinction between urban and rural regions is becoming less apparent due to urban expansion and decentralization, making it challenging to draw distinct boundaries between cities and the countryside.Given that the objective of this paper is to perform a comprehensive comparative analysis of CUHI and SUHI at a high frequency and resolution at street-level, it was necessary to find a suitable reference point with both surface and canopy level temperature data.As publicly available temperature data (such as satellite imagery and weather station data) could not provide such granularity, a reference location within the data collection path was selected for this study.This reference point was chosen for two main reasons: (1) it represents the coolest point based on the average LST during the summers of 2019, 2020, and 2021, as shown in Fig. 2 (left side, "temp reference location"), and (2) it is located away from the urban center to minimize the influence of anthropogenic heat sources and infrastructure associated with urban areas.To calculate the reference temperatures, the average temperatures collected by the mobile unit at a 100-meter distance on both sides of the reference location were taken.
For each street, a CUHI and SUHI attribute is estimated as follows: Where AT s and ST s are the average air and surface temperature at each street respectively, and AT r and ST r are the air and surface temperatures at the reference location.CUHI and SUHI will represent the temperature differential that can be considered indicators of UHI.As will be shown in Section 3.4, CUHI and SUHI data can be used to determine the deriving forces of each respective type of UHI by looking at the feature importance analysis.However, since this research also aims to understand the interplay between the two types of UHI, it is interesting to look into what causes/defines/derives the difference between CUHI and SUHI.Also, since the urban planners are commonly interested in reducing CUHI and SUHI simultaneously and, therefore, most probably not interested in the interplay or trade-off between the two, it is sensible to define an aggregate parameter that accounts for both CUHI and SUHI at the same time.Simplistically, this can be done by summing up the CUHI and SUHI for each point.By trying to minimize the impact of the aggregate UHI, urban planners can make sure the trade-off between the two types of UHI is accounted for.Therefore, this research also considers (1) ΔUHI (i.e., the difference between CUHI and SUHI) to generate insight into what causes the difference between the two types of UHI, and (2) ∑ UHI (i.e., the added/aggregate effect of CUHI and SUHI) in order to equip urban planners with a practical indicator for supporting their decision-making process.These two derivative UHI are estimated as shown in Eq. ( 4) and (5).

Model development
Once the data is organized in a structured database, the models are developed using RF and an ANN.RF and ANN are among the most widely used algorithms in supervised learning (Ahmad et al., 2017).The main reason for this lies in the prediction capabilities of both algorithms.However, it is commonly accepted that the predictive capability of ANN can be higher than that of RF, although at the cost of a higher computational effort (Roßbach, 2018).Nevertheless, the performance of both models depends heavily on the configuration of the internal parameters (commonly referred to as Hyperparameters), among other factors.For instance, RF performance can significantly change by changing the number of trees in the model.Therefore, it is important to find the best hyperparameter configuration for each model.This can be achieved by Fig. 4. Data processing procedure adapted from [52,55], where all features are scaled down to a street-level by calculating the all features within each street's jurisdiction.
performing optimization on the training of the model (i.e., evolutionarily changing the value of the hyperparameters until the near-optimum configuration is found, for instance, by using Genetic Algorithm (GA).Typically, tree-based algorithms, such as RF, require less hyperparameter tuning.Therefore, researchers are often faced with the dilemma between the two models.In this research, both algorithms were trained, optimized, and compared using GA.
The models were trained on 70% of the data.The performance of the models was tested using the remaining 30% of the data.By regressing the actual CUHI/SUHI on the predicted CUHI/SUHI, MAE and R-squared were calculated to assess the accuracy and precision of the models, respectively.In the process of optimizing the RF and ANN models, several parameters were adjusted to avoid overfitting and underfitting the model, as well as to find the near-optimum hyperparameter configurations.For this purpose, GA was used in this study, with a configuration of 100 individuals in the population and 50 generations.The crossover probability (CXPB) was set at 0,8.This GA configuration was used to optimize the hyperparameters of the models.This approach allows for a systematic and efficient search for the optimal set of hyperparameters (Kerdan & Gálvez, 2020).By using 100 individuals as the population size and 50 generations, the GA is able to explore a wide range of potential solutions and converge to the optimal set of hyperparameters.The high crossover probability also ensures that genetic information is exchanged between individuals, leading to a more diverse population and a more efficient search for the optimal solution.
Table 4 summarizes the hyperparameter configuration for GA-based optimization for the RF, and ANN models.For the RF model, the number of estimators determines the number of decision trees that are included in the forest.Generally, a higher number of estimators results in a more accurate model, although at the cost of increased computational time.Therefore, it is important to choose the minimum number of trees in the forest without compromising the performance.The maximum number of features determines the number of features that are considered when splitting a node in the decision trees.The maximum number of levels in RF determines the depth of the decision tree.A deeper tree will allow the model to capture more complex relationships in the data.However, it will also increase the computational time and cause overfitting.The minimum number of data points that are required to split a node aims to minimize overfitting.Lastly, bootstrapping is used to determine whether or not to add randomness to the model and decrease the chance of overfitting through the use of bootstrapped samples.
With regard to the ANN model, the activation function is used to introduce non-linearity in the ANN.The batch size refers to the number of samples used in one iteration to update the model weights.The dropout is used as a regularization technique to prevent overfitting in the ANN by randomly ignoring a certain percentage of neurons during training.The epochs refer to the number of times the model will see the entire dataset during training.The kernel initializer is used to initialize the weights of the neural network.The number of hidden layers, as well as the number of neurons in each hidden layer can lead to higher accuracy, but render the model more susceptible to overfitting.Finally, the optimizer is the algorithm used to update the model weights during training.
Table 2 presents the near optimum models (i.e., models with the configuration of hyperparameters that led to the best scores in terms of R-squared and MAE) that were selected to investigate the spatio-  temporal patterns of CUHI/SUHI and their interplay with street-level urban planning parameters.This study used Python 3,8 and the machine learning algorithm libraries Scikit-learn (Pedregosa et al., 2011) and Keras (Ketkar & Ketkar, 2017) to build the RF model and ANNs, respectively.The GA-based optimization algorithm was adapted from Kerdan and Gálvez (2020) and Fortin et al. (2012)

. Feature analysis
The data analysis involved understanding the contribution of each urban parameter to the outcomes of the different models.For this, feature importance can be computed by SHAP values (SHapley Additive exPlanations) (Lundberg & Lee, 2017).In recent years, SHAP values have been successfully applied in the context of UHI (Kim & Kim, 2022;McCarty et al., 2021;Oukawa et al., 2022;Yu et al., 2020).The underlying principle of the SHAP analysis consists of evaluating the performance of the model with and without each of the features for each combination of features.The SHAP value, therefore, is the average incremental contribution of a feature among all the possible combinations of features.
SHAP value analysis was conducted only on the best-performing model from the previous step.As will be shown in Section 3, RF models outperformed ANN models.As a result, feature importance analysis was only conducted on the best-performing RF model.However, because RF models have an element of stochasticity, the feature importance can slightly change every time the model is trained (i.e., using the same hyperparameters configuration and training dataset).To counter the impact of stochasticity on the feature importance, the Monte Carlo method was applied.For this purpose, once the near-optimum hyperparameters are found, 50 different RF models are trained on the same model configuration and training dataset.Consequently, SHAP values were calculated for each model and feature importance was assessed stochastically (i.e., the mean, and standard deviation of SHAP values).The average value of the feature importance is ranked and plotted for analysis.Finally, SHAP summary plots were used to graphically see the distribution of the impact that each feature has on the model prediction outcome.

Results
As described in Section 2.2, Table 5 presents an example of the urban parameters calculated for the streets covered in this research.As shown in Fig. 6 and Table 6, air temperatures ranged from -0,3 to 35,3 • C, while surface temperatures fluctuated over a wider range from -12,0 to 48,4 • C.During the months of April to August, air and surface temperatures were both at their highest.The difference between air and surface temperatures was greater during the colder months.Moreover, there is a strong positive correlation between the average air and surface temperatures, with a Pearson correlation coefficient of 0,9, and a p-value of approximately 0.

Sensitivity to the reference point location
As discussed in Section 2.2, the reference point selected for the calculation of the CUHI and SUHI effects was on the data collection route, i.e., to ensure it has the same temporal resolution as the entire dataset.To investigate the extent to which the models that will be developed later are sensitive to the selected reference points, a sensitivity analysis was performed.In this analysis, 3 different points on the route were selected.While they have different annual temperature profiles, all points were chosen from points that have cooler overall LST.Using these reference points, three different sets of CUHI and SUHI were calculated.An RF model was developed for each of the calculated CUHI and SUHI and the feature important analysis was performed.Fig. 7 shows the comparison of feature importance between different models.As shown in these figures, the model structures remain rather consistent,

Table 5
Urban morphology and socioeconomic parameters derived from cadastral datasets.especially for high-ranking features.This suggests that while small variations can be expected, in general, the underlying mechanisms, manifested through the feature importance analysis, remain more or less the same.Given that the main objective of this research is to shed light on the different mechanisms of CUHI and SUHI and to sensitize practitioners to the importance of the concurrent consideration of both types of UHI, it can be argued that the selection of a reference point is less relevant to the scope of this research.Therefore, out of the three points investigated, the reference point with the overall lower average temperature was used for the rest of the research.

. UHI variations
As shown in Fig. 8 and Table 7, the mean value of CUHI varied from 0,12 • C in the spring months (April -June), to 0,38 Regarding the ΔUHI and the ∑ UHI, the average values of the two throughout the year remained low (-0,57 • C, and -0,38 • C respectively).The standard deviation is 2,51, and 2,65 for ΔUHI and the ∑ UHI, respectively.

. Models' optimization and performance
As presented in Figs. 9 and 10, both the RF and the ANN models have comparable predictive performances.However, it was observed that the RF models performed slightly better.Table 8 summarizes the configuration of the hyperparameters resulting from the GA-based optimization.For the RF model, the number of estimators is 150 and the maximum depth of the decision trees is 20.In turn, the best ANN model features 4 layers of artificial neurons with 454, 328, 173, and 1 neuron, respectively.Furthermore, as shown in Table 9, the ANN models required a greater time for both training and interpretation, leading to an increased time requirement to achieve results equivalent to those of the RF models.Therefore, the RF models were chosen to further investigate the spatio-temporal patterns of CUHI/SUHI and their interplay with the urban planning parameters.

Feature assessment
To better understand the relationship between the urban parameter and each type of UHI, SHAP values were calculated for all features.Fig. 11 plots the impact of each feature on the prediction of the models after applying the Monte Carlo method.The fact that SHAP values have a standard deviation suggests that there can be changes in the order of features between different runs of the same model.However, low standard deviation values ranging from 0 to 0,01 across different runs, suggest that the models are fairly stable and the impact of stochasticity on the feature importance is rather minimal.
The SHAP plots for the CUHI model indicate that part of the day (morning, afternoon, or evening), the average relative humidity (RH), and the average surface temperature at the reference location contribute the most to CUHI.The analysis showed that the higher the air temperatures at the reference location, the lower the magnitude of CUHI.As for the SUHI model, the variation in temperature at the reference location, the part of the day, and the average air temperature have significant impacts on the output of the model.Moreover, it is shown that a higher surface temperature at the reference location reduces SUHI, but a higher air temperature increases the SUHI.It is interesting to observe that all the top-ranking parameters for both SUHI and CUHI are environmental parameters.While urban planners can direct and manage urban socioeconomic and morphological parameters, the influential role of uncontrollable environmental parameters denotes that effective mitigation strategies (in terms of adjustment within the domain of controllable socioeconomic and urban features) need to be tailored to specific urban Despite similarities in the top-ranking parameters, there is a significant difference between the two models in terms of controllable parameters.For instance, vehicle use and building density have a much higher impact on SUHI than on CUHI.Alternatively, the width of the street has a much higher influence on CUHI.The variation in the ranking of controllable parameters highlights the significance of considering both types of UHI when developing mitigation strategies and designing new neighborhoods or planning the expansion of existing ones.
As mentioned earlier, to better understand what causes the difference between the two types of UHI, ΔUHI was considered.As shown in Fig. 11, similar to the two previous models, environmental parameters play the most significant role in explaining the difference between the two types of UHI.Interestingly, the distribution of SHAP value has  become more uniform, where all controllable parameters have a rather similar impact on the difference between the two types of UHI.Considering Fig. 11 (a) and (b), this trend is sensible because variations in the ranking of parameters across the two models are offset when considering the difference between CUHI and SUHI.
Finally, this research evaluated the aggregate UHI (i.e., ∑ UHI) to provide urban planners with a model that considers the impact of their decisions on both types of UHI concurrently.As shown in Fig. 11(d), all features played a greater role in explaining aggregate UHI.The vehicle use of the street has the highest impact among the controllable parameters, followed by building height, population, and vegetation density.Again, looking at Fig. 11    on SUHI, and vice versa.However, when the aggregation of the two UHI is considered, the importance of all parameters becomes aggregated and therefore more pronounced.These models suggest that urban planners can consider a wider variety of mitigation strategies to counter the impact of (aggregate) UHI.It should be noted that from the physical standpoint, ∑ UHI does not correspond to an actual physical (and tangible) phenomenon.However, the authors believe this model has great practical value because it would enable urban planners to develop mitigation strategies that are more impactful considering both types of UHI.In essence, this model can be used to optimize the mitigation strategy considering the cost of the intervention and the collective impact of the strategy on both types of UHI.

Discussion
Sustainable urban development and heat resilience are major topics of government debate.The implementation of mitigation strategies and heat-resilience policies do not account for the impact of UHI in its wholeness mainly because, as discussed in Section 1, the data needed to do so lack consistency in terms of objectives, scope, granularity, and Recent efforts have focused on the simultaneous investigation of SUHI and CUHI (Du et al., 2021;Ho et al., 2016;Hu et al., 2019;Peng et al., 2022;Sheng et al., 2017;Sun et al., 2020;Venter et al., 2021;Yang et al., 2020).However, as discussed in Section 1, these studies lack sufficient granularity to jointly study SUHI and CUHI at the street level, mainly due to the difficulty of obtaining consistent urban measurements that capture the two.In addition, the impact of a wide range of urban parameters in generating the differences between the two has not been sufficiently studied.Accordingly, the main contributions of this research include: (1) the development of a data collection method that allows for simultaneous analysis of surface and canopy UHI effects, and (2) the explication of differences in the mechanisms that drive these two types of UHI effects.To the best of the authors' knowledge, this is the first time CUHI and SUHI have been systematically compared using a consistent dataset that covered the entire year.This was partly achieved by the unique setup of the data collection unit and also the intensive and long data collection campaign.
It was observed that the magnitude of CUHI and SUHI (at a street level) are systematically different.Notably, the significance of certain features in one model does not necessarily mirror their importance in the other.For instance, vehicle use and building density are among the top 10 most influential features of the SUHI model, but they don't have the same prominence in the CUHI model.This is because in the case of the surface temperature, the key material-related parameter is the heat absorption characteristics of the horizontal surfaces, while for the air temperature, it is more complicated and depends on the morphology of the street, the reflectivity of all surfaces, and the insulation characteristics of the buildings.Since we know asphalt roads tend to absorb and retain more heat than concrete roads (Doulos, 2004), it is logical that the area of road used by vehicles (which is mostly covered by asphalt) is a major driving force of the surface temperature differential.Additionally, the interaction of these features at a local scale can also influence this discrepancy.In a densely built area with heavy vehicle use, an asphalt road might exacerbate the heat due to its heat-retaining properties.Therefore, the specific combination and interaction of these features in a particular area can lead to varying levels of feature importance, causing the observed discrepancy between the SUHI and CUHI models.This insight has significant implications for urban planners and decision-makers as they can target their mitigation strategies locally.
Fig. 11 presents the variation of SHAP values of different parameters across all four models (this variation was plotted by looking at the SHAP values across all models).The figure illustrates that the top five factors with the most impact on the model output are primarily composed of environmental features.These factors are external to the control of urban planners and cannot be manipulated (Rizwan et al., 2008).However, factors such as the type of street, building density, and land use can be examined by decision-makers and urban planners as they have a palpable effect on two types of UHI, as indicated by the model ΔUHI.It should be highlighted that this observation does not negate nor underestimate the impact of socioeconomic and urban morphology parameters.What it denotes is that the impact socioeconomic and urban morphology parameters carry on the UHI intensity depends greatly on the environmental parameters.In other words, the magnitude of the contribution of socioeconomic and urban morphology parameters on UHI varies significantly with the changes in the environmental parameters.This is very much in line with the existing body of knowledge on the UHI mechanism.For instance, the work of Azevedo et al. (2016) suggests that UHI intensity in the same area can be 3 times more during the day than at night.Also, the work of Palme et al. (2016) suggests that the impact magnitude of different parameters on UHI changes between day and night.This shows how sensitive UHI intensity is to environmental conditions.
A point worth mentioning from the analysis of Fig. 12 is that the ranking among controllable features vary considerably and this attests to the fact that a mitigation strategy can have different impact on different types of UHI.As discussed by Peng et al. (2022), both types of UHI occur differently, and therefore one should not be used as a proxy for the other.Although the average air and surface temperature showed a strong positive correlation, the correlation between CUHI and SUHI was not as strong, even though they are inherently related.This suggests that one-size-fits-all mitigation strategies may not deliver the optimum impact.This was also demonstrated in the previous work of the authors, where it was found that SUHI models have different mechanisms for different urban contexts (i.e., cities) (Pena Acosta et al., 2023).
Considering this highly context-and-type sensitive mechanism, the data-driven models can be a practical solution for decision-makers as these models can be translated into ready-to-use information.The authors believe that the aggregate models (i.e., ∑ UHI) can be best utilized for the development of mitigation strategies given that they can consider the collective impact on UHI.However, this research assumed equal importance for CUHI and SUHI.If this assumption is not accurate from health and energy-related perspectives, a different aggregate function based on, for instance, the weighted sum method can be used.
In view of the effects that the choice of a different location for the reference temperature may have on the intensity of both SUHI and CUHI, it is necessary to consider certain trade-offs from the results of this study.Traditionally, a reference point outside the city is considered (Stewart & Oke, 2012), because the effects that the built environment has on the urban temperatures can be better represented by this approach.In this study, a reference temperature location was chosen within the circuit of the data campaigns.This was to ensure that the granularity of data required was met (both for CUHI and SUHI).This makes the results of the current models sensitive to that specific reference location.Nonetheless, this highlights the importance of a standardized understanding and subsequent rule/protocol for defining the reference location when analyzing UHI intensities and their embodiment throughout the urban environment.However, as highlighted by Yang et al. (2020), the selection of the appropriate indicator to characterize UHI in itself is a challenging endeavor, one that should remain to be answered in the future. said that, municipalities are invited to consider the development of reference stations outside the city that can measure both surface and canopy temperature at the same frequency.
Major efforts have been made to assess the applicability of datadriven models in the context of urban climates.However, it is the authors' conviction that the models need to be explainable, and interpretable, especially in the context of urban climates, where decisionmakers and urban planners are often unacquainted with the complexity of these models.Overall, the developed models have the potential to be a valuable tool for urban planners and decision-makers in the field of urban climate management, especially if combined with local interpretation methodologies, such as the one used in this research (i.e., SHAP).For instance, such models can be utilized to identify streets in the city that are particularly susceptible to trapping and absorbing heat.Equipped with this information, decision-makers can make informed decisions regarding the prioritization of green infrastructure, such as trees and cool pavement materials, while taking into account the specific surrounding urban infrastructure, or even a potential change of the land use surrounding a specific street/neighborhood.Similarly, the outcomes of the developed models can also be used for a high spatial resolution assessment of the UHI impacts on building energy consumption.Such information is particularly useful for helping energy infrastructure planning and investment in the context of climate change and rapid urbanization.
Another interesting point of discussion is that this research assumes that all streets would have similar mechanisms for each type of UHI.However, when the Pearson Correlation Coefficient between CUHI and SUHI is analyzed by streets, it is found that the correlation between the two types of UHI varies greatly.High-correlation streets are more common than low-correlation ones, as seen in Fig. 13.This suggests that perhaps more specific mechanisms can be discovered based on the typology of streets.It can be hypothesized that by clustering streets based on the similarity of parameters (i.e., features), more accurate and specific models can be developed.It can also be beneficial for the design of mitigation strategies, as it can leverage the economies of scale of multiple natures.This will be explored in the future study of the authors.
This research has tried to go beyond the black-box analysis of the prediction models by looking into the underlying mechanisms of different models.This is a step towards the transition from black-box to white-box thinking based on data-driven models.However, the results of this research need to be interpreted with caution, as the collected dataset is still too small vis-a-vis the global scale of the UHI model.Yet, the scale-up of this research demands more global commitment from local and national governmental authorities and research agencies, since it is simply beyond the capacity of a single research unit.The problem here is the total absence of standards on how to measure and collect data for UHI.This is a major barrier in the domain of data-driven modeling of UHI because there is very little consistency in the available data (Stewart, 2011).The authors firmly believe that there is an urgent need for standardization in the domain of UHI.This can be very well achieved through the development of domain ontologies for UHI.

Conclusions
The results of this research shed light on data collection regimes for studying two distinctive types of UHI simultaneously at the street-level.Two main conclusions can be drawn from this research: (1) it is shown that different UHI types seem to have different mechanisms, therefore a concurrent evaluation of both SUHI and CUHI is not only needed but desirable, particularly while designing urban-heat resilient cities; and (2) it is shown that a well-planned and strategic data collection can significantly help unravel the underlying mechanisms of UHI and thus provide urban planners with a more robust toolkit for the development of tailor-made mitigation strategies.
This study presents a consistent dataset reflecting the seasonal and contextual variabilities of UHI phenomenon behavior holistically.Using the collected data, the authors have conducted a comprehensive comparison of CUHI and SUHI.It was shown that while the two are related and driven by similar parameters, the core mechanisms that drive each type are quite different.This cast a thick shadow on the applicability of a one-size-fits-all mitigation strategy towards UHI, which is the dominant approach at the moment.
Furthermore, the development and application of data-driven models to capture the complex dynamics of UHI have profound implications for heat-resilient urban planning.The research highlights the critical role of controllable features, such as intended street use, building density, and vegetation density, in modulating both CUHI and SUHI.However, the impact of these features is not the same for both.This understanding enables urban planners to implement changes at the street level.This approach shifts generic strategies towards contextsensitive ones in managing UHI.
Regarding the performances of the developed models, it was shown that the ANN models take longer than the RF model.Moreover, the ease of hyperparameters tuning and the building of a robust model give three-based algorithms an advantage when dealing with large and complex multi-dimensional datasets.Furthermore, the outcomes of the data-driven models underscore the unique performance of each feature at the street level, suggesting that the factors influencing the UHI phenomenon in one locality may not be the same in another due to the inherent variations in urban structure and socioeconomic aspects of each built environment.Moreover, while general principles and best practices for UHI mitigation exist, it is essential to recognize that the most effective strategies are context-specific and require careful  consideration of the unique conditions of each environment at the street level.Therefore, street-level can provide a more accurate and local-specific understanding of the phenomenon.The authors are already busy working on this line of research.
Despite the contributions to the academic community and practitioners highlighted above, the study described in this paper exhibits some limitations: (1) sample size: the data were collected for one city.Subsequent efforts could consider expanding the data-gathering process to map more urban environments.A potential research line could look into expanding the data collection solutions to deploy increased numbers of mobile data collection units for use by the general public.Another potential application could be to miniaturize the UHI data collection unit by developing an integrated and embedded plug-and-play unit that can be used in different modes of transportation or municipal machinery, such as garbage collectors.
(2) need for standardization of the reference location: it should be noted that even a small change in the location of the reference point could potentially result in different magnitudes of UHI.Thus, the authors highlight the urgent need for a standardized protocol for defining the reference location to better understand and accurately measure both CUHI, and SUHI at the same spatiotemporal resolution in order to improve the accuracy and consistency of UHI modeling.(3) façade and rooftops: The study did not delve into the role of building façades and rooftops in UHI.However, these elements could affect the thermal properties and energy consumption of urban buildings, potentially having a significant impact on their overall energy performance (Santamouris, 2014(Santamouris, , 2017)).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Fig. 2 (right side) illustrates the average LST of the city during the summers of 2019, 2020, and 2021.In this figure, warmer areas are represented by red/orange colors and cooler areas are represented by blue/green colors.

Fig. 2 .
Fig. 2. The location of Apeldoorn, the Netherlands, is shown on the left.On the right, a heatmap of Apeldoorn displays the average land surface temperature (LST) for the summers of 2019, 2020, and 2021 throughout the city.The heatmap also shows the location of the reference point used to calculate different types of UHI, and the fixed route chosen for the data collection campaign.

Fig. 3 .
Fig. 3. On the left the bicycle-based mobile urban data-gathering station.On the right an example of the geo-reference, time-stamped data synchronization at the processing center.

Fig. 5 .
Fig. 5. Schematic representation of street features and labels estimation.The buffers represent the streets' jurisdiction, from where all features are calculated.

Fig. 6 .
Fig. 6.Air and surface temperatures measured by the mobile unit from March 2021 to February 2022.

Fig. 7 .
Fig. 7. Comparison of feature importance across three different reference locations (RTL 01, RTL 02, and RTL 03) for both CUHI and SUHI models.Each line represents a different feature, and its rank is plotted against the reference location.A lower rank indicates higher importance.
• C in summer (July to Sep), -0,05 • C in autumn (Oct to Dec), and -0,02 • C in winter (Jan -March), with an overall average of 0,10 • C. The mean value of SUHI varied from -0,01 • C in spring, to -0,83 • C in summer, -0,38 • C, in autumn, and -0,63 • C in winter, with an overall average of -0,48 • C. The difference between average CUHI and SUHI is a clear indication of an inherent discrepancy between the two types of UHI, despite the strong correlation.Furthermore, the standard deviation values are relatively high, with an average value of 0.81 for CUHI, and 2,45 for SUHI, indicating the variability of phenomena throughout the year.This variability is further confirmed by the low Pearson correlation coefficient of 0,09 between CUHI and SUHI.

Fig. 8 .
Fig. 8. Raincloud plots of the temperature variability of the four types of UHI studied in this research.The y-axis represents the months during which the data was collected and processed, while the x-axis represents the variation in UHI.(a) Summarizes the variation of CUHI with a temperature range from -4 to 4 • C. (b) Presents the variation of SUHI with a temperature range from -20 to 15 • C. (c) Presents the variation of ΔUHI with a temperature range from -20 to 15 • C, amd (d) Sumarizes the variation of ∑ UHI with a temperature range from -20 to 20 • C.

Fig. 9 .
Fig. 9. Regression plots summarizing the performance of the ANN models.The x-axis indicates the R 2 value, highlighting the model's ability to explain the variance in the dataset.The y-axis represents the MAE value, illustrating the average magnitude of errors in the predictions.
(a)  and (b), this behavior is expected since the discrepancy in the ranking of the parameters between the two types of UHI means that what has a high impact on CUHI can have a low impact

Fig. 10 .
Fig. 10.Regression plots summarizing the performance of the RF models.The x-axis indicates the R 2 value, highlighting the model's ability to explain the variance in the dataset.The y-axis represents the MAE value, illustrating the average magnitude of errors in the predictions.

Fig. 11 .
Fig. 11.Bar plots illustrating the average SHAP (SHapley Additive exPlanations) values over 50 iterations all the UHI models.Higher SHAP values indicate a more substantial impact on the model's output.These plots offer an in-depth understanding of feature importance across the different models.

Fig. 12 .
Fig. 12.Comparison of feature importance across all UHI models included in this study.

M
. Pena Acosta et al.

Table 1
Overview of the reviewed literature regarding mobile data campaigns between the years 2016-2021.

Table 2
Categorization of socioeconomic, morphological, and environmental parameters present in the built environment.
properties, which can affect the amount of heat that is absorbed or reflected.Surfaces that have high albedo and low emissivity tend to reflect more heat and absorb less.

Table 3
Sensors used in this study.
AccuracySample rate Potential limitations & sources of error A High performance ANN-MS, GPS antenna Coordinates RHCP (right-handed circular polarization) 1 reading per second Signal loss in dense urban areas due to buildings, atmospheric conditions, potential inaccuracies due to speed and vibration from the bicycle.

Table 4
Hyperparameter configuration for GA-based optimization for the RF and ANN models.
M. Pena Acosta et al.

Table 6
Mean and standard deviation values of data collected by the mobile unit (values are in • C).

Table 7
Mean and standard deviation values of CUHI, SUHI, ΔDUHI, and ΔAUHI (values are in • C).

Table 8
Architecture of the optimized RF and ANN models.

Table 9
Average of model performances after 100 iterations.