Comparison of different spatial temperature data sources and resolutions for use in understanding intra-urban heat variation

.


Introduction
Cities worldwide are growing fast to accommodate the increasing global population. According to the United Nations New Urban Agenda, by 2050, the world's population is expected to double, which makes urbanization an inevitable trend (Habitat, 2017). The excess heat resulting from deforestation, land cover change, and the prevalence of impervious surfaces is great enough to raise the cities' average temperature by several degrees over the non-urbanized and rural parts and lead to a known phenomenon called the urban heat island effect (Oke, 1987). Increased temperature in cities impacts human health and well-being in several ways. However, the associated effects of heat exposure in cities are not spatially uniform . Land use, landscape features, surface covers, and morphological parameters make temperature distribution in cities disproportionate and cause disparities in the burden of heat exposure across sociodemographic groups (Hsu et al., 2021;Voelkel et al., 2018). While proactive heat mitigation strategies at the local level are integral to compensate for the severity of extreme heat exposure, a lack of spatially explicit information on hotspots would undermine the specificity and applicability of the policies (Preston et al., 2011). Hence, the demand for systematic representations of spatial heat heterogeneity within cities is rising (Meng et al., 2018;Peng et al., 2020).
Representation of heat exposure variation can be facilitated through empirical measurements, allowing the direct observation of spatial and temporal variation in temperatures. For example, LANDSAT and MODIS images are widely applied to classify land cover/ land use changes and assess the association between spatiotemporal factors and the urban heat island effect (Chen, 2021;Deilami et al., 2016;Guha et al., 2018;Mukherjee & Singh, 2020;Schwarz et al., 2011). While empirical techniques such as satellite imagery and in-situ temperature measurements have moderate spatial and temporal resolutions, physically based modeling may provide higher resolution exposure representation by involving an array of physical and meteorological parameters at finer scales. Finer scales may be especially useful when engaging with residents because they enable more convincing causal relationships linked to the variation in temperatures (Berardi et al., 2020;Gunawardena & Steemers, 2019;Kianmehr & Lim, 2022). Moreover, modeling can incorporate more salient aspects of urban heat than remotely sensed infrared reflectance. For example, solar irradiance geometry models apply urban 3D models and meteorological data to estimate citizens' thermal comfort level in urban areas with a high resolution and through a human-centric point of view. Ultimately, however, the value of finer scale data depends on its intended use. For example, Wu et al. (2019) found that coarse regional air temperature data were sufficient to predict heat mortality during heat waves.
Urban planners are increasingly concerned with how decisions about the built environment can influence microclimate, giving rise to a need for models that can more precisely relate elements in the built environment with temperature impact (Keith & Meerow, 2022). Identification of underlying reasons for heat vulnerability and exposure variations within cities requires choosing appropriate proxies and relevant resolution of data (Karanja & Kiage, 2021). In recent years, advancements in "big data" acquisition and analysis have facilitated the operationalization of the known and theoretically based predictors, which could theoretically result in more precise predictions and explanations of spatial heat vulnerability associated with landscape elements. For instance, Google Application Programming Interface (API) for street view images (Google Street View, hereafter, "GSV") provides easy access to the extensive and invaluable sets of data taken from human-centric views , which allows for capturing three-dimensional elements with much higher resolution (Xu et al., 2012). In urban heat studies, street-level landscape features and urban form metrics derived from profile-based GSV images have been applied to identify the type, density, and distribution of urban greenery (Li et al., 2015b), track the temporal change of green index (Li, 2021b), and assess the accessibility of different neighborhoods and communities to the green areas (Li et al., 2015a). This promising source of data, coupled with other social indicators of heat vulnerability, such as poverty rate, educational attainment, ethnicity, gender, age, and historical segregation practices (Cutter et al., 2003;Dialesandro et al., 2021;Hoffman et al., 2020;Uejio et al., 2011;Wilson, 2020) can be used to create a picture of physical and social vulnerability of different localities toward the heat risk with a high resolution.
Whereas such suites of individual indicators leverage fine-scale analysis of temperature variations, aggregation techniques can change the nature of the gained information (Abson et al., 2012). A mismatch in spatial resolution among exposure data and physical and social vulnerability indicators (hereafter, vulnerability indicators) obstructs the consistent development of metrics and accurate representation of disparities (Cutter & Finch, 2008;Ho et al., 2015). This issue mainly emanates from a phenomenon called Modifiable Areal Unit Problem (MAUP) (Openshaw, 1984). The MAUP happens as the result of two distinct yet related conditions: the scale problem and the zoning effect (Fotheringham & Wong, 1991). The scale problem is associated with the aggregation of areal units' (e.g., pixel) data into the adjacent units, resulting in coarser units of analysis with the same areal size. In contrast, the zoning problem is related to the use of an alternative unit of analysis with different sizes while the total number of units is held constant (Ho et al., 2015;Ju et al., 2021). Both circumstances can change the results in multivariate analyses (Jelinski & Wu, 1996). For example, when the unit of analysis for the exposure and vulnerability analysis is determined arbitrarily, the analysis results may not reflect the real spatial heat heterogeneity. Such biased results would also hinder the accurate identification of suitable exposure and vulnerability indicators and proxies.
Selecting appropriate vulnerability indicators and data resolutions congruent with system dynamics is challenging, yet it is central to ascertaining appropriate heat mitigation strategies. There are many conceptual models in heat-related studies that scrutinize heat vulnerability at a fine scale to benefit site-specific policy-making (Conlon et al., 2020;Johnson et al., 2012;Mushore et al., 2018;Song et al., 2020). However, except for a few studies (Ho et al., 2015;Sobrino et al., 2012;Zhou et al., 2014), the literature lacks systematic analysis of data types and resolutions needed for the accurate detection of spatial heat variation and explainability of that variation. Therefore, this paper aims to explore what information can be gained about physical and social aspects of heat vulnerability and heat exposure through a data-driven approach that fuses different kinds of spatial data at a range of resolutions. As high-resolution data is not readily available for many locations due to the resource limitations and computational expenses (Deilami et al., 2018), this study specifically aims to evaluate the feasibility of using lower-resolution temperature data and new sources of vulnerability indicators to explain intra-urban heat variations. The key research questions this study seeks to answer are: first, what is the satisfactory range of exposure data resolution for accurately representing spatial temperature variations? Second, which groups of vulnerability indicators can better explain the variations in air temperature, land surface temperature, and mean radiant temperature? And, finally, what is the effect magnitude of specific landscape features and urban form metrics on changing temperatures? And how do the direction and magnitude of this effect differ in the whole study area and high-density zones?

Study area
Taking the city of Atlanta, Georgia, in the Southern United States as the case study, we systematically assess the compatibility of specific vulnerability indicators and temperature data and the suitability of specific spatial data at a range of resolutions for the representation of spatial temperature variations within cities. For this purpose, we employ various types of predictors, including specific street-level features, sociodemographic-based, and zone-based variables, to explain spatial air temperature (AT), land surface temperature (LST), and mean radiant temperature (MRT) variation within the city of Atlanta. The vector and raster-based data, including field measurements of air temperature, satellite imagery, and modeling data of various resolutions, are treated as dependent variables in the multivariate regression analyses.
The city of Atlanta has a population of around 500,000 and is also growing quickly (United States Census Bureau, 2021). Located in Fulton County, the city of Atlanta is characterized by humid subtropical weather with four seasons . Simulation results predict that Atlanta will experience higher frequency and longer duration of heatwaves (Habeeb et al., 2015). According to a survey, only 57% of respondents can afford or use central air conditioning when needed (Larsen et al., 2022). In addition, there is a disproportionate exposure to the heat in Atlanta, where the greater burden of the urban heat island effect falls on the poor (Chakraborty et al., 2019).

Data
According to studies such as Adger (2006), Guillard-Gonçalves and Zêzere (2018) and Turner et al. (2003), in this paper, we refer to vulnerability indicators as the composite of biophysical and socioeconomic indices. Based on theories related to physical and social vulnerability, we used three types of vulnerability indicators, including street-level landscape features and urban form metrics, population-based, and zone-based data, as explanatory variables in the regression analyses. Moreover, to study spatial temperature variation, we employed three exposure data types as dependent variables in multivariate regression analyses: air temperature, land surface temperature, and mean radiant temperature. Thermal comfort provided by shading and wind speed is highly dependent on fine-scale landscape features and urban form metrics. Studying the variations in thermal comfort and shading was the primary motivation for using the mean radiant temperature as one of the exposure data types in this study. Air and surface temperature might not capture thermal comfort as thoroughly as other heat indexes, yet they are the most prevalent and accessible types of temperature data and are worth further exploration to test their suitability for representing spatial heat variations.  (AT). Air temperature data were obtained from urban heat campaign (UHC) measurements for Atlanta. Funded by NOAA, over the past five years, the urban heat campaign has taken place in several localities in the U.S. (Mapping Campaigns, 2022). In this study, we used the afternoon measurements data (3:00 pm-4:00 pm) of the urban heat campaign in Atlanta, which took place on September 4th, 2021, and contained 23,386 observation points. Morning and evening measurements from this campaign are also available. However, given the significance of heat-related studies during the hottest hours of the day and one of the purposes of this study, which is to analyze the shading effect of street-level elements, we utilized the afternoon data from the UHC measurements. Further information about air temperature data employed in this study can be found in Table 1. This vector-based source of data is treated as one of the dependent variables in our multivariate regression analyses. In Fig. 1, we represent the normalized (scaled to the maximum-minimum range) spatial distribution of air temperature along with other types of heat exposure data used in this study (land surface temperature and mean radiant temperature) for comparison purposes. Fig. 1(a) shows the normalized spatial distribution of air temperature data, the study area, and the observation points of this study. (LST). The land surface temperature data was acquired through the bulk download of Landsat 8 images from the United States Geological Survey (USGS) website. Thermal infrared sensor (TIRS) images of Atlanta for the summer months (June 1 -September 30) of years between 2019 and 2021 were collected (a total of 14 scenes) and used for processing the raster data, including masking clouds and handling missing pixels. In the next step, the actual daytime land surface temperature for each grid point was calculated by converting the raster values into degrees of centigrade and taking means across all scenes. As a result, a single-layer daytime land surface temperature image was created for Atlanta city. We shall note here the original resolution of the Landsat TIR sensor is 100 m. However, the resolution of raster data used in this study has been changed to 30 m as a part of the Analysis Ready Data (ARD) dataset, prepared by the USGS, to match the resolution of other (visible band) data distributed in ARD. For more information about the resolution of Landsat images please refer to the USGS website (USGS, 2023). This dataset was also used as a dependent variable in multivariate regression analysis ( Fig. 1(b)).

Mean Radiant Temperature (MRT).
MRT more closely represents the thermal comfort that humans feel because it is derived by summing all shortwave and longwave radiation fluxes that the human body is exposed to (both directly and reflected) (Lindberg et al., 2008). However, this means that MRT cannot easily be measured as a spatial dataset and is usually calculated through mathematical modeling. Using SOLWEIG (SOlar and LongWave Environmental Irradiance Geometry) model, besides spatial variations of 3D radiation fluxes, we simulated the shadow pattern and the sky view factor of Atlanta's urban settings, taking into account the influence of shade on the thermal comfort of the human body. The two major inputs of the model are terrain features including ground topography and building configurations, and meteorological data, such as air temperature, relative humidity, wind speed, direct radiation, and diffuse radiation (Li, 2021a). For Atlanta, meteorological data and the high-resolution (1 m) 3D urban model generated from LiDAR and aerial images were used as inputs of SOLWEIG to calculate the spatial distribution of average mean radiant temperature and, thus, the human outdoor thermal exposure level across neighborhoods. More information about the modeling process can be found in our previous study (Li, 2021a). This raster-based data also was treated as the dependent variable in our regression analysis ( Fig. 1(c)). Further details about the distribution and original resolution of this dataset can be found in Table 1. LST and MRT data of Atlanta show a similar pattern for the distribution of hotspots in this city, where the highest land surface and mean radiant temperature were observed in industrial, high-density commercial, and office institutional zones. The most elevated air temperatures were observed in the southern part of the city, where the high-density residential and single-family residentials are located. The distinction between the distribution of hot spots and cool spots of air temperature and other measures of heat, such as land surface temperature, has been noted in other studies. This can be attributed to the anthropogenic activities and physical and landscape parameters that affect air temperature variations in urban areas (Amani-Beni et al., 2022).

Heat vulnerability indicators 2.2.2.1. Landscape features and urban form metrics.
In this study, vulnerability indicators refer to both biophysical and social aspects of vulnerability, such as lack of vegetation and shading, increased sky view factor, the prevalence of impervious surfaces, and a larger population of marginalized groups. In our study, landscape features and urban form metrics are considered as one of the subsets of vulnerability indicators and refer to morphological characteristics such as sky view factor and street-level built environment and landscape features that pedestrians can directly perceive. To collect data related to such physical aspects of heat vulnerability, we used Google Street View images of Atlanta. Using the Urban Heat Campaign measurement point locations, we obtained a list of available images in different years and months for those measurement points. We filtered the image list for the September years between 2017 and 2019 to get the list of the most recent and relevant images for download. This list contained 12,321 panorama identification numbers (I.D.). As for each panorama I.D., four images are available; overall, we downloaded 49,284 images for those specific locations.
To quantify the landscape features and built environment of downloaded images, we used Pyramid Scene Parsing Network (PSPNet), a superior framework for pixel-level predictions (Zhao et al., 2017). This image scene parsing and semantic segmentation algorithm enabled us to quantify 150 features (including buildings, trees, grass, road, sky, water, person, and car) that appeared in the downloaded images and analyze the physical characteristics of desired locations. Fig. 2, created by the authors, illustrates the process of quantifying street-level elements in Google Street View images using PSPNet.
We further used GSV panoramas to calculate the sky view factor (SVF), which refers to the ratio between the radiance received by a planar ground and the entire hemispheric radiation. The value of SVF ranges from 0 to 1, where 0 represents the total enclosure of the urban environment by trees or buildings, and one exhibits complete openness. We generated hemispherical images from the GSV panorama images using a geometrical transform model and quantified the visible portions of the sky to calculate the SVF of each observation point. For further details about calculating SVF from GSV images, refer to Li and Ratti (2018). metrics, sociodemographic variables have been shown to have explanatory power over spatial temperature distribution (Karanja & Kiage, 2021). This is because of discriminatory housing and urban planning processes and residential segregation. Socioeconomic indices such as race, income, education, gender, and age can be used to assess the social aspects of heat vulnerability and risk associated with heat in different localities (Harlan et al., 2006;Uejio et al., 2011;Wilson, 2020). We obtained population data for our temperature observation points from the City of Atlanta's open data portal, including total population, population density, and the percentage of each race (White, African American, Hispanic, and Asian) in 2010. We included those variables in stepwise regression analysis to identify the most important population-based variables to include in our regression models.

Zone-based data.
Urban heat island intensity is associated with the dominant land use and land cover zones (Weng et al., 2007;Yang et al., 2017). The urban thermal environment in a city varies due to the differences in land use and surface characteristics (Chen et al., 2023;Hart & Sailor, 2009), and the effects of these differences may not be captured through the street-centric landscape elements derived from GSV. To incorporate the effect of broader land use and land cover characteristics in the analysis of intra-urban heat variation, we included Atlanta's zone class categories in our analyses. Atlanta's zone class categories were obtained from the open data portal of the City of Atlanta. We also examined other categorical types of data, such as land use, neighborhood planning units, and statistical areas. However, we found that the zone class categories contributed the most to changes in temperatures.

Methodology
As discussed above, a variety of methods were adopted to extract and prepare data for statistical analysis. Fig. 3 reviews the workflow of the present study.

Changing data resolution
To examine the suitability of specific spatial data at a range of resolutions for the representation of spatial temperature variations, we downgraded the resolution of raster-based exposure data (LST and MRT) We should note here that as our air temperature data is vector-based, changing resolution does not apply to this data type in our analysis. So, we treated LST and MRT with various spatial resolutions alongside the original resolution of AT as dependent variables in our multivariate regression analyses.
We used coefficient determination (R-squared) that measures the goodness-of-fit to compare the explanatory power of land surface and mean radiant temperature data at a range of resolutions for the representation of spatial heat variations.

Statistical analyses
To identify the important variables to include in our regression models, we used forward and backward stepwise regression. Based on the outputs of this feature selection method, we developed various models with different groups of predictors (i.e., landscape features and urban form metrics, population-based, zone-based variables) and dependent variables (i.e., AT, LST, and MRT of various resolutions). We also checked for multicollinearity and removed variables that showed a strong relationship with each other. The final set of selected variables for the regression model can be found in Table 2. We used JMP software to   perform ordinary least squares (OLS) regression and used resulting Rsquared (and adjusted R-squared for the nested models) values to study the power of different groups of predictors in explaining the variations in AT, LST, and MRT over a range of resolutions (Eq. (1)). The initial total observation points in our study were 8895, and each observation point was located at least a 10-meter distance from the adjacent points.
Eq. (1) shows the statistical specification of the ordinary least squares regression.
where, Yi is one of the dependent variables, including AT, LST, and MRT at a specific resolution at location i; X g,i is the vector of landscape features, and urban form metrics (GSV-driven variables) observed at location i; X p,i is the vector of population-based variables in the area in which location i falls; X z,i is the vector of zone-based variables in which location i falls; β 1 , β 2 , and β 3 are vectors of the estimated coefficients of predictor variables; ε i is the error term observed for location i.
Following OLS estimation of the coefficients, we also checked for spatial autocorrelation. Details about those analyses can be found in Appendix A.
To further explore the role of shading and vegetation in temperature variations, we investigated the direction and magnitude of the effect of specific landscape features and urban form metrics (such as plants, buildings, and the sky view factor) on temperature exposure data using the standardized regression coefficient (SRC). The SRC ranges between − 1 and +1 and indicates both the direction and magnitude of changes in the response variable that occur with changes in the independent variable. We also estimated regressions using two subsets of the data: (1) the whole study area and (2) high-density zones (high-density residential, commercial, and office institutional zones) only. The reason for comparing the estimated coefficients on the high-density subset of the data is that we hypothesized very tall buildings, by providing shading and reducing the sky view factor in these areas, would lower the mean radiant temperature (MRT) and perhaps moderate the effect of vegetation.

Figs. 4 and 5 show the R-squared value of the regression model (Eq.
(1)) of LST and MRT data at a range of resolutions. The Y axis in these plots represents the R-squared values of the regression models, and the X axis shows the resolution of dependent variables (LST and MRT). According to Fig. 4, the R-squared value of the regression model or the explanatory power of the independent variables included in the model did not change significantly when the LST data resolution was downgraded to 500 ft (~152 m). By lowering the resolution after 500 ft, the Rsquared value started to drop at a higher rate. Overall, downgrading the LST data resolution from ~100 ft (30 m) to 1000 ft (~305 m) changed the R-squared value from the range of 0.7 to 0.6. This pattern suggests that even lower resolutions of LST data can still be appropriate for explaining the variations in land surface temperature. A similar pattern is observed for MRT while decreasing spatial resolution (Fig. 5).  A. Kianmehr et al. The higher range of R-squared values of the regression models with LST data at various resolutions implies groups of independent variables considered in this study (Table 2) have higher explanatory power for explaining variations in land surface temperature compared to the mean radiant temperature.

The explanatory power of different groups of vulnerability indicators
As mentioned in the previous sections, we included landscape features and urban form metrics (GSV-driven variables), population-based, and zone-based variables in our regression model as predictors. To study the effect of each group of these variables separately, we included them in the model one by one, ran the regression model one at a time, and compared the adjusted R-squared. As different numbers of variables were included in each model, an F-test also was performed to identify whether the more complex models (models with more variables) have a significant improvement over the simpler models (models with a reduced number of variables). Pairwise comparisons of the F-ratio (using the ANOVA test) demonstrate that more complex models offer significant improvements over the simpler models for each dependent variable. More details about the F-test comparisons can be found in Appendix B. Fig. 6 shows the adjusted R-squared values of each of those regression models with air temperature, land surface temperature, and mean radiant temperature in their original resolutions (at the resampled ~100 ft (30 m) resolution for LST and ~2 ft (1 m) for MRT) as the dependent variables. According to this plot, the regression results for the LST data show the highest adjusted R-squared values (0.55, 0.55, 0.72) compared to the MRT (0.52, 0.52, 0.56) and AT (0.20, 0.29, 0.47). Moreover, this figure shows that the GSV-driven variables have higher explanatory power for explaining MRT and LST data variations than the air temperature. However, it appears that the population-based and especially the zone-based variables have the most important impacts in   explaining the variations in air temperature. This fact suggests air temperature of different locations in a city is more associated with sociodemographic and land use/ land cover characteristics of the area rather than the landscape features and urban form characteristics.
In addition, the low adjusted R-squared value of the air temperature regression model with GSV-driven variables implies the air temperature data would not necessarily benefit from higher spatial resolution data collection since it is already not explainable by high-resolution landscape features and urban form metrics data. However, higher temporal resolution might mitigate the low explanatory power of these variables in regression models, which requires further investigation.
Based on Fig. 6, we also noticed while the population-based variables would help explain air temperature variations, they did not improve adjusted R-squared values for the MRT and LST regressions. However, the zone-based data had a positive effect in improving the adjusted Rsquared value of the LST regression result. Overall, among variables considered in this study, landscape features and urban form metrics (GSV-driven variables) were found to be the most important groups of variables for explaining the variation in MRT and LST.
As the role of GSV-driven variables in explaining LST and MRT variations proved to be significant, we also studied the explanatory power of this group of variables with the lowest resolutions of LST and MRT data. Fig. 7 represents the results (R-squared) of regression models in which the GSV-driven variables were included as independent and the lowest resolution LST and MRT data as the dependent variables. We also reported the regression result with the original resolutions in this plot to compare the impact of data resolution on the explanatory power of this group of variables.
As expected, lowering data resolutions for LST and MRT lowered the explanatory power of the high-resolution GSV-driven variables. However, even in the lowest resolutions studied in this paper (1000 ft (~305 m) for the LST and 128 ft (~39 m) for the MRT data), those variables still showed a relatively moderate potential for explaining the variation in LST and MRT (R-squared value of 0.43 for the LST and 0.35 for the MRT regression model). Table 3 presents standardized regression coefficients with only GSVdriven variables included in models. As this table suggests, most of the landscape features and urban forms metrics, even in lower data resolution, have a statistically significant effect (P-value<0.05) on changing land surface and mean radiant temperature. Moreover, according to this

Table C1
Average values of exposure data and vegetation and urban form metrics in the whole study area and high-density zone.

Table C2
The correlation between SVF, tree, and building in the whole study area and high-density zone.

Tree Building
Whole Study Area (SVF) − 0.73 0.14 High-Density Zone (SVF) − 0.32 − 0.27 table, the direction and magnitude of the effect of GSV-driven variables showed a consistent trend among higher and lower resolutions of LST and MRT data. This finding demonstrates the application of GSV-driven variables in explaining land surface and mean radiant temperature variations. Moreover, it suggests the usefulness of the lower resolutions of LST and MRT data for representing spatial heat heterogeneity in the case of high-resolution data paucity.

Effect magnitude of specific landscape features and urban form metrics on temperatures variations in the whole study area and highdensity zones
According to Table 4, plants in both the whole study area and highdensity zones are significantly (with P-value<0.05) related to the AT, LST, and MRT. As it was the pattern with the full dataset, vegetation parameters (plants and grass) showed a negative relationship with AT, LST, and MRT. However, the magnitude of their effects on AT, LST, and MRT was slightly different, with high-density zones having a slightly reduced effect of plants and grass on air temperature (SRC value of − 0.11 and − 0.18 for the whole study area compared to − 0.06 and − 0.18 for the high-density zones).
The direction and strength of the relationship between buildings and temperature-related data were found to be different in the whole study area and high-density zones. For example, while buildings showed a statistically insignificant positive (+0.01) relationship with MRT for the whole study area, in the high-density zones, buildings had a significant negative effect (− 0.12) on MRT. This result implies that the 1% increase in building density would decrease MRT by 0.12 and is related to the shading effect of buildings in high-density residential and commercial zones. However, the relationship between the buildings and LST (both in the whole study area and high-density zones) remained positive, suggesting that an increase in building density would lead to a rise in the land surface temperature. The association of land cover (in this case, built areas) and land surface temperature can explain such a pattern. The relationship between air temperature and buildings both in the whole study area and the high-density zones was found to be statistically insignificant. Table 4 also shows the significant positive relationship between SVF and all types of temperature data in the whole study area and highdensity zones, emphasizing that the increase in the street's openness would lead to a rise in temperatures. SVF showed the most substantial effect on changing MRT. However, the magnitude of this association proved to be smaller for the high-density area compared to the whole study area. This difference was especially significant for the LST data, where the magnitude of association from 0.26 dropped to 0.09.

The spatial resolution effect
Downgrading the resolution of both raster-based heat exposure data examined in this study (LST and MRT) didn't affect the R-squared values of regression models significantly. For the LST data, the explanatory power of predictors for explaining the variations in land surface temperature remained almost stable up to the downgraded resolution of 500 ft (~152 m). It's worth noting that, as described in the "Data" section, we used down-sampled Landsat TIR sensor data with a resolution of 30 m (approximately 100 feet). As the native resolution of Landsat TIR sensor data is 100 m, aggregating pixel sizes of our data and reducing the resolution to this range did not significantly reduce the accuracy of the gained information. This may explain why the R-squared values of LST regression models using data with a resolution in the range of about 100 m did not decrease significantly.
For the MRT data, up to the downgraded resolution of 48 ft (~15 m), the R-squared values of the regression models remained within the range of the original data resolution. These findings support the use of lower-resolution LST and MRT data in explaining temperature variations when higher-resolution data is unavailable due to resource and computational constraints. Dropping the R-squared values of the regression models with the downgraded data resolution can be explained by the Modifiable Areal Unit Problem (MAUP). Coarser units of analysis resulting from aggregating the adjacent pixels and smoothing pixel values introduce spatial data quality concerns (Griffith et al., 2015;Marceau, 1999). Despite this fact, according to studies such as Sobrino et al. (2012), the spatial resolution of LST data could be as low as 165 ft (~50 m) to represent the differences in urban heat island effect between districts (Sobrino et al., 2012). Wu et al. (2019) also found no significant change in the association between adverse health outcomes and land surface temperature at three spatial resolutions (zip codes, 12.5 km grids, and 1 km grids) (Wu et al., 2019). Although these findings align with the current study's results and support lower data use to represent variations in MRT and LST, data resolution should be chosen based on the specific purpose of the studies and careful consideration of the physical phenomenon being represented. This notion is especially crucial when producing heat risk hotspots using spatial vulnerability and exposure data, while the data is usually aggregated to match the employed spatial units (e.g., census tract, postal code) (Ho et al., 2015). Results of this study have important implications for heat-related modeling and studies that use heat exposure data for estimating heat morbidity  and the citizen's need during extreme heat events (Kianmehr & Pamukcu, 2021).

The explanatory power of different groups of vulnerability indicators
Examining different groups of vulnerability indicators showed that population and zone-based variables have the most important impacts in explaining the variations in air temperature. The association of air temperature with socio-demographics and land use/ land cover characteristics can explain this observation (Ngarambe et al., 2021). Moreover, in our analyses, landscape features and urban form metrics (GSV-driven variables) showed the highest explanatory power for explaining LST and MRT variations. It is widely acknowledged that land surface temperature is strongly influenced by local landscape features (e.g., plants, trees, and grass) and urban form metrics (e.g., urban geometry, the sky view factor, aspect ratio, etc.) (Gage & Cooper, 2017;Yang et al., 2021). Recent studies have shown the application of GSV images for estimating sky view factor, urban greenery, shade provision, and residents' outdoor heat exposure (Li, 2021b;Li & Ratti, 2018. In this study, GSV-driven variables, even in the lowest resolution of data, showed moderate explanatory power for explaining the variations in LST and MRT data. However, the lower adjusted R-squared values of regression models with air temperature as the dependent variable in this study can be attributed to the more compound relationship between local air temperature and factors such as anthropogenic activities, physical and landscape characteristics (Amani-Beni et al., 2022).

The effect magnitude of specific landscape features and urban form metrics
For vegetation parameters examined in this study (plant and grass), the direction of their effects on temperatures was negative and consistent across all three types of exposure data, confirming the results of previous studies (Dimoudi & Nikolopoulou, 2003;Giridharan et al., 2008). The most notable impact of buildings observed on MRT in high-density zones where a significant negative effect was recorded (− 0.12), and this pattern was not observed with LST. A similar pattern regarding the shading effect of buildings and their role in reducing MRT levels in urban environments was observed in previous studies (Lindberg & Grimmond, 2011;Nasrollahi et al., 2021). The role of the sky view factor in changing temperature was also significant. The direction of the SVF effect on temperature exposure data was consistent (a significant positive effect) across all variables in the whole study area and high-density zones. This finding is in line with the other studies. For example, a study in Phoenix, Arizona, showed that the sky view factor derived from GSV images has a statistically significant positive correlation with daytime and nighttime LST (0.52 and 0.11, respectively) (Zhang et al., 2019). Similarly, in a study in Beijing, China, it was shown that highly shaded areas (SVF<0.3) would significantly reduce the frequency of thermal discomfort (He et al., 2015). To this, our study additionally shows that the effect magnitude of this variable decreased in high-density zones. This might be explained by the sparse and homogenized tree density in high-density zones (Table C1). As trees play a central role in controlling SVF, this can affect the strength of the association between SVF and temperatures. However, in high-density areas, buildings show a strong negative relationship with the SVF (Table C2), suggesting the increase in buildings would decrease the street openness (SVF) and would ultimately help with lowering temperatures (especially mean radiant temperature). This finding provides an important piece of evidence about the usefulness of other street-level elements than trees (such as buildings) to impede direct solar radiation and improve thermal comfort in urban environments.

Limitations and future research
This study has its limitations. In terms of data sources, we only included air temperature, land surface temperature, and mean radiant temperature at a range of resolutions as the heat exposure variables. However, there are also other heat indexes such as Wet Bulb Globe Temperature (WGBT), Universal Thermal Comfort Index (UTCI), Physiological Equivalent Temperature (PET), and Predicted Mean Vote (PMV) that involve human body characteristics and humidity, radiation, and wind speed besides temperature to measure thermal comfort level (Höppe, 1999;Jendritzky et al., 2012;Wei et al., 2022aWei et al., , 2022b. Investigating the appropriate resolution ranges of such heat measures can be the subject of future research. Moreover, in this study, we just focused on a limited range of spatial resolution, while the effect of temporal resolution of data is also substantial and requires further investigation. Also, a wider range of spatial data resolution can be applied for more comprehensive conclusions. In terms of methods, we just employed one resampling technique (bilinear interpolation) for changing data resolutions. However, different resampling techniques (cubic convolution, nearest neighbor, etc.) might slightly change the results. Therefore, further research is needed to study the choice of resampling method and its influence on the results. Finally, we note the current analysis was conducted only for Atlanta, so the results are not generalizable to locations with different climates, landscapes, and demographic characteristics. Cross-site evaluations can be performed in future research to explore the consistency of observed results.

Conclusion
In this paper, we tried to study the satisfactory range of exposure data resolutions for accurately representing spatial temperature variations. Moreover, by including different types of physical and social vulnerability metrics, we explored which groups of vulnerability indicators can better explain the variations in temperature-related data. Further, we investigated the effect of specific landscape features and urban form metrics on changing temperatures in urban environments. Finally, we compared the effect of those variables on temperatures in the whole study area and high-density zones to check for the role of specific street-level elements in providing shading and thermal comfort.
The results of this study revealed that downgrading resolutions of land surface temperature (up to 152 m) and mean radiant temperature data (up to 15 m) would not substantially reduce the power of social and physical vulnerability metrics in explaining the variations in temperatures. Therefore, the lower resolution of LST and MRT data may still satisfactorily represent spatial urban temperature variations. Moreover, among vulnerability indicators studied in this paper, landscape features and urban form metrics showed the highest explanatory power in regression models. While the sky view factor proved to have the most influence in changing temperatures in the whole study area, buildings showed a significant effect on reducing the mean radiant temperature (with the SRC value of − 0.12) in high-density zones. These findings highlight the usefulness of street-level elements in providing shading and thermal comfort in high-density urban areas. The results of this study provide insights vis-a-vis appropriate sets of data and relevant resolution of temperature measurements for representing spatial urban heat variations which have important implications for heat-related policies and planning.

Declaration of Competing Interest
None.

Data availability
Data will be made available on request.

Acknowledgement
This material is partially supported by the National Science Foundation under Grant Number 1735139. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. Theodore Lim's time was partially supported by a seed grant from Virginia Tech's Institute for Society, Culture, and Environment. We would also like to thank Meng Qi (School of Public and International Affairs at Virginia Tech) for providing technical support and Temple University High-Performance and Scientific Computing Cluster for providing the computing resources. We also thank four anonymous reviewers of the Sustainable Cities and Society journal for their insightful comments and suggestions.

Appendix A. Spatial autocorrelation effect
We used Moran's I test to check for the spatial autocorrelation effect (the correlation among observation points due to the spatial proximity). The positive values of Moran's I test for AT, LST, and MRT and the (P-value<0.001) verified the spatial autocorrelation hypothesis in our data (Table A1). To address the concerns about the validity of OLS regression results due to the presence of spatial autocorrelation effect, we followed two common approaches. First, to minimize the potential effect of spatial autocorrelation, we randomly selected 3000 points (about onethird of the total observation points) to include in our OLS regression analyses. Second, using GeoDa software, an exploratory spatial data analysis tool, we ran two common spatial regression models called spatial lag model (SLM) and spatial error model (SEM) to compare the results with the OLS method. According to our analyses, the R-squared values of the OLS model appeared to be less than the SLM and SEM models (Table A2), suggesting the absence of bias due to the autocorrelation effect in the OLS method. Moreover, no significant difference in the value and direction of regression coefficients of SLM, SEM, and OLS methods was noticed. So, based on these observations, we proceeded with the OLS model with the random selection of observation points. We shall note here that this choice was also made based on the capability of the OLS method to include zone-based categorical variables in regression analyses which were important for the purpose of this study. Table A1 represents the Moran's I test using the total observation points to detect the potential spatial autocorrelation effect Table A2 represents the R-squared values of SLM, SEM, and OLS method using sampled observation points; AT, LST, and MRT as dependent variables; and street-level and population-based variables as independent variables (Table 2). (Table B1, Table B2).

Appendix C. Comparison of the whole study area and highdensity zone
(Appendix C)( Table C1, Table C2).