Microclimate Refugia: Comparing Modeled to Empirical Near-Surface Temperatures on Rangeland

.


Introduction
Near the Earth's surface, ambient temperature can vary on very small spatial scales, and animals exploit microhabitat variations in temperature to find refuge from extreme temperatures, locate optimum temperatures in spaces that enhance reproduction, and modify many other key fitness components that are temperature dependent [1,2]. Microhabitat variations in temperature have been identified as a means by which organisms can escape the negative impacts of climate change [3][4][5]. Refugia from extreme temperatures are important for the survival and conservation of species in decline (e.g., spittlebugs [6]). Microhabitats also provide refuges for pest populations and beneficials in agricultural regions [7,8]. To reduce pest populations, microhabitats can be managed in agroecosystems (e.g., fruit flies [9]).
Refugia make characterizing the availability of microhabitats on a broad spatial and temporal scale important. Standard meteorological data are collected at 1.5 to 2 m above ground to assess historical climate and predict future climate change. However, many biological processes occur much nearer to the ground level, where the monitoring of temperatures is sparse [10]. Biophysical models, such as NichMapR [11], use data from Geographies 2023, 3 345 meteorological stations to broadly estimate temperature and other physical parameters across space and time that fill gaps in empirical data. Modeled results require validation in the geographic space and seasons that are relevant to the biological processes under investigation.
Microclimate is biophysically associated with specific attributes of topography [12], but the effect of assemblages of topographic attributes on a local climate is much less studied. For example, topographic variation results in microhabitat heterogeneity that may modify the climate through local feedback in aggregate (reradiation, water vapor, evapotranspiration, etc.). As a result, topographically induced climate may be different from what it would be if similar attributes were found over a wider area, resulting in greater homogeneity [12].
The expansion and migration patterns of Mormon cricket Anabrus simplex populations are of particular interest due to their potential dependence on refugia. Cowan [13][14][15] hypothesized that Mormon cricket populations persist in topographically variable refuges (also known as 'hold-over areas' or 'endemic centers'). One effect of topographic variability on climate is to provide greater microhabitat diversity. Some microhabitats have favorable conditions and provide refuge when the climate is generally stressful, particularly in areas with less variable topography. Hence, endemic centers are heterogeneous in microhabitats, relative to homogeneous areas that are more directly defined by climate, and are prone to shifts with climatic extremes. Heterogeneous areas relevant to Mormon crickets are typically mountain ranges and canyonlands in the western United States, whereas homogeneous areas are plateaus and valleys. Cowan hypothesized that Mormon cricket populations persist in heterogeneous areas, from which they migrate to encounter homogeneous topography when climate conditions are favorable. This temporal heterogeneity in climate can lead to periods of population expansion and aggregation into bands, followed by periods of population contraction and persistence in areas of high topographic variability. Coincidentally, homogeneous areas are also more likely to have roads and be suitable for farming, ranching, and, in modern times, irrigated cropland and tourism, all of which increase the impacts of Mormon crickets on human activities and the local economy.
In this paper, we report near surface soil and air temperatures measured in Utah and Colorado and compare them to modeled temperatures for the same locations and times. We then investigate the effects of slope and aspect on discrepancies between modeled and empirical data.

Study Sites
In June 2012, we set up five arrays to monitor temperature at each of two regions: a set of five arrays in the Mineral Mountains near Beaver in southern Utah and a set of five in the Uintah Mountains near Vernal in northern Utah and Colorado (although two sites were technically in Colorado, we call the latter region northern Utah; Figure 1). All sites were rangeland, where livestock graze and where Mormon crickets had been observed banding within the past five years ( Figure 2). Mormon crickets are large (approximately 2 g as adults) omnivorous katydids that do not have functional wings for flight. They migrate on the ground and lay eggs in soil at approximately 2.5 cm beneath ground level. Accordingly, we monitored temperatures at approximately 2.5 cm above and below ground level (T air and T soil , respectively).
In each array, a wooden stake (4 × 2 × 30 cm) was erected at a central location, and a stake was placed 50 m away from the central stake in each of the four cardinal directions (Figure 1). Recording T air and humidity, a hygrochron ibutton (DS1923, Maxim Integrated, San Jose CA, USA; diameter 15 mm, height 6 mm) on a plastic fob was screwed to the north side of the stake 2.5 cm above ground level. Recording T soil , a thermochron ibutton (DS1922) on a plastic fob was attached to the stake with a 38 cm wire and buried the length of the wire away at 2.5 cm beneath the surface. For each stake, latitude, longitude, and elevation were recorded (Topcon GPS with sub-meter accuracy), the steepest slope of a 10 × 4 × 165 cm wooden board centered at the stake was measured with an inclinometer (Ranger 15T, Silva-USA, Sandy, UT, USA), and the aspect of the slope was classified to eight magnetic compass points (declination ca. +10 • ). In total, there were 25 air sensors and 25 soil sensors placed in southern Utah and 25 of each placed in northern Utah. T air and T soil were logged at 7700 s, the maximum rate to record one year of data before overwriting. Sensors were downloaded annually in September for five years in southern Utah. Arrays in northern Utah were downloaded annually for nine years and continue to be monitored. In each array, a wooden stake (4 × 2 × 30 cm) was erected at a central location, and a stake was placed 50 m away from the central stake in each of the four cardinal directions (Figure 1). Recording Tair and humidity, a hygrochron ibutton (DS1923, Maxim Integrated, and 25 soil sensors placed in southern Utah and 25 of each placed in northern Utah. Tair and Tsoil were logged at 7700 s, the maximum rate to record one year of data before overwriting. Sensors were downloaded annually in September for five years in southern Utah. Arrays in northern Utah were downloaded annually for nine years and continue to be monitored. A proportion of ibuttons stopped logging each year because of corrosion, battery failure, and other causes, resulting in loss of the data for the entire year. In 2016, 46% of the ibuttons had stopped logging, and we did not analyze data for that year because of the gaps. We also terminated the arrays in southern Utah in September 2016. Occasionally, stakes broke, or soil monitors came out of the ground, typically due to livestock activities or thawing. We inspected the data and compared them to neighboring sensors to determine anomalies indicating when the accident occurred, and, when we felt confident, we recovered data prior to the accident. As both southern and northern Utah were monitored between 2011 and 2015, we analyzed data from those years separately from data collected in 2017-2020 when only northern Utah was monitored.

Modeling Tair and Tsoil
Tair and Tsoil at each stake were modeled with NichMapR version 3.2.0 [11,16]. For the base model, we entered the latitude, longitude, elevation, slope, and aspect of each site into the NichMapR package using the micro_usa() function. For Tair, we set the height to A proportion of ibuttons stopped logging each year because of corrosion, battery failure, and other causes, resulting in loss of the data for the entire year. In 2016, 46% of the ibuttons had stopped logging, and we did not analyze data for that year because of the gaps. We also terminated the arrays in southern Utah in September 2016. Occasionally, stakes broke, or soil monitors came out of the ground, typically due to livestock activities or thawing. We inspected the data and compared them to neighboring sensors to determine anomalies indicating when the accident occurred, and, when we felt confident, we recovered data prior to the accident. As both southern and northern Utah were monitored between 2011 and 2015, we analyzed data from those years separately from data collected in 2017-2020 when only northern Utah was monitored.

Modeling T air and T soil
T air and T soil at each stake were modeled with NichMapR version 3.2.0 [11,16]. For the base model, we entered the latitude, longitude, elevation, slope, and aspect of each site into the NichMapR package using the micro_usa() function. For T air , we set the height to 2.5 cm. Other parameters were set to default values, including no shading by surrounding topography, hydraulic properties of loam-textured soil, and surface reflectivity of 0.15 [16]. We lacked slope at four sites, which were set to 0. Output included T air , T soil at 0 and 2.5 cm, and snow depth each hour. In order to compare the modeled temperatures with empirical data that were collected less frequently, we used linear interpolation to estimate modeled temperature at the empirical time. We then calculated the root mean square deviance (RMSD base ) between modeled and empirical data sets. We also applied leastsquares regression to relate empirical to modeled temperatures and report the intercept, slope, and variance explained (R 2 ). In order to determine whether the two measures of deviance were directly proportional, we correlated the intercepts with RMSD base .

Additions to the Base Model
Using the elevatr() package in R (v. 4.0), we used elevatr::get_elev_raster to collect a 5 × 5 raster around each stake. Digital elevations were extracted from the raster at 1/3 arc second or approximately 10 m resolution. Vertical accuracy of the digital elevation model was 0.35 m when compared with LIDAR surveyed points [17]. Slope and aspect were calculated using these small rasters at each stake with the raster::terrain() command.
Using the same 5 × 5 raster, we adjusted the base model at each stake for topographic shading by local features. Horizon angle was calculated at each of 36 compass points (every 10 • ) using the digital elevation at the stake and the elevation of the surrounding raster points. From these 36 points, the horizon angles at 24 compass points were interpolated and added to the 'hori' parameter in micro_usa. The RMSD between empirical and modeled temperatures (RMSD hori ) was calculated and compared with RMSD base .
For each stake, we adjusted the base model for five soil parameters downloaded for Utah and Colorado (for the two sites in Colorado located a few km east of the Utah state border) from the SSURGO database of the National Resources Conservation Service (usda.nrcs.gov). The map unit key (mukey) closest to each stake was extracted using the soilDB::SoilWeb_spatial_query command. From the nearest soil horizon data, a weighted mean of each soil parameter for the top 10 cm of the horizon was calculated and added to nichMapR: albedo (called REFL in micro_usa), bulk density (BD), air entry potential (PE), saturated conductivity (KS), and Campbell's soil b (BB). For calculating RMSD with the soil parameters (RMSD soil ), we focused on one site per array, except for one array, at which we modeled all five sites. RMSD soil was compared with RMSD from the base model. In the base model, roughness length was 0.004 m, which was an intermediate value for tilled soils (0.002 to 0.006 m [18]). Sagebrush, which was found on all of the arrays in northern Utah, had a roughness length of 0.02 m [19]. The roughness length for grasses was also 0.02 m for a canopy of 0.07 to 0.14 m [20], which was a reasonable range of heights for the grasses that dominate arrays in southern Utah. Hence, we modeled the air and soil temperatures with a roughness length of 0.02 m, calculated the RMSD (RMSD ruf ), and compared these with RMSD base for both soil and air.
NichMapR accounted for snow depth when modeling T soil but not T air . We created a hybrid model that replaced T air at 2.5 cm with T soil at 0 cm depth when the snow depth output from NichMapR was 3 cm or greater (i.e., sufficient to cover the sensor). RMSD between empirical and modeled temperatures (RMSD hybrid ) was then calculated and compared with RMSD from the base model.

Results
For the years 2012-2015, the RMSD between the sensors and the base model (RMSD base ) was not significantly different from a normal distribution (Shapiro-Wilk W = 0.97, p = 0.0501). The empirical T air differed from the modeled T air (Figures 3-5) by more than the empirical T soil differed from the modeled T soil (F 1,84 = 148.8, p < 0.0001; average RMSD base for T air : 7.4 • C, n = 49; for T soil : 4.4 • C, n = 37; . Using least-squares regression to estimate the linear relationship between empirical and modeled air temperatures for each site (e.g., Figure 4), the RMSD base was positively correlated with the intercepts (Pearson's r = 0.67, p < 0.0001, Table S1).
temperatures for each site (e.g., Figure 4), the RMSDbase was positively correlated with the intercepts (Pearson's r = 0.67, p < 0.0001, Table S1).        Modeled snow depth, which is also a factor in modeling soil temperature, is overlaid as a histogram. The difference between the two data sets (RMSDbase = 4.53 °C) at this site is near average for soil temperature.  Modeled snow depth, which is also a factor in modeling soil temperature, is overlaid as a histogram. The difference between the two data sets (RMSDbase = 4.53 °C) at this site is near average for soil temperature.

Effects of Digital Elevation Model Estimates of Slope and Aspect
The RMSD with the digital elevation model measurements of slope and aspect (RMSDDEM) was not significantly different from a normal distribution (n = 20, Shapiro-Wilk W = 0.95, p = 0.41). The RMSDDEM was not affected by whether the sensor was in air or soil (ANCOVA, F = 0.3, p = 0.59) and resulted in very little change relative to the RMSD of the base model (RMSDDEM = 0.20 + 0.98 × RMSDbase, R 2 = 0.99, p < 0.0001). Although the slope derived from the digital elevation model (DEM) increased less steeply than the slope measured locally (slopeDEM = 2.4 + 0.4 × slopelocal), the modeled temperatures constructed

Effects of Digital Elevation Model Estimates of Slope and Aspect
The RMSD with the digital elevation model measurements of slope and aspect (RMSDDEM) was not significantly different from a normal distribution (n = 20, Shapiro-Wilk W = 0.95, p = 0.41). The RMSDDEM was not affected by whether the sensor was in air or soil (ANCOVA, F = 0.3, p = 0.59) and resulted in very little change relative to the RMSD of the base model (RMSDDEM = 0.20 + 0.98 × RMSDbase, R 2 = 0.99, p < 0.0001). Although the slope derived from the digital elevation model (DEM) increased less steeply than the slope measured locally (slopeDEM = 2.4 + 0.4 × slopelocal), the modeled temperatures constructed

Effects of Digital Elevation Model Estimates of Slope and Aspect
The RMSD with the digital elevation model measurements of slope and aspect (RMSD DEM ) was not significantly different from a normal distribution (n = 20, Shapiro-Wilk W = 0.95, p = 0.41). The RMSD DEM was not affected by whether the sensor was in air or soil (ANCOVA, F = 0.3, p = 0.59) and resulted in very little change relative to the RMSD of the base model (RMSD DEM = 0.20 + 0.98 × RMSD base , R 2 = 0.99, p < 0.0001). Although the slope derived from the digital elevation model (DEM) increased less steeply than the slope measured locally (slope DEM = 2.4 + 0.4 × slope local ), the modeled temperatures constructed with digital elevations measured remotely were not much different than those modeled with ground measurements of the slope and aspect.

Shading Effects Due to Surrounding Topography
The RMSD for the horizon modification to the base model (RMSD hori ) was significantly different from a normal distribution (n = 86, Shapiro-Wilk W = 0.97, p = 0.045). Thr RMSD hori resulted in very little change relative to RMSD base for both T air (RMSD hori = −0.12 + 1.01 × RMSD base ) and T soil (RMSD hori = 0.01 + 0.99 × RMSD base ).

Effects of Soil Parameters from Empirical Soil Database SSURGO
The RMSD for the soil parameter modification to the base model was not significantly different from a normal distribution (n = 27, Shapiro-Wilk W = 0.96, p = 0.48). The RMSD for the soil parameter modification (RMSD soil ) was significantly related to the RMSD base (F = 54.2, p < 0.0001) and also affected by the sensor location (F = 13.6, p = 0.0011; air: 8.0, soil: 9.6 • C). Incorporating mapped soil parameters made modeled temperatures less similar to empirical data than the base model, with the relation between modeled soil temperature and empirical data particularly worsening (on average, T air models were 19% poorer, whereas T soil models were 86% poorer than the RMSD base model).

Effects of Roughness Length
For the years 2012-2015, the RMSD between the sensors and the temperatures modeled with the roughness length modification (RMSD ruf ) was not significantly different from a normal distribution for air or soil (Shapiro-Wilk W = 0.97, p > 0.31). The RMSD ruf resulted in very little change relative to the RMSD base for both T air (RMSD ruf = −0.23 + 1.07 × RMSD base ) and T soil (RMSD ruf = −0.02 + 0.99 × RMSD base ). On average, the RMSD ruf was 0.3 • C (3.7%) poorer for air temperature and 0.1 • C (1.7%) improved for soil temperature relative to RMSD base .

Effects of Snow on Air Temperature
The RMSD for the hybrid modification to the base model (RMSD hybrid ), which only modified T air , was not significantly different from a normal distribution (n = 49, Shapiro-Wilk W = 0.99, p = 0.98). The RMSD hybrid was linearly related to RMSD base and showed an improvement of 8% on average (RMSD hybrid = 1.09 × RMSD base −1.23, R 2 = 0.82, p < 0.0001). Forty-two sites saw an improvement between model temperature and empirical T air , with the greatest change in the RMSD being 31% (Figure 9), whereas only seven sites saw agreement between model temperature and empirical T air worsen, with the greatest by 9% ( Figure 10). Both the RMSD hybrid for T air and the RMSD base for T soil were independent of the sample size (p = 0.652 and 0.961, respectively). The RMSD hybrid was positively correlated with the intercept of the regression of empirical temperature on the hybridmodeled T air (Pearson's r = 0.57, p < 0.0001).
As the hybrid model resulted in a substantial improvement in the simulated T air data, we investigated the effects of aspect and slope on model and empirical data agreement using the RMSD hybrid for T air and the RMSD base for T soil . The aspect was categorized into four compass points: 316-45 • = N, 46-135 • = E, 136-225 • = S, and 226-315 • = W. For 2012-2015, the RMSD base was directly proportional to slope, whereas the RMSD base did not vary among aspects (ANCOVA: slope F 1,21 = 10.8, p = 0.0035; aspect F 3,21 = 0.67, p = 0.578; after interaction p = 0.39 was pooled with error). The addition of flat areas as a fifth category did not change the result qualitatively (slope F 1,29 = 8.6, p = 0.0064; aspect F 4,29 = 0.58, p = 0.678, interaction pooled with error). The RMSD hybrid was independent of slope, whereas it was significantly affected by the aspect (ANCOVA: slope F 1,30 = 0.34, p = 0.567; aspect F 3,30 = 4.50, p = 0.010; after interaction p = 0.58 was pooled with error). The post hoc multiple comparison of the means indicated that the empirical and modeled T air agreed best on the northern aspects and worst on the eastern aspects (E a = 7.8, W a = 7.2, S a = 6.9, N b = 5.8 • C). The addition of flat areas resulted in the RMSD hybrid being independent of the slope and aspect (ANCOVA: slope F 1,39 = 0.54, p = 0.468; aspect F 4,39 = 2.52, p = 0.056; interaction pooled with error).  . In contrast to Figure 9, which shows the location where the hybrid model improved the fit to empirical data the greatest, the hybrid model worsened the fit of the base model to empirical data at Upper Rye in northern Utah the most (RMSDhybrid = 7.81 vs. RMSDbase = 7.18, n = 4095). As in Figure 9, empirical data in black, base model data in grey, and soil temperature data in red substituted for the base model when snow depth was 3 cm or greater.
As the hybrid model resulted in a substantial improvement in the simulated Tair data, we investigated the effects of aspect and slope on model and empirical data agreement using the RMSDhybrid for Tair and the RMSDbase for Tsoil. The aspect was categorized into four compass points: 316-45° = N, 46-135° = E, 136-225° = S, and 226-315° = W. For 2012-2015, the RMSDbase was directly proportional to slope, whereas the RMSDbase did not vary among aspects (ANCOVA: slope F1,21 = 10.8, p = 0.0035; aspect F3,21 = 0.67, p = 0.578; after interaction p = 0.39 was pooled with error). The addition of flat areas as a fifth category  . In contrast to Figure 9, which shows the location where the hybrid model improved the fit to empirical data the greatest, the hybrid model worsened the fit of the base model to empirical data at Upper Rye in northern Utah the most (RMSDhybrid = 7.81 vs. RMSDbase = 7.18, n = 4095). As in Figure 9, empirical data in black, base model data in grey, and soil temperature data in red substituted for the base model when snow depth was 3 cm or greater.
As the hybrid model resulted in a substantial improvement in the simulated Tair data, we investigated the effects of aspect and slope on model and empirical data agreement using the RMSDhybrid for Tair and the RMSDbase for Tsoil. The aspect was categorized into four compass points: 316-45° = N, 46-135° = E, 136-225° = S, and 226-315° = W. For 2012-2015, the RMSDbase was directly proportional to slope, whereas the RMSDbase did not vary among aspects (ANCOVA: slope F1,21 = 10.8, p = 0.0035; aspect F3,21 = 0.67, p = 0.578; after interaction p = 0.39 was pooled with error). The addition of flat areas as a fifth category Figure 10. In contrast to Figure 9, which shows the location where the hybrid model improved the fit to empirical data the greatest, the hybrid model worsened the fit of the base model to empirical data at Upper Rye in northern Utah the most (RMSD hybrid = 7.81 vs. RMSD base = 7.18, n = 4095). As in Figure 9, empirical data in black, base model data in grey, and soil temperature data in red substituted for the base model when snow depth was 3 cm or greater.
For 2012-2015, the RMSD hybrid was significantly different among the topographic assemblage as defined by the array of sensors (F 9,39 = 2.16, p = 0.0469). Little Rock had the greatest RMSD hybrid and Dinosaur 2 the least (Figure 11). The arrays were significantly different in the RMSD base for T soil (F 9,27 = 3.67, p = 0.0041), with Gilles Hill 2 having the greatest RMSD and Dinosaur 2 the least (Table 1). For 2017-2020, there was no significant relation between the RMSD hybrid and the array (F 4,20 = 1.42, p = 0.265), but the arrays were significantly different in the RMSD base for T soil (F 4,20 = 3.85, p = 0.0187). Upper Rye had the greatest RMSD and Lower Rye the least (Table 1).
Geographies 2023, 3, FOR PEER REVIEW 12 Figure 11. For the years 2012 to 2015, RMSDhybrid for air temperature (a) and RMSDbase for soil temperature (b) were significantly different among arrays. Little Rock in southern Utah had the greatest discrepancy between empirical and modeled air temperature, whereas Dinosaur 2 in northern Utah showed the least difference. Gilles Hill 2 in southern Utah had the greatest discrepancy between empirical and modeled soil temperature, whereas Dinosaur 2 showed the least difference. Figure 11. For the years 2012 to 2015, RMSD hybrid for air temperature (a) and RMSD base for soil temperature (b) were significantly different among arrays. Little Rock in southern Utah had the greatest discrepancy between empirical and modeled air temperature, whereas Dinosaur 2 in northern Utah showed the least difference. Gilles Hill 2 in southern Utah had the greatest discrepancy between empirical and modeled soil temperature, whereas Dinosaur 2 showed the least difference.

Discussion
Many organisms are affected by temperatures within 2.5 cm of the Earth's surface both above and below ground [21,22]. Physical modeling of near-surface temperature between weather stations is logistically feasible, but we found that empirical data from two regions of Utah and Colorado differed from temperature data modeled with NichMapR by an average of 7.4 • C for air temperature and 4.4 • C for soil temperature. The modification of default parameters resulted in little improvement in error, suggesting that the modeled temperatures were remarkably insensitive to changes. The incorporation of snow cover into the modeled T air generally made the modeled data more similar to the empirical temperatures (an average difference of 6.9 • C). These hybrid models were the most effective when NichMapR accurately modeled snow depth. In addition to regional variation in snowfall, snow depth was also affected by drift and other sources of wind-induced error. Topographic variation undoubtedly affected snow accumulation and melting.
The difference between the empirical and modeled temperatures was larger in T air than T soil . In early morning and late evening hours when the air sensor was not in the shade provided by the stake, heating of the silver metal sensor housing by sunlight [23] would introduce error to the empirical T air but not T soil and potentially contribute to the 2.5 • C difference in RMSD between the hybrid model for T air and that for T soil . As an aside, ground surface temperatures can be extremely inhospitable to an insect. In the tracking of a migratory band of Mormon crickets in Nevada in June 2008 [24], surface temperature, measured with an Omega thermocouple thermometer, exceeded 50 • C when the ambient temperature at 1.5 m above ground was only 28.3 • C and reached 59.6 • C when ambient temperature was 31.8 • C. These hand-collected data support the maximum temperatures recorded by the air sensors in Utah, which occasionally exceeded 60 • C (Figures 3, 4 and 10).
For the years 2012-2015, the empirical and modeled T air were most alike on the northern aspect, where shading was greatest. In contrast, errors in T soil increased with slope, which might have resulted from some loss of soil over the sensor due to erosion. Contrarily, these differences may have been due to the adverse effects of the slope and aspect on the accuracy of the modeled data themselves. Differential effects of the slope and aspect on T soil and T air , whether modeled or empirical, may explain why the RMSD base for T soil was not associated with the RMSD hybrid for T air (n = 36, r = 0.264, p = 0.12).
The shading from vegetation may have also contributed to differences between the empirical and modeled temperatures. In addition to the modeled temperatures, assuming 0% vegetation cover that we reported, NichMapR generated an output for 90% vegetation cover. However, most of the sites that we sampled were sparse in vegetation and were much nearer to the assumption of 0% cover. Even sagebrush and juniper (e.g., Figure 2) were spaced so that the canopy over the sensors was typically open. The average area of bare ground in a 50 × 50 cm square encompassing the air and soil temperature sensors in northern Utah was 81% (Figure 12). Although the temperature for vegetation cover in between 0 and 90% cover could be calculated by interpolation, the addition of any vegetation cover would decrease the modeled temperature and thus increase the RMSD between the modeled and empirical temperatures. The canopy photos that track the sun's path provided precise measurements of incident sunlight, but we did not deem those to be necessary in the sparse vegetation. In understanding the role of topography in climate refugia, modeled data would be the least biased if slope or aspect had no effects on their agreement with the empirical temperatures. Differences between empirical and modeled data for the years 2017-2020 were not affected by slope or aspect. Similarly, when flat areas were included in the empirical and modeled data for the years 2012-2015, neither slope nor aspect affected the RMSDhybrid or RMSDbase, respectively. Hence, over a broadened topography, modeled temperatures were unbiased by slope and aspect in two discrete, four-year time spans.
For the years 2012-2015, the difference between the empirical and modeled soil temperature was greatest at Gilles Hill 2 and least at Dinosaur 2. These two sites also had significantly different RMSDhybrid, with Dinosaur 2 being the lesser. The two topographic assemblages were very different: Gilles Hill 2 is a mountain grassland on volcanic soil in southern Utah, whereas Dinosaur 2 is a rocky outcropping on sedimentary soil in Dinosaur National Monument. The sites at Gilles Hill 2 face north or south, with steep slopes between 20 and 21°, whereas those in Dinosaur 2 face north or south, with shallower slopes ranging between 0 and 8°. For the years 2017-2020, Upper Rye had the greatest difference between empirical and modeled soil temperature, and Lower Rye had the least. However, these two topographic assemblages are very similar; both are relatively flat (facing east, south, or west from 0-7° for Lower Rye and facing east or south from 0-6° for Upper Rye), dark-soil grasslands interspersed with sagebrush on the Diamond Plateau in northern Utah, which makes the difference in the RMSDbase difficult to explain.
Despite the differences in the modeled and empirical temperatures that we highlighted over a nearly ten-year time period in Utah, the advantages of the modeled data far outweighed the inaccuracies when modeling demographics, range shifts, and other events on a regional scale. For example, the Mormon cricket outbreak of the first decade of this In understanding the role of topography in climate refugia, modeled data would be the least biased if slope or aspect had no effects on their agreement with the empirical temperatures. Differences between empirical and modeled data for the years 2017-2020 were not affected by slope or aspect. Similarly, when flat areas were included in the empirical and modeled data for the years 2012-2015, neither slope nor aspect affected the RMSD hybrid or RMSD base , respectively. Hence, over a broadened topography, modeled temperatures were unbiased by slope and aspect in two discrete, four-year time spans.
For the years 2012-2015, the difference between the empirical and modeled soil temperature was greatest at Gilles Hill 2 and least at Dinosaur 2. These two sites also had significantly different RMSD hybrid , with Dinosaur 2 being the lesser. The two topographic assemblages were very different: Gilles Hill 2 is a mountain grassland on volcanic soil in southern Utah, whereas Dinosaur 2 is a rocky outcropping on sedimentary soil in Dinosaur National Monument. The sites at Gilles Hill 2 face north or south, with steep slopes between 20 and 21 • , whereas those in Dinosaur 2 face north or south, with shallower slopes ranging between 0 and 8 • . For the years 2017-2020, Upper Rye had the greatest difference between empirical and modeled soil temperature, and Lower Rye had the least. However, these two topographic assemblages are very similar; both are relatively flat (facing east, south, or west from 0-7 • for Lower Rye and facing east or south from 0-6 • for Upper Rye), dark-soil grasslands interspersed with sagebrush on the Diamond Plateau in northern Utah, which makes the difference in the RMSD base difficult to explain. Despite the differences in the modeled and empirical temperatures that we highlighted over a nearly ten-year time period in Utah, the advantages of the modeled data far outweighed the inaccuracies when modeling demographics, range shifts, and other events on a regional scale. For example, the Mormon cricket outbreak of the first decade of this century encompassed 7.2 million acres, an area the size of West Virginia, spread over six states in the western United States [25,26], and it is not yet possible to measure near-surface microclimate on that broad a spatial scale. However, an understanding of biological phenomena will be much more accurate with empirical measures of T air and T soil by an array of sensors on a local scale and the near-surface empirical data expanded to landscape and regional scales with the remote sensing of topography [27]. For example, we are currently investigating the egg diapause of Mormon crickets, which can be prolonged for several years (Figure 12, [28,29]), while simultaneously measuring T air and T soil with the sensor arrays in northern Utah and Colorado described in this paper. Empirical temperatures give the most accurate measurements of heat accumulation over time for predicting temperature-dependent biological processes, such as degree days for egg development in soil. Despite differences with empirical temperatures, temperatures modeled with nichMapR at the height of the organism are much more accurate than simply using temperatures from weather stations at 1.5 to 2 m above the ground surface for calculating degree-day accumulation.
Future projections of spatial and temporal variations in microclimate availability can only be modeled from macroclimate projections. In order to understand the ecological consequences of climate change, biophysical models that downscale climate to ecologically relevant spatial and temporal scales need to be created to understand changes in niche structure [30]. To achieve the widest applications possible, biophysical models need to be validated with empirical data from a wide variety of altitudes, latitudes, soil types, topographies that organisms currently inhabit, and where their ranges might expand to in the future. This study of two regions of Utah contributes to that effort.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/geographies3020018/s1, Table S1: RMSD values and regressions comparing modeled to empirical temperatures. Funding: Funding was provided by Congressional appropriation to USDA-ARS, and by Kent State University to P.D.L.

Data Availability Statement:
The datasets used and/or analyzed during the present study are available from the corresponding author on reasonable request.