Using remotely sensed temperature to estimate climate response functions

Temperature data are commonly used to estimate the sensitivity of many societally relevant outcomes, including crop yields, mortality, and economic output, to ongoing climate changes. In many tropical regions, however, temperature measures are often very sparse and unreliable, limiting our ability to understand climate change impacts. Here we evaluate satellite measures of near-surface temperature (Ts) as an alternative to traditional air temperatures (Ta) from weather stations, and in particular their ability to replace Ta in econometric estimation of climate response functions. We show that for maize yields in Africa and the United States, and for economic output in the United States, regressions that use Ts produce very similar results to those using Ta, despite the fact that daily correlation between the two temperature measures is often low. Moreover, for regions such as Africa with poor station coverage, we find that models with Ts outperform models with Ta, as measured by both R2 values and out-of-sample prediction error. The results indicate that Ts can be used to study climate impacts in areas with limited station data, and should enable faster progress in assessing risks and adaptation needs in these regions.


Introduction
Historical data on climatic variables such as temperature and precipitation are key for understanding how human and natural systems respond to climatic change. While many global-scale gridded weather datasets do exist for this purpose [1,2] and have provided fundamental insights into climatic responses, accuracies are often limited by the underlying station data availability which can vary substantially over time and space. For instance, according to our measure of quality, defined as stations with at least 10 years of data and missing less than 30% of daily observations, high-quality station density in the Global Historical Climatology Network (GHCN) database peaked in Africa in 1976 and peaked globally in 2001 (figure 1). By 2010 the database contained just 215 high-quality weather stations in all of Africa. This combination of low spatial density of stations, and stations that go on and offline at different times, can lead to substantial measurement error in interpolated datasets which in turn can bias estimates of societal impacts [3].
An alternate and less-common approach is to use satellites rather than ground-based measures to study climate variables of interest. For instance, several satellites measure surface emission of thermal energy, which can be converted into estimates of surface skin temperature (Ts)-a product that the Moderate Resolution Imaging Spectroradiometer (MODIS) has provided at 1 km resolution daily for over a decade. Past studies have evaluated agreement between MODIS and weather stations on daily time scales, often finding weak correlations for daytime temperatures because factors other than Ta, such as cloudiness and soil moisture, can affect Ts [4][5][6]. However, these results could be of limited relevance for estimating how societal outcomes respond to climatic change, since estimates of societal impacts often rely on year-to-year variations in seasonally aggregated measures of temperature exposure, and correlations between station and satellite data tend to increase as the period of aggregation lengthens. For instance, in the United States, the R 2 value associated with regressing daytime Ts on maximum Ta is 30% higher for seasonal averages than for 8-day averages (figure 2). Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Direct evaluation of Ts in societal applications thus appears warranted. Although a previous study evaluated Ts in cross-sectional regressions [7], most econometric studies rely on time-variation to identify climate response functions.
Another motivation for using Ts is that it may be a more direct measurement of the relevant temperature for certain applications. In agricultural settings, Ts measures canopy temperature, and the deviation of canopy temperature from Ta is often used as an indicator of plant water stress for drought monitoring or irrigation scheduling [8,9]. Ts may therefore better represent environmental conditions for predicting crop yields than Ta, as illustrated for wheat experiments in Europe [10].

Methods
To better understand how satellite-based temperature models could inform our understanding of societal responses to climatic change, we revisited three previous studies that had used standard measures of Ta to study impacts: maize yields in Africa [11], maize yields in the US [12], and gross domestic product (GDP) in the United States [13]. In each of these  studies, we re-estimated relationships using both Ta and Ts and compared model performance across temperature measures. We utilized MODIS Aqua MYD11C2 (8-day) version 5 products as our estimates of land surface temperature (Ts) for all three analyses (see table 1 for more information on data sources). MODIS 8-day composited averages have a resolution of 0.05°( 5.6 km) and are available from mid 2002 to the present. We used Ts estimates from the Aqua satellite because it captures images at approximately 1:30 AM and PM local time which more closely approximates the timing of daily temperature extremes than the Terra satellite schedule (10:30 AM and PM) [4]. Missing observations were replaced with inverse distance weighted averages of the nearest four non-missing cells.
For each analysis, Ts measures were constructed analogously to the Ta measures that had been used in the previous studies. The Africa analysis drew on more than 15 000 historical maize trials including 12 500 fields under optimal management and 2 500 fields under drought management conditions.Ta and precipitation data were previously interpolated from publicly available daily weather station data using thin-plate splines [11]. Following this previous work, we estimated a fixed-effects model with quadratic functions of maximum temperature and total precipitation averaged over field-specific 150-day growing seasons: where Y ist is the natural logarithm of reported maize yield for the ith trial at field station s in year t, Tmax is maximum temperature averaged over the 150 days following planting, Pr is total precipitation around anthesis, g is a field site fixed effect, d is a year fixed effect, and e is an error term. For each field site, Ts observations were constructed by taking the inversedistance weighted average of the nearest 9 MODIS cells. These values were then averaged over the 150 day growing period at each field site. The precipitation values interpolated from weather stations were included in both Ta and Ts regressions. The analogous analysis in the United States drew on more than 12 000 county-year maize yield observations from USDA's National Agricultural Statistics Service and the PRISM data set that consists of high-resolution gridded daily maximum temperature and precipitation [14]. Regression analysis of temperature impacts on US maize took the form: where Y it is log yield in county i and year t, Tmax it is the maximum temperature averaged over the approximate three month maize growing season (JJA), JulyMax is the maximum temperature averaged across July, Pr is the total precipitation across the growing season, g is a county fixed effect, d is a year fixed effect, and e is an error term. We estimated this simple specification in order to facilitate tractable comparison across temperature metrics and because model performance was similar between our model and more flexible growing degree models, such as used in [12,15]. For both Ts and Ta, grid cells were spatially aggregated to the county level using agricultural area weights and temporally averaged over JJA and July. The same PRISM precipitation values were used in both the Ta and Ts regressions. For temperature-GDP relationships in the United States, following [13] we utilized nearly 35 000 countyyear observations of GDP from the Bureau of Economic Analysis to estimate the regression: where Y it is county per-capita GDP in county i and year t, Y i;tÀ1 is lagged per-capita GDP, b m T m it is the number of days in the mth 3-degree bin 4 in county i and year t, Pr is total precipitation in the year, g is a county fixed effect, d is a year fixed effect, and e is an error term. Population weights were used to spatially aggregate MODIS and PRISM grid cells to the county level and PRISM precipitation estimates were used in both regressions.
Estimates of equations (1) through (3) were then used to plot the climate response functions shown in figures 3 and 4. In order to plot the relationships shown in figure 3, splines were fit to mean impacts at each temperature level estimated by equations (1) and (2). Bootstrapped standard errors were then calculated and used to estimate 95% confidence intervals. Figure 4 shows the coefficients for each temperature bin estimated by equation (3). Ts has a different support from Ta. Therefore, in order to facilitate a straightforward comparison, we mapped Ts to Ta by matching distribution quantiles and plotting the two response functions with Ta on the x-axis. The mapping was done by calculating 1 000 equally spaced quantiles separately for Ts and Ta then defining a function that matched Ts quantiles to Ta quantiles. This procedure transformed the Ts distribution into the Ta distribution and allows for a simple comparison in familiar units.

Results
For maize yields, we find downward sloping responses to temperature for both temperature measures   . Per-capita GDP response to temperature in the United States. Results from estimating equation 3 following the analysis in [13]. Coefficients measure the effect of a day in a given temperature bin, relative to a day in the omitted bin (12°C), on per capita GDP. Data include more than 34 000 county-year observations spanning 2003-2014. In order to plot air and surface temperature on the same axis, surface temperature units were transformed into air temperature units using the quantile mapping procedure described in Methods.
( figure 3). For the Africa analysis, Ts had slightly higher explanatory power than Ta, with R 2 values for Ts 2.7% and 4.2% higher, respectively, for the optimal and drought management trials examined in the original study. For the United States, the R 2 values are virtually the same across models, consistent with a dense and high-quality ground-station network. In order to further assess model performance we also calculated out-of-sample prediction error by repeatedly estimating the models on randomly selected 75% subsets of locations, predicting values for the 25% of locations that had been excluded from estimation, and calculating the RMSE of out-of-sample predicted values relative to actual values. For both optimal and drought management systems in Africa we find that the model with Ts has lower out-of-sample prediction errors. However, for the United States, the prediction RMSE values are nearly identical across temperature measures. This finding is consistent with our assertion that Ts is most useful in regions with poor station coverage where Ta is measured with significant levels of error. While Ts predicts crop yields well, it is less clear whether it could be used to estimate response functions for non-agricultural applications. Recent research suggests that a variety of economic activities respond negatively to higher temperatures [13,16] and our findings suggest that Ts is, in fact, suitable for estimating economic responses to temperature changes. For our GDP analysis, we find similar non-linear response functions using Ta and Ts over most of the temperature support, particularly at the upper end of the temperature distribution where income appears to be most sensitive to temperature (figure 4). The R 2 values for the two models are similar (0.541 for Ta model, 0.539 for Ts model) and the outof-sample prediction RMSE values are indistinguishable. One apparent difference across temperature measures is that the model with Ta finds a positive effect of extreme low temperatures on income while the model with Ts finds no effect over the same range of the temperature distribution. However, the confidence intervals for the two estimates are overlapping at low temperatures.

Discussion
Overall, we find that Ts is a suitable replacement for Ta in all three applications considered, with Ts even outperforming Ta with respect to prediction error in the Africa study, a region of low station density. Another approach to evaluating Ts performance is to compare the aggregated impacts from 1°C warming estimated with models using Ts and Ta ( figure 5). In doing so we again find similar estimates for all applications. This overall consistency is perhaps somewhat surprising, given the often low correlations between anomalies in Ts and Ta at the daily or 8-day time scale. We view four factors as important in explaining the relative success of Ts. First, some of the 'noise' in Ts vs. Ta relationships stems from errors in the Ta measures, particularly in regions such as Africa where Ta is often interpolated from anomalies at stations tens of kilometers away. Second, much of the noise likely cancels out when aggregating temperatures to the monthly or seasonal time scales that are used in regressions that relate outcomes to temperature. For applications that require finer temporal resolution of temperature measures, the noise in Ts may become Impact of +1ºC warning Figure 5. Comparing Climate Impacts Estimated with Ground Stations and Satellite Data. Results show the average impact from 1°C warming across the sample populations for different applications. Impacts were estimated with models using air temperatures from ground stations (Ta) or surface temperatures from satellite data (Ts). Error bars show the bootstrapped 95% confidence intervals.
Environ. Res. Lett. 12 (2017) 014013 more important-although again, whether it is larger than noise in high-temporal-resolution Ta remains an empirical question. Third, unlike ground measurements, satellite data come from a consistent sensor. Relative spatial variations could therefore be captured more precisely with satellites than with ground measurements from different instruments. Fourth, in vegetated areas much of the noise in the daytime Ts vs. Ta relationship arises from anomalous canopy transpiration rates, with stressed canopies often several degrees warmer than Ta whereas healthy canopies are typically several degrees below Ta [8,10]. Thus, Ts provides a more direct measure of crop condition than Ta, and this represents an advantage of Ts for agricultural applications that may compensate for some of its deficiencies. The substitutability of Ts for Ta suggests the potential usefulness of Ts for future study in areas with limited availability of reliable temperature data. For example, widespread surveys of health and economic activity such as the Demographic and Health Survey (DHS) and Living Standards Measurement Study (LSMS) are available in areas throughout the world with extremely poor weather station availability. Linking these measured outcomes to the MODIS Ts record, which now spans over 13þ years, will enable improved understanding of how climate trends and extremes affect human livelihoods around the world.