Global evaluation of terrestrial near-surface air temperature and specific humidity 1 retrievals from the Atmospheric Infrared Sounder ( AIRS ) 2

Corresponding author: Kaighin A. McColl (kmccoll@seas.harvard.edu) 5 Ministry of Education Key Laboratory for Earth System Modeling, Department of 6 Earth System Science, Tsinghua University, Beijing 100084, China 7 Department of Earth and Planetary Sciences, Harvard University, Cambridge, MA 8 02138, USA 9 School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 10 02138, USA 11 Center for Excellence in Tibetan Plateau Earth Sciences and National Tibetan Plateau 12 Data Center, Institute of Tibetan Plateau Research, Chinese Academy of Sciences, 13 Beijing 100101, China 14 NASA-GSFC Hydrological Sciences Laboratory, Greenbelt, MD, USA 15


7
In this section, we describe the datasets used in this study, detail how they are compared 122 and deseasonalized, and give an overview of TC. where ̅ is the sample mean; and ̂ and ̂ are the relative additive and 265 multiplicative biases, respectively. For simplicity of notation, we drop the ̂symbol 266 and denote the relative bias terms as and for the remainder of the manuscript. 267 The multiplicative bias can be interpreted as the temporal 'sensitivity' of the 268 measurement to the underlying target variable T: small values of result in small 269 temporal fluctuations in the measurement X even for large temporal fluctuations in T. 270 The term 'sensitivity' has different meanings in different contexts. In this work, the 271 AIRS temporal 'sensitivity' refers to estimated for the Level 3 AIRS product in its 272 current form. It does not refer to the sensitivity of the AIRS instrument. For example, 273 the Level 3 AIRS product may exhibit lower temporal sensitivity to the observed 274 temperature than the AIRS-observed radiances due to artifacts of the retrieval algorithm 275 or other processing. We also distinguish 'temporal sensitivity' (estimated in this study) 276 from 'vertical sensitivity', which is a measure of the spatial (vertical) resolution of the 277 AIRS profile (Maddy and Barnet, 2008). This study focuses solely on AIRS near-278 surface products, and therefore does not evaluate vertical sensitivity. 279 Like all validation metrics, quantities estimated by TC are subject to sampling error. 280 We used bootstrapping (Efron and Tibshirani, 1994; chapter 6) with 5000 replicates to 281 quantify the uncertainty in estimates of and . When plotting , estimates of 282 with a 95% confidence interval that overlapped zero were manually set equal to zero. 283 When plotting , estimates of with a 95% confidence interval that overlapped one 284 were manually set equal to one. This ensures that reported non-zero estimates of 285 and non-unity estimates of are unlikely to be artifacts of sampling error. In addition, 286 if any TC-estimated 2 was negative, or any TC-estimated 2 was negative or 287 greater than one, it was discarded. Similarly, in rare cases in which estimates of 288 were negative or greater than two, they were discarded, along with the corresponding 289 . These values can arise if sampling error is significant or if one of the assumptions 290 of TC is violated. 291 Since the primary focus of this study is on the error statistics of the AIRS products, 292 rather than the HadISD or reanalysis products, we simplify our notation for the 293 remainder of the study. Specifically, instead of writing 2 and 2 2 for the 294 standard deviation of the random error and the coefficient of determination for the AIRS 295 products, respectively, we write (AIRS) and 2 (AIRS) instead. Similarly, instead 296 of writing 2 and 2 , we write (AIRS) and (AIRS). 297

Results 298
In this section, we present the major results of the triple collocation validation analysis 299 of AIRS retrievals of near-surface air temperature and specific humidity. The estimated 300 coefficient of determination 2 (AIRS) is relatively high at mid-and high-latitudes 301 for both air temperature and specific humidity (  6) and specific humidity (Fig. 7). 334 Maps of relative additive and multiplicative biases (Figs. 8 and 9, respectively) are 335 presented, again, for retrievals of both near-surface air temperature and specific 336 humidity. In most parts of the world, relative additive biases are indistinguishable from 337 zero for AIRS retrievals of specific humidity (Fig. 8b). For air temperature, they are 338 negative in most parts of the world, and are most negative in the eastern United States 339 and Europe (Fig. 8a). In the tropics, the relative additive bias is closer to zero. The 340 relative multiplicative bias is less than one for AIRS retrievals of both air temperature 341 There is some concern in the use of TC in this study that its assumptions are violated 383 by including reanalyses, which ingest AIRS observations, perhaps inducing error 384 correlations between measurements that are assumed to be zero by TC. In addition, the 385 AIRS retrievals include a component based on a neural net trained on ECMWF 386 reanalysis (Blackwell and Milstein, 2014; Milstein and Blackwell, 2016). Near the 387 surface, the AIRS retrieval may be substantially influenced by the reanalysis training 388 set, again potentially creating error correlations between measurements that violate the 389 assumptions of TC. In Appendix A, we demonstrate that error cross-correlation 390 between the AIRS retrieval and the reanalyses is unlikely to explain the estimates of 391 lower AIRS multiplicative bias in the tropics (Fig. 9). We show that, if anything, the 392 presence of error cross-correlation would overestimate the multiplicative bias of AIRS 393 in the tropics. Therefore, our results are unlikely to be an artifact of violations of the 394 assumptions of TC. temperature over land is lowest in the tropics, the measured signal must be 414 proportionally lower, either due to lower , lower variability in T, or both. TC is not able to estimate the variance of T, but it is likely that lower variability in temperature 416 and humidity in the tropics contributes to the lower correlations observed in the tropics 417 (although differences in the seasonal cycle between the tropics and higher latitudes do 418 not contribute, since all time series were deseasonalized prior to analysis, and 419 qualitatively similar results are obtained if the analysis is conducted separately for each 420 season). However, in addition to this effect, a substantial contributor to the reduction in 421 measured signal is the relatively low multiplicative bias (AIRS) which dampens the 422 observed signal relative to station observations (i.e., (AIRS) < 1), particularly in the 423 tropics (Fig. 9). Therefore, the low AIRS correlation coefficients in the tropics for near-424 surface air temperature over land are due, at least in part, to relatively low temporal 425 sensitivity (i.e., low ) rather than relatively high noise (high ), above and beyond 426 likely differences in variability of near-surface air temperature and specific humidity 427 between the tropics and mid-latitudes. 428

Possible causes of lower temporal sensitivity in the tropics over land 429
A major result of this study is that AIRS retrievals of terrestrial near-surface state show 430 significant potential in the extra-tropics but less potential in the tropics, where 431 correlation with ground observations is relatively low. Why does AIRS perform well 432 outside the tropics, but not in the tropics? In particular, why is the temporal sensitivity 433

Summary and Conclusions 529
This study has evaluated the performance of AIRS retrievals of near-surface air 530 temperature and specific humidity over land. Our evaluation is novel in at least two 531 respects. First, to our knowledge, this is the first study to apply triple collocation to 532 evaluating retrievals of near-surface air temperature and specific humidity. Second, it 533 is the first evaluation study of any kind of AIRS near-surface atmospheric 534 measurements that is both global (rather than specific to a particular site or region) and 535 spatially resolved (rather than averaging results, for example, over all land surfaces). 536 The novel aspects of the study's methodology allow us to reach the main new finding 537 of this study: AIRS retrievals of the near-surface atmospheric state are less accurate in 538         Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: *Declaration of Interest Statement