Intercomparison of daily precipitation persistence in multiple global observations and climate models

Daily precipitation persistence is affected by various atmospheric and land processes and provides complementary information to precipitation amount statistics for understanding the precipitation dynamics. In this study, daily precipitation persistence is assessed in an exhaustive ensemble of observation-based daily precipitation datasets and evaluated in global climate model (GCM) simulations for the period of 2001–2013. Daily precipitation time series are first transformed into categorical time series of dry and wet spells with a 1 mm d−1 precipitation threshold. Subsequently, Pdd (Pww), defined as the probability of a dry (wet) day to be followed by another dry (wet) day is calculated to represent daily precipitation persistence. The analysis focuses on the long-term mean and interannual variability (IAV) of the two indices. Both multi-observation and multi-model means show higher values of Pdd than Pww. GCMs overestimate Pww with a relatively homogeneous spatial bias pattern. They overestimate Pdd in the Amazon and Central Africa but underestimate Pdd in several regions such as southern Argentina, western North America and the Tibetan Plateau. The IAV of both Pdd and Pww is generally underestimated in climate models, but more strongly for Pww. Overall, our results highlight systematic model errors in daily precipitation persistence that are substantially larger than the already considerable spread across observational products. These findings also provide insights on how precipitation persistence biases on a daily time scale relate to well-documented persistence biases at longer time scales in state-of-the-art GCMs.


Introduction
Daily precipitation persistence on land is regulated by various regional weather features, such as atmospheric blocking [1], mesoscale convective systems [2], and monsoon [3], that are often under the influence of large-scale modes of climate variability which is governed by sea surface temperature variability [4][5][6][7][8] or, in some regions to the similar degree by land processes [9][10][11][12][13]. Prolonged dry or wet spells spanning from several days to weeks, as a result of all such processes that affect daily precipitation dynamics can have considerable impacts on agricultural productivity [14][15][16][17] and thus, on human society. Hence daily precipitation persistence is an important criterion for evaluating precipitation data sets and models regarding their ability to capture or simulate precipitation dynamics. Still, there is a lack of evidence on how observations characterize daily precipitation persistence globally and how global climate models (GCMs) simulate it. Only a few studies have focused on analyzing daily precipitation persistence in observations, and they are often very regional and mainly investigate changes in the daily precipitation persistence. Changes in precipitation persistence associated with changes in the intensity of extreme rainfall were identified in the northeastern US [18] and Europe [19,20] during the 20th century. In Switzerland, little change in spelllength statistics associated with significant trends in Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. precipitation intensity during the 20th century were reported [21]. Conversely, [22] documented a significant decreasing trend of annual maximum consecutive dry days (CDD) in Australia. Significant trends with different signs and magnitude regarding the characteristics of dry and wet spell lengths during the 20th century was identified in India [14,23,24]. While these studies show that some regions of the world have experienced significant changes in daily precipitation persistence, a lack of consensus on the metrics used to characterize precipitation persistence makes it difficult to compare the results from different studies.
Studies evaluating precipitation persistence in GCMs have typically focused on indices such as CDD, maximum consecutive wet days, heavy precipitation days (number of days with rainfall more than 10 mm, R10mm) and very heavy precipitation days (R20mm) [22,26,27], recommended by the WMO/WCRP/ JCOMM Expert Team on Climate Change Detection and Indices [25]. While these indices measure extreme dry or wet persistence, they are all based on yearly maxima of a given variable and do not account for mean persistence characteristics. Furthermore, they might artificially introduce a large IAV on a grid-scale analysis depending on how the maximum values located at the beginning or end of a year are treated [26].
In the following, we analyze global daily precipitation persistence using indices that represent its mean characteristic in an exhaustive set of observational data sets and use these results to evaluate a comprehensive ensemble of GCM simulations. Section 2 describes the observational data and models used in the analysis. Section 3 describes the two indices representing dry and wet precipitation persistence. In section 4, comparisons of long-term daily precipitation indices and their IAV in observational products and GCMs are presented with discussions on the results. The major conclusions of our analyses are presented in section 5.

Data
2.1. Observation-based precipitation products 23 observation-based precipitation products at daily temporal and 1°×1°spatial resolution in the Frequent Rainfall Observations on GridS [28,29] data collection were used (table 1). All considered products belong to one of the following categories: interpolated station measurements, satellite-only products, satellite products calibrated with station measurements, and atmospheric reanalysis and the common time period 2001-2013 is used in our analysis. More information regarding the observation datasets is available in table 2 of [28]. All observational precipitation products were aggregated to a common 2.5°×2.5°resolution before calculating the indices to facilitate comparison with GCMs.

Model simulations
Daily precipitation outputs in the historical and the representative concentration pathway 8.5 (RCP8.5) simulations, covering the period 1950-2013, from 33 GCM belonging to the Coupled Model Intercomparison Project Phase 5 (CMIP5) [42] have been used (table S1 is available online at stacks.iop.org/ERL/14/ 105009/mmedia, supporting info). For the analysis in this study, the longest common observational period 2001-2013 was used. All model outputs were regridded to a to a common 2.5°× 2.5°resolution before calculating the indices.

Methods
In order to estimate day-to-day wet (i.e. rainy) and dry (i.e. nonrainy) persistence, daily precipitation time series are transformed into binary time series using a 1 mm d −1 precipitation threshold. Subsequently, the two precipitation persistence indices, P dd and P ww are calculated as the fraction of dry (wet) day that is followed by another dry (wet) day. For the analyses conducted in this study, the indices are calculated in two ways: annual values and long-term values. For the  [39] 1979-2017 JRA-55 [40] 1958-2017 MERRA1 [41] 1979-2015 MERRA2 [41] 1980-2017 calculation of long-term values, daily precipitation during 2001-2013 is considered. Annual values are estimated using data for each year and allow us to investigate the IAV. IAV is quantified as the standard deviation of the yearly values, and it partially also reflects the statistical uncertainty of the indices.
To estimate the robustness of the multi-observation and multi-model mean error of P dd , P ww and their IAV, we use the coefficient of variation (CV), defined as the ratio of the standard deviation to the mean. In this case, the standard deviation is calculated for all combinations of observations or models, respectively. When the absolute value of the CV is smaller than 1, the estimate is considered robust.
Besides using P dd and P ww as indicators for precipitation persistence, previous studies also used P dd and P ww to simulate the original daily categorical time series under the first-order Markov chain assumption, whereby the goodness-of-fit between the original and simulated dry and wet spell length distributions was assessed using a two-sample Kolmogorov-Smirnov test [43][44][45]. At monthly and annual time scales, the test suggested for both observations and models that P dd and P ww can be used to represent the dry and wet spell length distributions. We conducted the same statistical test at a daily time scale to check whether such approximation is valid at a shorter time scale (figure S1, supporting information). For dry spell lengths, in a few regions such as eastern North America and South Australia, its approximation using the Markov model was statistically confirmed across different observational products and models but not for other regions. For wet spell lengths, the hypothesis does not hold mainly in the tropics, while they are reproducible with P dd and P ww in the other regions of the world. The fact that P dd and P ww do not allow good simulation of the spell length distributions at daily scale might be due to strong seasonality or long-term variability in the time series, which violates the first-order Markov chain assumption. Hence, P dd and P ww alone do not allow to represent dry and wet spell length distribution at a daily time scale. Nonetheless, they remain useful  indicators of precipitation persistence and are used for that purpose in this paper.

Results and discussion
4.1. Long-term mean persistence characteristics of daily precipitation Figure 1 shows the multi-observation and the multimodel mean of daily P dd and P ww over the period 2001-2013. The observations and models show similar spatial patterns of P dd and P ww . While P dd is lowest in the regions around the equator and increases towards the poles, P ww shows the opposite pattern, with the highest values observed over the whole tropics consistently with precipitation climatologies. Also, the global mean of P dd is generally higher than the global mean of P ww . Equivalent results for different seasons are presented in supporting information (figures S2-S5). Compared to monthly and annual P dd and P ww presented in [43], both indices expectedly show wider range of values towards both extremes at daily scale.
Multi-model mean errors in both dry and wet persistence are shown in figure 2. The P dd error shows a more heterogeneous spatial structure, while the P ww error is consistently positive in most regions of the world. This is consistent with the broad notion that models tend to rain more frequently than that observed, although spread across the observational products in the frequency of light precipitation events are larger than more intense categories of precipitation events [46]. These errors are not sensitive to the choice of the threshold within the range of light intensity rainfall (5 mm d −1 ) (not shown here). In addition, in both observations and models, Pdd and Pww are significantly correlated with total amount and frequency of precipitation in most regions ( figure S6). The different spatial structure between the error of the two indices contrasts with what was identified at monthly and annual time scale where both indices were dominantly underestimated by GCMs [43]. The positive P dd error is largest in the Amazon and Central Africa. Other regions including southern Argentina, western North   Figure 3 shows the multi-observation and multimodel mean of the standard deviation of the yearly values of daily P dd and P ww , here also referred to as IAV of the respective index. The spatial structure of the IAV has some commonalities with that of the mean P dd and P ww values (figure 1). The IAV is large in regions with low P dd and P ww values that are the Amazon, central Africa, the Sahara, Australia, and South Africa. In other parts of the world, the magnitude of IAV is relatively constant. Figure 4 shows the multi-model mean error of IAV of P dd and P ww . For IAV of P dd , the Amazon stands out with the largest mean error but with large spread either across the observations or the models indicated by CV larger than 1. Overall, there is a robust underestimation of IAV of P ww in most regions, except for the Sahara and the Middle East, where it is overestimated. The strong underestimation in the IAV of P ww in models suggests that they are oversimplifying some aspects of daily precipitation dynamics. This might also explain how consistent overestimation of daily P dd and P ww can be concurrent with previously identified underestimation of the two indices at monthly and annual time scales [43].

Observational product intercomparison
Based on the previously presented results, we identify four regions (Western North America, Amazon, West Africa, and South Africa) where the errors of daily precipitation persistence or errors of their IAV is relatively large. The regions were defined according to the Risks of Extreme Events and Disasters to Advance Climate Change Adaptation (SREX; [47], see table S2 and figure S7 for the definition of the regions) and are indicated in figure 2. In figure 5, the time series of regionally averaged annual P dd and P ww over the period 1950-2013 are presented. The dark and light gray shades indicate the interquartile and the total range of model simulations whereas the black line indicates median. Colored lines indicate each category of observational products (see also table 1). Results in all other SREX regions are presented in the supporting information (figures S8-S12). For the simulations from GCMs used in this study, the year-to-year evolution is not expected to agree among themselves or with observations year-by-year, unlike the long-term mean magnitude of IAV in individual models quantified as standard deviation ( figure 3). Therefore, the models' yearly values in figure 5 cannot be directly compared to observations. A larger (smaller) IAV in the regions with constantly very low (high) P dd or P ww values is partially due to the statistical uncertainty of the yearly indices themselves, which is affected by the number of transitions from dry or wet statuses.
In Western North America, both observations and models show relatively small spreads in both P dd and P ww . The IAV of the P dd in observations is generally small compared to other regions and shows a good agreement among observations on year-by-year variation. For P ww , the IAV differs depending on the observation product; for instance, some of the satellite products show different temporal behavior than the rest. Also, the full range of P ww in observations is below the interquartile range of the models indicating consistent overestimation in the models. In the Amazon, both observations and models show a large interproduct spread, evident from the magnitude of IAV in P dd . In particular, the reanalysis products show larger IAV than the other products. West Africa is one of the regions with the largest spread in P ww across observational products, even larger than that of models. The range of P ww in the reanalysis products is well separated from the rest. In South Africa, there is a good agreement in P dd between observations, with small spread across both. For P ww , the median of the models is larger than any observational products, confirming consistent overestimation of the models. The very low or high values of indices in some satellite products at the beginning of their time series in the Amazon and West Africa are due to temporarily smaller spatial coverage. Figure 6 shows the globally averaged root mean square differences (RMSD) from the multi-observation mean of P dd and P ww as well as the IAV of both indices for observations and models. For P dd and P ww , the reanalysis products show the largest RMSD values, Figure 6. Globally averaged root mean square differences (RMSD) from the multi-observation mean of P dd (first row), P ww (second row), and the interannual variability (IAV) of P dd and P ww (third and fourth rows), for observations and models. The boxplots indicate the median, interquartile range, and full range of all observational products (OBS) and models (CMIP5).
while the remaining observational products show values smaller than 0.1. Models show a similar range of RMSD of P dd as reanalysis products, and a higher RMSD for P ww also with a larger intermodel spread. The similarity between the GCMs and reanalyses is likely due to the fact that the dynamical properties of reanalysis are highly dependent on the underlying atmospheric model and that precipitation is often not assimilated directly. Specifically, they commonly simulate more frequent rainy days [48] which is significantly correlated with the persistent characteristics ( figure S6). The RMSD of the IAV of P dd is similar across observational products with slightly larger values in the reanalyses. The RMSD of the IAV of P ww is consistent around 0.02 across all observational products. The RMSD of the IAV in both indices for models are larger than observations by around 0.01 with a small intermodel spread. As observational estimates were analyzed in different categories based on data processing and measurement methods, similar a priori approach could be applied to GCMs considering the shared components between models [49,50]. Such an approach may help to identify the reasons underlying the model errors found in this study.

Conclusions
In this study, an exhaustive set of observational precipitation products was compared to GCMs with regards to their daily precipitation persistence. A consistent and statistically robust overestimation in P ww in GCMs was identified, while the model errors in P dd were spatially heterogeneous in both sign and magnitude. A statistically robust underestimation in the IAV of P ww in GCMs was identified around the globe except for the Sahara, Arabian Peninsula, and India, where an overestimation was found. A majority of models show an underestimation in the IAV of P dd in the tropics. In many regions, the spread of P dd and P ww across all observational products are similar or even larger than the intermodel spread, while the magnitude of year-to-year variations agree well between the observational products. At the global scale, the reanalysis products are found to exhibit a larger difference compared to the multi-product mean than other groups of observational products. The results of this study highlight the consistent model error in daily precipitation persistence, despite the considerable spread across observational products. Contrasting model errors of the long-term mean of daily precipitation persistence and its IAV show how a consistent overestimation of daily persistence can relate to previously identified underestimation of dry and wet persistence at longer time scales (e.g. monthly and yearly).