A monitoring and prediction system for compound dry and hot events

Compound dry and hot events (i.e. concurrent or consecutive occurrences of dry and hot events), which may cause larger impacts than those caused by extreme events occurring in isolation, have attracted wide attention in recent decades. Increased occurrences of compound dry and hot events in different regions around the globe highlight the importance of improved understanding and modeling of these events so that they can be tracked and predicted ahead of time. In this study, a monitoring and prediction system of compound dry and hot events at the global scale is introduced. The monitoring component consists of two indicators (standardized compound event indicator and a binary variable) that incorporate both dry and hot conditions for characterizing the severity and occurrence. The two indicators are shown to perform well in depicting compound dry and hot events during June–July–August 2010 in western Russia. The prediction component consists of two statistical models, including a conditional distribution model and a logistic regression model, for predicting compound dry and hot events based on El Niño–Southern Oscillation, which is shown to significantly affect compound events of several regions, including northern South America, southern Africa, southeast Asia, and Australia. These models are shown to perform well in predicting compound events in large regions (e.g. northern South America and southern Africa) during December–January–February 2015–2016. This monitoring and prediction system could be useful for providing early warning information of compound dry and hot events.


Introduction
Droughts and hot extremes may cause severe impacts on the society and ecosystem (Mishra and Singh 2011, Perkins et al 2012, Coumou and Robinson 2013, Russo et al 2016, Oliver et al 2018. These two extremes are interconnected and may occur concurrently or consecutively (i.e. compound dry and hot events) Numerous studies have analyzed the variability of compound dry and hot events through observations and model projections and highlighted increased occurrences of compound events (Beniston 2009, Hao et al 2013, Mazdiyasni and AghaKouchak 2015, Sharma and Mujumdar 2017, Zhou and Liu 2018, Chen et al 2019. For example,  showed an increased likelihood of hot and dry seasons in many regions (e.g. southern Africa and central Europe), which may result from both the warming trend and the strengthened dependence between precipitation and temperature. Zhou and Liu (2018) investigated the likelihood of compound extremes in China based on the copula model and found increased occurrences of compound hot and dry events in the warm season over southwestern and northeastern China. The increased occurrences and large impacts of compound dry and hot events in different regions across the globe call for improved understanding of underlying mechanisms as well as reliable early warning. Since both dry and hot events may result from large scale global circulations, compound dry and hot events have been shown to be linked to common forcing factors (e.g. El Niño-Southern Oscillation, or ENSO) (López-Moreno et al 2011, Seneviratne et al 2012, Kopp et al 2017. Previous studies have explored the potential predictability of compound dry and hot events based on ENSO. For example, Hao et al (2019) proposed to employ the meta-Gaussian model for the prediction of compound events in southern Africa based on ENSO.
Due to large impacts of droughts and hot extremes, a variety of information systems at regional and global scales have been established for the monitoring and prediction of these events , Hao et al 2014, Nijssen et al 2014, Yuan et al 2015, Zink et al 2016. For example, based on the standardized precipitation evapotranspiration index (SPEI), a global drought monitoring system has been developed to track drought conditions over global land areas (https://spei.csic. es/map/) . Yuan et al (2015) developed the Princeton global seasonal hydrologic forecast system for hydrologic drought forecasting based on climate forecast and variable infiltration capacity model. Zink et al (2016) developed an online platform for drought monitoring in Germany based on soil moisture estimate of the root zone from a hydrologic model on a daily basis.
The large impacts from compound dry and hot events highlight the necessity to track their conditions and predict their occurrences ahead of time; however, an information system for monitoring and predicting compound dry and hot events is still lacking.
The objective of this study is to develop a monitoring and prediction system of compound dry and hot events at the global scale. Data and methods for the development of the system are introduced in section 2. Results of monitoring and prediction are presented in section 3, followed by the discussion and conclusion in section 4.

Monitoring and prediction system
In the study, a compound dry and hot event is defined based on monthly precipitation and temperature to illustrate different components of the system. The standardized compound event indicator (SCEI) and a binary variable are used for characterizing compound events at the global scale. The prediction component consists of predicting the severity (i.e. SCEI) and occurrence (i.e. the binary variable) based on the conditional distribution model and logistic regression model, respectively (Hao et al 2018a, Hao et al 2019). The framework of different components of the system is summarized in figure 1 and will be introduced in detail in the following sections.

Data
Global precipitation and temperature data were obtained from the reanalysis product of modern-era retrospective analysis for research and applications, version 2 (MERRA-2), which provides global estimates of land surface conditions for the period 1980present at a spatial resolution of 0.5°×0.625°(Gelaro et al 2017, Reichle et al 2017a). This dataset has been shown to perform better than its previous versions and is thus selected in this study (for the period 1980-2018). Two types of precipitation products are available in the MERRA-2 system (one is generated by atmospheric models and the other is corrected based on observations). The corrected MERRA-2 precipitation, which involves merger and disaggregation of observational products and model estimates, were used in this study (Reichle et al, 2017b). Another dataset of global monthly precipitation and temperature from 1951 to 2016 at 0.5°spatial resolution was obtained from the climatic research unit (CRU) (Harris et al 2014). This dataset provides a relatively longer record of precipitation and temperature to extract compound dry and hot events and is also used in this study.
The Niño 3.4 Sea Surface Temperature (SST) index (NINO34), defined as the area averaged SST from 5S-5N and 170-120 W, was used as the ENSO indicator for the prediction of compound dry and hot events. In addition, indices of Pacific Decadal Oscillation (PDO) and North Atlantic Oscillation (NAO) were also used for analyzing their impacts on compound dry and hot events. These data were obtained from the Global Climate Observing System (GCOS) Working Group on Surface Pressure (WG-SP) (https://esrl.noaa.gov/psd/ gcos_wgsp/Timeseries/).

Indicators and monitoring component
The drought condition was characterized by the standardized precipitation index (SPI) based on accumulated precipitation of different time scales (e.g. 3 month) (McKee et al 1993). The standardized temperature index (STI) was employed to assess the hot condition and was computed in a similar way to the SPI. Usually, a distribution function was fitted to precipitation (or temperature) to estimate the marginal probability, which was then transformed to a standardized index based on the standard normal distribution. To avoid assumptions of distribution forms, the empirical Gringorten plotting position (Gringorten 1963) was employed to estimate marginal probabilities and compute the SPI and STI based on precipitation and temperature of June-July-August (JJA) and December-January-February (DJF) (i.e. 3 month time scale) for the period from 1980 to 2018. Specifically, the empirical probability for each period i based on observations of sample size n was estimated as: To facilitate statistical modeling, the NINO34 was also transformed into a standardized index (i.e. SNINO) using the same method.
A compound dry and hot event was defined by low precipitation (i.e. low SPI) and high temperature (i.e. high STI) of the same period. Two types of indicators of compound dry and hot events were defined in this system. The first indicator is the SCEI derived from the bivariate distribution function of precipitation (X) and temperature (Y) (or SPI and STI) (Hao et al 2019). Specifically, the joint probability distribution of low precipitation and high temperature can be expressed as: A variety of distribution families for estimating the distribution in equation (1) To avoid the assumption of bivariate distribution forms, we derive the joint probability in equation (1) following the concept of Gringorten plotting position as follows: where n is the length of observations; n i is the number of occurrences x k x i and y k >y i (1kn).
Since the joint probability in equation (1) is not uniformly distributed, an empirical distribution F (based on the Gringorten plotting position) can be fitted to the joint probability to remap it into the uniform space (Mo and Lettenmaier 2014). Following the similar methodology in computing the SPI in the univariate case, the standardized index of compound dry and hot event can be derived by transforming the remapped joint probability through the standard normal distribution Ф. Specifically, the SCEI based on the joint probability of precipitation and temperature can then be expressed as (Hao et al 2019).
Lower SCEI values indicate more severe conditions of compound dry and hot events. The advantage of this indicator is that it can be used to characterize the severity of a compound dry and hot event.
The second indicator is a binary variable (Z=1 for occurrence and Z=0 for non-occurrence), which indicates the occurrence based on precipitation (P) and temperature (T) (or based on SPI and STI). For specific thresholds p 0 and t 0 of precipitation and temperature, respectively, the occurrence of a compound dry and hot event can be expressed as: This indicator can be obtained simply by assessing the concurrence of low precipitation and high temperature for a specific period. For example, a compound dry and hot event can be defined to occur (i.e. Z=1) when precipitation is lower than or equal to the 50th percentile (PP 50 ) and temperature is higher than the 50th percentile (T>T 50 ). However, it falls short in characterizing severity, since there are only two values (1 for occurrence and 0 for nonoccurrence). To alleviate this shortcoming, we defined five thresholds of precipitation and temperature based on different percentiles, including P 50 /T 50 , P 40 /T 60 , P 30 /T 70 , P 20 /T 80 , and P 10 /T 90 , for each period JJA and DJF for each grid. The compound event was classified into 5 categories based on these five levels of occurrences for characterizing compound dry and hot conditions. When the threshold value becomes extreme (e.g. from P 50 /T 50 to P 10 /T 90 ), the number of occurrences generally decreases (figure S1 is available online at stacks. iop.org/ERL/14/114034/mmedia). The occurrence of a compound event based on the threshold P 10 /T 90 indicates more severe conditions than that based on the threshold P 50 /T 50 .

Prediction component 2.3.1. Conditional distribution model
The SCEI was used as the predictand for predicting the severity based on antecedent SCEI and SNINO, which represents the persistence and external forcing, respectively (Hao et al 2019). Specifically, the 1 month lead prediction of SCEI for a period t (W t+1 ) can be achieved based on the conditional distribution given two predictors W t and X t (SNINO) which can be expressed as: By assuming a multivariate normal distribution of the three standardized variables (SPI, STI, and SNINO), the conditional distribution in equation (5) is essentially a normal distribution with mean μ and variance σ 2 (Wilks 2011, Hao et al 2019). The conditional mean μ can be regarded as the predicted severity of the compound event. The Pearson correlation coefficient between observed and predicted SCEI was used to evaluate the prediction skill of the conditional distribution model.

Logistic regression model
For the prediction of occurrences of compound events (Z=1), the logistic regression model was employed and can be expressed as (Hao et al 2018a): where π is the probability of occurrence P(Z=1|x); a is the constant and b is the regression coefficient; x is the predictor (i.e. NINO34). The 1 month lead prediction of the probability of occurrences of a compound dry and hot event (i.e. Z=1) can then be expressed as: The Brier Skill Score (BSS) was used to evaluate the probabilistic prediction skill of the logistic regression model, which was defined as ( where n is the number of periods (or instances) of prediction; P i is the predicted probability for period i; O i =1 if the compound event in observations occurs and O i =0 otherwise; R i is the reference prediction, which is defined as the climatology frequency of occurrences of compound dry and hot events during the period 1980-2018. The BSS ranges from −∞ to 1 with a positive value indicating skillful prediction (i.e. better prediction performance than the reference prediction).  . We also assessed other historical occurrences of compound dry and hot events and overall the two indicators performed well in identifying historical compound events (not shown). We used the threshold P 50 /T 50 to define the occurrence of compound dry and hot events. This setting enables the extraction of a relatively large number of occurrences for statistical modeling of compound events. Significant (and positive) regression coefficients β (at the 0.05 significance level) of lag 1 month and 3 month in equation (6) were used to assess the relationship between occurrences of compound events during DJF and NINO34 in previous seasons (NDJ and SON), which is shown in figures 3(c), (d). Similar to figures 3(a), (b), regions with significant (and positive) β mainly located in northern South America, southern Africa, southeast Asia, and parts of Australia. The positive values of β imply more occurrences of compound dry and hot events with higher values of NINO34 during DJF (or during El Niño years) in these regions.

Prediction of the severity and occurrence
For certain limited regions (e.g. southern North America), the opposite pattern of relationships between ENSO and compound events is shown (i.e. significant positive r and negative β). This indicates that La Niña (or low NINO34 values) is associated with increased occurrences of compound dry and hot events in southern North America. The main reason is that during La Niña, the Pacific jet stream often tends to shift northward, leading to dry and hot conditions in southern United States (Cook and Schaefer, 2008).
In previous sections, only 39 years of monthly precipitation and temperature data from MERRA-2 (i.e. 1980MERRA-2 (i.e. -2018 were used to evaluate the predictor. The number of compound dry and hot events extracted from historical records may not be sufficient for statistical modeling in certain regions. To address the potential uncertainty of datasets, we used the CRU data with a longer record  to assess ENSO impacts. The correlation coefficient (r) between SCEI and NINO34 and the regression coefficient (β) from the logistic regression model of lag 1 month and 3 month based on the CRU data are shown in figure  S2. Overall, similar pattern of impacts of ENSO on compound dry and hot events is shown (negative r and positive β in regions such as northern South America and southern Africa). These results indicate that ENSO provides a good predictor for the prediction of compound events in these regions.

Model validation
To assess the prediction skill of the two prediction models, the leave-one-out cross validation (LOOCV) was used for the period from 1980 to 2018 (n=39), in which the fitting procedure is repeated n times, each time with a sample of size n−1 by leaving out one sample for prediction and evaluation (Wilks 2011). For the prediction of SCEI, the positive correlation (significant at the 0.05 significance level) between observed and predicted SCEI values (1 month and 3 month lead time) of DJF during 1980-2018 based on LOOCV is shown in figure S3. Due to the persistence of SCEI, correlation coefficients between observations and 1 month lead predictions is high for large land areas. When the lead time increases to 3 month, high correlation coefficients between observations and predictions are mainly shown in southern US, northern South America, southern Africa, southeast Asia and parts of Australia. For the probabilistic prediction of occurrences, the BSS for the 1 month and 3 month lead prediction is shown in figure S4. Skillful predictions are shown in regions with significant relationships between ENSO and occurrences of compound dry and hot events (e.g. positive BSS values in northern South America and southern Africa). Overall, these results show that ENSO provides skillful prediction of compound events during DJF for 1 month and 3 month lead time for regions including southern US, northern South America, southern Africa, southeast Asia and parts of Australia.

Model application
Based on the analysis of the predictor and prediction skill above, we then applied the two models to predict compound events during DJF 2015-2016 as a case study. The monitoring of the compound event for this period is shown in figures 4(a), (b). From figure 4(a), the SCEI is particularly low in regions including northern South America, northern and southern Africa, southeast Asia, parts of Australia, and northern Russia. From figure 4(b), the occurrence of different categories of compound events resides in similar regions (e.g. occurrences in northern South America The prediction of the SCEI and occurrence is first illustrated at one grid to show the application of the two models. The observed SCEI values and occurrences of compound events during DJF from 1980 to 2018 for one grid in southern Africa (longitude: 22.5, latitude: −20) are shown in figure 5, which indicates historical compound dry and hot events during certain periods, such as DJF 2015-2016. The 1 month and 3 month lead prediction of the SCEI and occurrence from the two models based on LOOCV is shown in figures 5(a) and (b), respectively. In figure 5(a), the correlation between observations and predictions of SCEI values is significant and relatively high (0.76 and 0.46 for 1 month and 3 month lead prediction, respectively). For the predicted probability of occurrences in figure 5(b), if 0.5 is selected as the threshold to define the occurrence, the probabilistic prediction from the logistic regression model performs well in identifying a large number of historical occurrences of compound events (e.g. during DJF 2015-2016). For example, the 1 month and 3 month lead prediction of the probability of occurrences of compound dry and hot events during DJF 2015-2016 is 0.91 and 0.90, respectively, indicating high likelihoods of occurrences during this period.
The 1 month and 3 month lead prediction of SCEI over global land areas is shown in figures 4(c), (e). The low SCEI values from the prediction generally resemble observations in figure 4(a) for large regions (e.g. northern South America, southern Africa, southeast Asia). In addition, the 1 month and 3 month lead prediction of occurrences is shown in figures 4(d), (f). Higher probability of occurrences of compound dry and hot events is predicted in similar regions to those with low SCEI values from the prediction in figures 4(c), (e), which is consistent with observed occurrences in figure 4(b). The relatively good prediction performance of compound dry and hot events for this period in these regions mainly results from the strong impact of ENSO (Hao et al 2019). However, the prediction model based on ENSO fails to predict compound dry and hot events in certain regions, such as northern Russia, where no significant impact of ENSO is shown from figure 3. These results highlight the useful early warning information of compound dry and hot events from this system for regions with significant impacts from ENSO. Meanwhile, improved understanding and modeling of compound events beyond the region significantly affected by ENSO is a pressing need to improve the system.

Discussion and conclusion
A global monitoring and prediction system for compound dry and hot events at the global scale is introduced in this study. The monitoring component consists of two indicators incorporating both dry and hot conditions, which is shown to perform well in depicting the compound event during summer 2010 in western Russia. For the prediction component, the conditional distribution model and logistic regression model are employed for predicting the severity and occurrence of compound dry and hot events based on ENSO. These two models perform well in predicting the compound dry and hot event during DJF 2015-2016 in large regions (e.g. northern South America and southern Africa) for 1 month and 3 month lead time due to the strong impact of ENSO.
Though ENSO provides skillful prediction of compound dry and hot events, a significant relationship between ENSO and compound events exists only for certain global land areas. Thus, influences of other modes of climate variability (e.g. PDO and NAO) on compound events need to be assessed to improve the prediction of different regions. Similar to figure 3, we show the impact of PDO on compound dry and hot events based on lag 1 month and 3 month correlation coefficient (r) and regression coefficient (β) for DJF (significant at the 0.05 significance level) in figures S5 and S6 based on MERRA-2 and CRU data, respectively. High values of PDO tend to increase the likelihood of compound dry and hot events (negative r and positive β) in regions including northern North America, northern South America, part of Australia (Mantua and Hare 2002), which is roughly similar to the impact of ENSO. Similarly, based on MERRA-2 and CRU data, NAO is shown to affect compound dry and hot events in southern Europe and part of Mediterranean regions (figures S7, S8), which is consistent with previous studies (Brandimarte et al 2011, López-Moreno et al 2011. These results indicate that other modes of climate variability can be employed to improve the prediction of the system. We mainly characterize compound dry and hot events based on two indicators incorporating precipitation and temperature at a monthly time scale. This method can be applied to more variables or indicators (e.g. SPEI) at finer time scales (e.g. define hot conditions based on heat wave). A potential limitation of the prediction component is that information of dry and hot conditions is combined into indicators while their joint status is not predicted explicitly. This can be alleviated by extending the conditional model in equation (5) to predict the joint distribution function of dry and hot conditions based on ESNO (Hao et al 2018b). In addition, the prediction of compound events is achieved based on statistical models, which rely on empirical relationships in historical records and generally fall short in capturing complicated physical processes. To address this limitation, seasonal prediction products from advanced general circulation models (or GCMs) (e.g. North American Multi-Model Ensemble) (Doblas-Reyes et al 2013, Mcevoy et al 2016, Schubert et al 2016 could also be used for the prediction of compound dry and hot events to improve the performance in different regions. Results of this study will be available at the Global Compound Dry-hot Monitoring and Prediction System (GCDMaPS) website (gcdmaps.bnu.edu.cn). We stress that the purpose of this system is to provide alternatives to current efforts or systems in tracking droughts or hot extremes, such as the Global Drought Monitor (https:// spei.csic.es/map/), with focus on compound events. This system could be useful for tracking and predicting compound dry and hot events at regional and global scales to reduce their potential impacts.