ENSO-Based Predictability of a Regional Severe Thunderstorm Index

Here we use coupled climate model forecasts of Niño 3.4 and a regional (Texas, Oklahoma, Arkansas, and Louisiana) tornado environment index (TEI) to examine the modulation of US severe thunderstorm activity by the El Niño-Southern Oscillation (ENSO). The large number of forecast initializations, leads, and ensemble members reduces sampling variability and increases detail in the analysis. The strongest negative relations between TEI and concurrent Niño 3.4 are found in February and March. Both the average of TEI and its spread are larger during cool ENSO conditions, which raises the question of how predictability differs between warm and cool conditions. Predictability is measured using perfect-model skill scores. For a deterministic skill

severe thunderstorm activity is likely modest since the ENSO signal is modest even in less-variable, well-observed quantities such as seasonal averages of the US near-surface temperature and precipitation. Furthermore, over the last 40 years, there have been fewer than 20 ENSO events, with the exact number depending on the choice of threshold and season. The small number of ENSO events in the historical record makes it challenging to estimate the ENSO signal in the US severe thunderstorm activity, as well as to estimate the variability that is unexplained by ENSO. Consequently, analysis based on storm reports tends to involve spatial and temporal aggregation, for example, 3-month seasons and multi-state regions (Moore, 2019). Despite this smoothing in time and space, the estimated ENSO signal in severe thunderstorm activity may still overly reflect the particular events in the historical record such as the Super Outbreaks of 1974 and 2011 which accompanied La Niña conditions. The other tool of climate science, physics-based model experiments, is difficult to apply directly to the question of ENSO and severe thunderstorm activity because global climate models (GCMs) that resolve ENSO and its teleconnections do not resolve thunderstorms. Even the grid spacing of dynamical, convection-permitting, downscaling models is relatively coarse (e.g., 4  E 4 km) compared to the spatial scale of thunderstorms (Gensini & Mote, 2015;Hoogewind et al., 2017).
GCMs are able to resolve so-called thunderstorm ingredients, large-scale meteorological quantities such as convective available potential energy (CAPE) and vertical wind shear which are associated with severe thunderstorm activity (Rasmussen & Blanchard, 1998). Compared to storms themselves, these ingredients can be more readily observed, modeled, and predicted. Combining storm reports and reanalysis ingredients has provided a clearer picture of the relation between ENSO and the US severe thunderstorm activity . In addition, this combination of storm reports and ingredients can be used to construct skillful (in some parts of the United States) statistical seasonal predictions of tornado and hail activity based on ENSO state Lepore et al., 2017). Moreover, GCMs can also be used to model the ingredient response to idealized tropical SST forcings (Chu et al., 2019;Lee et al., 2012). Despite the success of the ingredient approach, the relatively small number of ENSO events remains a source of uncertainty. Reanalysis ingredients while being spatially smoother than storm report data still represent the outcome of a specific realization of weather with all its particular synoptic and climatic features in addition to the background ENSO state.
While the observational record can only be extended with the passage of time, the amount of data available from GCM simulations is only limited by computational resources. In climate change applications, large ensemble simulations have been created for the specific purpose of quantifying uncertainty due to internal variability (Kay et al., 2015). In other applications, ensembles of "opportunity" have been used, sometimes taken from forecasting systems with extensive realtime and reforecast archives. For instance, forecast model output has been used to estimate statistics of extreme and rare events (Kelder et al., 2020;V. Thompson et al., 2017;van den Brink et al., 2004), and to estimate more accurately the variability in the US west coast winter precipitation during strong El Niño events (Kumar & Chen, 2017;L'Heureux et al., 2021). Here we combine the thunderstorm ingredient and forecast model output approaches to compute a novel estimate of the dependence of severe thunderstorm activity on ENSO state. We do so by analyzing concurrent values of a regional tornado environment index (TEI) and the Niño 3.4 index in the reforecast and realtime outputs of a coupled climate forecast model. The forecast record length, frequency of initialization, and ensemble size, provide sample sizes that are orders of magnitude larger than the observational record and that allow detailed analysis of the following questions: To what extent does TEI depend on ENSO in a climate forecast model? How does the strength of the dependence vary by month, and what are the relative contributions of the TEI constituents? What are the implications for TEI predictability?
The paper is organized as follows. Section 2 describes the data. Section 3 describes the TEI formulation and the predictability measures used. Results are given in Section 4. A summary and discussion are provided in Section 5. m s E , respectively. TEI computed from NARR has been shown to explain many aspects of the regional climatology and interannual variability of the US tornado report numbers. The region considered in this study overlaps with NOAA's South and Southeast climate regions, and correlations between TEI and tornado report numbers averaged over the South and Southeast regions are statistically significant in all the months considered here (Tippett et al., 2014). The climatology of monthly TEI computed from first-lead CFSv2 forecasts is higher than NARR TEI during December-March, roughly the same in April, and lower during May in the South and Southeast regions (Lepore et al., 2018).
Previous work has used gridpoint values of cPrcp and SRH. Here we use box averages over the region defined above. We compared the box average of TEI computed using gridpoint values of cPrcp and SRH to TEI computed using box averages of cPrcp and SRH in NARR data (grid spacing of ∼32 km). The correlation of TEI computed in these two different ways exceeded 0.976 for each month December-May separately.
Therefore we used box averages to compute TEI. To obtain the TEI-predicted regional number of tornadoes, we multiplied by 50 which is the number of  1 1 E° grid cells.

Regression Lines and Statistical Significance
We computed regression lines using ordinary least squares and reported 95% confidence intervals for the slope using the nominal sample size. Slopes and 2 E r values were judged statistically significant when the 95% confidence intervals of the slopes did not include zero. The width of the confidence intervals depends on the sample size, and March has the fewest samples with 8,500. This sample size means that 2 E r values as small as 0.05% are statistically significant if the samples are independent. If the sample size is reduced by a factor of nine (the number of forecast lead months), 2 E r values as small as 0.5% are still statistically significant.
However, such small and weak signals may have little practical value because they tend to be overwhelmed by noise and possibly model error.

The Expected Mean-Squared Error Skill Score (MSESS)
Consider a continuous random variable E X and a condition E c. Here E X may be TEI, cPrcp, or SRH, and the conditions are "cool" and "warm" ENSO states. The best, in a mean-squared error sense, forecast of E X given E c is the conditional mean E X c [ ] | , where E E denotes expectation and the vertical line means "conditional on." The squared error of the conditional mean forecast is ( [ ]) X E X c  | 2 , and the expected mean-squared error . The expected mean-squared error skill score (MSESS) is MSESS is a signal-to-total ratio and varies between 0 and 1. MSESS and signal-to-noise ratio are equivalent in the sense that they are monotonic functions of each other. MSESS is a measure of predictability that depends only on the mean and variance of the conditional distribution.

The Expected Value of the Rank Probability and Log Skill Scores
The ranked probability score (RPS) of a forecast ( , , ) B N A E P P P for three ordered categories (below, normal, and above) with verifying categorical observations ( , , ) If the forecast is reliable (expected occurrence equals forecast probability), the expected value of the first term on the right-hand side of Equation 3 is In the case of climatologically equally likely categories, the climatological forecast is ( , , ) 1 3 1 3 1 3 / / / , and by the same argument, its expected RPS is where we have taken the forecast to be reliable and replaced the expected occurrence with the forecast probability. Therefore, the expected ranked probability skill score (RPSS) of a reliable forecast ( , , ) The log score (LS) is the logarithm of the forecast probability of the event that occurs (Good, 1952;Roulston & Smith, 2002). Therefore, the expected LS of a reliable three-category forecast is which is the negative of the entropy of the forecast. The LS of the climatological forecast is The expected log skill score (LSS) of a reliable forecast ( , , ) which is the relative entropy (Kullback-Leibler divergence) from the forecast distribution to the climatological distribution. LSS is positive unless ( , , ) B N A E P P P is the climatological forecast.

Dependence of TEI on Niño 3.4
Before examining Niño 3.4 dependence, we checked for the presence of trends which are expected due to a warming climate (Diffenbaugh et al., 2013). CFSv2 does have some ability to capture long-term trends in near-surface temperature and includes changing radiative forcings (Saha et al., 2014). For the region considered here, we found that SRH has a modest but statistically significant negative trend in every month ( Figure S2). Only the May 2 E r -value is greater than 1% (1.3%). cPrcp has a modest, but statistically significant positive trend January-May ( Figure S3). When those two ingredients are combined in TEI, TEI has a statistically significant negative trend in May, which indicates a very modest (  2 1% E r ) reduction in model-simulated environments favorable for tornadoes ( Figure S4). There are no statistically significant trends in the months of December-April.
During January-March, CFSv2 values of TEI (Figure 1, gray dots) compare reasonably well with NARR ones (Figure 1, red dots), in the sense of having the same range of values. The range of NARR TEI is lower than that of CFSv2 TEI in December and higher in April and May, and this conclusion is supported by quantile-quantile plots (not shown). A linear fit (black line) between CFSv2 TEI and Niño 3.4 shows a statistically significant negative correlation in all months. The highest correlations are in February and March with 2 E r values of 12% and 11%, respectively. Correlations (not shown) between simultaneous NARR TEI and ERSST5 Niño 3.4 are negative in January-April but are not statistically significant.
In all months, the 15th, 50th, and 85th percentiles of TEI (blue curves) are decreasing functions of Niño 3.4. The 15th to 85th percentile range of TEI is wider for negative values of Niño 3.4 than for positive values in all months. The difference is largest in February and March, and higher percentiles of TEI have steeper negative slopes. To the extent to which the dependence of the mean of TEI on Niño 3.4 is approximately linear, the magnitude of the mean shift is the same for warm and cool ENSO conditions. However, the TEI spread is smaller for warm conditions than for cool conditions, especially in February and March, when TEI has a relatively strong Niño 3.4 dependence.
Applying the analysis in Figure 1 to cPrcp and SRH separately shows a statistically significant negative correlation between cPrcp and Niño 3.4 December-April, with 2 E r values exceeding 5% only in February and March ( Figure S5; 7.7% and 5.6%, respectively). There is a statistically significant, slightly positive relation between Niño 3.4 and cPrcp in May ( Figure S5). The linear relation between Niño 3.4 and SRH is stronger than the relation between Niño 3.4 and cPrcp in each month ( Figures S5 and S6). The statistically significant negative correlation between SRH and Niño 3.4 December-May is strongest in January-March ( Figure S6; 2 E r of 15%, 20%, and 17%, respectively). There is no apparent dependence of the spread of SRH on Niño 3.4 and only a very modest dependence of the spread of cPrcp on the Niño 3.4 index in February and March. One explanation for the spread of TEI depending on Niño 3.4 follows from TEI being a product.

Consider random variables
The variance of the product E XY is larger when the means of E X and E Y are larger. Here the means of cPrcp and SRH are larger when Niño 3.4 is negative, which would lead to the variance of TEI being larger when Niño 3.4 is negative. This differing spread raises the question of how TEI predictability differs during warm and cool conditions.

Predictability of TEI
To investigate the differing predictability of TEI during warm and cool ENSO conditions, we define warm ENSO conditions as months when Niño 3.4 is greater than its monthly standard deviation, and cool ENSO conditions as when Niño 3.4 is less than the negative of its monthly standard deviation. The Niño 3.4 standard deviation decreases from 1.37°C in December to 0.7°C in May (Figure 1, green diamonds). The predictability of TEI based on ENSO state is the difference between the probability density function (PDF) of TEI conditional on ENSO state and its climatological (unconditional) PDF (DelSole & Tippett, 2007, and there are many measures of this difference. Measures based on information theory have attractive properties including invariance to nonlinear transformations but their use may be less familiar in weather and climate applications. Alternatively, the expected value of skill scores computed assuming that observations are drawn from the forecast distribution (the so-called perfect-model assumption) are measures of predictability whose interpretation is directly analogous to that of the underlying skill score-they tell us what skill to expect. With that in mind, we measure predictability using expected skill scores. For joint-Gaussian distributed forecasts and observations, the expected value of skill scores including MSESS (Section 3.3), RPSS, and LSS (Section 3.4) are equivalent to the information theory quantity mutual information in the sense that they are all invertible and monotonic functions of the squared correlation (Tippett, 2019;Tippett et al., 2019). This fact means that for joint-Gaussian distributed forecasts and observations, these predictability measures give the same rankings. However, the distribution of TEI is not Gaussian, and the conditional (warm and cool months) and unconditional (all months) PDFs of TEI are strongly skewed to the right with the peak to the left of the mean (Figure 2). The increased width of the TEI PDF during cool conditions is especially clear for February and March. Differences in the TEI distributions for cool and warm conditions are smallest in May.

Expected Mean-Squared Error Skill Scores
MSESS depends on the mean shift and variance of the conditional distribution compared to the climatological distribution (see Equation 2, Section 3.3), and MSESS values can be interpreted as the fraction of explained variance-higher values are better. There are modest and inconsistent (from month to month) differences in the magnitude of the mean shifts of TEI for warm and cool conditions (Figure 3a), with larger shifts in some months for warm conditions (e.g., in December) and larger shifts in some months for cool conditions (e.g., in February). On the other hand, the spread of TEI (15th to 85th percentile range shown in Figure 3a as error bars) is consistently larger during cool conditions. This difference in spread leads to TEI MSESS values being larger during warm conditions than during cool conditions in all months (Figure 3d). The lower value of TEI MSESS during cool conditions reflects the wider range of possible outcomes. TEI MSESS values are highest in February and March during warm conditions and substantially lower during cool conditions. The lowest TEI MSESS values are for cool conditions during April and May.
The predictability of TEI depends on that of cPrcp and SRH. There are only modest mean shifts for cPrcp during warm and cool conditions (Figure 3b), which leads to low cPrcp MSESS values (Figure 3e). Differences in cPrcp MSESS values between warm and cool conditions for a given month reflect the differences in the mean shifts. The larger mean shifts for SRH (Figure 3c) result in higher SRH MSESS values (Figure 3f) compared to those for cPrcp. Larger mean shifts during warm conditions in December, April, and May lead to higher SRH MSESS values.

Expected Probabilistic Skill Scores
During warm and cool conditions, the probability of TEI falling into one of three climatologically equally likely categories (below, normal, and above) shifts substantially away from 1/3 (equal chances) in all months except April and May (Figure 4) (see Figure 2 for category boundaries). Expected RPSS and LSS values are higher for larger probability shifts (see Equations 5 and 6, Section 3.4), and the highest values are in February when cool condition scores are slightly higher than warm condition ones due to the larger shift in probabilities. March scores are relatively high too, and the scores (and probability shifts) are nearly the same for warm and cool conditions. The behavior of the expected probability scores is in contrast to that of MSESS which is much lower in February and March for cool conditions than for warm conditions. The larger spread during cool conditions does not translate to lower expected probabilistic skill scores because the probability shifts are still roughly as larger as during warm conditions. Probability shifts and expected RPSS and LSS values are small in April and May.

Summary and Discussion
We examined the dependence of a US regional TEI on concurrent values of Niño 3.4 in a large set of CFSv2 forecasts with December-May monthly targets. In CFSv2, the negative relation between TEI and Niño 3.4 is strongest in February and March. Previous studies have often focused on 3-month seasons (i.e., December-February and March-May), which separate the months that here have the strongest ENSO signal Moore, 2019). The behavior of TEI is understood in terms of its two constituents: cPrcp and 0-3 km SRH. SRH has a relatively strong negative relation with Niño 3.4 in all months analyzed, but cPrcp has a strong negative relation in only February and March, and in fact has a weak positive relation during May. The strong relation that we find in February is consistent with the NARR-based study of Koch et al. (2021) who found that Niño 3.4 was a useful covariate in February for extreme values of SRH and of the product of SRH and the square-root of CAPE, but not for CAPE separately, and not in other months. The Enhanced Fujita (EF) scale-dependent tornado environmental indices of Lepore and Tippett (2020) show increased sensitivity to SRH with higher EF-rating, which would here imply larger percent increases in TEI-EF2 (and higher) than in TEI-EF1 during La Niña conditions. Such behavior is in line with the finding of   Table 2) that during December-February, the relative increase when going from El Niño to La Niña in tornadoes rated EF2 and higher in the Southeast US was 16 percentage points greater than for tornadoes rated EF1 and higher.
In addition to TEI being larger on average during cool values of Niño 3.4, its spread is also larger then, and the PDF of TEI is extended rightward toward higher values. Higher percentiles of the TEI distribution are more sensitive to shifts in the Niño 3.4 index than are the median and mean. Greater spread during cool Niño 3.4 conditions is also consistent with previous findings such as Moore (2019) who noted higher variance of EF1+ tornado numbers in December-February and March-May (MAM) during cool ENSO conditions. Our results are also consistent with the extended logistic regression of  for MAM severe thunderstorm activity which produced larger probability shifts for cool ENSO conditions than for warm ENSO conditions. We demonstrated here that the change in spread with Niño 3.4 index may reflect the fact that TEI is a product since the variance of a product is larger when the means of its factors are larger.
Differences in TEI spread between warm and cool ENSO conditions have implications for predictability, which we measured by computing the expected value of skill scores assuming that observations are drawn from the model distribution, the so-called perfect-model assumption. Whether TEI is more predictable during warm or cool conditions depends whether deterministic or probabilistic skill scores are used to measure predictability. The differing answers are due in part to TEI being non-Gaussian distributed. Measuring predictability in terms of signal and noise variance is equivalent to using a deterministic skill score such as the expected value of the MSESS. MSESS is larger (more predictability) for warm conditions than cool conditions because of the reduced TEI spread. Probabilistic skill scores provide an alternative view of predictability. We found that the expected value of the RPSS and LSS are greater (more predictability) for cool ENSO conditions during February and March because of larger probability shifts. Higher probabilistic skill for cool conditions is consistent with Lepore et al. (2017) who found higher Brier skill scores in ENSO-based MAM forecasts when the Oceanic Niño Index was less than   0 1 . C.
The approach here of supplementing the observational record with data from large forecast datasets has the potential to address other questions related to severe thunderstorms. For instance, the importance of other large-scale circulation features could be investigated. Childs et al. (2018) examined November-February EF1+ tornado numbers in the southeastern US and found a stronger relation with the Arctic Oscillation than with ENSO. We characterized ENSO state using the Niño 3.4 index but other indices and SST patterns could be used (Lee et al., 2012(Lee et al., , 2016. Another question for future work is the extent to which the strong ENSO signal in February and March is particular to the model and region analyzed here. We note that using forecast datasets to analyze climate and weather connections does not require the forecasts to have skill in the usual sense. Instead the key requirement is that the forecasts be faithful representations of possible realizations of the climatological distribution. In fact, lack of skill implies greater independence across ensemble members and initializations, which increases effective sample size.
Finally, we emphasize that here we have analyzed tornado-favorable environments, not tornadoes. The occurrence of tornado-favorable environments does not guarantee the occurrence of a tornado. Furthermore, we have used environments from a climate model, not from observations or reanalysis. The response of tornado-favorable environments to ENSO in a climate model may differ from that in nature.