Southeastern Pacific Error Leads to Failed El Niño Forecasts

In El Niño‐Southern Oscillation (ENSO) dynamical predictions, the ensemble members may show a large spread, leading to low prediction accuracy. The reasons for unreasonable forecast spreads in dynamical predictions are investigated based on hindcast/forecast results from the North American Multimodel Ensemble system. A category of failed‐forecasting members is defined if a negative Niño34 index in winter is forecasted by one member for the observed El Niño events. Compared with reasonable‐forecasting members, the failed‐forecasting members show significant cold sea surface temperature anomalies (SSTAs) in the southeastern Pacific (SEP). Such cold SSTAs can be traced back to the initial cold error in the SEP region. The initial cold error can be enhanced by positive feedback near the SEP region and further hinder warm SSTAs in equatorial regions, leading to a failed prediction. This result highlights the essential role of the SEP region, providing possible contributions to enhance the ENSO forecast skill.


Introduction
The El Niño-Southern Oscillation (ENSO) is a large-scale ocean-atmosphere interaction in the tropical Pacific Ocean that occurs every 3-7 years. ENSO exert worldwide influences on weather and climate variabilities (Wang et al., 2000). The relatively high ENSO prediction skill is usually limited to 1 year (Ludescher et al., 2013). Among these, the expected reasonable ENSO predictions are challenged by the current forecast system of dynamical models (Jeong et al., 2012), which motivates us to investigate the ENSO prediction skill in view of dynamical models.
Several factors could affect the ENSO prediction skill. The two well-known requisites for ENSO are the accumulated equatorial Pacific heat content, which is determined by the warm water volume (Meinen & McPhaden, 2000) and the westerly winds burst (Chen et al., 2015). However, the relationship between the leading warm water volume and ENSO formation is not robust (Su et al., 2018). The timing and strength of the westerly wind burst are difficult to predict (Chen et al., 2015). In addition to the two fundamental factors, the North Pacific Oscillation (Linkin & Nigam, 2008)/the North Pacific meridional mode (Alexander et al., 2010) may affect ENSO development, but their relationship is also complicated. Moreover, the difficulty in predicting the ENSO flavor beyond one season remains challenging (Hendon et al., 2009).
On the other hand, some other studies have highlighted that the southeastern Pacific (SEP) region is important to ENSO predictions (Ding et al., 2015;Imada et al., 2015;Min et al., 2015Min et al., , 2017Su et al., 2014;You & Furtado, 2018Zhang et al., 2014). For example, the cooling sea surface temperature anomalies (SSTAs) in the SEP played a key role in hindering the development of the 2014 El Niño (Min et al., 2015). The cold SSTAs in the SEP region contributed to the anomalous easterly winds in the eastern equatorial Pacific, which suppressed the equatorial SST warming. Meanwhile, following the wind-evaporation-SST feedback, the cold SSTAs in the SEP could sustain themselves and penetrate into the equatorial Pacific (Min et al., 2015(Min et al., , 2017Su et al., 2018). The SSTAs in the SEP region can favor SSTAs development in the equatorial eastern Pacific and may lead to different flavors of El Niño (Min et al., 2017;Zhang et al., 2014). In addition, the key influence of SEP atmospheric variability with regard to ENSO predictions has also been confirmed in very recent studies (You & Furtado, 2018.
In dynamical ENSO predictions, multiple members are usually carried out at the same initial time by one dynamical model, and considerable spreads always exist among these members. Particularly, some members might drift to a state with apparent departure from the real value and cause unreasonable spreads. Such unreasonable spreads can lead to low prediction accuracy, which is unexpected when performing dynamical prediction. Thus, it is worth checking whether unreasonable spreads occur in ENSO dynamical predictions.
If so, what are the reasons for the unreasonable spreads? Whether the key regions affecting ENSO in observation (such as the SEP) may lead to unreasonable spreads?
The spring predictability barrier usually causes a relatively poor score of forecasts initialized before June (Webster, 1995;Webster & Yang, 1992). Physically, the spring barrier is an observed natural phenomenon due to the complicated coupling in the real climate system, which could inherently lead to a large uncertainty in the evolution of spring SSTs. However, this natural large uncertainty of spring SSTs in observation is essentially different from the large spread of initial SSTs in climate models, and these two issues should be investigated separately. As this study was intended to highlight the performance of North American Multimodel Ensemble (NMME) models, the following analyses mainly focus the forecasts initialized at the beginning of June.

Data and Methods
Dynamical ENSO predictions are obtained from the NMME system (Kirtman et al., 2014). The NMME phase-I data sets are available from the website (http://iridl.ldeo.columbia.edu/SOURCES/.Models/. NMME), and the NMME phase-II data sets are available online (https://www.earthsystemgrid.org/search. html?Project¼NMME). The 16 models from phase-I and 4 models from phase-II of the NMME system were applied in this study, while the several remaining models were abandoned due to the unavailability and scarcity of monthly mean sea surface temperature data. The available NMME data set covers the period from 1981 to 2019 in phase-I and 1981 to 2012 in phase-II models, and the spanning period varies by model. Following Table 1 of Becker et al. (2020), the detailed description of each model including model name, valid periods, and lead times has been provided (Table S1 in the supporting information). The climatological monthly mean SST, as a function of both the initial month and lead time, are first calculated based on the entire period available for each model. Then, the monthly SSTAs for each model are calculated by removing the corresponding climatology. The following results are not sensitive to the climatology calculation. As this study focuses on the El Niño predictability from the developing to the peak phase of ENSO, the SSTAs forecasted with initialized near 1 June are used for the following analyses.
To check whether one forecast member predicts El Niño events successfully, the forecasted D 0 J +1 F +1 (0 indicates the current year and +1 indicates the following year) Niño34 index is used as a criterion. If the forecasted DJF Niño34 index during the above 13 El Niño events is positive, the forecast member can contribute to a reasonable spread for the forecast ensemble mean. However, if the forecasted DJF Niño34 index is below 0, the forecast member is considered to have a cold error and may lead an unreasonable cold spread for the ensemble mean. Although some reasonable forecast spreads are expected in the El Niño predictions, such unreasonable cold spread should be avoided. Hence, all the forecasted results are classified into two categories. One category (hereafter category 1; failed-forecasting category) represents the events with unreasonable cold error, and the other (hereafter category 2; reasonable category) represents those events with reasonable spread. The following analyses will focus on the features of these two categories. One specific model may provide several forecast members, and each member is counted separately as an independent case.

Spatial Pattern of the Forecast Error
Significant differences are found in the composite SSTAs of the two categories ( Figure 1). The forecasted SSTAs in the equatorial eastern Pacific are positive from June-July-August (JJA) to DJF in category 2, reflecting the development of warm SSTAs during those recorded El Niño events. However, the warm SSTAs in the equatorial eastern Pacific in JJA in category 1 are very weak, and the SSTAs become negative during September-October-November (SON) and DJF, forming a La Niña pattern. During the developing-mature phase (SON-DJF), obvious SSTA differences between two categories can be found in several regions: the broad SEP (110°W-70°W, 5°S-25°S), the northeastern Pacific, and the northern Atlantic Ocean.
In such regions, there are negative SSTAs in category 1 while positive SSTAs in category 2. In particular, the SSTA differences in the SEP region between the two categories show a relatively strong magnitude and a broad extension during SON-DJF. Meanwhile, the SEP SSTA differences can be directly linked with the SSTAs in the key area of El Niño events, such as the Niño1+2 region (90°W-80°W, 0°-10°S). Thus, the SEP SSTAs may play an important role in causing considerable SSTA spreads in the equatorial eastern Pacific.
Furthermore, the SSTA differences in the SEP between the two categories can be traced back to the initial JJA phase. The reversed SSTAs in the SEP are already present in JJA, with a similar pattern to that in SON/DJF. In category 1, the SEP cooling SSTAs originating from JJA with a salient magnitude could be prolonged and enhanced for several months. The cooling SSTAs in the SEP region then gradually invade the equatorial eastern Pacific, and the cooling SSTAs cover a broad region in the eastern Pacific, from the equatorial region to the SEP region ( Figure S1). This local enhancement of the SEP SSTAs is similar to that of the South Pacific meridional mode (Min et al., 2017;Zhang et al., 2014), which can affect ENSO formation. Hence, it can be deduced that the forecast members with cold error in the SEP region can attribute to the cold spreads of forecasted SSTAs in the equatorial regions.
To highlight the differences between the two categories, a new variable of the occurrence frequency is defined for each category in every grid. The occurrence frequency is defined as the number of positive/negative SSTA occurrence divided by the total number of forecast members in every grid in each category. The horizontal pattern of the frequency with positive SSTAs (hereafter positive frequency) and negative SSTAs (hereafter negative frequency) in the two categories is then compared (Figure 2). The overall spatial pattern of the occurrence frequency is consistent with that of the composite SSTAs for each category ( Figure 1). As discussed above, the sign (positive or negative) of the SEP SSTAs is a key factor in determining the final classification of the forecasted SSTA spread. In fact, more frequencies of negative SSTAs tend to occur in the SEP in category 1. On the other hand, more frequencies of positive SSTAs prefer in the SEP in category 2. The difference in occurrence frequencies between the two categories shows outstanding magnitude and broad extension in the SEP region, which remains persistent from JJA to DJF. Similar to the evolution of negative SSTAs in category 1 (Figure 1), the significant negative frequency in the SEP starts in JJA and then strengthens in the following phases. In accordance with negative SSTA development ( Figure S1), the negative frequency in the SEP region gradually invades the equatorial eastern Pacific and finally covers the broad eastern Pacific, including the equatorial and SEP regions ( Figure S2). Hence, it can be deduced that the forecast members with cold error in the SEP region are favorable to the formation of unreasonable cold events in these El Niño forecasts.

Classification of the Forecasted Cold Error
The negative frequency in the SEP does not reach 100% in JJA in category 1; instead, the value is approximately 70% (Figure 2a). In other words, although the SSTAs in the SEP are mostly negative in category 1, there are still a few occurrences of positive SSTAs in the SEP region. As the SEP SSTAs are essentially 10.1029/2020GL088764

Geophysical Research Letters
connected to the evolution of SSTAs in the key Niño1+2 region for El Niño developments, it is required to further investigate the response of the SEP SSTAs to initial condition in the dynamical forecast system. Then, category 1 is classified into two subcategories, depending on whether the Niño1+2 index in June is negative Figure 2. The SSTA occurrence frequency of (a-c) failed-forecasting category 1 and (d-f) reasonable-forecasting category 2. From top to bottom, the SSTA occurrence frequencies in (a, d) JJA, (b, e) SON, and (c, f) DJF are presented. The SSTA occurrence frequency for each category is defined as the number of positive/negative SSTA occurrence divided by the total number of forecast members in each category. The total numbers of forecast members in category 1 and category 2 are (a-c) 430 and (d-f) 2,000, respectively. The red color indicates that the positive SSTA occurrence frequency is at least greater than 50%, and the blue color indicates the same information for the negative SSTAs occurrence frequency.

Geophysical Research Letters
HUA AND SU (hereafter category 3; with 276 members) or positive (hereafter category 4; with 154 members) (Figure 3; see also the SSTAs in Figure S3). The spatial pattern of occurrence frequency in June is almost identical to that in JJA in the two subcategories, implying the steady persistence of initial condition over this region. The spatial The SSTA occurrence frequency for each category is defined as the number of positive/negative SSTA occurrence divided by the total number of forecast members in category 1. The total number of forecast members in category 1 is 430, and it is classified as 276 in category 3 and 154 in category 4. The red color indicates that the positive SSTA occurrence frequency is at least greater than 30%, and the blue color indicates the same information for the negative SSTA occurrence frequency.

Geophysical Research Letters
pattern of the frequency in category 3 is rather similar to that in category 1 (Figure 2), and the occurrence frequency in category 1 can be largely attributed to that in category 3. The strong frequency of cold errors in the broad eastern Pacific in category 3 can persist and be enhanced for three seasons. From SON to DJF, the cold errors are amplified along the equatorial region, and the Niño3.4 region is completely covered by cold errors in winter, implying a failed El Niño prediction. However, the failed prediction in category 4 seems to be caused by another mechanism. Some positive errors near the equatorial region can be seen in category 4, and those positive errors decrease gradually from June to DJF. Ultimately, the positive frequency in most of the equatorial Pacific in June/JJA gradually transforms to negative in the following months, corresponding to a failed El Niño prediction. The positive frequency in category 4 in June/JJA occurs not only at the equatorial Pacific but also near the equator in the Indian Ocean and Atlantic Ocean. As category 4 has 154 members, the failed prediction caused by synchronous warming errors in the three basins should also be considered in ENSO dynamical predictions. Many previous studies pointed that the SSTAs of Indian Ocean and Atlantic Ocean could influence the ENSO evolution (Ham & Kug, 2015;Izumo et al., 2010;Polo et al., 2015;Wang et al., 2017). The impacts of Indian Ocean and Atlantic Ocean can also be inferred in the reasonable-forecasting category, where the initial cold error over Indian Ocean and Atlantic Ocean may also contribute to reasonable predictions (Figures 1d-1f). Hence, the potential influence of Indian Ocean and Atlantic Ocean on the ENSO forecast deserves further research but is beyond the scope of this study.

Statistical Cold Error Analysis Related to Each Recorded El Niño Event
To check whether the above conclusions are sensitive to a particular dynamical model, the cold spreads of the forecasted SSTAs are calculated for each model (Figure 4). Here, the cold error occurrence frequency for each El Niño event is defined as the number of forecast members in category 1 from each model divided by the total number of ensemble members from the 20 obtained models. Such frequency can be used to evaluate the probability of predicting unreasonable cold events by each model. The cold error frequency is not produced by one or two specific models, but it is a common error in NMME models. Almost all the NMME models make a considerable contribution to the cold spreads of the forecasted Niño3.4 SSTAs. Different failed-forecasting frequencies can be found among the models, and the performance of each model varies among the observed events. The performance of forecast member also varies among the observed El Niño events ( Figure S4). In addition, an observed event (1982, 1991, 1997, 2009, and 2015) associated with a high Niño3.4 index likely corresponds to a low failed-forecasting frequency, indicating that such events may be easier to forecast with the present dynamical models. During the three super El Niño events (1982, 1997, and 2015), the significant positive SSTAs already appeared in the SEP in May (before June) ( Figure S5). Such warmer SSTAs in the SEP are a favorable condition for the formation of strong El Niño by promoting westerly anomalies along the equator (Min et al., 2017;Su et al., 2018). In such situation, it is hard to get a negative/cold SSTA in the SEP by any kind of perturbation in data assimilation. Hence, positive SSTAs (no matter their amplitude) are created for the initial condition in all the members and in all the models. As a result, the forecasted DJF ONI can be generally positive, falling into the class of reasonable-forecasting.
Similarly, the corresponding failed-forecasting frequencies for category 3 and category 4 are calculated ( Figure S6). For category 3, such frequency is dispersed among numerous events, while the majority of the frequency in category 4 is confined to one specific event (say, 1987). The high frequency in category 1 generally occurs in six El Niño events: 1986, 1987, 1994, 2002, 2004. Category 3 explains most of the frequency in the 1986, 1994, 2004, and 2006 events, but category 4 causes a significant frequency in the 1987 event. Meanwhile, category 3 and category 4 share such frequency in the 2002 event. This finding means that category 3 makes a stronger contribution to category 1 than category 4. Consequently, the cold error frequency originating from category 3 dominates the majority of the failed-forecasting members. In category 3, some models have a slight/mild total failed-forecasting frequency, implying that they exhibit better performance in forecasting El Niño events. However, cold spreads would also be produced just if a cold error in the SEP region is provided by these "better" models, leading to failed predictions. Thus, more attention should be paid to the SEP region, considering it as an important indicator in future El Niño predictions even if models have high operational forecast skill.

Cases Initialized Before June and in La Niña Events
Similar results can also be obtained based on the forecasts initialized in earlier months, either in May ( Figure S7) or in April ( Figure S8). After being initialized in May/April, significant differences in SSTA occurrence frequencies, during the period from JJA to DJF, between the two categories can also be found in the SEP region, with large magnitude and broad extension similar to those initialized in June. Such signals in the SEP region could be traced back to condition in May ( Figure S7). However, the difference of negative/positive frequency between the two categories is relatively weaker in the SEP region in April ( Figure S8). From the perspective of SEP SSTA error, the forecast skill initialized in April is relatively low than that initialized in May/June, which is probably linked with the spring predictability barrier.
Besides, the initial error of SEP SSTAs also affects the prediction skill in La Niña events. According to the observed ONI, there were 13 La Niña events (1983, 1984, 1988, 1995, 1998, 1999, 2000, 2005, 2007, 2008, 2010, 2011, and 2017) during the period from 1981 to 2019. Following the same method applied in the El Niño events, the SSTA occurrence frequency for the negative phase of ENSO is also obtained ( Figure S9). It can be found that there are some positive SSTA signals around the SEP region in the failed-forecasting category. On the other hand, the SEP is occupied generally by negative SSTAs in the reasonable-forecasting category in all three seasons. Hence, it can be deduced that the SEP SSTAs errors in the initial months could also play a considerably important role to lead to failed La Niña predictions.

Summary and Discussion
In the dynamical predictions for the recorded El Niño events, some unreasonable cold spreads were generated by the dynamical models, leading to a failed prediction. The southeastern Pacific region is a key region for such cold spreads, as the forecast members with cold error in the SEP region can contribute to unreasonable forecasted SSTAs near the Niño regions. It is suggested that the SEP cold error should be given more attention in ENSO forecasting as well as seasonal climate predictions.
In addition to the SEP cold error, other regions may also contribute to the cold spreads in El Niño forecasting. One example occurred in the 1987 El Niño event, during which both the Indian Ocean and Atlantic Ocean show simultaneous warming, which was different from category 3. The 1987 El Niño event is a unique event. This special event (say, 1987) needs further careful analysis, as the difficulty in continually predicting El Niño conditions during 1987-1988 was also mentioned in another study (Barnston et al., 2019).
In ensemble predictions, members with extreme departures from the observed true value can cause low prediction skill, which is not expected. The failed-forecasting members seem to be inevitable. Some common features of such failed-forecasting members in ENSO prediction have already been obtained in this study.
Further deep investigations into this issue can provide useful clues to create better initial condition for dynamical predictions and can also help to improve ENSO prediction skill. Particularly, the gradual evolution of the initial errors within 1 or 2 months is a key issue, during which the random synoptic errors could be enhanced by atmosphere-ocean interaction into a steady large-scale climatic anomaly pattern. Hence, future researches about ENSO prediction should be investigated under the framework of seamless prediction, covering the period from days to months.