How likely is an El Niño to break the global mean surface temperature record during the 21st century?

The likelihood of an El Niño breaking the annual global mean surface temperature (GMST) record during the 21st century is derived from 38 climate models from the Fifth Coupled Model Intercomparison Project (CMIP5). We find that, under a low emission scenario, one out of three El Niño events break the GMST record. The probability significantly increases to four out of five in a high emission scenario. About half of strong El Niños, but only one-fifth of weak El Niños, can set new GMST records in a low emission scenario. By contrast, even weak El Niños break the GMST record more regularly (68 ± 8% chance) in a high emission scenario. Both a stronger El Niño and a higher emission scenario induce a higher record-breaking GMST with a magnitude range from 0.03°C to 0.21°C above the previous record. El Niño accounts for more than half of record-breaking GMST occurrences in all emission scenarios. A comparison between CMIP3, CMIP5, and CMIP6 suggests that the analyses are not affected by model generations.


Introduction
The efficiency and rapidity of El Niño in pushing up the record warm global mean surface temperature (GMST) have been demonstrated in recent decades, notably during the strong 1997/98 and 2015/16 El Niños. As a naturally-occurring phenomenon, an El Niño can cause a strong and transient warmth of the global climate, superimposed on the gradual and persistent warming induced by greenhouse gas (GHG) forcing (Newell and Weare 1976, Pan and Oort 1983, Timmermann et al 1999, Trenberth 2002. The gradual background warming continues to shift the probability distribution of GMST toward higher values, thereby increasing the chance of record-breaking GMST once any internal fluctuation occurs (Wergen and Krug 2010). Recently, Power and Delage (2019) demonstrate the likelihood of a monthly record-breaking surface temperature occuring under different future scenarios. Since 1980, 11 out of 14 record-breaking annual GMSTs have coincided with an El Niño event (Yin et al 2018). Over a long period, these record-breaking GMSTs reflect the general warming trend (Su et al 2017), while on short timescales, they reveal extreme events with non-linearity or threshold behaviors inherent in the climate system (Yin et al 2018).
The interactions between El Niño and global warming and their combined effect on GMST are important topics in climate change detection and attribution (Foster and Rahmstorf 2011). During the 21st century, GHG forcing is projected to alter the characteristics and statistics of El Niño, such as its frequency, amplitude, duration, pattern, and teleconnection (Chen et al 2016). Besides El Niño, other factors can also influence GMST variations on various timescales, including the Pacific Decadal Variability (PDV)/Inter-decadal Pacific Oscillation (IPO) (Meehl et al 2016, Su et al 2017, Atlantic Multi-decadal Variability (AMV) (Schlesinger and Ramankutty 1994, Zhang et al 2007, Muller et al 2013, Atlantic Meridional Overturning Circulation (Schleussner et al 2014), and Arctic Oscillation (AO) (Buermann et al 2003, Zanchettin et al 2013. Regardless of the many factors listed above, Trenberth (2002) shows the importance of diabatic heating from the tropical Pacific ocean on the GMST. Thus, when an El Niño occurs during the 21st century, an important question for near-term climate change prediction would be: how likely is the El Niño to break the GMST record, and by how much? To answer this, we analyze the historical simulations and future projections from 38 climate models in the Fifth Coupled Model Intercomparison Project (CMIP5) (Taylor et al 2011), supplemented with the results from CMIP3 and CMIP6. It should be noted that we do not attempt to make actual and real-time predictions for particular events, but rather to obtain the statistics from the CMIP5 models about El Niño, global warming, and record-breaking GMST. We evaluate these statistics for each model before calculating their multi-model ensemble mean.

Models and methods
The CMIP5 multi-model ensemble has demonstrated improvements on the convergence of El Niño amplitude and life cycle between models when compared to CMIP3 (Bellenger et al 2014). To identify the El Niño events and associated record-breaking GMSTs during the 21st century (2006-2100), we use the monthly 'tos' and 'tas' outputs from a total of 38 CMIP5 models under the same ensemble run and calculate the Oceanic Niño Index (ONI) and GMST. The variable 'tos' and 'tas' represent sea surface temperature (SST) in the oceanic models and the near-surface (twometer) air temperature in the atmospheric models, respectively. The ONI is routinely used by the Climate Prediction Center (CPC) at the National Oceanic and Atmospheric Administration (NOAA) to monitor the condition of the tropical Pacific and to issue El Niño/ La Niña warnings once thresholds are passed (https:// origin.cpc.ncep.noaa.gov/products/analysis_ monitoring/ensostuff/ONI_v5.php).
To obtain the ONI in the CMIP5 models, we calculate the area-weighted mean of monthly SST anomalies in the Niño 3.4 region (5 • N-5 • S, 170 • W-120 • W) with the seasonal climatology and long-term trend removed before calculating the threemonth running mean. The detrending follows the method from the CPC by removing the 30-year means from every five years centered at the 30-year window. This avoids the trend affecting the identification of the El Niño events in different Representative Concentration Pathways (RCP) scenarios. An El Niño event is identified whenever the ONI is higher than 0.5°C for more than five consecutive months, the same criteria set by the CPC. We also tested two other methods for ONI detrending, including the removal of either the linear trend or low-frequency component identified with a low-pass Butterworth filter. All three methods generate results with variations within the error bar shown in this study. In this study, we use the CPC method to determine El Niño.
For the GMST in each model and RCP, we first derive a monthly time series of GMST from the areaweighted average of 'tas' with the seasonal cycle removed. We further calculate an annual mean time series to identify the record-breaking annual GMSTs. For each model, we set 1850-1899 in the historical run as the reference period to identify the first recordbreaking GMST and then the following records during the 20th and 21st centuries. Trenberth (2002) has shown that the GMST response typically lags the Niño 3.4 index by around three months in the observational data. Therefore, we pick out the years where the record-breaking annual GMST is overlapped by the period that is a threemonth lag to an El Niño period ( figure 1(a)). However, there is a possibility that one could make an incorrect linkage between El Niño and record-breaking GMST. This happens when record-breaking GMSTs are largely induced by other mechanisms, and coincide with the year of El Niño onset. To avoid any incorrect attribution to El Niño, we consider two particular situations based on the onset time of El Niño ( figure 1(b)). In the case of the onset happening in the first half of year 1, we calculate the mean of the monthly GMST with the seasonal cycle removed after the El Niño onset. If it is higher than the annual mean GMST, only then do we attribute the record-breaking GMST to the occurrence of the El Niño. In the case of the El Niño onset happening in the second half of year 1, we do not associate the GMST, if record-breaking, with this El Niño. This is because the El Niño does not have enough time to significantly influence the annual mean GMST given the three-month lag.
We calculate the likelihood (%) of an El Niño to break the GMST record according to where  T and ò represent a record-breaking GMST and El Niño event, respectively. N ò is the total number of El Niño events.
is the number of El Niños that induce at least one record-breaking GMST. In other words, represents, in all El Niño events, the probability of an El Niño to induce record-breaking GMST. Similarly, we calculate the fraction of recordbreaking GMST years attributable to the occurrence of an El Niño according to is the number of record-breaking GMSTs induced by El Niño. Figure 1(c) shows the relation between the different numbers used in the equations above ). The number of record-breaking GMSTs that are induced by an El Niño ( We also calculate the magnitude of a recordbreaking event associated with El Niño (   ( | ) M T ). The magnitude of a record-breaking GMST is simply the difference between the new and previous records. Since there are some cases of consecutive recordbreaking years associated with one El Niño event, we accumulate the record-breaking magnitudes to study the total effect of the El Niño.
We use four emission scenarios including RCP2.6, RCP4.5, RCP6.0, and RCP8.5 to study how different emission scenarios change the likelihood and magnitude during the 21st century. For any two estimated values of X±x and Y±y, the difference is x y 2 2 based on error propagation. D is statistically significant when D>d.

Likelihood of record-breaking GMST during El Niño
To calculate the likelihood of record-breaking GMST during an El Niño (   ( | ) P T ), we first identify El Niños, record-breaking GMSTs, and record-breaking GMSTs associated with El Niño in each model and RCP (figure 2 and figures S1-S4 available in the online supplementary material at stacks.iop.org/ERL/14/ 094017/mmedia). The ensemble mean of N ò is 22 ± 3 in RCP2.6, 24 ± 2 in RCP4.5, 24 ± 3 in RCP6.0, and 25 ± 2 in RCP8.5 (figure 3(a)), which are consistent with the pre-industrial control (piControl) run during a 100 year period with N ò = 23 ± 2. The range shown in this study represents the 95% confidence interval. The result indicates that, in most models, GHG forcing does not impact much on El Niño frequency during the 21st century. The differences in N ò are relatively small between RCPs for each model (i.e., scenario uncertainty). The largest difference in N ò comes from the difference between models. The multi-model ensemble mean helps reduce this structural uncertainty (Palmer et al 2005, Tebaldi andKnutti 2007). Most models simulate 14-35 El Niño events in their RCP projections, consistent with the El Niño cycle of 2-7 years. One model (INM-CM4) shows a significant reduction of the total number of El Niños from 28 in the historical run to 6 in the RCP projections. Internal variability and external forcing could be the possible causes for this reduction (Wittenberg et al 2014). By removing the three models that simulate less than 14 El Niño events in RCPs, the statistics shown in this study change by less than 5% and are still within the error bars of the ensemble mean estimates (table 1).
The number of El Niño events that are associated with record-breaking GMSTs ), however, increases significantly with the increasing emissions.
shows 8±1 in RCP2.6, 13±1 in RCP4.5, 14±2 in RCP6.0, and 20±2 in RCP8.5 (figure 3a). The differences in  are statistically significant between RCPs except between RCP4.5 and RCP6.0. Due to the different responses of N ò and  with the increasing emissions in the RCPs (table 1). Except for the insignificant difference between RCP4.5 and RCP6.0, there are significant differences between any other two scenarios. The largest increase of 43±6% in   ( | ) P T occurs when the emission increases from RCP2.6 to RCP8.5. The result is consistent with the fact that the larger the warming trend in GMST, the higher the chance for an internal variability to break the GMST record.
For consistency and to further investigate how different El Niño strength influences record-breaking GMST, we categorize El Niño events based on the maximum value of the three-month mean ONI during each El Niño period. El Niños are categorized into weak, moderate, and strong events with the ONI range of 0.5<ONI1.0, 1.0<ONI1.5, and ONI>1.5, respectively (Trenberth 2018). Table 1 shows the ensemble mean of   ( | ) P T under different categories of El Niño and RCPs (figures S6-S8). Such as with the case of all El Niño, there is a positive correlation between emission strength and   ( | ) P T in each category. The larger error bars in each category are due to fewer El Niño events when compared to the all El Niño case (figures S5-S8). Despite the changes in error bar, in each category, the   ( | ) P T in RCP8.5 is still significantly higher than in RCP2.6. For example, only about one-fifth (21%) of weak El Niños can break the GMST record in RCP2.6. The chance significantly    Note.   ( | ) P T represents, in all El Niño events, the probability of an El Niño to induce a record-breaking GMST.   ( | ) P T represents, in all record-breaking GMSTs, the probability of those induced by an El Niño.
represents the average magnitude of a record-breaking GMST associated with El Niño. Error bar represents 95% confidence interval. increases to two-thirds (68%) in RCP8.5. The result implies that the increase of emissions from RCP2.6 to RCP8.5 significantly increases   ( | ) P T regardless of the strength of El Niño. Strong El Niños in the high emission scenarios (RCP8.5) are very likely (86%) to break the GMST record.
Between RCP8.5 and RCP4.5, in which we have the most available models, we find that only the weak and moderate El Niño cases show statistical significance in the differences of   ( | ) P T . The differences between RCP4.5 and RCP8.5 in   ( | ) P T decrease when El Niño strength increases, from 31±11% (weak El Niño) and 16±12% (moderate El Niño) to 8±10% (strong El Niño). The error bar of   ( | ) P T difference does not change much between categories, but the difference in the strong case is around 75% smaller than in the weak case. This relatively small difference between RCP4.5 and RCP8.5 in the strong El Niño case implies a 'saturation' of   ( | ) P T . In other words, strong El Niño can trigger a GMST variation large enough to ensure a record-breaking year regardless of the differences in warming rates between the two scenarios.  (table 1). To highlight the important role of El Niño in record-breaking GMSTs, we apply the bootstrapping method with 1000 iterations in each RCP scenario and compare the average magnitudes of record-breaking GMSTs during El Niños and during non El Niños. In all RCPs, the average magnitudes are larger during El Niños than during non El Niños. We find differences of 0.06±0.01°C in RCP8.5, 0.05±0.01°C in RCP6.0, 0.04±0.01°C in RCP4.5, and 0.04±0.01°C in RCP2.6.

Record-breaking magnitude of GMST during El Niño
By considering different El Niño strengths,   ( | ) M T also shows a higher value under higher emission scenarios (table 1). Despite the differences in   ( | ) M T between RCP2.6, RCP4.5, and RCP6.0 not being statistically significant under the same El Niño category, the differences between RCP8.5 and RCP4.5/RCP2.6 appear to be statistically significant regardless of the El Niño category. This shows that   ( | ) M T is critically influenced by external forcing. By reducing emissions, we can reduce the average magnitude of a record-breaking GMST during El Niño. However, one should be aware that reducing emissions also decreases the total numbers of recordbreaking GMSTs ( figure 3(b)). In addition, a stronger El Niño also shows higher   ( | ) M T under fixed RCP, which is expected due to a larger GMST variation that can be induced by a stronger El Niño. The differences in   ( | ) M T between weak and strong El Niños show statistical significance in all RCPs.

Likelihood of record-breaking GMST linked to an El Niño during the 21st century
Since El Niño is not the only factor to cause a recordbreaking GMST, we calculate the likelihood for a record-breaking GMST being associated with an El Niño event (   ( | ) P T ). The total number of recordbreaking GMSTs (  N T ) is closely correlated with the strength of emissions in all models ( figure 3(b)). The positive correlation between the emission scenario and the number of record-breaking GMSTs associated with El Niño ( is also evident in most models. Interestingly, figure 3(b) and table 1 show a decreasing   ( | ) P T when emission increases, which is opposite to   ( | ) P T .   ( | ) P T in RCP2.6 is higher than all other RCPs (table 1). In other words, the result suggests that there are higher chances for a record-breaking GMST to be associated with El Niño in a low emission scenario than in high emission scenario. However, due to the large uncertainty and relatively small difference in   ( | ) P T , we find no statistical significance in the difference between RCPs. In all RCPs, more than half of the record-breaking GMSTs are associated with an El Niño. In the piControl run, on the other hand, the ONI changes can explain a little more than one-quarter of the GMST variance with a three-month lag, which is consistent with the result in Trenberth (2002). These results confirm that, among various factors, El Niño is perhaps most dominant in causing record-breaking GMSTs on interannual timescale. Therefore, when the warming trend in GMST due to RCP forcing is small, other factors that generate smaller GMST variations than El Niño are 'over-shadowed' by the GMST records induced by El Niños. This is consistent with the bootstrapping test which shows the average magnitudes of record-breaking GMSTs are larger during El Niños than during non El Niños. Under moderate and strong El Niño cases, we see the same phenomenon of high   ( | ) P T in a low emission scenario. We also see that, in all RCPs, more recordbreaking GMSTs are generated by strong El Niños than weaker El Niños.

Influence of different types of El Niño
Many studies have shown that the two types of El Niño, central Pacific El Niño (CP) and eastern Pacific El Niño (EP), are related to different mechanisms and teleconnections (e.g. Larkin and Harrison 2005, Ashok et al 2007, Kao and Yu 2009). CP and EP El Niños differ in their location of SST maximum in the tropical Pacific. By following Yu and Kim (2013), we classify the El Niño events to CP and EP. There are more CPs than EPs in all RCP scenarios (table 2, figures S9-S10), which is in consistent with Kim and Yu (2012). As a result, the larger   ( | ) P T in CP compared to EP shows that there are more CP-related record-breaking GMSTs than EP-related. However, the differences are not large enough to be statistically significant. On the other hand, EP shows larger   ( | ) P T than CP in all RCPs except RCP8.5, which means that a recordbreaking GMST is more likely to happen once EP occurs. This could relate to the increasing occurrence of strong EP events under a warming climate (Cai et al 2018), which makes EP more likely to generate a record-breaking GMST than CP. However, the differ- are not large enough to show statistical significance, either. The two types of El Niño generate similar   ( | ) M T under different RCP scenarios. The above analyses suggest a more dominant role of RCP scenarios over types of El Niño in all statistics.
Since the discovery of different types of El Niño, new classifications have been proposed to better capture different SST spatial patterns (e.g. Ren andJin 2011, Kim andYu 2012). Considering the large number of CMIP models used here and their different simulations for El Niño, we choose a relatively simple and straightforward classification method based on ONI. Nonetheless, it is not perfect, as the ONI is based on a fixed location over the ocean. If future El Niños evolve to a different spatial pattern, the statistics might be affected.

Inter-comparison between CMIPs
To verify if the results are affected by the development and improvement of the climate models, we conduct the same analyses for the CMIP3 and the available CMIP6 models so far on the Earth System Grid Federation (ESGF) database (downloaded as of 6/3/ 2019). As a common experiment across CMIPs, we use the 1% per year CO 2 increase (1pctCO 2 ) experiment to complete the inter-comparison of different phases of CMIP (figure S11-S13). We choose the first 80 years in the 1pctCO 2 experiments. The ensemble mean of   ( | ) P T in CMIP3 is 64±8% compared to 70±4% in CMIP5 and 69±6% in CMIP6 ( figure 4(a)). All ensemble means are reasonably close with a maximum difference of 6±9% in   ( | ) P T . CMIP3 shows a larger spread in   ( | ) P T between models when compared to CMIP5 and CMIP6. This might be related to the model convergence in simulating El Niño-Southern oscillation (ENSO) frequency and amplitude as pointed out in the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (Flato et al 2013). The ensemble mean of   ( | ) M T in CMIP3 is 0.11±0.01°C compared to 0.10±0.01°C in CMIP5 and 0.12±0.02°C in CMIP6 (figure 4(c)) with a maximum difference of 0.02±0.03°C. We see similar standard deviation of   ( | ) M T in all CMIPs. Finally, the ensemble means of   ( | ) P T show a value of 64±11% in CMIP3 compared to 55±6% in CMIP5 and 73±13% in CMIP6 ( figure 4(b)). The larger uncertainty and higher value in CMIP6 when compared to other CMIPs are largely due to the small number of available models (nine CMIP6 models thus far) and two relatively high   ( | ) P T values from the Goddard Institute for Space Studies center. Therefore, the question of whether there is an increase in recordbreaking GMST caused by El Niño in CMIP6 still needs further investigation. For   ( | ) P T , CMIP3 still shows the largest variability between models with standard deviation of 22% compared to 14% in CMIP5 and 17% in CMIP6.

Comparison between model result and observation
The nature of the CMIP5 design is such that the model simulations do not attempt to capture individual events as in the observation, but rather to obtain the long-term statistics. A short-term mean would be susceptible to a specific event during the averaging period (i.e. timing and response of the event). For instance, considering an El Niño cycle of five years during a 30-year period, the count for just one El Niño could cause ∼17% difference in   ( | ) P T . A long-term mean with a larger sample size of El Niño would make the statistics more robust. However, the observed record-breaking GMSTs associated with an El Niño event only become less uncertain after 1980 due to  Note.   ( | ) P T represents, in all El Niño events, the probability of an El Niño to induce record-breaking GMST.   ( | ) P T represents, in all record-breaking GMSTs, the probability of those induced by an El Niño.
represents the average magnitude of record-breaking GMST associated with El Niño. Error bar represents 95% confidence interval. improvement of data spatial coverage (Cowtan et al 2018), and are less affected by volcanic activities.
Due to the reasons above, we focus the comparison on   ( | ) M T between models and observations. Yin et al (2018) show observed record-breaking magnitude ranging from 0.02°C to 0.24°C and a value of agrees well with the value of 0.08±0.01°C in the CMIP5 historical run. A regression between GMST and ONI which shows 0.09±0.02°C per ONI in the piControl run also agrees well with 0.08°C per Nino3.4 index based on observations (Trenberth 2002). The two particularly large record-breaking GMSTs (0.19°C during 1997-98 El Niño and0.24°C during 2015-16 El Niño) are also within the range of   ( | ) M T estimates in CMIP5 models (0.03 ± 0.02°C to 0.21 ± 0.04°C).
We further calculate the difference between historical simulation and future projections (RCPs), and find that the differences in show no statistical significance in all RCPs except in RCP8.5, which shows a difference of 0.05±0.01°C and increases in magnitude. Historical simulation shows 25±3% in RCPs (table 1) are larger than the historical simulation with differences showing statistical significance. For   ( | ) P T , only RCP8.5 shows a difference of 10±8% and is smaller than the historical simulation.
While the models have global coverage, some observational datasets (e.g., HadCRUT4 (Morice et al 2012)) do not. To investigate if the spatial coverage affects the statistics in this study, we apply the Had-CRUT4 mask to the model output. The GMST from the masked model output only shows a variance change of 0.1% which includes the trend difference mainly due to the difference in polar region coverage. The comparison shows few changes in magnitude (<0.01°C) and less than 3% changes in both   ( | ) P T and   ( | ) P T except for a 6% decreases in   ( | ) P T for RCP2.6. All differences between masked and

Conclusions
In this study, we investigate the combination of the strong and transient warmth induced by El Niño with the gradual and persistent warming induced by greenhouse gas forcing (figure 2). We focus on the multimodel ensemble statistics of record-breaking global mean surface temperature (GMST) induced by El Niño. These statistics include both frequency and magnitude from CMIP5 projections for the 21st century. The emission scenario has a significant influence on the likelihood of record-breaking GMST happening during El Niño (   ( | ) P T ), with stronger emission scenarios causing higher   ( | ) P T . Under a low emission scenario (RCP2.6), one out of three El Niño events breaks the GMST record during the 21st century. The probability increases to four out of five in a high emission scenario (RCP8.5). The result shows the importance of climate change mitigation in reducing record-breaking GMSTs during important internal variability such as El Niño. However, the emission scenario has a limited influence on the likelihood of a record-breaking GMST occurring during strong El Niño. A strong El Niño produces a GMST variation that is large enough to break the GMST record regardless of the GMST trend difference between RCP4.5 and RCP8.5. Stronger El Niño also induces a higher record-breaking magnitude (   ( | ) M T ) in each RCP.   ( | ) M T can range from 0.03°C to 0.21°C based on individual CMIP5 models. El Niño accounts for more than half of record-breaking GMST occurrences in all RCPs. The critical influence of background warming on the statistics of the record-breaking GMST suggests that these statistics could be used to infer or confirm the background warming.
For the likelihood of a record-breaking GMST happening during El Niño and a record-breaking magnitude, the inter-comparison of CMIPs shows a consistent result with little differences. The likelihood for a record-breaking GMST associated with an El Niño (   ( | ) P T ) shows little difference between CMIP3 and CMIP5. It should be noted that the statistics in this study are the average for the 21st century. The exact number can change decade by decade due to the nonlinear warming trends such as in RCP2.6 and RCP8.5. For example, in RCP2.6, there is minimal recordbreaking GMSTs in the second half of the century due to the deceleration of the warming trend (figure 2). Due to less record-breaking GMSTs associated with El Niño over a shorter period or a result of the nonlinear trend, the resulting larger error bar leads to a less conclusive result from the same analyses. The possible under-represented inter-basin connection and decadal variability in the models could also slightly alter the statistics shown in this study. The inter-basin differences in SST, which are important to the observed low-level easterly wind anomaly over the tropical Pacific, can affect the estimates of long-term Walker circulation change (Zhang and Karnauskas 2016). The under-represented decadal variability, such as Pacific decadal variability, might also affect the simulated El Niño events (Deser et al 2011). Another source of uncertainty relating to the statistics would come from possible future volcanic eruptions. Its effect can be seen in the observational data and the historical run. However, where and when future volcanic eruptions will occur in the 21st century is highly uncertain. As such, the IPCC assessments did not incorporate volcanic forcing in the 21st century projections (Pachauri and Meyer 2014). For the same reason, the statistics in this study only focus on the influence of El Niño without considering the volcanic forcing.