Multi‐Decadal Skill Variability in Predicting the Spatial Patterns of ENSO Events

Seasonal hindcasts have previously been demonstrated to show multi‐decadal variability in skill across the twentieth century in indices describing El‐Niño Southern Oscillation (ENSO), which drives global seasonal predictability. Here, we analyze the skill of predicting ENSO events' magnitude and spatial pattern, in the CSF‐20C coupled seasonal hindcasts in 1901–2010. We find minima in the skill of predicting the first (in 1930–1950) and second (in 1940–1960) principal components of sea‐surface temperature (SST) in the tropical Pacific. This minimum is also present in the spatial correlation of SSTs, in 1930–1960. The skill reduction is explained by lower ENSO magnitude and variance in 1930–1960, as well as decreased SST persistence. The SST skill minima project onto surface winds, leading to worse predictions in coupled hindcasts compared to hindcasts using prescribed SSTs. Questions remain about the offset between the first and second principal components' skill minima, and how the skill minima impact the extra‐tropics.

There is substantial diversity in ENSO events' spatial and temporal characteristics (Ashok et al., 2007;Kao & Yu, 2009;Kug et al., 2009;Okumura, 2019).El Niños, for example, can be categorized as Eastern Pacific (EP) or Central Pacific (CP), named after the location of their maximum positive SST anomalies (Capotondi et al., 2015).EP events tend to have greater SST anomalies than CP events, and they evolve differently in time and space (Capotondi et al., 2020).Causes and classification of ENSO flavors are the subject of active research (see e.g., Capotondi et al. (2020); Dieppois et al. (2021)).
It is important for seasonal forecasts to skilfully predict both an ENSO event's magnitude and spatial pattern, as both are important in determining its remote impacts (e.g., Ashok et al. (2007); Garfinkel and Hartmann (2008)).

RESEARCH LETTER 10.1029/2023GL107971
Key Points: • The skill of predicting the spatial pattern of ENSO events in coupled seasonal hindcasts is low in 1930-1960, compared to before and after • The skill minimum is attributable to lower ENSO variability and decreased SST persistence in the mid-century period • Decreased skill in the SST spatial pattern is linked to decreased surface wind skill in coupled hindcasts (vs.atmosphere-only hindcasts)

Supporting Information:
Supporting Information may be found in the online version of this article.
For example, Aleutian Low deepening caused by El Niño events occurs closer to the equator in CP events than EP events (Alizadeh et al., 2022), and precipitation impacts in Japan and New Zealand are opposite for EP and CP events (Ashok et al., 2007).Some simple models, and many GCMs, simulate this diversity, implying it is part of ENSO's natural variability (Geng & Jin, 2022).The frequency of CP El Niños has increased in recent years, which is attributed to global warming (Yeh et al., 2009) or natural variability (Cai et al., 2021;Wittenberg, 2009).
Hindcasts, or reforecasts, produced by running a model with past initial conditions, are used to assess a seasonal forecast model's skill (e.g., Weisheimer et al. (2017)).Hindcasts can be compared to observations and/or reanalyses, to assess model skill.Most studies of seasonal forecast skill analyze approximately the last 30 years (e.g., C3S (2018)), failing to sample multi-decadal variability, including in ENSO.
In order to examine multi-decadal variability in seasonal forecast skill, Weisheimer et al. (2017Weisheimer et al. ( , 2020) ) produced and analyzed 110 year hindcast datasets (atmosphere-only runs, ASF-20C, and coupled runs, CSF-20C).They found robust decadal variability in seasonal forecast skill in both coupled and uncoupled hindcasts, with midcentury periods of reduced skill in Niño 3.4, North Atlantic Oscillation (NAO) and Pacific-North American (PNA) indices.Since skill does not change monotonically with time, changes in skill are attributed to changes in the climate system's physical state, not just inferior initial conditions.The skill minimum is more pronounced in CSF-20C, but also present for the NAO and PNA in ASF-20C (O'Reilly et al., 2017).
Additionally, studies have looked at multi-decadal variability in ENSO, its diversity, and its predictability.Dieppois et al. (2021) showed multi-decadal variability in type of ENSO event, which may impact predictability.Lou et al. (2023) used model analogs to test ENSO hindcast skill, showing multi-decadal variations in forecast skill from 1800 to the present.
Here, we build on previous work to show that coupled seasonal forecasts exhibit multi-decadal variability in their skill at predicting the spatial pattern of SST anomalies in the ENSO region.We propose reasons for this drop in skill, and examine implications for surface wind skill, by comparing to atmosphere-only hindcasts.

Model Experiments, Data, and Skill Metrics
The Coupled Seasonal Forecasts of the Twentieth Century (CSF-20C) (Weisheimer & O'Reilly, 2020) reforecasts were performed with ECMWF's coupled seasonal forecasting model (Johnson et al., 2019), which includes stateof-the-art atmospheric, land surface, oceanic, and sea-ice components.The atmospheric model was fully coupled to the Nucleus for European Modeling of the Ocean (NEMO).Horizontal atmospheric resolution was T L255 (≃80 km), with 91 vertical levels; ocean resolution was 1°, with 42 vertical levels.Initial conditions were from CERA-20C: ECMWF's coupled twentieth century reanalysis (Laloyaux et al., 2018).CERA-20 C assimilates surface pressure and marine wind observations (no satellite data) in the atmosphere, and subsurface temperature and salinity profile observations in the ocean.SSTs are relaxed toward monthly HadISST2 reconstructed data (Rayner et al., 2003).
This study analyses hindcasts initialized on 1 November every year from 1901 to 2010.51 ensemble members per start date were created through a combination of stochastic perturbations to atmospheric model physics, and sampling initial conditions from CERA-20C's 10 realizations.Time-varying forcings from greenhouse gases, the solar cycle, and volcanic and sulfate tropospheric aerosols were prescribed; other tropospheric aerosols were constant.See Johnson et al. (2019) for full details.The linear trend and mean-state bias (against HadISST2) was removed before this analysis.This helps offset the known cold tongue bias in the ECMWF model (Johnson et al., 2019).Some bias may remain, but this should not impact the EOFs or PCs as they are calculated using anomalies.
Data from the equivalent, uncoupled, atmosphere-only hindcasts (ASF-20C) is used for comparison.ASF-20 C used the same atmospheric model as CSF-20C, with prescribed SSTs from HadISST2.The initial conditions were from ECMWF's ERA-20C reanalysis (Poli et al., 2016).
Two SST datasets (both cropped to 1901-2010) are used for validation: HadISST2, and ERSSTv5 (Huang et al., 2017).Both assimilate global ocean observations to produce temporally and spatially consistent reconstructed SST datasets, and incorporate bias-corrected observations from floats and ship-based instruments.HadISST2 uses reduced space optimal interpolation (RSOI) to reconstruct the full SST field at 1°× 1°resolution.
ERSSTv5 uses an EOF-based approach, projecting modern EOFs onto past observations to achieve a full SST field at 2°× 2°resolution.The ERA-20C reanalysis is used to validate atmospheric variables.
Empirical orthogonal functions (EOFs) are found by computing eigenvectors of the monthly SST anomaly covariance matrix in the Pacific basin (30°N-30°S, 150°E-80°W), and the principle components (PCs) by a projection of the data onto these eigenvectors.The eigenvalues associated with each EOF represent the proportion of the total variance attributable to that mode of variability.
Hindcast skill is evaluated using the ensemble mean anomaly correlation coefficient (ACC) and the root mean square skill score (RMSSS).
The ACC between ensemble mean forecast index g k and observation b k index pairs over the years k = 1, …, P is calculated as The spatial ACC between ensemble mean forecast f ij and observation a ij pairs over latitudes i = 1, …, N and longitudes j = 1, …, M is defined as using the ensemble mean forecast anomaly (relative to climatology calculated over years k = 1, …, P) The spatial RMSSS is defined as (3) using, at each gridpoint, the ensemble mean forecast value f ij , the observed value a ij and the climatological value c ij (the time-mean across the whole period at ij).Using the climatology across the whole period enables analysis of all years from 1901, but this may slightly increase the observed skill (Risbey et al., 2021).The year of interest has not been excluded; this would only negligibly affect the 110 year mean.The RMSSS expresses the differences between forecasts and observations compared to climatology.Higher values are better: 0 means a forecast adds no value over climatology; one is perfect.
For spatial measures of skill, 5-95% confidence intervals are calculated using the standard error of the ensemble mean.For temporal measures of skill, 5-95% confidence intervals are calculated using bootstrapping with replacement: sampling 21 years within the period 10,000 times and computing statistics on the resulting distribution.

First and Second Principal Components
To examine how well CSF-20C represents ENSO events' spatial pattern, DJFmean tropical Pacific SST EOFs were calculated from 1901 to 2010.The first two EOFs from CSF-20C, HadISST2 and ERSSTv5 are shown in Figures 1a-1e.
Whilst the EOFs appear smoother in ERSSTv5 (due to coarser resolution) and CSF-20C (due to being an ensemble mean) than in HadISST2, the first EOF is similar in all three.All show warming through the Pacific cold tongue, with cooling at westerly longitudes and in the subtropics.The warming associated with the first EOF is further east in CSF-20C than HadISST2 or ERSST, and warm anomalies transition to cold anomalies further west.This is a known issue in models (e.g., Luo et al. (2005)), including SEAS5 (Beverley et al. (2023), Figure S3 in Supporting Information S1), and suggests the hindcast model may have a slight bias for EP over CP El Niño events.
There are more obvious differences in the second EOF, which can be interpreted as representing ENSO's spatial variability.The overall pattern is similar between datasets: cooling in the cold tongue, accompanied by warming in the rest of the basin.Compared to the two observational datasets, especially ERSSTv5, CSF-20C's second EOF cools over a much smaller range of latitudes, and warms more strongly in the west Pacific.Also note that the second EOFs of HadISST2 and ERSSTv5 do not entirely agree.This is likely due to worse observational coverage toward the start of the century, meaning different reconstruction techniques (see Section 2) play a larger role in determining the SST pattern.
The skill (as 21 year rolling ACC temporal ) of the coupled hindcasts at predicting the DJF-mean value of the PCs associated with the EOFs is shown in Figure 1g and 1h, alongside the rolling correlation between the observational datasets.To ease comparison, the ERSSTv5 EOF patterns were used for all, with time series of other datasets formed by projection onto these.The analysis was also performed using the CSF-20C and HadISST2 patterns (Figure S1 in Supporting Information S1), with qualitatively similar results.Note that PC1 is approximately equal to the Niño 3.4 index, and PC2 to the Trans-Niño index (TNI).
There is clear multi-decadal variability in both PC1 and PC2 skill, but PC1 is better predicted than PC2 throughout the whole century.The changes in skill are non-monotonic, suggesting they are not simply due to changes in initial condition (and reference data set) quality, but changes in the coupled atmosphere-ocean system's physical state.PC1's minimum in skill in 1930-1950 is consistent with the minimum in Niño 3.4 Index skill shown by Weisheimer et al. (2022).The correlation between the two reconstructed datasets also decreases in 1930-1950, and the uncertainty in that correlation increases.
There are also multi-decadal variations in PC2 skill, which is important for determining ENSO events' spatial pattern.There is still a drop in skill in the middle of the century, but this is not in-phase with PC1: the minimum in PC2 skill occurs in 1940-1960, lagging the PC1 minimum by 10 years.This slightly later period is coincident with minima in NAO and PNA skill shown by O'Reilly et al. (2017).
The relationship between PC2 in the two reconstructed datasets is also different to PC1.At the start of the century, PC2 in HadISST2 and ERSSTv5 are much less well correlated than PC1, but have good agreement from 1940 onwards.Unlike for PC1, there is no observational disagreement coincident with the skill minimum, suggesting the reasons for the minimum in PC2 skill are independent of observational quality.

ENSO's Spatial Pattern
We also calculated the hindcast ACC spatial (Figure 2a, black line) and RMSSS (Figure 2b, blue line) at predicting the spatial pattern of SST anomalies in the tropical Pacific.Both measures of skill (ACC spatial based on correlations, RMSSS on the absolute values of hindcast anomalies) show similar behavior.The smoothed lines show an extended period of reduced skill , which encompasses the minima in skill of both PC1 and PC2 (Figure 1).
There is substantial variation in individual years' skill scores.Whilst some events for example, 1997./98, have high ACC spatial and RMSSS (approaching 'perfect' prediction of ENSO's spatial pattern), others have spatial skill scores close to zero, or negative.These low/negative scores are clustered in the middle period.They tend to be moderate or neutral ENSO years (DJF-mean Nino 3.4 between 0.5 and +0.5 K).Whilst we may expect ACC spatial to be very sensitive in moderate/neutral years, the that the RMSSS has the same results shows that these years are, indeed, being more poorly predicted than years in the early and late periods.Uncertainties also increase significantly in the mid-century, shown by a two-sided student's t-test, particularly for the strongest El Niño and La Niña events.Figures 3a and 3b show a strong, significant positive relationship between the magnitude of an ENSO event and its ACC spatial , and a strong negative relationship between the magnitude and the uncertainty in ACC spatial (i.e., strong ENSO events high skill and low ensemble spread).The relationship is consistent for strong events, but weaker ENSO events show a larger range of ACC spatial values and uncertainties.This analysis was repeated for the 1and2 index and the TNI, but no strong relationship was found (Figures S2 and S3 in Supporting Information S1).
This relationship between ENSO magnitude and ACC spatial suggests that ENSO variability plays a role in seasonal hindcast skill of ENSO's spatial pattern.ENSO variability is reduced in the middle of the century, and this contributes to the minimum in spatial skill.The standard deviation in the DJF-mean Niño 3.4 Index in HadISST2 decreased from 1.0 K in 1910-1930 to 0.8 K in 1940-1960, before rising again to 1.1 K in 1990-2010.This was tested further by taking 10,000 random 30-year samples from the whole data set: this showed a strong, statistically significant relationship between ENSO variance and ACC spatial (not shown).
As shown by the location of the stars in Figures 3a and 3b, the mid-century reduction in ENSO variability led to lower spatial skill, and larger uncertainties.However, lower variability does not explain the whole picture: even the neutral and weak ENSO events, shown in black in Figure 3, have a significantly lower ACC spatial in midcentury compared to the early and late periods, as tested by a two-sided student's t-test (not shown).
This suggests there are other factors contributing to the mid-century minimum in skill.One factor is likely the reduction in SST persistence in the middle of the century, shown in Figure 3c.The autocorrelation of November SSTs with December, January and February SSTs has a minimum in 1930-1960.This reduction in persistence is also present in CSF-20C, and is observed for the TNI too (Figure S3 in Supporting Information S1).For the Niño 1and2 index (Figure S2 in Supporting Information S1), whilst there is a reduction in persistence in the observations in 1940-1970, this is not captured by the hindcast model; this likely reduced skill even further.We propose that this change in behavior, with less persistence of SSTs throughout the season (which may be linked to lower inter-annual variability), contributes to the mid-century skill drop.

Wind Anomaly Skill
ENSO interacts with the tropical atmosphere through surface wind anomalies.When the east-west SST gradient across the tropical Pacific is strengthened (La Niña events), this leads to easterly wind anomalies.During El Niño events, when the SST gradient is weakened, westerly wind anomalies are observed.Figure 4 shows a Hövmoller plot of the 21 year smoothed meridionally averaged (5°N-5°S) zonal surface wind anomalies from CSF-20C, ASF-20C, and ERA-20C, along with hindcast skill.
Whilst the magnitude of the anomalies are much smaller (as expected from an ensemble mean), ASF-20C (Figure 4b) predicts the smoothed, decadally varying zonal wind anomalies very well compared to ERA-20C (Figure 4c) throughout the whole century.This high skill is clearly visible in Figure 4e: 21 year smoothed ACC temporal is very high throughout the whole century, especially in the central Pacific, and always statistically significant.This is perhaps expected, as ASF-20C uses prescribed SSTs to force the atmosphere.However, other atmospheric signals, including the PNA and NAO, show a minimum in skill in ASF-20C in the middle of the century, despite prescribed SSTs (Weisheimer et al., 2017).
CSF-20C's surface wind anomalies (Figure 4a) show much worse agreement with ERA-20C.At the start of the century agreement is high, before there is a mid-century period of persistent, strong easterly anomalies which is not present in the reanalysis.This leads to multi-decadal variations in the surface wind ACC temporal (Figure 4d).In the central Pacific -the region best predicted in ASF-20C -there is a period in 1940-1960 of reduced skill, which coincides with the skill minimum in SST PC2.There is also a band around 230°E where the surface wind anomalies are almost always badly predicted.This could be due to differences in the location of the maximum/ minimum SST anomalies, compared to ERA-20C.As with the SST metrics, there is a strong relationship between the magnitude of an ENSO event and the skill at predicting its spatial pattern in CSF-20C (Figure S4 in Supporting Information S1).This is also true in ASF-20C, even though skill is much higher overall (Figure S5 in Supporting Information S1).We conclude that lower SST spatial skill in the middle of the century leads to worse predictions of surface wind anomalies in the tropical Pacific, compared to hindcasts with prescribed SSTs.

Discussion and Conclusions
Section 3 shows multi-decadal variations in the skill of seasonal hindcasts at predicting ENSO events and their spatial pattern throughout the twentieth century.There are two distinct low skill periods: in 1930-1950 for PC1 (Niño 3.4 Index), and 1940-1960 for PC2 (Trans-Niño Index).The PC1 low-skill period is consistent with the Niño 3.4 Index skill minimum shown by Weisheimer et al. (2020Weisheimer et al. ( , 2022)).The PC2 low-skill period is consistent with the PNA and NAO skill minima shown by et al. (2017).There is a prolonged minimum  in the spatial skill of predicting tropical Pacific SST anomalies.
The skill minima for SST indices are related to the agreement between different reconstructed SST datasets.The correlation between PC1 in HadISST2 and ERSSTv5 also has a minimum in the middle of the century, which coincides with the skill minimum.For PC2, there is no period of observational disagreement coincident with the minimum, suggesting it is caused by other factors.
The variation in correlation between the two reconstructed datasets' PC1s adds evidence that mid-century seasonal prediction is hampered by lower quality and coverage of initialization and observational data.The different reconstruction methods used by HadISST2 and ERSSTv5 may contribute to their differences in this period when they are less constrained by sparser observations.The EOF approach of ERSSTv5 assumes stationarity of the SST EOF patterns, which may not be true throughout the whole century, and the interpolation used by HadISST2 is less accurate with fewer observations.Sparser observations may also contribute to the increased ensemble spread in the mid-century, with seasonal hindcasts less constrained by initial conditions.
Inferior quality initialization data does not explain the full story: at the start of the century -when there are just as few observations -the skill in PC1, PC2 and spatial pattern is very high, comparable to recent years.Further, whilst the PC1 minimum in skill coincides with observational disagreement, the PC2 minimum does not.Other factors contribute to the mid-century skill minimum.The hindcast skill at predicting the spatial pattern of SST anomalies is strongly related to the magnitude of ENSO events that year.During strong ENSO events, whether El Niño or La Niña, the hindcasts accurately predict the spatial pattern of SST anomalies in the tropical Pacific, with a lower ensemble spread.During weak or neutral events, the skill is lower, with higher spread.This is perhaps expected, as weak/neutral initial condition anomalies can lead to El Niños, La Niñas, or neither; so small perturbations can dramatically alter the DJF-mean outcome.If initial conditions have strong anomalies, the signal-tonoise ratio is larger, leading to a more predictable system.In the mid-century period, ENSO events have a lower magnitude, which contributes to the skill minima.
Multi-decadal variation in the persistence of SSTs may also contribute to the observed variations in skill and explain why weak/neutral ENSO events in 1930-1960 are still worse predicted than weak/neutral events in the rest of the century.The reduction in autocorrelation of November SSTs with December-February SSTs in the reconstructed datasets in 1930-1960 signifies a reduction in predictability, which contributes to low hindcast skill.The skill at predicting the atmospheric component of ENSO (here surface winds) was shown to also be impacted in 1930-1960 by this skill minimum, with coupled hindcasts performing much worse than atmosphereonly equivalents.
ENSO influences the extra-tropics through remote teleconnections to the NAO (Hurrell & Deser, 2009) and PNA (Horel & Wallace, 1981).Whilst they have high skill throughout the century in the surface wind anomalies, ASF-20C does show a skill minimum in 1940-1960 for the PNA and the NAO (O'Reilly et al., 2017;Weisheimer et al., 2017).These indices do not have lower variability than average in 1940-1960. Instead, O'Reilly et al. (2017) show that real-world teleconnections weakened during this period, whilst model-world teleconnections stay constant.The impact of ENSO skill variations on extra-tropical skill is another aspect that warrants further investigation, particularly comparing responses in coupled and uncoupled hindcasts.
There is still work to do examining and interpreting multi-decadal variations in seasonal hindcast skill, to fully quantify how much of the skill minimum is caused by inferior initialization data, and how much by reduced predictability in the middle of the century.It is also important to understand why different signals show different periods of reduced skill (e.g., why the PC1 skill minimum leads the PC2 skill minimum by 10 years).Other suggestions for multi-decadal variability in hindcast skill include poor representation of air-sea coupling in seasonal hindcast models (Yao et al., 2022), which was important in the real world in the middle of the century.
In the future, we intend to examine further causes of the skill minima.For example, tropospheric aerosol forcings vary on multi-decadal timescales, so we will quantify the effects of changing this external forcing on seasonal hindcasts and their skill.We would also like to extend this analysis to longer lead times and May initialisations, to further understand variability in spatial skill.

Data Availability Statement
Data from ASF-20C and CSF-20C have become publicly available through a dedicated online dissemination platform hosted by the CEDA archive at https://catalogue.ceda.ac.uk/uuid/6e1c3df49f644a0f812818080-bed5e45.A set of standard monthly mean atmospheric variables including temperature, precipitation, mean sea level pressure, geopotential height, wind, and thermal and radiative fluxes have been provided as global gridded data in netCDF format.
The HadISST2 reconstructed SST data set used in this study was taken from the ERA-20C data set, which uses this SST data set to force it.The ERA-20C data set is available on the ECMWF MARS data store at https://apps.ecmwf.int/mars-catalogue/?class=e2.The catalogue is public but to download data, access must be requested from ECMWF.The ERSSTv5 reconstructed SST data is available for public download on the NOAA website at https://psl.noaa.gov/data/gridded/data.noaa.ersst.v5.html.
The CERA-20C reanalysis, used for initial conditions for the hindcasts, is available on the ECMWF MARS data store at https://apps.ecmwf.int/mars-catalogue/?class=ep.The catalogue is public but to download data, access must be requested from ECMWF.
and the observed anomaly a′ ij = a ij 1 P ∑ P k=1 a ijk at each gridpoint.

Figure 1 .
Figure 1.(a)-(e) The first and second EOFs from: (a) and (d) ERSSTv5; (b) and (e) CSF-20C ensemble mean; (c) and (f) HadISST2.(g)&(h) 21 year rolling ACC temporal , plotted against the central year of the 21 year period, of the (g) first, and (h) second principal components.The EOFs and PCs were calculated in the domain 30°N-30°S, 150°-280°E from the DJF-mean SSTs from each data set.The PCs are calculated by projecting each data set's EOFs onto the EOFs from ERSSTv5.5-95% confidence intervals in g&h were calculated using bootstrapping with replacement.

Figure 2 .
Figure 2. The (a) ACC spatial , and (b) RMSSS between the spatial fields of DJF-mean SST anomalies in CSF-20C ensemble mean and HadISST2.The domain was 5°N-5°S, 160°-270°E: the Niño 3 and Niño 4 regions.The thick lines (black in (a); blue in (b)) show rolling 21 year mean spatial skill.5-95% confidence intervals are calculated using bootstrapping with replacement.Individual crosses show each year's spatial skill, with error bars showing the 5-95% confidence interval within CSF-20C ensemble members.Neutral events are defined as having a DJF-mean Nino 3.4 index of between 0.5 and +0.5 K; moderate events as between ±0.5 and ±1 K; strong events as exceeding ±1 (k).

Figure 3 .
Figure 3. (a)&(b) Scatter plots demonstrating the relationship between the magnitude of an ENSO event in HadISST2, as measured by the Niño 3.4 Index, and (a) ensemble mean ACC spatial for CSF-20C; (b) 5-95% confidence intervals for ACC spatial within ensemble members.Colors denote different types of ENSO events: red shows strong El Niño events (DJFmean Niño 3.4 index >+1 (k); blue shows strong La Niña events < 1 (k); black shows all others).Symbols represent different periods: upwards triangles show years from 1901 to 1930; stars show 1931-1960; downwards triangles show 1961-2010.(c) Correlation between the Niño 3.4 index in November and (blue line) December (orange) January and (green) February, for (solid lines) HadISST2 and (dashed lines) CSF-20C.Correlations are taken in rolling 21 year periods, and the value plotted against the central year.

Figure 4 .
Figure 4. 21 year smoothed DJF-mean anomaly of the meridionally averaged (5°N-5°S) zonal wind at 10 m height: (a) CSF-20C ensemble mean; (b) ASF-20C ensemble mean; (c) ERA-20C reanalysis.The rolling 21 year ACC temporal between the hindcasts and reanalysis is shown in (d) for CSF-20C, and (e) for ASF-20C.Values are plotted at the central year of the 21 year period.Stippling in (d) and (e) shows significance at the 5% level.Subplots (a) and (b) share a color bar; note this is different to the color bar for (c).