1 Introduction

Seasonal forecasts of wind speed have a variety of applications. Recent initiatives such as the EU funded project, EUPORIAS (Buontempo et al. 2018), are encouraging the use of seasonal forecasts within the wind energy sector. The amount of wind power generated in a season has a large effect on electricity price and redispatch costs (Wohland et al. 2018), so advance knowledge of extreme seasons is useful information for energy traders. In China, wind power is becoming increasingly important, with the Chinese government has pledging to produce 15% of all electricity by 2020 using renewable resources, including 210 GW of wind power expected to come online by the end of the decade (National Energy Adminitstration 2016). Energy trading in China may become important as the market evolves (Yu et al. 2017). Another use for the wind energy industry is using seasonal forecasts to anticipate potentially damaging conditions to energy infrastructure (Bett et al. 2017, Clark et al. 2017). Other areas such as the health sector could also benefit from seasonal forecasts of wind, given the link between wind speed and air pollution (e.g. Csavina et al. 2014). Air pollution is estimated to contribute 1.6 million deaths per year in China (Rohde and Muller 2015), so advance warning of seasons likely to have poor air quality could aid decision makers in reducing the impact.

There is strong seasonal variability in wind speeds over China, with the highest wind speeds generally in winter (Yu et al. 2016). In winter a large pressure gradient develops between the Aleutian Low and Siberian-Mongolia High (SMH) which results in strong northerly winds along the east coast (Chang et al. 2006). This circulation is one of the main features of the East Asian Winter Monsoon (EAWM). In the mid-troposphere, a prominent feature of the EAWM is the East Asian Trough (EAT) near Japan, with a strong sub-tropical jet to the south (Chang et al. 2006). There is considerable inter-annual variability in winter wind speeds (Yu et al. 2016), and so skill in predicting wind speeds on seasonal time scales could be particularly useful to the sectors mentioned above.

Interannual variability in winter wind speeds over China can have many causes. The El Niño Southern Oscillation (ENSO) is strongly linked with climate in southern China (e.g. Chen et al. 2013). The Arctic Oscillation (AO) has been shown to correlate with climate in northern China due to its effect on snow cover in the SMH region: a positive phase of the AO reduces snow cover and hence the build-up of cold air during the SMH development, resulting in weaker circulation (e.g. Chang and Lu 2012, Chang et al. 2011). Variations in the East Asian Trough are also linked to variations in the SMH and Aleutian Low (e.g. Song et al. 2016) and thus also affect surface winds. The strength of the Middle Eastern Jet Stream has also been linked to climate in central China (e.g. Zuo et al. 2015, Wen et al. 2009, Li and Sun 2015), with changes in the MEJS linked to Rossby wave activity.

Bett et al. (2017) investigated the ability of the Met Office seasonal forecasting system, GloSea5 (Global Seasonal Forecasting System 5), in predicting winter and summer forecasts of variables relevant to the energy sector in China, including 10 m wind speeds. They found that GloSea5 has skill in predicting winter (December–January–February, DJF) mean 10 m wind speeds over regions of China, in particular in south-eastern China and South China Sea (SE China), and a region in northern-central China, south of Mongolia (NC China). They proposed that skill in SE China was related to the El Niño Southern Oscillation (ENSO), whereas the NC China region was less well understood.

The aim of this paper is to investigate the robustness of wind speed prediction skill in the areas identified by Bett et al. (2017), by seeing if the results hold for two independent ensembles over different time periods, and to identify the sources of predictability. By finding the sources of predictability of wind speed over China, it may be possible to improve the direct model skill, for example using the atmospheric circulation as a proxy for wind speed (Scaife et al. 2014, Clark et al. 2017). These studies showed that the skill of forecasting temperature over the UK and Europe is significantly improved by predicting the temperature from the forecast of the North Atlantic Oscillation (NAO, Hurrell 1995) rather than using the temperature directly from the model.

The paper is structured as follows: In Sect. 1.1 we describe the seasonal forecast model and observational dataset used, and the methods are described in Sect. 1.2. In Sect. 2.1 we assess the seasonal prediction skill of the model and also discuss the signal-to-noise paradox. In Sects. 2.2 and 2.3 we examine the sources of predictability of wind speed in SE and NC China, and in Sect. 2.4 we discuss using these sources in linear regression models. Discussion and conclusions are given in Sect. 3.

1.1 Data

We assess the skill in predicting winter (DJF) mean 10 m wind speeds from forecasts initialised 1 month in advance, from two in-house ensembles generated by the second Global Coupled configuration of the Hadley Centre Global Environment Model version 3 (HadGEM3-GC2; Williams et al. 2015). HadGEM3-GC2 uses the GA6.0 configuration of the Met Office Unified Model (UM, version 8.4) as its atmospheric component, on an N216 grid (a horizontal resolution of 0.83o in longitude and 0.55o in latitude) and 85 vertical levels reaching a height of 85 km near the mesopause (Walters et al. 2017). This is coupled to the GL6.0 configuration of the JULES land surface model (Best et al. 2011), the GO5.0 configuration of the NEMO ocean model with a 0.25o nominal resolution and 75 vertical levels (version 3.4, Megann et al. 2014; Madec 2008), and the GSI6 configuration of the CICE sea ice model (version 4.1, Rae et al. 2015; Hunke and Lipscomb 2010).

The first ensemble is a set of retrospective forecasts (hindcasts) from the GloSea5 prediction system (Global Seasonal Forecasting System 5; MacLachlan et al. 2015). It is the same ensemble as used in Bett et al. (2017). This ensemble is produced by collating 8-member ensemble hindcasts initialised each year on 25th October, 1st November and 9th November, giving a total of 24 members per season. The 8 members started on each date differ due to a stochastic physics scheme (MacLachlan et al. 2015). The hindcasts cover DJF 1992/93–2011/12 (20 years).

The second ensemble is a set of decadal hindcasts from DePreSys3 (Decadal Prediction System 3; Dunstone et al. 2016). Here, 40 ensemble members are started from initial conditions provided by assimilation runs on 1 November. The ensemble members differ due to the same stochastic physics scheme of GloSea5. Only the first winter is used and the DePreSys3 hindcasts cover DJF 1980/1981–2014/15 (35 years).

We also combine the two ensembles to create a 64 member ensemble. Since the two ensembles were created by the same underlying model (HadGEM3-GC2), it is straightforward to combine them, with each of the 64 members given equal weight. This results in the DePreSys3 ensemble being given more weight than GloSea5, due to the larger number of ensemble members. By taking the mean of this larger ensemble, unpredictable ‘noise’ is further reduced leaving a more accurate estimation of the forced (predictable) signal. The greater number of ensemble members should therefore give a better estimation of the true skill of our systems over the common time period, DJF 1992/93–2011/12.

We use the ERA-Interim re-analysis dataset (ERAI, Dee et al. 2011) as a proxy for observations to assess the performance of the ensemble hindcasts. Previous studies comparing ERAI wind speeds to station and sounding observations over China have shown that although biases exist in ERAI, the dataset captures observed inter-annual and seasonal variations well (Zha et al. 2017).

1.2 Methods

Skill is calculated using the Pearson correlation coefficient (r). A correlation skill score is chosen because they are not affected by errors in amplitude from the forecasts, which is important given the signal-to-noise paradox previously identified in seasonal NAO forecasts (Eade et al. 2014; see Sect. 2.1). The Pearson correlation coefficient is favoured over the rank correlation coefficient because the former puts more weight on prediction of extreme seasons, which is likely to be important for users. All significance level thresholds are calculated assuming a two-tailed Student’s t test, and the significance level used throughout the paper is 95%.

When calculating internal relationships between climate indices and variables within ERAI [Figs. 4a, 5a, b and in Table 2 (second row)], we calculate the correlation coefficient for the full ERAI period (DJF 1979/80—2015/16, 37 years of data). For calculating the same internal relationships within the model [Figs. 4b, 5c, d and Table 2 (second row)], we use the data from all individual ensemble members from the combined ensemble. Since the combined ensemble covers 1993–2012 and has 64 members, this amounts to 1280 years of data. Consequently there is a much lower threshold of r for an internal correlation to be considered significant compared to the threshold in ERAI.

To assess the signal-to-noise ratio in the combined ensemble in Sect. 2.1 (Fig. 3), we use the method of Dunstone et al. (2016). We estimate the skill, r, of the combined ensemble for a given number of ensemble members, n, by randomly sampling n ensemble members for each hindcast year independently without replacement to create n ‘new’ ensemble members. Thus each ‘new’ member is a combination of years from different members of the original ensemble. This is possible due to the construction method of the ensemble and reinitialisation of the model every winter. The mean of the n-member ensemble is taken and r is calculated. This is repeated 5000 times, and the mean r of the 5000 samples is plotted against ensemble size.

To estimate the skill of the model predicting itself (dashed line in Fig. 3), the method above is repeated but a new ‘observational’ time series is created by randomly sampling from the ensemble members. The theoretical relationship between number of ensemble members (n) and skill (rn) depends only on the average skill of a single ensemble member in predicting observations (rsing), and is given by \( r_{n} = n^{1/2} r_{\text{sing}} /\left[ {1 + (n - 1)r_{\text{sing}} } \right]^{1/2} \) (Murphy 1990). To obtain an accurate estimation of rsing we sample as above (using n = 1) but with 10,000 samples.

The ratio of predictable components between the observations and model (RPC, Eade et al. 2014), calculated in Sect. 2.1, is estimated as RPC \( \ge r/\sqrt {\sigma_{sig}^{2} /\sigma_{tot}^{2} } \), where is \( \sigma_{sig}^{2} \) is the variance of the model signal (i.e. the variance of the model mean), and \( \varvec{\sigma}_{{\varvec{tot}}}^{2} \) is the variance of the ensemble members. If the model and observations are interchangeable then RPC ≈ 1. If the model ensemble is overconfident (which results in an under-dispersed ensemble), then RPC < <1. RPC ≫ 1 corresponds to the more unusual case of underconfident forecasts, which has recently been reported for winter seasonal predictions of the NAO (Scaife et al. 2014).

For the analysis in Sects. 2.2 and 2.3, and the linear regression models in Sect. 2.4, the ENSO index used is the Niño 3.4. We use a fixed definition of the Arctic Oscillation (AO, Thompson and Wallace 1998), defined as the difference in MSLP between two bands of latitude at 35oN–45oN and 60oN–70oN. The National Oceanic and Atmospheric Administration (NOAA)/Climate Prediction Center (CPC) define the AO as the projection of the 1000 mb height anomalies poleward of 20°N onto the first leading mode from EOF analysis of monthly mean height anomalies at 1000 hPa. Our fixed definition avoids discrepancies between the model and observations due to possible different loading patterns, and is very similar to the definition used by Li and Wang (2003), who use zonally averaged MSLP anomalies between 35oN and 65oN. In ERAI, the AO definitions are very highly correlated, with r = 0.9.

The definitions of the bespoke climate indices for the NC China region (Z500-N, Z500-S, MSLP dipole index and MEJS), used in the analysis in Sect. 2.3 and linear regression modelling in Sect. 2.4, are summarised in Appendix 1.

The composites of MSLP and 10 m wind vector anomalies for El Niño (La Niña) winters in Fig. 4 are made by selecting winters which have Niño 3.4 index value of greater (less) than one standard deviation in ERAI over the entire ERAI period. The corresponding model composites are for the same winters as ERAI. Only DePreSys3 data is used for the model composites so that more years can be included. Similarly the composites of 500 hPa geopotential height and wind vector anomalies in Fig. 6 for windy (calm) winters in NC China are made by selecting winters with a normalised area averaged wind speed over NC China of greater (less) than one standard deviation in ERAI, and only DePreSys3 data is included in the model composites.

In Sects. 2.2 and 2.3, we identify the likely sources of skill for wind speeds in SE and NC China. The correlations between the predicted index describing the skill source (e.g. ENSO for SE China) and the observed wind speed give an indication of the skill that would be obtained if using the index rather than the wind speed direct from the model. However, this indirect method requires the relationship between the predictor and observed wind speed to be established. For a real forecast the observation would be unknown, so to obtain fairer indication of the performance of the indirect method, cross-validated linear regression is necessary. This is performed in Sect. 2.4. Here, the wind speed for a given year is predicted using the linear regression model derived from the remaining years in the hindcast. The regression coefficients are calculated using normalised values of the predictors from the dynamical model (combined ensemble) and normalised wind speeds from the observations.

2 Results

2.1 Skill assessment

Figure 1a, b show maps of the correlation coefficient (r) at each grid point between ERAI DJF 10 m wind speed with the ensemble mean DJF 10 m wind speed from GloSea5 and DePreSys3 respectively. Figure 1a reveals five regions over China with statistically significant skill at the 5% level, marked in Fig. 1a–c with boxes. These regions are (1) south-eastern China and the South China Sea (SE China, 108oE–125oE, 18oN–28oN), (2) Yunnan province (98oE–105oE, 22oN–30oN), (3) Southern Tibet (81oE–91oE, 28oN–35oN), (4) Northern central China (NC China, 100oE–112oE, 34oN–43oN), and (5) North East China (NE China, 115oE–123oE, 40oN–50oN). The correlation coefficients for the area averaged DJF mean wind speed for regions (1)–(5), for GloSea5, DePreSys3 and the combined ensemble, are given in Table 1. For DePreSys3 the results are given over the whole ensemble time period (1981–2015) and the common time period (1993–2012).

Fig. 1
figure 1

Winter seasonal prediction skill (correlation) for China in a GloSea5 (24 members, 1992/93–2011/12); b DePreSys3 (40 members, 1980/81–2014/15); and c the combined ensemble for the common period (64 members, 1992/93–2011/12). Stippling shows regions where the skill is significant at the 5% level, and the boxes correspond to the regions in Table 1

Table 1 Skill of regional wind speed predictions

The skill appears to be robust to using an independent ensemble and longer time period because all regions also show areas of statistically significant skill in the DePreSys3 ensemble (Fig. 1b), apart from NE China (region five). Given the longer time period and larger ensemble size of the DePreSys3 hindcast, this gives added confidence that the skill in regions (1)–(4) is robust. Table 1 shows that DePreSys3 does have significant skill in NE China when only 1993–2012 is considered, and the DePreSys3 skill is also higher for this time period for NC China and SE China compared to 1981–2015. This may hint at more predictability in the later period for these regions, as has previously been observed for other phenomena such as the NAO (Weisheimer et al. 2017), although there is not yet enough evidence to say this definitively here. The skill map of the combined ensemble (Fig. 1c) resembles that of GloSea5 and DePreSys3 individually, with significant skill in all five areas. These results confirm the findings of Bett et al. (2017), showing that they hold for a larger ensemble and also over a longer time period.

For the remainder of this paper we focus on finding the sources of skill in SE China and NC China, because these are the regions where we are most confident the skill is robust, since they have the highest skill in the combined system (r = 0.83 and 0.63 respectively; see Table 1), and the skill is statistically significant in the GloSea5 and DePreSys3 ensembles independently. In addition, according to the freely available map images of wind farm locations on thewindpower.net, there is a substantial number of wind farms within NC China (especially in the within Ningxia province and around its border with Inner Mongolia), and in SE China along the Fujian coastline.

The observed and ensemble mean time series of area averaged DJF wind speed for SE and NC China are shown in Fig. 2a, b. Note that for NC China the observed and model wind speeds are plotted on different scales due the model having a negative bias of approximately 1 ms−1 (~ 30%) and smaller variability by a factor of five. The bias may be due to the complex terrain of the NC China region, since it includes the edge of the Tibetan plateau. The true bias, however, may not be as pronounced as shown in Fig. 2b since ERAI shows a positive bias in surface wind speed compared to station observations in general over China (Zha et al. 2017). Small amplitude variability is expected in ensemble means because the wind speed is the sum of a predictable (signal) and unpredictable (noise) component, so taking the mean leaves only the signal. The reason this is more of an issue in NC China compared to SE China could be because the predictable signal comes from different mechanisms (see Sects. 2.2 and 2.3) and is weaker in NC China. The variability of the model mean may be further underestimated if the relative strength of signal and noise in the model differ from observations, as discussed below.

Fig. 2
figure 2

Time series of area averaged DJF mean wind speed in ERAI, DePreSys3 (1981–2015), GloSea5 and combined ensemble (1993–2012) for (a) SE China (region 1) and (b) NC China (region 4) respectively. Note the different y-scale for ERAI and dynamical model wind speeds in (b). The dashed line in each plot shows the cross validated linear regression models described in Sect. 2.4. In (a) the covariate is ENSO and in (b) the covariates are the Z500-N and Z500-S indices

Recent studies have shown that for extratropical climate indices such as the NAO, a large ensemble is required to achieve high skill (Scaife et al. 2014, Eade et al. 2014). This is interpreted as the signal-to-noise ratio in the model being significantly smaller than estimated from observations, i.e. the real world appears to be more predictable than the model. We test if this is the case for wind speed in SE and NC China in Fig. 3 by plotting the estimated skill, r, of the combined ensemble, against number of ensemble members, n (see Sect. 1.2). Comparing Fig. 3a, b, a larger ensemble is required to achieve high skill in NC China compared to SE China. The ratio of predictable components between observations and model (RPC, Eade et al. 2014; see Sect. 1.2) is 2.1 for NC China, whereas for SE China the RPC is closer to 1 (RPC = 1.1).

RPC greater than 1 also implies that the skill of the model in predicting one of its own ensemble members is less than the skill of the model in predicting observations (Eade et al. 2014). The skill of the model predicting itself is shown by the dashed lines in Fig. 3. Indeed, for NC China the skill of the model predicting itself is much lower than the skill in predicting observations and not statistically significant, whereas for SE China these quantities are comparable. This is in agreement with previous results, that extra-tropical phenomena such as the NAO and AO appear less predictable (i.e. have a smaller forced component) in seasonal forecast models compared to the real world (Scaife et al. 2014, Eade et al. 2014, Stockdale et al. 2015, Dunstone et al. 2016, Athanasiadis et al. 2017, Kumar and Chen 2017).

Fig. 3
figure 3

Prediction skill (r) against ensemble size for a SE China, and b NC China. The solid line shows the skill measured experimentally by calculating the mean skill of repeated random samples of ensemble members for each year to create ‘new’ ensemble members. The dotted line shows the theoretical relationship (Murphy 1990). The dashed line shows the skill of the model in predicting one of its own members, measured experimentally

Table 2 Skill of predicting climate indices and their correlation with Chinese wind speed

2.2 SE China

In this and subsequent sections, all skill values presented refer to the combined ensemble unless otherwise stated. The model shows very high skill (r = 0.83) in predicting DJF wind speed over SE China. The strength of the EAWM is often quantified by anomalies in the lower tropospheric meridional winds around the SE China region (Yang and Lu 2014), but note that here we find very similar skill in both the time mean meridional and zonal components (r = 0.74 and 0.77 for the meridional and zonal components respectively).

Bett et al. (2017) attributed the skill in this region to the El Niño Southern Oscillation (ENSO). ENSO is the leading mode of variability in the tropics (Barnston et al. 2011) and is highly predictable in winter on seasonal timescales (the skill of the Niño 3.4 index for the combined ensemble is r = 0.99, see Table 2). Figure 4a shows the correlation between the area averaged wind speed in SE China with sea surface temperatures (SST) over the period DJF 1979/80–2015/16, which does indeed reveal a La Niña-like pattern. The model reproduces this pattern (Fig. 4b), although the region of significance is much larger than in ERAI due to the larger amount of data (1280 years in the model compared to 37 years for ERAI). In Table 2 we also see that the correlation between the Niño 3.4 index and area averaged wind speed in SE China in ERAI and in the model ensemble members are − 0.66 and − 0.68 respectively, confirming this relationship. Note that the relationship is stronger in the model when calculated using the ensemble mean ENSO and ensemble mean wind speed, with r = − 0.92. The correlation between the model predicted Niño 3.4 index and the observed area averaged wind speed in SE China is − 0.83, indicating that using the predicted ENSO index to predict wind speed in SE China gives as good a prediction as using the wind speed taken directly from the model. This is consistent with ENSO being the main source (> 50% variance) of wind speed skill in this region. This is discussed further in Sect. 2.4 where we test cross-validated linear regression models.

Fig. 4
figure 4

ENSO connection to SE China winds. Correlation of area averaged DJF wind speed in SE China (region marked with inner box) with sea surface temperatures in ERAI (a) and in the combined model ensemble, calculated on ensemble members (b). Stippling shows significance at the 5% level [|r| > 0.32 for ERAI (37 data points), |r| > 0.055 for the model (1280 data points)]. The larger box shows the area in the panels below. c, d Mean DJF MSLP and 10 m wind vectors over China in ERAI and the combined model respectively (over the whole ERAI and combined model period). e, f Composites of MSLP and 10 m wind vector anomalies in ENSO positive (El Niño) winters (1983, 1992, 1998, 2010 and 2016) in ERAI and DePreSys3 respectively. g, h Composites of MSLP and 10 m wind vector anomalies in ENSO negative (La Niña) winters (1989, 1999. 2000, 2008 and 2011) in ERAI and DePreSys3 respectively. Note that for panels f and h, only DePreSys3 data is used rather than the combined model (see Sect. 1.2)

The mechanism by which ENSO affects wind speeds in SE China is shown in Fig. 4c–h. For ERAI, Fig. 4c shows the mean MSLP (mean sea level pressure) and DJF 10 m wind vectors in the region, and composites of the anomalies for El Niño and La Niña winters are shown in Fig. 4e, g. The composites show high (low) pressure anomalies off the south-east coast of China in El Niño (La Niña) winters. The flow around these anomalies opposes (enhances) the mean flow shown in Fig. 4c, leading to decreased (intensified) wind speeds. These results are consistent with the findings of Lu et al. (2017), who found that the anomalous south-westerly flow in SE China in El Niño winters results in enhanced precipitation over the region.

The model mean flow and composite anomalies are shown in Fig. 4d, f, h, with the anomalies plotted for the same years as for ERAI. The composites show that the model captures the surface wind anomalies in El Niño and La Niña years remarkably well.

These results are consistent with the findings of Yang and Lu (2014), who investigated the predictability of 21 different EAWM indices in the ENSEMBLES dataset (van der Linden and Mitchell 2009). They found that the EAWM indices measuring anomalies in lower tropospheric meridional winds around SE China were the most predictable, and these same indices also had the strongest correlations with ENSO.

2.3 NC China

The skill of wind speed predictions in NC China is not related to ENSO. Over the ERAI period the correlation between the observed Niño 3.4 index and observed wind speed in NC China is 0.11, and over the period covered by the combined model hindcast (1992/93–2011/12) it is even smaller (− 0.02). As for SE China, the model skill is very similar for the time mean meridional and zonal components (r = 0.34 and 0.33 respectively), although the lower skill in the components indicates that the model is better at capturing changes in wind speed rather than direction.

Large scale climate modes such as the Arctic Oscillation (AO) have particular spatial patterns in MSLP, so in Fig. 5a, c we show the correlation between area averaged DJF wind speed in NC China with MSLP in ERAI and the combined model ensemble. Both ERAI and the model show that the wind speed in NC China is related to a local east–west pressure gradient (the difference in MSLP between the black solid and dashed boxes), but since the mean elevation of the NC China region is approximately 1600 m in our model and has an average surface pressure of ~ 800 hPa in DJF, using a local MSLP gradient as a predictor of surface winds makes little sense here. In fact, the model has insignificant skill at the 5% level in predicting this MSLP gradient, and the correlation between the model predicted MSLP difference and observed wind speed in NC China is just r = − 0.05 (see Fig. 5e and Table 2), confirming that this is not the source of predictability seen in the model. Figure 5a, c also show that the wind speed in NC China is not dominated by the intensity of the Siberian-Mongolian High (SMH), which is centred north west of the eastern MSLP box (solid line). The correlation of the SMH (using an index defined by MSLP anomalies in the region 40–60oN and 70–120oE; Chang and Lu 2012) with wind speed in NC China in ERAI data is only r = − 0.24, reflecting this fact.

Fig. 5
figure 5

Relations between surface wind speeds in NC China and other variables. a Correlation of area averaged wind speed in NC China (red box) with MSLP in ERAI; c Correlation of area averaged wind speed in NC China with MSLP in the combined model ensemble (calculated on ensemble members); e Skill of the combined model ensemble in predicting MSLP. The solid and dashed black boxes mark the regions used to calculate the MSLP dipole index in Sect. 2.3. b, d, f As left hand panels but for Z500. The solid and dashed boxes outline the Z500-S (central Asia, 45oE–85oE, 30oN–50oN) and Z500-N (eastern Siberia, 90oE–135oE, 45oN–65oN) regions in Sect. 2.3 respectively. Stippling marks significance at the 5% level [|r| > 0.32 for a and b (37 data points), |r| > 0.055 for c and d (1280 data points), |r| > 0.44 for e and f (20 data points)]

Figure 5a, c, however, do also resemble a negative AO pattern, which can be skilfully predicted on our systems (MacLachlan et al. 2015). From Table 2, we see that the skill in predicting the AO is 0.56 (significant at the 1% level), and it has a significant anti-correlation with wind speed in NC China in both the observations and model (r = − 0.45 and − 0.35 respectively, both significant at < 1%). However, the correlation between the model predicted AO and observed wind speed in NC China is only r = − 0.19 (statistically insignificant). So although wind speed in NC China appears to be related to the AO, it is not the full source of predictability and does not appear to explain the wind speed skill seen in the model.

In Fig. 5b, d, f we examine 500 hPa geopotential height signals. Figure 5b, d show the correlation between wind speed in NC China with the geopotential height at 500 hPa (Z500) in ERAI and the combined ensemble respectively, and the skill of the model in predicting Z500 is shown in Fig. 5f. There are two regions identified where there is strong correlation between Z500 and wind speed in NC China, and where the model has areas of significant skill in predicting Z500: over eastern Siberia (Z500-N) and central Asia (Z500-S) (marked with the dashed and solid black boxes in Fig. 5b, d, f. This Z500 pattern has a slight resemblance to the type 2 Eurasian pattern (also known as the East Atlantic/West Russia (EATL/WRUS) pattern (Liu et al. 2014), although we find that the correlation of the EATL/WRUS index as calculated by the Climate Prediction Center/National Oceanic and Atmospheric Administration (CPC/NOAA, https://www.cpc.ncep.noaa.gov/data/teledoc/telecontents.shtml) with wind speed in NC China in ERAI is low (r = − 0.23). Therefore, we define an index based on the Z500 anomalies in the regions mentioned above.

Table 2 shows that the model has significant skill in predicting the difference in Z500 (Z500 dipole) over these two regions (r = 0.45, significant at the 5% level), and that Z500-dipole is very highly correlated with wind speed in NC China in both the observations and model (r = 0.77 and 0.71 respectively, both significant at < 1%). The correlation between the model predicted Z500-dipole and observed wind speed in NC China is 0.70 (significant at < 1%) and comparable to the direct model skill (0.63). This could indicate that the model’s skill in wind speed in NC China is related to its skill in predicting Z500 over the identified regions. Using Z500 as a predictor of wind speed in NC China is discussed further in Sect. 2.4

The mechanism by which Z500 in these regions influences wind speed in NC China is shown in Fig. 6. Figure 6a, b show the ERAI and model mean Z500 and 500 hPa wind vectors for DJF, revealing the average north-westerly flow at 500 hPa in NC China. Figure 6c, e show composites of the Z500 and 500 hPa wind vector anomalies for observed windy and calm winters, and Fig. 6d, f show the model composite anomalies for the same years. The anomalies seen in the model are much smaller compared to ERAI (and hence different colour and vector scales are used in Fig. 6). This is to be expected since the model composites are based on the ensemble mean anomalies, so they only show the predictable component (the signal) of the anomalies in the same years as the observations. Furthermore, in Sect. 2.1 we demonstrated that the signal in the model over NC China is smaller than observations (RPC > > 1), resulting in even smaller model mean anomalies. If instead we made model composites taking the windy/calm years from individual ensemble members we would expect similar magnitude variations.

Fig. 6
figure 6

Windy and calm winters over NC China. a, b Mean DJF Z500 and 500 hPa wind vectors in ERAI and the combined ensemble respectively (over the whole ERAI and model periods). c, d Composites of Z500 and 500 hPa wind vector anomalies in windy winters over NC China (1981, 1983, 1987, 2001, 2009, 2010 and 2011) in ERAI and DePreSys3 respectively. e, f Composites of Z500 and 500 hPa wind vector anomalies in calm winters over NC China (1989, 1990, 1992, 2008 and 2012) in ERAI and DePreSys3 respectively. The small black box marks the NC China region, and the larger black boxes mark Z500-N and Z500-S. The red box marks the MEJS region defined in Sect. 2.3

In windy winters, both ERAI and the model show there is a strong negative Z500-N anomaly, with Z500-S remaining close to climatology. Cyclonic circulation around this anomaly intensifies the mean flow in NC China. In calm winters, in ERAI there is a strong negative anomaly in Z500-S and weaker positive anomaly in Z500-N, and the cyclonic and anti-cyclonic flow around these anomalies respectively weakens the mean flow in NC China. The model captures the pattern of these anomalies well, but their relative strength is reversed.

Fig. 7
figure 7

Cross-validated linear regression fits to wind speed in SE and NC China. The ENSO coefficient is for the SE China regression model, and the remaining coefficients (Z500-N, Z500-S and AO) are for the NC China model. For NC China, the coefficients for the regression model including all three predictors are plotted with thin black lines, and the coefficients for the model including both Z500-N and Z500-S (but not AO) as predictors are plotted with thick black lines. The boxes show the range of coefficient values resulting from the cross-validation procedure, and the whiskers give the mean 95% confidence intervals

It should be noted that there is much diversity in the Z500 patterns in both the ERAI and model composites. The Z500-S anomalous cyclone in the ERAI calm composites (see Fig. 8) are dominated by the calm winters 2008 and 2012, and this feature was also well captured by the DePreSys3 anomalies for the same years (Fig. 9). The Z500-N anomalous cyclone in ERAI windy composites are dominated by the winters 2001, 2009 and 2010 (Fig. 10). Of these, only winter 2010 appears to be well forecast by the model (Fig. 11). The time series of wind speed in NC China in Fig. 2 shows that 2008 and 2012 we both extreme calm years, while 2010 was extremely windy, and it is clear that much of the model’s skill comes from the accurate forecasting of these years. In fact, if these years are removed from the time series, the skill for the combined model reduces to r = 0.18 (statistically insignificant). The significance levels tell us it is unlikely these winters were well forecast by chance, but this highlights the sensitivity of the skill to individual years.

Also revealed in Fig. 6c–f is a region over the Middle East where strong negative (positive) anomalies in the zonal component of the 500 hPa wind are seen in windy (calm) years, marked by the red box (35oE–70oE, 25oN–35oN). These anomalies are also seen higher up in the atmosphere, at 250 hPa, where the geopotential height anomalies resemble those at 500 hPa. These zonal anomalies are due to both the anomalies in Z500-S, and smaller anomaly of the opposite sign over the Arabian Sea. We mention this because it has previously been noted that the strength of the upper atmosphere Middle Eastern jet stream (MEJS) has an influence on the winter climate in central China (e.g. Zuo et al. 2015, Wen et al. 2009, Zhang et al. 2009). The correlation between the MEJS (defined by area averaged anomaly in zonal wind over the red box) and wind speed in NC China is − 0.64 in ERAI and − 0.61 in the combined ensemble (calculated on ensemble members), with both values significant at < 1%. The skill of the combined ensemble in predicting the MEJS is 0.67 (significant at < 1%), and the correlation between the model predicted MEJS and observed wind speed in NC China is − 0.46 (significant at 5%). This implies that the MEJS could also be used as a predictor of wind speed in NC China, although it does not give such high skill at the Z500-dipole index. Furthermore, being able to skilfully predict the MEJS could have implications for predicting other variables such as temperature and precipitation.

We also tested whether wind speeds in NC China were related to any of the EAWM indices mentioned by Yang and Lu (2014). Wind speed in NC China tends to have the highest correlation with the east–west pressure gradient EAWM indices, although the correlations are only moderate (six of the eight of these indices have |r| ~ 0.2–0.45, with the remaining two having r close to zero). Above we found that wind speed in NC China is related to an east–west pressure gradient, although the model predicted gradient was a poor predictor of observed wind speed possibly due to the complex orography in the region. There was also reasonably strong correlation with the East Asian Trough index, I16 (r = 0.59, He et al. 2013), which measured Z500 anomalies in a large area overlapping the Z500-N region. These moderate correlations with the existing EAWM indices imply that these indices are also unlikely to be good predictors of wind speed in NC China, and instead a more specialised index is needed.

2.4 Linear regression models

Here we present the results of the cross-validated linear regression models (see Sect. 1.2), to test whether we have correctly identified the sources of skill in each region. Furthermore, linear regression models are widely used in seasonal forecasting research and operational forecasts and can sometimes lead to an improvement of skill (e.g. Clark et al. 2017, Palin et al. 2016). For SE China, the predictor in the regression model is ENSO, and the range of values of the ENSO coefficient is shown by the box-whisker symbol on the left in Fig. 7. The box shows the coefficient range resulting from the cross-validation procedure, and the whiskers give the mean 95% confidence intervals. The skill of the cross-validated regression model is 0.78, very close to the direct model skill of 0.83. Figure 2a shows the cross-validated regression model fit (dashed line), which closely resembles the output from the dynamical models. The regression model has been scaled to have the same mean and standard deviation as ERAI which accounts for the offset between it and the dynamical models.

For NC China, the possible covariates identified were Z500-N and Z500-S. In Sect. 2.3 we showed that a dipole index constructed by taking the difference between Z500-N and Z500-S had a strong correlation with the wind speed in NC China, but since Z500-N and Z500-S are actually independent in the observations (r = 0.09), we keep them as separate covariates in the linear regression model. We also test including the AO in the model, which is significantly correlated with Z500-N in the observations (r = 0.65, significant at < 1%), but not with Z500-S (r = − 0.02, statistically insignificant).

Values of the coefficients of the two multi-linear regression models (with and without AO) show that wind speed in NC China has the strongest dependence on Z500-S (mean coefficient values are 0.62 and 0.72 in the models with and without AO respectively), followed closely by Z500-N (mean coefficient values are − 0.51 and − 0.62 in the models with and without AO respectively). Both of these coefficients are significantly different to zero with 95% confidence (Fig. 7), and the skill is higher using both predictors (r = 0.57) compared to using Z500-S and Z500-N individually (r = 0.29 and 0.042 respectively in cross-validated single linear regression models). Since the Z500-N and Z500-S coefficients are of similar magnitude but opposite sign, the results are very similar if the Z500-dipole index is used as a single covariate (r = 0.62).

Figure 2b shows the model fit without AO (dashed line) (scaled to ERAI), and shows that the regression model closely follows the variations seen in the dynamical model, consistent with the hypothesis that skill in Z500-N and Z500-S is responsible for the wind speed skill in NC China. Note in particular how the regression model captures the well forecast period of 2008–2012 and the poorly forecast winters 2001 and 2004.

The dependence on AO is much weaker, with a mean coefficient value of 0.25. Including AO in the regression model also makes little difference to the skill (r = 0.57 without AO, and 0.56 with AO). The MEJS was not included in the multi-linear regression models as it is a response to the anomalies in Z500-S (it has a very high correlation with Z500-S of r = − 072 in both ERAI and the combined ensemble members), but we tested using the MEJS alone as a predictor, which gave a skill of r = 0.18.

We therefore conclude that both Z500-N and Z500-S are required as covariates in the regression model for NC China, and the skill of this regression model (r = 0.57) is comparable to the direct skill of the dynamical model (r = 0.63). The results are very similar when analysing only the DePreSys3 ensemble over the longer time period, with the linear regression model using Z500-N and Z500-S as covariates giving the highest correlation (r = 0.47, very similar to the skill of DePreSys3 r = 0.43).

In theory, the linear regression (indirect) method should have a higher skill than using the wind speed directly from the model because it bypasses the noise added in the relationship between the predictors and wind speed (Palin et al. 2016). For both SE and NC China, however, the direct model skill is higher than the indirect method skill. This could be because the correlations are calculated on only 20 years of data, giving large uncertainties on the values. However, the fact that the linear regression models do give such high skill gives confidence that the covariates identified are responsible for most, if not all, of the wind speed skill in both regions.

3 Discussion and conclusions

We have shown that Met Office forecast systems have robust skill in forecasting DJF wind speeds over SE and NC China, since it is seen in two independent ensembles covering different time periods. These regions are of importance to the wind energy industry, given the large number of wind farms located there (https://thewindpower.net), and to the health sector given their large populations (the NC China region contains the large population centre Xi’an, and the SE China region includes the highly populated Fujian coast and northern coast of Guangdong, including the major cities Fuzhou, Quanzhou and Xiamen).

The high skill over these regions is seen in two independent ensembles which cover different time periods. The NC China region suffers from a similar signal-to-noise problem to the NAO, where the model appears to be less predictable than the real world. This problem tends to be seen more in extra-tropical regions compared to the tropics, despite the tropical sources of NAO prediction skill (Scaife et al. 2017).

In SE China, the predictability of wind speeds comes from the model’s ability to predict ENSO. The situation is more complicated in NC China, where there appears to be two independent regions of Z500 anomalies which are in balance with the surface wind speeds. The regions are SW (Z500-S) and NE (Z500-N) of NC China, and negative anomalies in Z500-S appear to be associated with calm winters, whereas negative anomalies in Z500-N are associated with windy winters. Linear regression modelling shows that anomalies in both regions are needed to predict wind speed in NC China. The AO is significantly anti-correlated with wind speed in NC China in observations, and although the model has skill in AO, including the AO in linear regression models does not lead to an improvement in wind speed forecasts over NC China. The linear regression models using these covariates (ENSO for SE China, Z500-N and Z500-S for NC China) give similar skill in wind speed compared to the direct model output, but do not improve it as would be expected if all drivers have been identified. However, this could be because the skill is only measured on 20 years of data, leading to uncertainties in the exact skill values. There is not currently enough evidence to recommend using the linear regression models over the dynamical model ensemble, but this should be re-assessed as more years of data become available.

The Z500-S anomaly is also associated with anomalies in the upper level jet streams over central Asia and the Middle East. The strength of the Middle Eastern jet stream (MEJS) is known to be linked with climate in China, and has been associated with temperature and precipitation anomalies in central and southern China (e.g. Zuo et al. 2015, Wen et al. 2009, Zhang et al. 2009). The combined ensemble has significant skill in predicting the MEJS, with r = 0.67. Bett et al. (2017) showed that winter temperature is poorly predicted by GloSea5 over China, so in the future it is worth investigating whether the skill in the MEJS demonstrated here could be used to improve predictions of winter temperature in some areas of China.

We note that throughout this paper, the variable analysed is 10 m wind speed. Since wind turbine hub heights tend to be 80–120 m, for relevance for wind power generation, wind speed forecasts at these heights would be more useful. The correlations between DJF mean wind speed at 10 m and at model levels at 71.9 m and 124.5 m in ERAI, over SE and NC China, range from 0.99 to 1.00, so the results presented in this paper also apply at hub height.

It is also not obvious the skill results for area averaged seasonal mean wind speed also apply to potential wind power generated, which is clearly of more interest to the energy industry. To estimate wind power, high temporal (e.g. 6 hourly) wind speed data is converted to power using a turbine specific wind power curve (e.g. Kiss et al. 2009, Lydia et al. 2014, Bett et al. 2017), then the seasonal mean is taken. The area averaged seasonal mean wind power thus requires knowledge of the distribution of wind farms and their power curves, as well as being more computationally expensive to estimate. However, the recent study of Bett et al. (2018) showed that over Europe, after taking the seasonal and country averages, the resulting wind power capacity factors are very highly correlated with the corresponding mean wind speeds. Thus, where there is significant wind speed skill, there is also likely to be seasonal skill in forecasting wind power. It will, however, be necessary to test whether this result holds for China where there may be different sub-seasonal variability of wind speeds, and for different turbine properties and hub heights.

In addition, if power generation data from specific wind farms in SE and NC China were available, it may be possible to generate site-specific seasonal forecasts using the relationship between the model wind speed (or climate indices) and observed wind power generated.