Attributing correlation skill of dynamical global GCM precipitation forecasts to statistical ENSO teleconnection using a set theory based approach

. Climate teleconnections are essential for the verification of valuable precipitation forecasts generated by global climate models (GCMs). This paper develops a novel approach to attributing correlation skill of dynamical GCM forecasts to statistical El Niño-Southern Oscillation (ENSO) teleconnection by using the coefficient of determination ( R 2 ). Specifically, observed precipitation is respectively regressed against GCM forecasts, Niño3.4 and both of them and then the intersection 15 operation is implemented to quantify the overlapping R 2 for GCM forecasts and Niño3.4. The significance of overlapping R 2 and the sign of ENSO teleconnection facilitate three cases of attribution, i.e., significantly positive anomaly correlation attributable to positive ENSO teleconnection, attributable to negative ENSO teleconnection and not attributable to ENSO teleconnection. A case study is devised for the Climate Forecast System version 2 (CFSv2) seasonal forecasts of global precipitation. For grid cells around the world, the ratio of significantly positive anomaly correlation attributable to positive 20 (negative) ENSO teleconnection is respectively 10.8% (11.7%) in December-January-February (DJF), 7.1% (7.3%) in March-April-May (MAM), 6.3% (7.4%) in June-July-August (JJA) and 7.0% (14.3%) in September-October-November (SON). The results not only confirm the prominent contributions of ENSO teleconnection to GCM forecasts, but also present spatial plots of regions where significantly positive anomaly correlation is subject to positive ENSO teleconnection, negative ENSO teleconnection and teleconnections other than ENSO. Overall, the proposed attribution approach can serve as an effective tool 25 to investigate the source of predictability for GCM seasonal forecasts of global precipitation. Kug, The relationship between forecast skill and the state of sea surface temperatures (SSTs) was evaluated for the seasonal outlook of precipitation over the United States and the skill was found to case CFSv2 South America in JJA (Cai et al., 2020) and Middle East in SON (Mariotti, 2007). 7.3%, 7.4% and 14.3% of grid cells are with significantly positive anomaly correlation attributable to negative ENSO teleconnection respectively in MAM, JJA and SON. Representative regions are southeast Asia in MAM, JJA and SON and large parts of Australia in SON, which . The pattern in southeast Asia is consistentsuggest that with previous studies that precipitation therein is strongly correlated with ENSO in the 350 dry season (Jiang & Li, 2017). Australian precipitation in SON has also been found to be greatlysubstantially influenced by the extratropical teleconnection pathway of ENSO (Cai & Weller, 2013). Further, 12.6%, 10.3% and 13.3% of grid cells are with significantly positive anomaly correlation not attributable to ENSO teleconnection. This result calls for the investigation of other teleconnection patterns for GCM seasonal precipitation forecasts. for GCM seasonal precipitation forecasts. The proposed approach in this paper can serve as an effective tool to achieve the attribution and can be further used to investigate other sources of skill of seasonal precipitation. The results not only confirm the prominent 425 contributions of ENSO teleconnection to GCM forecasts, but also present spatial plots of regions where significantly positive anomaly correlation is subject to positive ENSO teleconnection, negative ENSO teleconnection and teleconnections other than ENSO. Overall, the attribution approach proposed in this paper can serve as an effective tool to investigate the source of predictability for GCM seasonal forecasts of global precipitation.

. Despite the importance, the forecasting of precipitation remains a formidable task due to complex interactions of ocean, atmosphere and land surface processes (Doblas-Reyes et al., 2013;Vano et al., 2014;Johnson et al., 2019;Zhao et al., 2019;Tesfa et al., 2020). Comparing multiple sets of global temperature and precipitation forecasts from the North American Multi-Model Ensemble (NMME) experiment, Becker et al. (2020) highlighted that there are substantial improvements in temperature forecasts over both land and ocean during the 35 past decades and that there is still plenty of room for improvement of global precipitation forecasts.
Global climate models (GCMs) generate valuable forecasts of worldwide precipitation for hydrological modelling and water management (Doblas-Reyes et al., 2013;Kirtman et al., 2014;Schepen et al., 2020). Nowadays, GCMs have been employed by major climate centers around the world to produce operational climate outlooks (Demargne et al., 2014;Delworth et al., 2020). For example, the Climate Forecast System version 2 (CFSv2) of the U.S. National Centers for Environmental Prediction 40 (NCEP) has been implemented for coupled ocean-atmosphere forecasting since 2011 (Saha et al., 2010); the European Centre for Medium-Range Weather Forecasts (ECMWF) System 5 model became operational in 2017 (Johnson et al., 2019); and the Seamless System for Prediction and Earth System Research (SPEAR) became the next generation modelling system at the Geophysical Fluid Dynamics Laboratory (GFDL) in 2020 (Delworth et al., 2020). In the meantime, GCM forecasts have been increasingly incorporated into forecasting systems of streamflow, crop yield and soil water and they are shown to create 45 enormous socioeconomic benefits (e.g., Vano et al., 2014;Peng et al., 2018;Wang et al., 2019).
There have been in-depth investigations of ENSO for GCM forecasts as it is one of the most prominent modes of climate variability (Fu et al., 1997;Wang et al., 2003;Feng & Hao, 2021). For instanceexample, Attention is usually paid to regions subject to prominent ENSO influences (Kim et al., 2016;Rivera & Arnould, 2020;Vashisht et al., 2021). For instance, possible contributions of ENSO to precipitation forecast skill were investigated for the west coast of North America (Pegion & Kumar, 60 2013;Chen & Kumar, 2020). Tthe influence of ENSO on the East Asian-western Pacific climate was studied for CFSv2 forecasts (Yang & Jiang, 2014) and also for climate projections in the Coupled Model Inter-comparison Project Phase 5 (Gong et al., 2015;Kim et al., 2016;Kim & Kug, 2018). The relationship between forecast skill and the state of sea surface temperatures (SSTs) was evaluated for the seasonal outlook of precipitation over the United States and the skill was found to be dominantly attributed to ENSO in late autumn to late spring (Quan et al., 2006;Pegion & Kumar, 2013;Shin et al., 2019). 65 Understanding the sources for skillpredictability is important for physically-based validations of GCM forecasts necessary for the prospects for seasonal forecast skill (Neelin & Langenbrunner, 2013;Manzanas et al., 2014;Shin et al., 2019). Processbased evaluation linking ENSO to summer precipitation was performed over eastern Africa (Vashisht et al., 2021).
Nevertheless, previous studies tended to pay attention to Attention is usually paid to regions subject to prominent ENSO influences (Kim et al., 2016;Rivera & Arnould, 2020;Vashisht et al., 2021). aAt the global scale, the relationship 70 betweenattribution of ENSO teleconnection and GCM precipitation forecasts to ENSO teleconnection is yet to be illustratedconducted. For instance, possible contributions of ENSO to precipitation forecast skill were investigated for the west coast of North America (Pegion & Kumar, 2013;Chen & Kumar, 2020). This paper is devoted to attributing correlation skill of dynamical CFSv2 forecasts (e.g., Saha et al., 2010;Yuan et al., 2011;Jia et al., 2015;Becker et al., 2020;Zhao et al., 2020a) to statistical ENSO teleconnection at the global scale. A novel approach 75 based on the coefficient of determination (R 2 ) and from the perspective of set theory based on the coefficient of determination (R 2 ), which measures the ratio of the explained variance to the total variance, is devised to facilitate the attribution. With R 2 characterizing the ratio of explained variance and set operations illustrating the overlapping The R 2 , the significance test by grid cell of overlapping R 2 is tested byis conducted by bootstrapping in order to identify where correlation skill is attributable to ENSO teleconnection. The novelty of the approach is the implementation of classical set operations through simple linear 80 regression. As will be demonstrated through the case study of CFSv2 forecasts, three cases are effectively revealed: (1) significantly positive anomaly correlation attributable to positive ENSO teleconnection, (2) significantly positive anomaly correlation attributable to negative ENSO teleconnection and (3) significantly positive anomaly correlation not attributable to ENSO teleconnection.

Data description 85
GCM forecasts comprise a typical high-dimensional dataset (Kirtman et al., 2014;Saha et al., 2014;Chen & Kumar, 2016;Becker et al., 2020;Zhao et al., 2020b). For the CFSv2 forecasts investigated in this paper, there are five dimensions: (1) forecast start time s; (2) lead time l; (3) ensemble size n; (4) latitude y; and (5) longitude x. s represents the number of months since the benchmark time that is January 1982; l is the number of months ahead, which ranges from 0 to 9 month for the CFSv2 forecasts; n =1, …, 24, i.e., the total number of ensemble members is 24; y ranges from −90 to 90 while x is from 0 to 359, 90 with a horizontal resolution of 1.0º latitude by 1.0º longitude. The set of forecasts is denoted by , , , , in which f represents forecast values specified by the five dimensions and F is the dataset of forecasts.
There are three dimensions for the dataset of observed precipitation corresponding to forecasts (Xie et al., 2007;Infanti & Kirtman, 2015;Schneider et al., 2016). They are target time t, which is equal to the sum of start time s and lead time l to align observations with forecasts; latitude y; and longitude x. The set of observed precipitation corresponding to the forecasts is 95 denoted as , , The CPC global daily Unified Rain-gauge Database (CPC-URD), which has been widely used in the analysis of regional and global precipitation (Xie et al., 2010), is used as the referenced observed precipitation.
The correlation skill, which is in the form of the Pearson's correlation coefficient, is calculated so as to relate CFSv2 precipitation forecasts to CPC-URD observations: 100  (Yuan et al., 2011;Zhao et al., 2020a;Zhao et al., 2020b).
It is noted that forecasts/observations, which are monthly, are aggregated into seasonal. The aggregation is meant to facilitate 105 the analysis by season. The aAThe attention is paid to the latest forecasts. That is, seasonal precipitation forecasts generated at the beginning of one the season are investigated. For example, the December-January-February (DJF) forecasts are generated at the beginning of December. Similarly, seasonal forecasts for March-April-May (MAM), June-July-August (JJA) and September-October-November (SON)) are respectively produced at the starts of March, June and September.

Mathematical formulation
The approach to attributing correlation skill of GCM seasonal forecasts to ENSO teleconnection is built upon the coefficient of determination, i.e., R 2 (Koster et al., 2010). Mathematically, R 2 is equivalent to the squared value of the Pearson's correlation 120 There is a difference in the meaning of r in relating observed precipitation to forecasts and ENSO. As to forecasts, r tends to be positive, i.e., high (low) values of forecasts can be indicative of high (low) values of observations (Yuan et al., 2011;Zhao et al., 2020a;Zhao et al., 2020b (Mason & Goddard, 2001).
Both positive and negative correlations contribute to R 2 . Therefore, this paper proposes to use R 2 is respectively derived for forecasts and ENSO in relating them to as an indicator of the association of to associate observed precipitation with forecasts and ENSO respectively. Specifically, To obtain R 2 , simple linear regression models are set up to regress observed precipitation respectively against GCM forecasts 130 and Niño3.4 respectively: where β1 and β2 are the regression coefficients; ϵ1 and ϵ2 are the residuals; k represents the target year. Further, through bivariate linear regression, the variance explained by the union of forecasts and Niño3.4 is calculated: in which the union operator is introduced to represent the joint effect. If GCM forecasts were independent from Niño3.4, then R 2 (o~f∪Niño3.4) could conceptually be obtained by simply adding up R 2 (o~f) and R 2 (o~Niño3.4). On the other hand, if GCM 135 forecasts were dependent on Niño3.4, then there would be some overlaps for R 2 (o~f) and R 2 (o~Niño3.4). As a result, R 2 (o~f∪Niño3.4) would not be as large as the sum of R 2 (o~f) and R 2 (o~Niño3.4).
In accordance with the set theory, the intersection between R 2 (o~f) and R 2 (o~Niño3.4) is derived by subtracting R 2 (o~f∪Niño3.4) from the sum of R 2 (o~f) and R 2 (o~Niño3.4): in which the intersection operator is introduced to formulate the overlapping R 2 between forecasts and Niño3.4. Specifically, 140 the value of R 2 (o~f∩Niño3.4) quantifies the overlapping part of the explained variance of observed precipitation accounted for by GCM forecasts and Niño3.4.

Attribution of correlation skill
There are three steps to attribute correlation skill of GCM forecasts to ENSO teleconnection. As shown in Figure 1, the first step is the implementations of three linear regression models to derive R 2 (o~f), R 2 (o~Niño3.4) and R 2 (o~f∪Niño3.4) so as to 145 derive R 2 (o~f∩Niño3.4). In the absence of a theoretical distribution function for the overlapping R 2 , the significance is tested by bootstrapping (Efron, 1979). Specifically, GCM forecasts and Niño3.4 are randomly sampled with replacement for 1000 times under the null hypothesis that observed precipitation is independent from either GCM forecasts or Niño3.4. In this way, R 2 (o~f∩Niño3.4) is tested to examine whether the overlapping R 2 between f and Niño3.4 is significant. Secondly, the significance of correlation coefficient is calculated for GCM precipitation forecasts in relating them to global precipitation. Specifically, r(o, f) is obtained for each grid cell over global land. The two-tailed significance test is implemented 155 and r(o, f) is therefore identified to be significantly positive, neutral, or significantly negative. In this paper, the significance levels in the first and second steps are set to be 0.10.
The third step focuses on significantly positive r(o, f) that indicates informative forecasts (Becker et al., 2020;Zhao et al., 2020a;Zhao et al., 2020b). There are two criteria: (1) the significance of overlapping R 2 and (2) the sign of ENSO teleconnection. Overall, three cases are obtained: (1) significantly positive anomaly correlation attributable to positive ENSO 160 teleconnection, (2) significantly positive anomaly correlation attributable to negative ENSO teleconnection and (3) significantly positive anomaly correlation not attributable to ENSO teleconnection.

An illustrative example
An example based on synthetic data is devised to illustrate how the overlapping R 2 is influenced by the strength of association between two variables. Samples of x1, x2 and y are randomly drawn from a tri-variate normal distribution 165 where μ and Σ are the mean vector and covariance matrix, respectively: In the example, the correlations of y with x1, x2 are fixed to be 0.5 respectively. As a result, the focus is on r(x1, x2) that determines the intersection between x1 and x2. The value of r(x1, x2) is set to be 0.0, 0.1, 0.2, 0.3, 0.4 and 0.5. For each prespecified r(x1, x2), 1000 samples of x1, x2 and y are drawn to facilitate linear regression models to derive R 2 (y~x1), R 2 (y~x2), R 2 (y~x1∪x2) and R 2 (y~x1∩x2). For R 2 , the median and inter-quartile ranges are estimated through 1000 Monte Carlo 170 experiments.
That is, they They remain to be around 0.25, i.e., 0.5 2 , as r(x1, x2) increases from 0.0 to 0.5. That is to say,The indication is that the change in correlation between forecasts and Niño3.4 does not influence the amount of information that they respectively 185 provided respectively. By contrast, Figure 2c shows that R 2 (y~x1∪x2), which represents the ratio of variance explained by the union of x1 and x2, decreases with the increase of r(x1, x2). In the meantimeBy contrast, R 2 (y~x1∩x2) increases with r(x1, x2).
Figure 3 further shows the influence of r(x1, x2) by using the Venn diagram that illustrates the extent to which R 2 (y~x1) and R 2 (y~x2) intersect. and 3 show the influence of r(x1, x2) on overlapping R 2 . In the context of GCM forecast, the influence on the overlapping R 2 can represent the change in the overlapping information provided by forecasts and ENSO teleconnection. 190 According to Figures 2a and 2b, the medians of R 2 (y~x1) and R 2 (y~x2) approximate the squared value of pre-specified r(y, x1) and r(y, x2). They remain to be 0.25, i.e., 0.5 2 , as r(x1, x2) increases from 0.0 to 0.5. That is to say, the change in correlation between forecasts and Niño3.4 does not influence the amount of information they provided respectively. By contrast, Figure   2c shows that R 2 (y~x1∪x2), which represents the ratio of variance explained by the union of x1 and x2, decreases with the increase of r(x1, x2). In the meantime, R 2 (y~x1∩x2) increases with r(x1, x2). Figure 3 further shows the influence of r(x1, x2) by 195 using the Venn diagram that illustrates the extent to which R 2 (y~x1) and R 2 (y~x2) intersect. The area of the circle is proportional to the percentage of variance explained by the corresponding explanatory variable. The intersection is depicted represented byas the overlapping area between the two circles. As is illustratedFrom this figure, it can be seen that it is the correlation between x1 and x2 that leads to the decrease of R 2 (y~x1∪x2) and the increase of R 2 (y~x1∩x2). For global precipitation forecasting, the intersection reflects the overlapping information for GCM forecasts and ENSO teleconnection. Figure 3 implies that the 200 correlation between GCM forecasts and Niño3.4 would leads to a decrease of total information and an increase of overlapping information.This represents that the total information provided by forecasts and Niño3.4 decrease with their correlation.

Correlation skill and ENSO teleconnection in DJF
Global maps of correlation skill and ENSO teleconnection in DJF, which is the peak season of ENSO, are shown in the upper 205 part of Figure 4. In Figure 4a, correlation skill is observed to be largely positive, indicating that CFSv2 forecasts are skillful in general (Saha et al., 2010). In Figure 4b, ENSO teleconnection exhibits both positive and negative values. That is, observed precipitation around the world can be positively or negatively correlated with Niño3.4 (Mason & Goddard, 2001;Chen & Kumar, 2020;Vashisht et al., 2021). The two-tailed significance test is applied to anomaly correlation and ENSO teleconnection at each grid cell.  and 4f, which are respectively for CFSv2 forecasts and Niño3.4, respectively conform to Figures 4a and 4b. This outcome is due to that R 2 is mathematically equal to the squared value of correlation coefficient (Krause et al., 2005). The union in Figure  225 4g exhibits a higher value of R 2 than that in either Figure 4e or Figure 4f. The subtraction of the union from the sum facilitates the intersection. As illustrated in Figure 4h, deep blue grid cells are seen to distribute in southern North America, northern South America, East Africa, Southern Africa and Southeast Asia. Over these regions, both GCM forecasts and Niño3.4 index can explain a considerable part of the variance of observed precipitation (Figures 4e and 4f). More importantly, their explained variances intersect (Figure 4h). 230

Attribution at the global scale
The significance of the intersection for CFSv2 forecasts and Niño3.4 is tested by bootstrapping and then shown in Figure 5. In and ENSO teleconnection ought to be large enough to facilitate a significant intersection. Largely owing to overlapping R 2 , anomaly correlation is observed to increase with the increase of positive ENSO teleconnection and also with the decrease of negative ENSO teleconnection. For Figure 5b, there notably exist some outliers that suggest ENSO teleconnection could contribute to negative anomaly correlation. CFSv2 forecasts are generally wrong in these cases and the cautious grid cells are marked in black in Figure 5a. Further, the scatter plot in Figure 5c is for grid cells where overlapping R 2 is non-significant. For 240 a fair number of grid cells, anomaly correlation can rise above 0.50 but ENSO teleconnection remains nearly 0.00. The implication is that the corresponding anomaly correlation is not relevant to ENSO teleconnection.  generally located in Europe, North Asia, northwestern Africa and South Australia. Therein, skillful forecasts can relate to teleconnections other than ENSO, such as AO and NAO (Minami & Takaya, 2020).

Figure 6: (a) Spatial distribution of grid cells for the three cases attributing anomaly correlation of CFSv2 forecasts generated in December to ENSO teleconnection. Scatter plots of anomaly correlation against ENSO teleconnection for grid cells (b) with significant overlapping R 2 R2 and (c) without significant overlapping R 2 R2.
Figure 7 presents a sunburst diagram that quantifies the percentages of significantly positive anomaly correlation and its 265 attribution results. As shown by the central cycle, anomaly correlation is identified to be significantly positive, neutral, or significantly negative (Zhao et al., 2020a;Zhao et al., 2020b). The brown slice suggests that more than half of grid cells around the globe are of neutral anomaly correlation, indicating that GCM precipitation forecasts still have plenty of room for improvement (Kim et al., 2012;Jia et al., 2015;Delworth et al., 2020). The pink slice indicates that 39.4% of the grid cells exhibit significantly positive anomaly correlation. Three cases of attribution are performed for significantly positive anomaly 270 correlation. The results are shown by the extended slices, of which the color scheme is the same as that of Figure 6. It can be seen that significantly positive anomaly correlation is attributable to positive (negative) teleconnections for 10.8% (11.7%) of grid cells around the globe. Further, significantly positive anomaly correlation is not attributable to ENSO teleconnection for 16.9% of grid cells around the globe.

Attribution for selected grid cells 280
Four grid cells are selected from Figure 6a to showcase the attribution of CFSv2 forecasts to ENSO teleconnection. As shown in Figure   Grid cell A shown in the first row of Figure 8 presents the case of significantly positive anomaly correlation attributable to positive ENSO teleconnection. The coordinate of the grid cell is (38ºN, 77ºW). In southern North America, DJF precipitation is known to be modulated by the ENSO-induced Pacific North American (PNA) pattern (Jong et al., 2021). Specifically, in 295 southern North America, PNA tends to cause an enhanced DJF Pacific jet stream that extends further east than normal during El Niño events and there are nearly reversed patterns during La Niña events . Since the jet stream determines the paths of DJF storms, the PNA pattern enables ENSO to affect precipitation in the southern North America. It can be seen that forecasts exhibit a high correlation with Niño3.4. This result suggests that CFSv2 forecasts can reasonably represent the influence of ENSO. As a result, R 2 explained by forecasts and Niño3.4 largely overlap in grid cell A. 300 Grid cell B shown in the second row of Figure 8 is for the case of significantly positive anomaly correlation attributable to negative ENSO teleconnection. Its coordinate is (5ºN, 60ºW). In DJF, there is a negative ENSO teleconnection over northern South America; it is owing to that ENSO-related SSTs drive changes in the climatological Walker circulation that promotes anomalous descending (ascending) motion and contributes to negative (positive) precipitation anomalies in El Niño (La Niña) events (Kayano & Andreoli, 2006). The high correlation between forecasts and Niño3.4 highlights the effectiveness of CFSv2 305 in capturing the negative ENSO teleconnection. There is a considerable intersection between the variability explained by Grid cell C displayed in the third row of Figure 8 is for the case of significantly positive correlation not attributable to ENSO teleconnection. Its coordinate is (49ºN, 2ºE). The remote influence of ENSO on Europe has pathways through the North Atlantic or Arctic regions, including the tropospheric and stratospheric bridges (Butler et al., 2014). However, the amplitude of ENSO impacts is weak and generally not significant in the European region (Butler et al., 2014). Besides, DJF precipitation in Europe is known to be modulated by the NAO (Greuell et al., 2018), the Eurasian snow cover extent and the Quasi-biennial 320 Oscillation (Butler et al., 2014). It can be seen that while observed precipitation shows a neutral correlation with Niño3.4, CFSv2 forecasts explain a substantially fraction of observed precipitation variability. This result indicates the capability of CFSv2 in capturing teleconnection patterns other than ENSO.
Grid cell D shown in the last row of Figure 8 represents the case of neutral anomaly correlation. The coordinate is (40ºN, 116ºE). Over East Asia, precipitation is known to be influenced by the response of Rossby waves to ENSO (Yang et al., 2018). 325 Also, winter monsoon activities in East Asia are profoundly influenced by wind-SST-evaporation feedbacks over tropical central Pacific to northwestern Pacific (Kim & Kug, 2018). It can be observed that there is a moderate but not significant correlation between observed precipitation and Niño3.4. However, CFSv2 forecasts exhibit a neutral anomaly correlation, suggesting that the information of ENSO teleconnection is not well represented in the forecasts.

Extended analysis of the other seasons 330
The attribution analysis is further extended to the other seasons, i.e., MAM, JJA and SON. Global maps of the three cases of attribution are illustrated by season in Figure 9. The results of attribution analysis are observed to vary considerably across the four seasons (Figures 6, 7 and 9). It is generally owing to the facts that ENSO teleconnection patterns vary by season (Kim & Kug, 2018;Steptoe et al., 2018;Wang et al., 2019) and that GCMs also formulate not only ENSO teleconnection but also other non-ENSO teleconnection patterns other than ENSO (Saha et al., 2014;Jia et al., 2015;Delworth et al., 2020). The percentage 335 of significantly anomaly correlation is 27.0%, 24.0% and 34.6% respectively in MAM, JJA and SON. Overall, the percentage of significantly positive anomaly correlation in the three seasons is less than that in DJF and is 27.0%, 24.0% and 34.6% respectively in MAM, JJA and SON, which tend to be smaller than that in DJF. This result can be is due to consistent with the seasonal cycle of ENSO, i.e., that the SST forcing tends to be weaker induring JJA compared to that in DJF and therefore translates into weaker precipitation variability (Yang et al., 2018). 340 The results of Representative regions of the three cases of attribution are revealedshown by in the spatial plotting in Figure   9maps. Among themthe significantly positive anomaly correlation, 7.1%, 6.3% and 7.0% are The percentage of significantly positive anomaly correlation attributable to positive ENSO teleconnection is respectively 7.1%, 6.3% and 7.0% in MAM, JJA and SON. Representative rRepresentative regions for this case shown in the distribution maps are western United States in MAM, parts of South America in JJA and Middle East in SONsupport previous studies associating atmospheric predictability 345 with ENSO teleconnections over theare western United States in MAM (Pegion & Kumar, 2013), parts of South America in JJA (Cai et al., 2020) and Middle East in SON (Mariotti, 2007). 7.3%, 7.4% and 14.3% of grid cells are with significantly positive anomaly correlation attributable to negative ENSO teleconnection respectively in MAM, JJA and SON.
Representative regions are southeast Asia in MAM, JJA and SON and large parts of Australia in SON, which . The pattern in southeast Asia is consistentsuggest that with previous studies that precipitation therein is strongly correlated with ENSO in the 350 dry season (Jiang & Li, 2017). Australian precipitation in SON has also been found to be greatlysubstantially influenced by the extratropical teleconnection pathway of ENSO (Cai & Weller, 2013). Further, 12.6%, 10.3% and 13.3% of grid cells are with significantly positive anomaly correlation not attributable to ENSO teleconnection. This result calls for the investigation of other teleconnection patterns for GCM seasonal precipitation forecasts.

Discussion 360
The correlation skill between forecast and observed precipitation is one of the most important indicators of the usefulness of GCM forecasts (Yuan et al., 2011;Becker et al., 2014;Vano et al., 2014;Johnson et al., 2019;Zhao et al., 2020b). To facilitate forecast applications, correlation skill is conventionally calculated from data and then presented using spatial plotting (Zhao et al., 2020a(Zhao et al., , 2020b. Focusing on the relationship between correlation skill and ENSO teleconnection, this paperthe present paper highlightss that significantly positive anomaly correlation, which is always advantageous for practical applications of 365 GCM forecasts (Vano et al., 2014;Yuan et al., 2015;Peng et al., 2018), can be attributed to positive (negative) ENSO teleconnection or not to ENSO. In DJF, significantly positive anomaly correlation for CFSv2 forecasts is attributable to positive ENSO teleconnection in southern North America and East Africa and it is attributable to negative ENSO teleconnection in northern South America and southern Africa. Moreover, significantly positive anomaly correlation in Europe can be attributable to teleconnections other than ENSO. The different cases of attribution also exist for MAM, JJA and SON, but their 370 spatial extents vary considerably. These results conform to previous findings that regions exhibiting positive (negative) ENSO teleconnection change substantially by season (Mason & Goddard, 2001;Chen & Kumar, 2020;Vashisht et al., 2021) and that performances of GCM forecasts vary by season (Vano et al., 2014;Johnson et al., 2019;Zhao et al., 2020b).
Lagged relationships between seasonal precipitation and ENSO are useful for plays a useful role in long-lead seasonal forecastsing (Schepen et al., 2012;Peng et al., 2014;Steinschneider & Lall, 2016). To investigate the possible sources of 375 forecast skill provided by the lagged relationship, tThe attribution method is further performed tofor different lagged Niño3.4 indices at different time lags. Specifically, Tthe concurrent Niño3.4 in the analysis is replaced by observed Niño3.4 in September, October and November is applied to the regression analysis with the DJF forecasts generated in December and DJF observations and thereforeso as to investigate the overlapping resultsR 2 at for 3-, 2-and 1-month lags are obtained. Figures S1 to S4 in the supplementary generally show that the additional results tend to be similar; the similarities are generally owing 380 to the temporalThe spatial maps and corresponding sunburst diagrams ( Figure S1) are nearly equivalent to that obtained by concurrent relationship due to the strong persistency of Niño3.4 (Yang et al., 2018). Furthermore, We also perform aanother experiment is devised to to explore the effect of the lead time of for GCM forecasts at the lead time of 3, 2 and 1 months. That is, in the attribution analysis, Using the forecasts of DJF precipitation generated in September, October and November are used to replace forecasts generated in December. From Figures S5 to S8, it can be observed that the number of grid cells 385 exhibiting significantly positive anomaly correlation tends to reduce as the lead time increases. Forecasts at a longer lead time are generally less skilful (Jia et al., 2015;Johnson et al., 2019;Zhao et al., 2019). Meanwhile, the three cases of attribution remain in particular for southern North America, northern South America and southern Africa., the same attribution procedure is performed to the precipitation and Niño3.4 in DJF. The 3-to 1-month lead forecasts exhibit more limited areas of significantly positive anomaly correlation ( Figure S2) compared to that from 0-month lead. It is worth noted that the proportion 390 of grid cells attributable to ENSO teleconnection remains stable as the lead time increases from 1 month to 3 months. By contrast, the proportion almost doubles from a 1-month lead to a 0-month lead. It seems that the long-lead forecasts fail to capture some of the lagged relationships with ENSO shown in Figure S1, especially in East Asia, East Africa and Central Asia.
The capability to formulate climate teleconnections is an essential part in the evaluation of GCM forecasts (Molod et al., 2020;395 Jong et al., 2021). Adding to previous studies that investigated GCM forecasts for regions subject to prominent ENSO influences (Pegion & Kumar, 2013;Manzanas et al., 2014;Jha et al., 2016), this paper presents an investigation of ENSO teleconnection at the global scale. For grid cells around the world, the ratio of significantly positive anomaly correlation attributable to positive (negative) ENSO teleconnection is respectively 10.8% (11.7%) in DJF, 7.1% (7.3%) in MAM, 6.3% (7.4%) in JJA and 7.0% (14.3%) in SON. Furthermore, the ratio of significantly positive anomaly correlation not attributable 400 to ENSO teleconnection, which suggests that other climate teleconnections are at play in determining the skill of GCM forecasts, is respectively 16.9%, 12.6%, 10.3% and 13.3% in DJF, MAM, JJA and SON. Overall, the spatial plots and the

Conclusions 405
Climate teleconnections, in particular ENSO, have been extensively used in conventional statistical hydrological forecasting. This paper is devoted to investigating the relationship between statistical ENSO teleconnection and correlation skill of dynamical CFSv2 forecasts. A novel mathematical approach is built upon the coefficient of determination (R 2 ) that measures the ratio of explained variance to total variance. Specifically, taking advantage of simple linear regression, the ratios of variance explained by GCM forecasts, Niño3.4 and their union are respectively obtained; then, the overlapping R 2 for GCM forecasts 410 and Niño3.4 is derived based on the intersection operation. Based on the significance of overlapping R 2 and the sign of ENSO teleconnection, three cases of attribution are derived. They are significantly positive anomaly correlation attributable to positive ENSO teleconnection, attributable to negative ENSO teleconnection and not attributable to ENSO teleconnection.
The effectiveness of the developed approach is demonstrated through the case study of CFSv2 seasonal forecasts of global precipitation. Spatial plots of the attribution are illustrated by season. The results not only confirm the prominent contributions 415 of ENSO teleconnection to GCM forecasts, but also present spatial plots of regions where significantly positive anomaly correlation is subject to positive ENSO teleconnection, negative ENSO teleconnection and teleconnections other than ENSO.
Overall, the attribution approach proposed in this paper can serve as an effective tool to investigate the source of predictability for GCM seasonal forecasts of global precipitation. Spatial plots of the attribution are illustrated by season. The spatial patterns of forecast skill attributed to different types of ENSO teleconnections confirms previous studies associating seasonal 420 precipitation variability with ENSO and therefore revealshighlight the capability of CFSv2 in capturing the pathways of ENSO teleconnections. The attribution attributionmethod proposed in this paper of forecast skill can lay a basis for furtherture evaluations of other teleconnections and investigations of predictability sources evaluation of future prospects for GCM seasonal precipitation forecasts. The proposed approach in this paper can serve as an effective tool to achieve the attribution and can be further used to investigate other sources of skill of seasonal precipitation. The results not only confirm the prominent 425 contributions of ENSO teleconnection to GCM forecasts, but also present spatial plots of regions where significantly positive anomaly correlation is subject to positive ENSO teleconnection, negative ENSO teleconnection and teleconnections other than ENSO. Overall, the attribution approach proposed in this paper can serve as an effective tool to investigate the source of predictability for GCM seasonal forecasts of global precipitation.