Decline of Arctic sea ice: Evaluation and weighting of CMIP5 projections

Trends of Arctic September sea ice area (SSIA) are investigated through analysis of Coupled Model Intercomparison Project phase 5 (CMIP5) data. The large range across models is reduced by weighting them according to how they match nine observed parameters. Calibration of this refined SSIA projection to observations of different 5 year averages suggests that nearly ice‐free conditions, where ice area is less than 1 × 106 km2, will likely occur between 2039 and 2045, not accounting for internal variability. When adding internal variability, we demonstrate that ice‐free conditions could occur as early as 2032. The 2013 rebound in ice extent has little effect on these projections. We also identify that our refined projection displays a change in the variability of SSIA, indicating a possible change in regime.


Introduction
The decline of Arctic ice area has important consequences for global climate, such as reducing global albedo [Hudson, 2011], initiating the release of large quantities of carbon dioxide from thawing permafrosts [Lawrence et al., 2008], and reducing the strength of the thermohaline circulation [Jahn and Holland, 2013]. It also appears to be impacting Northern Hemisphere weather conditions, being linked to the recent cold winters in Europe and northern Asia [e.g., Petoukhov and Semenov, 2010].
The September sea ice area (SSIA) in 2013 has significantly rebounded from its 2012 area, which was the lowest in the satellite record , and possibly the past 1450 years [Kinnard et al., 2011]. The recent string of record area lows (2002,2006,2007,2012) highlights the increased rate of decline observed since the start of the century. Stroeve et al. [2011] identified an increased rate of decline post-1999, extended in Figure 1 to include the 2012 record low. Studies Kay et al., 2011] have attributed approximately half of the observed SSIA trend to internal variability; however, the role of external forcing is still considered to be the principal driver of this decline Semenov et al., 2012]. Ice-albedo feedback, higher Arctic temperatures, and thinning ice pack could all be helping to accelerate the decline [Stroeve et al., 2011]. In this paper we re-examine future Arctic sea ice trends using the CMIP5 projections, and to consider when the threshold of nearly ice-free (hereinafter referred to as ice-free), defined here as ice area less than 1 million km 2 , conditions will be reached.
The CMIP5 models more accurately replicate the observed trend in SSIA compared to the last generation of climate models [Stroeve et al., 2012], but still show considerable range in their projections as to when such conditions will be reached ( Figure 2). The mean of these ensembles projects ice-free conditions to occur in 2045, while their range is large with ±1 standard deviation either side of their mean resulting in a 1σ range of 54 years for the projection of ice-free conditionsbetween 2029 and 2083.
There are two existing studies that already use CMIP5 projections to provide a prediction of when ice-free conditions will occur. Wang and Overland [2012] (WO2012) and Massonnet et al. [2012] (M2012) propose the time intervals of 2030-2039 and 2041-2060, respectively, for the date when ice extent will fall below 1 × 10 6 km 2 , under the high external forcing "representative concentration pathway" (RCP8.5) [Moss et al., 2010]. There is also debate [Tietsche et al., 2011;Wadhams, 2012] as to whether a tipping point, resulting in a rapid and irreversible transition to a seasonally ice-free Arctic regardless of future emissions, has been reached.

Ensemble Weighting and Calibration
Similar to WO2012 and M2012, we aim to refine CMIP5 projections to limit the effect of model error and bias to provide the most likely projection of when ice-free conditions will occur. In a break from these studies,  Table S1 • Table S2 • Table S3 • Table S4 • Table S5 Correspondence to: P. M. Forster, p.m.forster@leeds.ac.uk however, we investigate future projections of SSIA, rather than ice extent. Ice extent is a measure of the areal sum of the satellite data cells that have an ice concentration greater than (typically) 15%, while ice area averages the ice concentration of individual cells (provided they are over a threshold limit, again usually 15%) to provide a measure of "actual" ice coverage. We suggest that, despite the potentially higher error of SSIA observations [Koldunov et al., 2010], from a global climate perspective, ice area is a more important measure as it better reflects the Arctic ice albedo and ice concentration.
We analyze 65 CMIP5 ensembles from 27 model groups (listed in supporting information, Table A1), which is a greater selection than used in both WO2012 and M2012. We employ a new approach, creating an ensemble weighting based on their accuracy in replicating observations against nine key parameters (identified in Table 1 and discussed below), building on the five parameters proposed by M2012 of mean extent, mean volume, mean thin ice extent, trend, and seasonal cycle amplitude. These nine parameters include ice area and volume measures, as well as Northern Hemisphere temperature. Ensembles that less accurately replicate these parameters are allocated less weighting in their determination of our refined projection. Such a model weighting has been used for stratospheric ozone projections (e.g., Waugh and Eyring [2008]).
First, we reflect the importance of the ensembles' ability to replicate the observed conditions, as stressed by both WO2012 and M2012, and define parameter 1 (P1) as the error of (uncalibrated) model ensembles in replicating the observed 1979-2012 SSIA average (data from Fetterer et al. [2012]). Error in our study refers to the sum of absolute differences between observational and model data, summed over each year of the observed time period considered. In the case of P1, this is the 1979-2012 period (see Table 1). Historical ensembles run up to 2005, after which RCP8.5 simulations are used (the differences between alternative simulations are negligible in the 2005-2012 time interval [Massonnet et al., 2012]). Our study therefore extends the analysis of WO2012 and M2012, who limit their analysis from 1979 to 2005 and 2010, respectively. This new information allows us to assess the ability of the models to replicate the 2012 low and 2013 rebound. This is important as the 2012 low was not entirely due to anomalous atmospheric conditions [Zhang et al., 2013].
Using this error of a given ensemble member (comparing the ensemble to a specified time period of observations from Table 1), the SSIA of ensemble members are scaled to effectively make this error zero. This forms a calibrated model ensemble member. Similar to WO2012, these calibrated ensemble members are used to create additional ranking parameters, designed to capture internal variability. Our second parameter (P2) is the error of an ensemble, once calibrated to the 2012 minimum, in replicating the average 1979-2012 SSIA value. Further to this, and in the same vein, (P3) the ensemble error, once calibrated to the 1999-2012 observations, in replicating the 2012 record low is also used as a rank to select ensemble members that by chance, as a result of internal variability, begin their projection phase from a low 2012 value somewhat like that observed. Additionally, an important feature of the declining ice pack, not considered in the two aforementioned CMIP5 studies, is the increasing variability of ice minima [Stroeve et al., 2011], and we therefore propose that (P4) the error in replicating SSIA 2 year trend variability (i.e., the average magnitude of SSIA change from year to year averaged over the observed record, 1979-2012) should be a key parameter when assessing the accuracy of ensembles. As discussed later, this parameter also plays an important role in determining whether a tipping point has been reached.
The parameters so far suggested are measures of the ability of the ensembles to accurately replicate the trends and variability in SSIA. However, it is also critical to consider the ensembles' ability to replicate the observations of changes that are driving SSIA trends. Therefore, we repeat parameters 1 to 4 but with ice volume measures comparing modeled data with data calculated using the Pan-Arctic Ice Ocean Modeling and Assimilation System (PIOMAS, originally by Zhang and Rothrock [2003]) (P5-8). PIOMAS data, despite also   [Schweiger et al., 2011]. The baseline ice thickness is strongly related to SSIA , and the underestimation of SSIA trends from previous generations of climate models has been attributed to their failings to sufficiently replicate ice thickness observations [Laxon et al., 2003;Mahlstein and Knutti, 2012], and it is therefore essential to include parameters to evaluate ensemble performance in replicating the changing September sea ice volume (SSIV).
Additionally, we include (P9) the error of the ensembles in replicating the annual average Northern Hemispheric temperature over the period of 1950-2012 (data from Jones et al. [2012]) as our final parameter, due to the importance of increasing temperatures in increasing the melt season length and reducing the ability of SSIA recovery [Markus et al., 2009;Stroeve et al., 2011]. We do not use Arctic temperature as a parameter due to the lack of a reliable observational record.
In a departure from WO2012 and M2012, we do not consider the seasonal cycle to be a useful parameter. We believe that the distortion of the seasonal ice coverage cycle resulting from the increasingly dominant firstyear ice fraction [Maslanik et al., 2007] means that the seasonal cycle is not that important when considering the ensembles' ability to replicate and project future trends in SSIA. The coverage of "thin-ice," proposed by M2012, is also not considered as a separate parameter as it is in some way accounted for in our sea ice variability parameter (P4).
The performance of the 65 ensemble members are assessed against these nine individual parameters (IPs) and are subsequently ranked in order of performance, according to which they are allocated a weighting, based on 1/ranking 2 . Weights are calculated as 1=ranking 2 ∑1=ranking 2 , where the dominator is summed over all ensembles. Therefore, the sum of the weights equals one. This weighting has been chosen so as to predominantly represent only the most accurate ensembles, with 90% of the weighting attributed to the top 5 ranked ensembles. In addition to assessing the ensembles performance against individual parameters, their performance against all parameters (AP) is evaluated. The AP weights are generated by developing a new rank from the summed ranks of the nine IPs and then using this with the same 1/ranking 2 weighting as before. The IP ranking, rather than the direct ensemble error, determines the weighting so that for the AP-weighting, rankings for different parameters can be combined. Graphical depictions of the weighting functions used are provided in Figure 3.  This study, similar to WO2012, uses model calibration to project future trends from observational data by scaling the model to the observational data over the time intervals given in Table 1. We assume that the APweighting represents the most likely SSIA projection, and the deviation in IP-weighted projections represents the uncertainty in our model calibration methodology, considering any calibration method equally justifiable. We develop on previous work through the investigation of multiple calibration periods and performing the calibration with a greater number of parameters. If models perfectly replicated observed trends, the years to which they were calibrated would not result in any difference to their projections. This is provided that time periods of averaging are long enough to not be affected by internal variability. Therefore, by calibrating our IP and AP-weighted projections to both the 2003-2008 and 2008-2012 observed average values, a range of projections can be created that help account for model error but these time periods could still be too short to account fully for internal variability. Nevertheless, the two AP-weighted projections produced from these calibrations provide a range of dates when ice-free conditions will be reached, defined as our refined estimate. The broader range of IP-weighted projections provides a measure of the associated uncertainty with ensemble calibration. We additionally calibrate ensemble projections using the 2012 record low. We recognize that calibration of model data to a single year is potentially unreliable. However, as the SSIA in 2012 was not solely a result of internal variability [Zhang et al., 2013], it is not necessarily biased and provides a useful "extreme" case scenario.

Ice-Free Projection
Initially, we use this weighting methodology to refine projections of SSIA, assuming no influence of internal variability due to such effects being averaged out in the calibrated (AP-weighted) ensemble average. Under this assumption, our refined estimate suggests that ice-free conditions will occur between 2039 and 2047, with the extreme case 2012 calibrated case suggesting this could occur by 2035. The broader IP range spans from 2020 to 2064 (Figure 4 and Table 1), suggesting that the AP-weighted average does not fully capture uncertainties.
Similar to Kay et al. [2011], we find that internal variability appears to account for approximately 50% (47.5%) of the 1979-2012 trend, as the 1979-2012 observed trend of À0.100 × 10 6 km 2 /yr (which retains variability) is about 50% larger than the AP-weighted trend which to a large extent averages out the variability amongst the 65 ensemble members.
It is clear that internal variability will have an important role in determining when ice-free conditions will be reached, as it affects the distribution of Arctic sea ice volume between thickness and area. We therefore propose a new methodology, designed to take into account the potential effects of internal variability that was not fully considered by either WO2012 or M2012. We reintroduce internal variability into the weighted trends and use SSIV as we found that it better represents variability at small ice areas compared to SSIA, as it allows for both area and thickness changes. With SSIV projections, we consider when this measure will fall below a threshold of 1.735 × 10 3 km 3 . Using CMIP5 data, this threshold was identified as being the maximum ice volume when ice area fell below 1 × 10 6 km 2 (and becomes ice-free). We suggest that ice area is unlikely to become ice free while ice volume is above this threshold because it does not do so in any of the 65 ensemble simulations. Below this threshold, we suggest ice-free conditions become a possibility depending on the role of internal variability because individual ensembles start to project ice-free conditions occurring once SSIV values fall below 1.735 × 10 3 km 3 . Therefore, this threshold enables us to identify where ice-free conditions have the potential to occur. Assuming the AP-weighting removes model error and internal variability, this methodology enables the reintroduction of the role of internal variability, enabling an assessment into its effects on a "reliable" SSIA projection.  We use the same weighting as before as the parameters include ice volume components and calibrate our projections to PIOMAS ice volume data, which we consider to be the best observational baseline. Again, we calibrate to both 2003-2007 and 2008-2012 average values to create our refined estimate, as well as the 2012 value to demonstrate the extreme case scenario.
When the role of internal variability is considered, our refined SSIV projection suggests that ice volume will fall to a level where there is potential for ice-free conditions to occur between 2032 and 2046, with the extreme case suggesting these conditions to occur by 2021 ( Figure 5 and Table 2). The IP-projection range spans from 2013 to 2100+, which is larger than the IP-projection range for SSIA. The upper end of this large range is 2100+ and this is likely unphysical due to the fact that the recalibration process offsets some ensembles with zero ice volume. However this only affects one IP-projection and if this is ignored the IP-projection ranges, and therefore degrees of uncertainty, for our SSIV and SSIA refined estimates can be considered similar.

Variability in SSIA
We examine short-term variability employing a similar methodology to Kay et al. [2011]. We look specifically at the interannual trend vector (ΔSSIA/year) and standard deviation of 2, 5, 10, and 20 year trends in three 50 year time periods, a control period of 1890-1940, 1963-2012, and 2013-2062. We chose to investigate the change in terms of area rather than percentage due to the fact that percentage change becomes an unrepresentative measure as the ice area tends to 0. Any negative recalibrated values are not zeroed as they remain physically relevant in the model ensemble. Therefore, this methodology is able to provide a fully representative analysis for the time period of 2013-2062, even if SSIA values fall below 0 km 2 . Relative to the control period of 1890-1939, an increase in variability for all trend lengths was observed in the recent period of 1963-2012 ( Figure 6 and Table 3). Following this, however, there is projected to be a decrease in variability for all trend lengths, most significantly for the longertrend lengths.
The fraction of positive 2-20 year trends reduced, but not drastically, in the 1963-2012  relative to the control period. The projection for 2013-2062, however, is for significant reductions in the occurrence of 5-20 year recovery periods. Significantly, the fraction of 5 year positive trends reduces to 4%, while 10 and 20 year positive trends are projected not to occur at all in this time period. This suggests that the influence of internal forcing (e.g., North Atlantic Oscillation) loses the ability to create positive longer-term trends. However, 22% of 2 year trends are projected to be positive, suggesting that internal forcing will still affect short-term trends. Generally, the average magnitude of SSIA decline is projected to increase for all trend lengths in 2013-2062, relative to 1963-2012 (following the same pattern when comparing 1963-2012 to the control period), suggesting that SSIA is on an increasingly negative trajectory.

Discussion and Conclusions
This study finds that ice-free conditions are likely to occur between 2039 and 2047, assuming no influence of internal variability. We find that the influence of internal variability has the potential to bring our refined icefree estimate forward to between 2032 and 2046, or even as early as 2021 in the extreme case. An analysis of sea ice ensembles in RCP4.5 found an almost identical range to the RCP8.5 ensembles used here, with only 10% less model-mean sea ice extent at 2040. Therefore, these predictions are considered to be valid for all likely emission scenarios, due to the negligible differences out to~2040 between the climate responses of the different RCP emission scenarios [Forster et al., 2013].
The stratification of model performance is low under the AP-weighting, i.e., there is not a large difference in model performance between ensembles when considering all nine parameters (Figure 3). This supports the notion that combining climate models and ensembles often provides the best depiction of climate processes or trends due to the different distributions of model bias resulting from parameterization [Knutti et al., 2010]. Excluding some of the nine parameters or adopting a different weighting has little effect on our ice-free projections. For example, using a 1/ranking, rather than 1/ranking 2 , weighting changes our overall projection range from 2021-2047 to 2021-2048, and excluding the Northern Hemisphere temperature parameter has little effect (see supporting information). The most important role of weighting is to exclude poor performing models.
Our extreme case scenario could be biased by internal variability and there is a possibility of a rebound effect [Sedláček et al., 2012]. Such a rebound was witnessed in 2013. Nevertheless, it shows that ice-free conditions could have the potential to occur as early as 2021. Our overall projection range (2021-2047) also can be considered to largely agree with the findings of WO2012, who predicted ice extent to fall below 1 × 10 6 km 2 in the 2030s, and suggest that M2012 have perhaps underestimated the future decline of Arctic ice coverage. Both of these studies do not take into account the role of internal variability in their projections, and therefore their conclusions perhaps do not adequately convey how soon ice-free conditions may occur. Our results also align with the conclusions of Overland and Wang [2013], who use expert judgment to suggest that ice-free conditions are very likely to occur before 2050.
In Figures 4 and 5, we depict the SSIA and SSIV values from 1 September 2013, the most up-to-date values available when this journal went to publication. This SSIA value is from Cryosphere Today (http://arctic.atmos. uiuc.edu/cryosphere/) and the SSIV value is from PIOMAS. Despite there being an increase in both SSIA and SSIV relative to 2012, the 2013 SSIV minimum value importantly will still be within our projection range (see Figure 5). The 2012 and 2013 SSIA values encompass the range of our refined SSIA projection, which helps support the refined estimate as a fair projection, excluding variability. Tietsche et al. [2011], through artificially perturbing climate models, suggest that September ice coverage can rebound even once completely removed. However, the pattern of variability change projected by the refined Table 3. SSIA Trend Statistics a From AP-Weighted CMIP5 Model Data for the Periods of 1890-1939, 1963-2012-2062Trend Length 1890-19391963-2012-2062 Avg. CMIP5 data, with an increase in the 1963-2012 period, followed by a projected decline in the period of 2013-2062, is indicative of a regime change [Scheffer et al., 2009]. This could be a result of the inability of the system to form multiyear ice once ice-free conditions are reached [Wadhams, 2012]. Within the period of 2013-2062 in our refined CMIP5 projection, 0% of 10 and 20 year trends (and only 4% of 5 year trends) are projected to be positive, suggesting that a sustained decline in SSIA will occur in the coming decades until the system reaches ice-free conditions. Interestingly, the previous generation of climate models (CMIP3) did not find this and projected a continued increase in variability for 2-10 year trends in the coming decades .
It is important to recognize that this study relies on the accuracy of model projections, and while a certain amount of confidence can be gained from the fact that analysis is based on weighting model data, the inherent model biases and error will affect the reliability of our findings, especially as model error is often correlated amongst model groups [Knutti et al., 2010]. Our weighting method only really serves to reduce the influence of the worst performing models. However, the remaining models would still be expected to have many deficiencies in their sea ice, due to their poor spatial resolution and their poor representation of Arctic clouds and surface albedo [Stroeve et al., 2012]. The large range of our weighted projections, particularly IPweighted projections, confirms that there are still large inherent uncertainties associated with the CMIP5 ice coverage projections and our understanding.