An ensemble approach to quantify global mean sea-level rise over the 20th century from tide gauge reconstructions

We present an ensemble approach to quantify historical global mean sea-level (GMSL) rise based on tide gauge reconstructions. This approach combines the maximum internal uncertainty across the ensemble with an estimate of structural uncertainty to provide a conservative estimate of the total uncertainty. Comparisons of GMSL rise over the 20th century based on deltas and linear trends (and their respective uncertainties) are consistent with past Intergovernmental Panel on Climate Change assessments and show good agreement with satellite altimeter timeseries. Sensitivity tests show that our estimates of GMSL rise are robust to the choice of reference period and central estimate timeseries. The methods proposed in this study are generic and could be easily applied to other global or regional climate change indicators.


Introduction
Observational timeseries of historical global mean sea-level (GMSL) change are essential to our understanding of climate variability and change. Their uses include: assessments of the current state of our climate (e.g. Thompson et al 2020); monitoring and advancing knowledge of sea-level variability and change (Ponte et al 2019a, Hamlington et al 2020; evaluation of climate and Earth system models (e.g. Slangen et al 2017); provision of sea level services such as hindcasts and real-time forecasts (Ponte et al, 2019b); and to inform assessments of future change and their socio-economic and environmental impacts (Church et al 2013a, Oppenheimer et al 2019. All of the above applications require rigorous evaluation of the observational uncertainties.
Reconstruction of GMSL change from the available coastal tide gauge observations represents a substantial scientific challenge with a number of processing steps, each of which can influence the resultant timeseries (see section 2). Differences in methodological approaches across the available products leads to a diversity of GMSL reconstructions in terms of the underlying trends and interannual-todecadal variability (e.g. Oppenheimer et al 2019). The importance of this 'structural uncertainty' (Thorne et al 2005) has led to several studies adopting an ensemble approach to quantify uncertainty in essential climate indicators related to sea-level change (e.g. WCRP 2018, von Schuckmann et al 2020. In this paper we present an ensemble approach to estimate observed GMSL rise, based on five tidegauge reconstructions as an example, or 'case study' . While some previous studies have made use of ensemble-based estimates of observed change (e.g. WCRP 2018, Oppenheimer et al 2019, von Schuckmann et al 2020, we take a more holistic approach to the estimation of total uncertainty. This is achieved by combining an ensemble-based estimate of internal uncertainty (i.e. the uncertainty in GMSL associated with each reconstruction, sometimes referred to as 'parametric' uncertainty) with an ensemble estimate of the structural uncertainty, which is informed by the ensemble spread (Thorne et al 2005). We find that both sources of uncertainty are substantial and therefore omission of either will tend to underestimate the total uncertainty.
In addition, we introduce elements of expert judgement and steps to avoid circularity in modelobservation comparisons that use our ensemble timeseries. Based on our ensemble, we evaluate GMSL rise for a number of different periods and compare these to individual reconstructions and past Intergovernmental Panel on Climate Change (IPCC) assessments. This comparison uses both delta (i.e. the total change over a given period) and linear-trend calculations, and we discuss the suitability of these change metrics for different applications. The methods presented here are generic, and could prove useful in the assessment of other global or regional climate change indicators (e.g. Trewin et al 2021).
The outline of the paper is as follows. In section 2 we present the datasets used in our ensemble estimate of historical GMSL change, including a summary of the different methods used by the dataset originators. In section 3 we describe the construction of our ensemble GMSL estimate and the delta and trend calculations presented in this paper. The main results are shown in section 4, including timeseries of GMSL change with uncertainties and comparisons of the estimated sea-level change for several example periods. Finally, in section 5 we present a summary of our main findings.

Data
Our 20th century ensemble analysis is based on a selection of available GMSL estimates, spanning different methodological approaches and including uncertainty estimates (table 1). Where necessary, units are converted to mm and annual mean timeseries are generated from monthly mean data, weighting by the number of days in each calendar month. The long tide gauge records needed to perform reconstructions over the 20th century have a highly heterogeneous temporal-spatial distribution, with greatest data availability in the North Atlantic sector (e.g. Church and White 2011;hereafter CW2011). In addition, tide gauge records are of varying quality and often subject to additional non-climatic signals that can, to some extent, be corrected for in order to better isolate the underlying GMSL change. Therefore, GMSL reconstructions are subject to diversity in the corrections applied to tide gauge records as well as the quality and criteria for selection of geographic locations to be included or excluded from the analyses (e.g. Hamlington and Thompson 2015, Hay et al 2017, such as excluding records in tectonically active areas and/or where sediment compaction is important. Tide gauge records can be corrected for the effects of atmospheric pressure loading known as inverted barometer (IB, e.g. Piecuch and Ponte 2015), however, a relationship between air pressure and sea level is not straightforward at decadal and longer timescales and thus no adjustment might be preferable to a partial or a potentially poor adjustment (e.g. Miller and Douglas 2006). Geodynamical models are used to correct for ongoing glacial isostatic adjustment (GIA, e.g. Peltier et al 2004). These models, however, do not consider other potential sources of land movement (e.g. sediment compaction, terrestrial water storage and/or cryosphere loadings). First-difference of annual values can be used to account for uncertain datum and allow a greater number of stations to be considered (e.g. Church and White 2011). For other potential tide gauge corrections and more specific details, we refer the reader to WCRP Global Sea Level Budget Group (WCRP 2018) and references therein. A summary of the corrections applied to each dataset used in this study is listed in table 1.
Once the tide gauge records have been selected and any corrections applied there are fundamentally two approaches to determining GMSL (e.g. Church and White 2011). The first, and simplest approach, is to apply spatial averaging to the tide gauge records, to arrive at an estimate of GMSL change. Several studies have adopted a 'virtual station' (VS) approach, which averages neighbouring station sea-level changes in several regions before averaging the regions to get an estimate of GMSL change (e.g. Frederiske et al 2020; hereafter FR2020). The VS method avoids overweighting of densely-sampled regions in the global average. FR2020 combined this approach with sealevel budget analysis of individual ocean basins using a large Monte Carlo simulation to sample across the estimated uncertainties.
The second type of approach uses estimates of the large-scale patterns of sea-level variability and change to interpolate between the tide gauge locations and thus estimate GMSL change. CW2011 applied the reduced space optimal interpolation (RSOI) technique (Kaplan et al 2000, Church et al 2004, using spatial patterns of sea-level variability informed by empirical orthogonal functions applied to satellite altimeter data. A strength of this approach is that it relies purely on observational data to inform the estimate of GMSL change. However, it assumes that the sea-level variability captured in the satellite altimeter record since 1993 is representative of the entire period, i.e. it assumes statistical stationarity. Calafat et al (2014) have evaluated the ability of RSOI methods to represent trends and variability of GMSL using an analytical solution and synthetic tide gauge observations based on an ocean reanalysis. The authors caution that the approach taken in CW2011 will Frederikse et al (2020,2018) a Indicates 'hybrid' products that include information from CMIP climate models as part of the estimate of regional sea-level changes.
tend to alias regional coastal variability at decadal timescales and over-estimate interannual variations in GMSL. However, they found that the approach of CW2011 captured the long-term trends in GMSL reasonably well, which is the focus of the present study. Hay et al (2015; hereafter HA2015) used a Kalman Smoother (KS) approach to combine the tide gauge data with temporal-spatial patterns of sea-level change associated with: (a) ocean processes from six CMIP5 climate model simulations (Taylor et al 2012); and (b) 'fingerprints' of sea-level change associated with GIA and mass loss from Antarctica, Greenland and mountain glacier regions. A strength of the Hay et al approach is that it allows a more comprehensive treatment of the drivers of observed sea-level change and accommodates non-stationarity of trends and incomplete data records. However, an important caveat is that in the absence of observations, the KS relies on the model dynamics. Based on these previous works, Dangendorf et al (2019;hereafter DA2019) sought to unify the RSOI and KS approaches in their estimate of GMSL change in an attempt to better represent both the long-term signals (KS) and sea-level variability (RSOI). A summary of dataset methods is provided in table 1, along with the main references, where further details can be found.

Methods
In this section we present our approach to constructing an ensemble estimate of historical GMSL change that includes a conservative approach to estimating the overall uncertainty.
Before describing the details of the methodology, there is an additional aspect to consider when, e.g. using observations in model evaluation (e.g. Slangen et al 2017) or detection and attribution studies (e.g. Marzeion et al 2014, Slangen et al 2016; that is the issue of 'circularity' . Circularity, or circular reasoning, can arise in this context when information from models is used to influence observational analyses of past climate change. A common example is the use of spatially complete data from climate model simulations to establish temporal-spatial relationships that can be used to inform interpolation of observational data voids (see section 2). Such approaches have been applied to tide gauge reconstructions of GMSL change (e.g. Hay et al 2015, Dangendorf et al 2019 and also estimates of past ocean temperature and heat content change (e.g. Smith andMurphy 2007, Cheng et al 2017). These 'hybrid' products that convolve observations and climate model simulations may be compromised for use in the evaluation of climate model simulations and attribution studies, because they are no longer independent from model simulations (e.g. Torkaska et al 2019).
Care must be taken not to introduce circularity in climate model evaluation or detection and attribution studies. As far as we are aware, none of the tide gauge reconstructions used here make direct use of climate model information in determining their estimate of GMSL change. Therefore, we do not eliminate the hybrid products from our ensemble. The framework we introduce in this study is generic and suitable for other variables and applications. However, the methodological details should consider the application in question.
For example, if we were to consider regional sealevel changes as part of this study, we would seek to maximize the information content of the reconstructions in our ensemble, while avoiding circularity with climate model simulations. This could be achieved by omitting hybrid estimates from the ensemble that is used to determine the central estimate timeseries. However, hybrid estimates would be retained for the purposes of informing the ensemble estimate of internal and structural uncertainty (section 3.1), following the terminology of Thorne et al (2005).

An ensemble estimate of GMSL change
There are four steps in the construction of the ensemble estimate of GMSL change: (a) construct the central estimate timeseries; (b) construct the ensemble timeseries of internal uncertainty; (c) construct the ensemble timeseries of structural uncertainty; (d) combine the timeseries. Each of these steps is described below in sequence.
The choice of central estimate is made based on the application. For this study, the focus is on determining long-term changes in GMSL. Since Calafat et al (2014) found the RSOI approach for CW2011 to be reasonable for these purposes, we include this RSOI estimate in our ensemble. Of the two reconstructions that employ a KS method to inform the long-term GMSL changes, we use the more recent study of DA2019. Similarly, of the two VS-based reconstructions we use the more recent study of FR2020, which is more up-to-date and includes a more comprehensive treatment of uncertainties. The central estimate timeseries is constructed by simply taking the average of CW2011, DA2019 and FR2020, i.e. each of the three methods-RSOI, KS and VS (table 1)-is given equal weight in our ensemble central estimate.
The next step is to estimate the internal uncertainty. Internal uncertainty is the term we use to describe the uncertainty estimates that are associated with each individual GMSL reconstruction ( figure 1(a)). Different estimation methods are used among the products and result in substantial differences in both the magnitude and time evolution of internal uncertainty. We take a conservative approach to constructing the ensemble internal uncertainty by selecting the largest value at any given time across all available products (including 'hybrid' products, figure 1(a), dashed black line). If there were good scientific justification to do so, the estimate could be based on a subset of reconstructions. However, our approach assumes that all internal uncertainty estimates are equally valid and hence all are included.
As discussed by Thorne et al (2005), structural uncertainty arises from the choice of approach in the reconstruction of observational datasets. This is particularly important when the spatial sampling density is low (as is the case for tide gauge reconstructions of GMSL change), as there are different approaches to account for the missing information, such as RSOI or the VS method. We estimate the structural uncertainty through computing the standard deviation of the central estimates of all products relative to the ensemble average ( figure 1(a), grey shading). To do this, we first express every timeseries to a reference period by subtracting the average value for 1991-2010. This is the latest 20 year period that is common to all products and offers improved coverage and more numerous gauge records compared to earlier periods (e.g. Calafat et al 2014). Note that the central estimate timeseries is also computed relative to the same 1991-2010 reference period.
Finally, we assume that the internal and structural uncertainty estimates are independent with a normal distribution and we therefore add the timeseries in quadrature to provide a timeseries of the total observational uncertainty. This timeseries is then combined with the ensemble central estimate to provide an ensemble timeseries of GMSL change with uncertainties ( figure 1(b)).
To assess the robustness of our results, we carry out two sensitivity tests. In the first, we use a much longer reference period of 1902-2010 to explore the impact this has on our results. The choice of reference period has an impact on the estimate of structural uncertainty, as shown in figure S1 (available online at stacks.iop.org/ERL/16/044043/mmedia). For the second test, we eliminate hybrid products from our ensemble central estimate, by replacing the DA2019 timeseries with DA2017. This test affects the central estimate timeseries but not the ensemble-based uncertainties, as shown in figure S3. Overall, the impact of the sensitivity tests on our results are small, with the largest variations in central estimates and total uncertainties for GMSL change not exceeding a few mm for all periods considered in this study (figures S2 and S4). Further discussion is provided in section 4.3.

Metrics of GMSL change: deltas vs trends
We compare our ensemble timeseries to individual products and past assessments of observed GMSL reported by the IPCC on the basis of two change metrics.
The first, which we refer to as a 'delta' or ∆, is simply the difference in GMSL between two years in the historical record. The uncertainty in each delta value is estimated by combining the total uncertainty for each of the two years in quadrature-under the assumption that the uncertainties are independent. While we consider this assumption to be reasonable for years that are temporally distant, it may tend to overestimate the uncertainty for shorter periods (e.g. interannual-to-decadal timescale). Deltas are perhaps the simplest approach to quantifying sea-level change but they can be sensitive to end-point effects in the presence of substantial year-to-year variability. Deltas represent a logical choice for budget studies and may have advantages in accommodating accelerations in rates of GMSL change, compared to linear-trend estimates.
The second metric of change we use is the slope of a linear fit to the GMSL central estimate applied over a specified period, which we refer to as a 'trend' . This approach has been widely used in the IPCC AR5 and SROCC reports to characterize the average rate of change of key indices, such as globally averaged surface temperature, globally integrated ocean heat content, and GMSL. While a trend over a longer period is effective for removing the effects of internal variability, the assumption of a linear trend may be less appropriate in case of substantial accelerations in the rate of GMSL change. For consistency with IPCC assessments, we compute the uncertainty in the trend based on the approach of Santer et al (2008), which accounts for serial autocorrelation in the timeseries. Note that this approach does not make use of the a priori uncertainty estimates presented in figure 1. The uncertainty arises purely from the variability and temporal autocorrelation in the central estimate timeseries (2008, 2012).

Ensemble timeseries of GMSL change and uncertainties
Estimates of internal uncertainty (expressed as the 90% confidence interval) across the GMSL reconstructions range from between about 30 and 40 mm at the start of 20th century to between about 5 and 25 mm at their lowest values in the early 21st century ( figure 1(a)). While all estimates show a substantial decrease in the estimated internal uncertainty over the first half of the 21st century, the non-hybrid estimates tend to level off around 1960. In contrast, the two hybrid estimates (HA2015 and DA2019) continue their much smoother reduction of uncertainty into the 21st century. Our conservative ensemble estimate of internal uncertainty is based on the maximum value across all products for any given year over the period 1900-2010 ( figure 1(a), dashed line). Using this criterion, the ensemble internal uncertainty estimate is based on the HA2015 and DA2017 estimates, with values ranging from >40 mm to about 25 mm.
The estimate of ensemble structural uncertainty is computed as the standard deviation of the ensemble relative to the ensemble mean, when all timeseries are expressed relative to their 1991-2010 mean value ( figure 1(a), grey shading). The structural uncertainty (expressed as the 90% confidence interval) is largest prior to the mid-20th century, with an average value of around 20 mm between 1900and 1950. Between about 1950and 1970 there is a substantial reduction in structural uncertainty, which has an average value of around 5 mm between 1980 and 2020. The estimate of structural uncertainty has quite a lot of residual  Frederikse et al (2020) noise, with typical year-to-year variations of several mm. Overall, the ensemble structural uncertainty is about 50% smaller than the ensemble internal uncertainty, making a lesser but still important contribution to the total uncertainty.
The estimates of structural and internal uncertainty are combined in quadrature to estimate the ensemble total uncertainty ( figure 1(b), light grey shading), under the assumption that they are essentially independent and have a normal distribution. Total uncertainty ranges from about 50 mm in the early 20th century to about 25 mm in the early 21st century. The ensemble central estimate is determined by taking the mean of the three non-hybrid products, i.e. CW2011, DA2017 and FR2020. The ensemble central estimate has a very similar timeseries to FR2020, with some reduction in the higher frequency variability as a result of averaging across several estimates. We note that the CW2011 product, which is based on an RSOI method, shows less variability than products based on a VS approach (DA2017 and FR2020).

Comparison of long-term changes in GMSL
In this section we compare changes in GMSL from our ensemble estimate with the individual central estimates of the products listed in table 1 and the assessment ranges presented in recent IPCC reports. We consider estimates of GMSL change based deltas and trends (see section 2) for the periods 1901-2010, 1901-1990, 1901-1993, 1993-2010 and 1971-2010. These periods are informed by the periods used in past IPCC assessments and the availability of satellite altimeter data from 1993.
For all periods considered, the 90% confidence interval of each delta based on our ensemble GMSL estimate encompasses the individual products ( figure 2(a)). Our ensemble estimate is consistent with the assessment of AR5 (Church et al 2013a) for the period 1901-2010, with a slightly lower central estimate and substantially larger uncertainties (table 2). We note that reconstructions published since the AR5, tend to show lower rates of GMSL rise over the first half of the 20th century than earlier studies ( figure 1(b); Hay et al 2015).
The deltas presented in figure 2(a) can be simply converted to a mean rate of GMSL rise for each period by dividing by the period length in years ( figure 2(b)). Unsurprisingly, the comparison to AR5 is very similar when expressed as an average rate. Our ensemble estimate for the period 1901-1990 shows remarkably good agreement with the assessment presented in the recent SROCC (Oppenheimer et al 2019). The ensemble estimate uncertainties for 1993-2010 and 1971-2010 seem large compared to the spread of individual GMSL reconstructions. Since structural uncertainty is relatively small during these periods, the ensemble total uncertainty is dominated by the internal uncertainty estimate (figure 1).
The comparison of average linear trends and uncertainties following Santer et al (2008) (figure 2(c)) presents a very different picture to the analysis based on deltas ( figure 2(b)). While the ensemble central estimate is broadly similar to the rates based on deltas, there is a large reduction in the estimated uncertainties. As a result, for every period, one or more individual GMSL Table 2. Summary of the estimated deltas, rates and trends in GMSL. Uncertainties are expressed as the 90% confidence interval. 1901-2010 1901-1990 1901-1993 1993-2010 1971-2010 Ens. ∆GMSL (mm) 175 ± 50 120 ± 50 121 ± 50 54 ± 33 83 ± 36 Ens. rate (mm yr −1 ) 1.6 ± 0.5 1.3 ± 0.6 1.3 ± 0.5 3.0 ± 1.9 2.1 ± 0.9 Ens. trend (mm yr −1 ) reconstructions exhibit a trend that lies outside the ensemble 90% confidence interval. The AR5 assessment for 1901-2010 has similar, albeit slightly larger, uncertainties than our ensemble. The AR5 assessment was based on the reconstruction of CW2011, using ordinary least squares linear fit with the uncertainty representing the 90% confidence interval (Rhein et al 2013;their Table 3.1) using a similar method to Santer et al (2008) to account for serial autocorrelation. We note that our ensemble approach (section 3) will tend to reduce variability in our central estimate timeseries and may explain why our uncertainties are smaller than those of AR5, based on the Santer et al method.
The SROCC (Oppenheimer et al 2019) used an ensemble estimate that augmented the GMSL reconstructions available at the time of AR5 with the additional estimates of Hay et al (2015) and Dangendorf et al (2017) to arrive at an assessed rate of GMSL rise of 0.8-2.0 mm yr −1 for the period 1901-1990. We arrive at very similar 90% confidence interval for this period, i.e. 1.0-2.0 mm yr −1 , based on the standard deviation of trends computed using Church and White (2011), Ray andDouglas (2011), Jevrejeva et al (2014), Hay et al (2015) and Dangendorf et al (2017). The SROCC approach is similar to the ensemble estimate presented here and shows remarkably good agreement with our estimated uncertainties based on deltas. If we were to carry out our ensemble method using an identical ensemble to SROCC, we might expect to see larger overall uncertainties arising from the estimate of internal uncertainty in addition to structural uncertainty.
The ensemble estimates of GMSL change are highly consistent with values based on the satellite altimeter timeseries presented by Frederikse et al (2020) for the period 1993-2010 (figures 2(a)-(c); table 2). For the GMSL change based on deltas, as expected, the satellite altimeter timeseries have much smaller uncertainties than our tide gauge reconstruction ensemble (figures 2(a) and (b)), which nevertheless encompass our ensemble central estimate (table 2) The trend results presented here call into question the suitability of the Santer et al approach for estimating uncertainties in GMSL rise, which does not account for structural or internal uncertainty in the ensemble (figure 1), and appears to underestimate the overall trend uncertainty (c.f. figure 2(b) and (c)). We note that this method was devised primarily in the context of estimates of global mean surface temperature (GMST) change, which has very different observational properties to GMSL. Compared to GMSL, estimates of GMST are supported by more numerous and well-distributed observations and the methodological uncertainties are relatively small compared to the observed signals (e.g. Morice et al 2012). GMST is subject to large internannual-to-decadal variability associated with ENSO and other modes of climate variability and therefore Santer et al (2008) is a very useful approach for quantifying uncertainties in the underlying multi-decadal trends. For climate indices with lesser variability and larger observational uncertainties (e.g. GMSL, ocean heat content) our results suggest that it is important to incorporate a priori estimates of internal and structural uncertainty when assessing the observed changes. We would therefore caution against using the Santer et al (2008) method for assessing trend uncertainty in GMSL. Our results suggest that the spread in linear trends across the ensemble is a more reliable basis for estimating uncertainty in GMSL change.

Sensitivity tests
As a fist sensitivity test, we recompute our GMSL ensemble using a longer reference period of 1902-2010 (figure S1), since the choice of reference period is somewhat subjective. This results in an estimate of structural uncertainty that is more time-constant and does not show the substantial reduction after 1950 see in figure 1(a). Relative to our original ensemble, the total uncertainty increases in the last few decades of the timeseries. The sensitivity to reference period arises due the different decadal rates of change seen across the reconstructions, particularly during the period 1940-1980 ( figure 1(b)). While the change in reference period has no impact on the central estimates of GMSL change, it does increase the uncertainty of the delta in GMSL for the periods 1993-2010 and 1971-2010 by 6 mm and 4 mm, respectively (figure S2). The effect on the uncertainty for the longer periods does not exceed~1 mm.
In a second sensitivity test, we eliminate hybrid estimates from our central estimate ensemble by replacing DA2019 with DA2017, while retaining the 1991-2010 reference period. While the multi-decadal characteristics of GMSL are largely unchanged, the new ensemble has substantially larger interannual and decadal variability (figure S3). However, the uncertainties in delta GMSL are unaffected by the change of central estimate, and differences in the central estimate does not exceed 3 mm for any period (figure S4).

Summary
We have presented a simple ensemble-based method to estimating GMSL change over the 20th century. This method considers the contribution from both internal uncertainty (i.e. the estimates of uncertainty in GMSL change associated with individual ensemble members based on their chosen method) and the structural uncertainty (i.e. the differences among ensemble member central estimates that arise from choice of method). Our results suggest that both internal uncertainty and structural uncertainty make an important contribution to the overall uncertainty. The implication is that estimates of GMSL change based on a single product are likely to underestimate the total uncertainty, as are ensemble-based estimates that do not account for internal uncertainty (i.e. ensemble spread alone is not enough to characterize uncertainty).
The method is flexible and can easily accommodate elements of expert scientific judgement. For example, for some applications it might be appropriate to remove hybrid estimates (i.e. those that convolve observations with climate model simulations) from the central estimate in order to avoid circularity in the context of climate model evaluation or detection and attribution studies. Other examples of expert scientific judgement might include: selecting a subset of products to estimate the uncertainties; or selecting a sub-ensemble to inform the central estimate based on those that are less subject to known sampling biases.
While we have used tide gauge reconstructions of GMSL to illustrate the method, it could be readily applied to other metrics of global or regional climate change of relevance to climate monitoring, model evaluation or detection and attribution studies.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).