Sub-basin-scale sea level budgets from satellite altimetry, Argo floats and satellite gravimetry: a case study in the North Atlantic Ocean

. In this study, for the ﬁrst time, an attempt is made to close the sea level budget on a sub-basin scale in terms of trend and amplitude of the annual cycle. We also compare the residual time series after removing the trend, the semiannual and the annual signals. To obtain errors for altimetry and Argo, full variance–covariance matrices are computed using correlation functions and their errors are fully prop-agated. For altimetry, we apply a geographically dependent intermission bias (Ablain et al., 2015), which leads to differences in trends up to 0.8 mm yr − 1 . Since Argo ﬂoat measurements are non-homogeneously spaced, steric sea levels are ﬁrst objectively interpolated onto a grid before averaging. For the Gravity Recovery And Climate Experiment (GRACE), gravity ﬁelds full variance–covariance matrices are used to propagate errors and statistically ﬁlter the gravity ﬁelds. We use four different ﬁltered gravity ﬁeld solutions and determine which post-processing strategy is best for budget closure. As a reference, the standard 96 degree Dense Decorrelation Kernel-5 (DDK5)-ﬁltered Center for Space Research (CSR) solution is used to compute the mass component (MC). A comparison is made with two anisotropic Wiener-ﬁltered CSR solutions up to degree and order 60 and 96 and a Wiener-ﬁltered 90 degree ITSG solution. Budgets are computed for 10 polygons in the North Atlantic Ocean, deﬁned in a way that the error on the trend of the MC plus steric sea level remains within 1 mm yr − 1 . Using the anisotropic Wiener ﬁlter on CSR gravity ﬁelds expanded up to spherical harmonic degree 96, it is possible to close the sea level budget in 9 of 10 sub-basins in terms of trend. Wiener-ﬁltered Institute of Theoretical geodesy and Satellite Geodesy (ITSG) and the standard DDK5-ﬁltered CSR solutions also close the trend budget if a glacial isostatic adjustment (GIA) correction of 10–20

Abstract. In this study, for the first time, an attempt is made to close the sea level budget on a sub-basin scale in terms of trend and amplitude of the annual cycle. We also compare the residual time series after removing the trend, the semiannual and the annual signals. To obtain errors for altimetry and Argo, full variance-covariance matrices are computed using correlation functions and their errors are fully propagated. For altimetry, we apply a geographically dependent intermission bias (Ablain et al., 2015), which leads to differences in trends up to 0.8 mm yr −1 . Since Argo float measurements are non-homogeneously spaced, steric sea levels are first objectively interpolated onto a grid before averaging. For the Gravity Recovery And Climate Experiment (GRACE), gravity fields full variance-covariance matrices are used to propagate errors and statistically filter the gravity fields. We use four different filtered gravity field solutions and determine which post-processing strategy is best for budget closure. As a reference, the standard 96 degree Dense Decorrelation Kernel-5 (DDK5)-filtered Center for Space Research (CSR) solution is used to compute the mass component (MC). A comparison is made with two anisotropic Wienerfiltered CSR solutions up to degree and order 60 and 96 and a Wiener-filtered 90 degree ITSG solution. Budgets are computed for 10 polygons in the North Atlantic Ocean, defined in a way that the error on the trend of the MC plus steric sea level remains within 1 mm yr −1 . Using the anisotropic Wiener filter on CSR gravity fields expanded up to spherical harmonic degree 96, it is possible to close the sea level budget in 9 of 10 sub-basins in terms of trend. Wiener-filtered Institute of Theoretical geodesy and Satellite Geodesy (ITSG) and the standard DDK5-filtered CSR solutions also close the trend budget if a glacial isostatic adjustment (GIA) correc-tion error of 10-20 % is applied; however, the performance of the DDK5-filtered solution strongly depends on the orientation of the polygon due to residual striping. In 7 of 10 sub-basins, the budget of the annual cycle is closed, using the DDK5-filtered CSR or the Wiener-filtered ITSG solutions. The Wiener-filtered 60 and 96 degree CSR solutions, in combination with Argo, lack amplitude and suffer from what appears to be hydrological leakage in the Amazon and Sahel regions. After reducing the trend, the semiannual and the annual signals, 24-53 % of the residual variance in altimetryderived sea level time series is explained by the combination of Argo steric sea levels and the Wiener-filtered ITSG MC. Based on this, we believe that the best overall solution for the MC of the sub-basin-scale budgets is the Wiener-filtered ITSG gravity fields. The interannual variability is primarily a steric signal in the North Atlantic Ocean, so for this the choice of filter and gravity field solution is not really significant.

Introduction
If the sum of individual components is statistically equal to the total sea level variations, the budget is closed. Total sea level variations and its components are observed by in situ and satellite measurements, but can also be modelled. Several studies have attempted to close the sea level budget by using satellite altimetry, satellite gravimetry and observations or reanalyses of ocean temperature and salinity on a global scale. Closure of the budgets is required to get a consistent division between the mass component (MC) and steric-related sea level changes. This helps us to identify the contributors to present-day sea level changes. Contributors that affect the MC are glacier and ice sheet melt and land water storage, while heat fluxes between ocean and atmosphere contribute to steric changes. Ocean dynamics have an effect on both the MC and the steric change in sea level.
One of the first attempts to close the sea level budget compared time series of total sea level from satellite altimetry with the sum of the MC from satellite gravimetry and the steric component from Argo floats (Willis et al., 2008). That study showed that between the middle of the years 2003 and 2007 the sum and the total sea level have comparable seasonal and interannual sea level variability; however, the 4-year trends did not agree. In that same year, Cazenave et al. (2008) found comparable estimates of steric sea level estimated from Argo and from the difference between altimetry and the Gravity Recovery And Climate Experiment (GRACE) observations over 2003-2008. Using the same methods as Willis et al. (2008) the global sea level budget was closed within error bars by Leuliette and Miller (2009) (Bretherton et al., 1976) to create grids of Argo data. Li et al. (2013) attempted to close the global budget using temperature and salinity grids from Ishii et al. (2006).
While time series of satellite gravimetry and Argo observations became longer and the processing of gravity fields improved, it became possible to look at basin-scale budgets and patterns. Chambers and Willis (2010) compared gravimetry-derived maps of ocean bottom pressure (OBP) to those obtained with steric-corrected altimetry, whereas Marcos et al. (2011) investigated the distribution of steric and OBP contributions to sea level changes and looked at basin-scale differences. Purkey et al. (2014) analysed differences between basin-scale OBP from satellite gravimetry and steric-corrected altimetry using conductivity-temperaturedepth (CTD) profiles over the period 1992-2013. They showed that both methods captured the large-scale OBP change patterns, but that differences occur when deep-steric contributions below 1000 m are not considered. Over the North Atlantic Ocean, the OBP trends were found to be statistically equal, but with large error bars for the stericcorrected altimetry results. Von Schuckmann et al. (2014) found global and large-scale regional (a third of the total ocean) consistency in sea level trends of the three systems in the tropics as long as areas like the tropical Asian archipelago are not considered, but they did not manage to close the budget between 30 and 60 • N and argued that the inability of Argo to resolve eddies in the western intensifications caused the difference in trends.
Some other studies focussed on sea level budgets in small basins. García et al. (2006) compared sea level trends in the Mediterranean from satellite altimetry, satellite gravimetry and the ECCO (estimating the circulation and climate of the ocean) model. ECCO is also used by Feng et al. (2012) to determine trends in the South China Sea. Time series of sea level budgets have been investigated in the Red Sea using Ishii grids (Feng et al., 2014).
Compared to previous studies, we improve the treatment of each dataset, in particular with respect to an accurate description of the uncertainties. We avoid using precomputed grids for Argo and altimetry, because no covariances between grid cells are provided, and we use full variance-covariance matrices of the GRACE gravity field solutions. Secondly, we address the effect of several processing steps particularly on gravimetry data in terms of trend, annual amplitude and (residual) time series. For altimetry, we briefly discuss the effect of different averaging methods and analyse the effect on the trends of having a latitude-dependent intermission bias (Ablain et al., 2015). For GRACE, DDK5-filtered solutions (Kusche, 2007;Kusche et al., 2009) are compared with the anisotropic Wiener-filtered (Klees et al., 2008) solutions. Finally, basin-and sub-basin-scale budgets are created, problematic areas are identified and potential causes for nonclosure are discussed.
We apply our method to the North Atlantic basin, because the coverage of Argo is sufficient during the 2004-2014 period, which allows the construction of budgets over a 10-year time span. Secondly, for both steric sea level and the MC, different regimes are present in terms of trend, annual cycle and interannual variability, which allows us to investigate the performance of the method under various conditions. Additionally, we are able to address the effect of the glacial isostatic adjustment (GIA) on the trends, which is a large contributor in the northwest of the considered basin and therefore also a potentially large source of error.
This article will briefly describe the data used in Sect. 2. Secondly, the processing of the three datasets is discussed in the methodology section. In Sect. 4, the processed datasets are compared to existing products. The resulting basin-and sub-basin-scale budgets are described in Sect. 5. In the final section, conclusions are drawn based on the results.

Data description
This section briefly discusses the data from the three observing systems that are used to determine the sea level budgets. For the determination of the sum of the steric and the mass components of sea level, satellite altimetry data are used. The altimetry data are obtained from the Radar Altimetry Database System (RADS) (Scharroo et al., 2012). RADS contains 1 Hz along-track data, which corresponds to an along-track separation of sea level measurements of approximately 6 km. The files contain ranges, orbits and geophysical corrections for all altimeters that have been flown. In this study, only the data of the Jason-1 and Jason-2 satellites are considered to have consistent spatial and temporal sampling over the period [2004][2005][2006][2007][2008][2009][2010][2011][2012][2013][2014]. The data of Jason-1 during  (Canabes et al., 2013), which means that there is, on average, approximately 1 float per 3 × 3 • box. For the North Atlantic Ocean, steric sea levels can be analysed from 2004, because most areas are sampled already by Argo floats as shown in Fig. 1. In the North Atlantic Ocean, the areas around the Antilles and north of Ireland are the only problematic areas. Most floats descend to a depth around 1000-2000 m and measure temperature and salinity while travelling upward. The resurfacing time of an Argo float is approximately 10-12 days. Using the distribution of temperature and salinity over depth, the steric sea level is computed.
The Earth's time-variable gravity field is measured since 2002 by GRACE. This mission measures changes in the Earth's gravity field by low Earth orbit satellite-to-satellite tracking. Traditionally the Earth's gravity field is expressed in spherical harmonics. In this study, the release five monthly spherical harmonic solutions computed at the Center for Space Research (CSR) (Tapley et al., 2004), together with the ITSG-GRACE2016 solutions (Klinger et al., 2016) computed at the Institute for Theoretical geodesy and Satellite Geodesy (ITSG). The CSR solutions are computed up to degree and order 60 and 96, while the ITSG solutions are computed up to degree and order 90. All three products are provided with full variance-covariance or normal matrices, which allows for statistically optimal filtering. In case of a proper error description, we expect that the results of the CSR 60 and 96 degree solutions give comparable results, except in areas with large gradients in gravity. However, since the differences in variance-covariance matrices are small during the periods July 2003-December 2010 and February 2011-July 2013, but the orbit geometry substantially varies within these periods, the provided variance-covariance matrices are not expected to give a proper representation of the error, which leads to reduced quality filtering. Klinger et al. (2016) showed that the gravity field variability over the oceans indeed increases substantially during periods when GRACE enters repeat orbits. As a consequence, the months July-October 2004 are excluded from the analysis, when GRACE entered a near 4-day repeat orbit. The addition of the ITSG solutions enables us to compare an independent solution computed with a different approach to the standard CSR products. The non-dimensional gravity field coefficients are converted to units of equivalent water height (EWH) before filtering, to make them compatible with the other two observing systems. For comparison, we also used the publicly available Dense Decorrelation Kernel (DDK)filtered solutions of CSR; however, no variance-covariance matrices for those solutions are publicly available. From here on, the designations listed in Table 1 are used to refer to the GRACE gravity field solutions. In the processing phase, the atmospheric and ocean de-aliasing level 1B (AOD1B) product is incorporated (Dobslaw et al., 2013), which is based on the Ocean Model for Circulation and Tides (OMCT) and the European Centre for Medium-range Weather Forecast (ECMWF) model. Monthly averages of the OMCT and the ECMWF are restored after processing to the time-varying gravity field in the form of spherical harmonics (Chambers and Willis, 2010); details of this are described in Sect. 3.3.

Methodology
The data described in the previous section are processed such that they are suited for establishing monthly regional sea level budgets. It implies that the equation is satisfied within uncertainties, whereh sla,GIA is the GIAcorrected mean sea level (MSL) anomaly derived from the Jason satellites,h ssla the mean steric sea level anomaly derived from Argo andh mca,GIA the mean GIA-corrected MC anomaly in terms of EWH derived from GRACE. Note that MSL is inverse barometer-corrected and the MC anomaly from GRACE is made consistent. This section therefore describes the processing strategies for the three observation types from individual measurements to an average over a specified region in the ocean including the propagation of the formal errors. As far as altimetry is concerned, after computing individual along-track sea level anomalies, two important processing steps are described in this section: a suitable averaging method to come to a time series of MSL for a given area and a way to deal with geographical dependencies of the intermission bias between the two Jason missions (Ablain et al., 2015).
To compute steric sea levels from Argo temperature and salinity measurements the Thermodynamic Equation Of Seawater-2010 (TEOS-10) software is used (Pawlowicz et al., 2012). Since the Argo measurements are nonhomogeneously distributed over the ocean, the steric sea levels are first interpolated using an objective mapping procedure to a grid of 1 × 1 • before being averaged. Monthly GRACE solutions of CSR and ITSG are provided with full variance-covariance and normal matrices, which allows the use of an anisotropic Wiener filter (Klees et al., 2008). Compared to other existing filters, it strongly reduces the stripes that are still present in the DDK-filtered solutions (as will be shown in Sect. 4), while not reducing the spatial resolution by applying a large Gaussian filter. A fan filter is applied after the optimal filter to reduce ringing artefacts that occur close to Greenland due to the limited number of spherical harmonic coefficients (degree and order 60-96).

Jason sea level
Individual sea level anomalies h sla measured with the Jason-1 and Jason-2 satellites are computed with respect to the mean sea surface (mss) DTU13 as where a is the satellite altitude, R the Ku-band range and R corr the applied geophysical corrections obtained from the RADS database. The satellite altitude is taken from the GDR-D orbits and the latest versions of the geophysical corrections are applied, as listed in Table 2. The 35 s smoothed dualfrequency delay is used to reduce the relatively large noise in the individual ionospheric corrections. For the wet tropospheric correction, we use the latest delay estimate from the radiometer, while the dry tropospheric delay is computed from the ECMWF pressure fields. Tidal corrections from the GOT4.10 model are applied, which are based on Jason data instead of TOPEX data as in the GOT4.8 model (Ray, 2013). The Cartwright-Taylor-Edden solid earth tide model is applied (Cartwright and Taylor, 1971;Cartwright and Edden, 1973) and an equilibrium model for the pole tide (Wahr, 1985). For the sea state bias correction, a nonparametric model is used (Tran et al., 2012). To correct for high-frequency (< 20 days) wind and pressure effects on the sea surface, a dynamic atmospheric correction is applied based on the MOG2D model (Carrère and Lyard, 2003). The dynamic atmosphere correction in RADS also includes an inverse barometer correction, which corrects for the lowfrequency (> 20 days) sea level anomalies caused by regional sea level pressure variations with respect to the timevarying global mean over the oceans. Sea level anomalies larger than 1 m are removed from further processing, as in the National Oceanic and Atmospheric Administration (NOAA) GMSL time series (Masters et al., 2012).
In GMSL time series, an intermission bias correction is applied, which is determined from the average GMSL difference between Jason-1 and Jason-2 during their tandem phase, in which the satellites orbit the same plane only a minute apart . However, the differences reveal a geographical dependence as shown in Fig. 2. Regional sea level budgets established in this study are more prone to these geographical differences than when estimating global sea level budgets. This problem is partly corrected for by estimating a polynomial through the intermission differences, which only depends on latitude (Ablain et al., 2015) and is given by where λ is the latitude and h sla,ib (λ) is the intermission correction. The sea level anomaly h sla,c corrected for inter-Ocean Sci., 12, 1179-1203, 2016 www.ocean-sci.net/12/1179/2016/ Figure 2. Geographical differences between Jason-1 and Jason-2 sea level estimates averaged over the tandem period.
This correction is only applied to Jason-2 sea level anomalies. The parameters c n , with n = 0, 1, . . .4, depend on the applied geophysical corrections. For the corrections given in Table 2, the values for the parameters are given in Table 3.
In the middle of the North Atlantic Ocean (approximately 40 • N), the intermission difference is several millimetres less than when only including the constant c 0 parameter (which is slightly different if the other parameters are not estimated). This results in an approximate trend difference of several tenths of a millimetre over a period of 10 years. Due to the limited sampling of the Argo network and the relatively large errors in the gravity field solutions it is necessary to integrate sea level anomalies over extended areas. Previous studies producing GMSL time series have used two different techniques (Masters et al., 2012): gridding or lati-tude weighting based on the inclination of the orbit (Wang and Rapp, 1994;Nerem, 1995), which was simplified for a spherical Earth approximation by Tai and Wagner (2011). From here on, the latter is referred to as the Wang and Rapp method. The gridding method is problematic when using the Jason satellites, because of their large track spacing at the Equator, causing the number of individual observations per grid cell to decrease at low latitudes (Henry et al., 2014). A solution is to increase the grid cell size, but this has a disadvantage if sea level budgets are constructed over an irregular and/or a small polygon. The Wang and Rapp method has the disadvantage that it underweights measurements at high latitudes (> 50 • ) (Scharroo, 2006), because it assumes the number of measurements to reach infinity at the inclination of the satellites.
Therefore, it is suggested to average the sea level anomalies based on the number of available measurements within a latitude band. The method connects the weights assigned www.ocean-sci.net/12/1179/2016/ Ocean Sci., 12, 1179-1203, 2016 to the measurements to the number of measurements N l in a latitude band l of 1 • and the area of the sea surface A l in the following way: These weights are normalized: A MSL anomalyh sla for an area is computed with whereŵ is the vector of normalized weights andĥ sla,c is the vector of sea level anomalies corrected for intermission differences.
For the error estimation, variance-covariance matrices are computed as described in Le Traon et al. (1998). Observations over depths smaller than 1000 m are removed using the general bathymetric chart of the oceans (GEBCO) 1 min grid, because Argo steric sea level is referenced to this depth. This method separates the long-wavelength errors from the representativity errors due to ocean variability. White measurement noise is not considered here, because it becomes very small when averaged over large areas. Among the longwavelength errors, we consider the orbit, ocean tide and inverse barometer errors. These errors are assumed to be fully correlated between measurements within the track and uncorrelated between inter-track measurements. It is noted that those correlations do not hold over large basins (> 2000 km in the zonal direction) (Le Traon et al., 1998), and therefore the error is overestimated. For the other error sources, an efolding decorrelation time of 15 days is assumed and the zero crossing of the correlation distance function d corr is given by (Le Traon et al., 2001): where λ avg is the average latitude of two measurements. Ultimately, this results in equations for the covariance of respective measurements in different tracks and on the same track: where ρ ij is correlation computed with the decorrelation time and distance provided above, σ 2 ov is the ocean variability variance and σ 2 lw is the long-wavelength variance. The values for σ ov and σ lw are assumed to be 100 and 15 mm, where the first number comes from typical mesoscale variability (Chelton et al., 2007). By putting these equations in the variancecovariance matrix C sla , the standard errorσ sla for the mean sea level anomaly is computed usinḡ Both the satellite altimetry mean sea level anomalies as well as the MC from GRACE are affected by GIA. For the corrections to GRACE and altimetry, we use the solution of Peltier et al. (2015) based on an Earth model with VM5a viscosity profile and ICE-6G deglaciation history. The altimetric measurements are corrected by subtracting the GIA geoid trend averaged over the region of interest. Errors in the GIA trends are typically assumed to be on the order of 30 % of the signal (Von Schuckmann et al., 2014).
Because the CSR gravity fields are created on a monthly basis and the altimetry measurements are averaged over a cycle of approximately 10 days, the altimetry measurements are low-pass filtered. A low-pass filter f lp is computed by taking an inverse discrete time Fourier transform, which results in with f c the cut-off frequency, which is taken as 12 cycles per year, t is time vector of the altimetry time series and t m the time at the middle of a month. This filter is infinitely long, so therefore we cut it at 2 months. To obtain a better frequency response, the filter is windowed using a Hamming window w H , so that where L is the length of the window which is 2 months. The applied filter h lp is then written in the time domain as The mean of the GIA-corrected, low-pass filtered time series is subtracted, which leaves the MSL anomalyh sla,GIA from Eq. (1).

Argo steric sea level
Using the Argo profile instead of a precomputed temperature/salinity (T /S) grid has the primary advantage that covariances can be computed between steric sea level grid cells. First, steric sea levels are computed from the individual Argo T /S profiles using the TEOS-10 package. This package requires the conversion of the PSS-78 practical salinity  (Grosso et al., 2010) as well as the ITS-90 temperatures to conservative temperature as defined in the TEOS-10 user manual (IOC, 2010). The TEOS-10 program numerically integrates the equation for the geostrophic steric sea level (IOC, 2010): where P 0 is the surface pressure, P is the reference pressure, which is set to 1000 dbar (approximately 1000 m depth) and g 0 is a constant gravitational acceleration of 9.763 m s −2 .
In the analysis, only profiles that reach at least 1000 m depth are included and have at least a measurement above 30 m depth, which is the typical depth of the mixed layer. A virtual measurement is created at 1 m depth, assuming the same salinity and potential temperature values as the highest real measurements, so that the top steric signal is not missed. Only measurements that have error flag "1" (good) or "2" (probably good) are used and the measurements are cleaned by moving a 5 × 5 • block to remove steric sea level estimates more than 3 rms from the mean.
To be able to average measurements monthly over a basin or a polygon, a grid is constructed by statistical interpolation of the steric sea levels at the profile locations based on the method described in Bretherton et al. (1976) and . First, a background field is constructed by estimating a model through the 1000 closest measurements of a profile or grid cell location. This model contains a constant, a second-order 2-D longitude-latitude polynomial and six intra-annual to annual cycles (Roemmich and Gilson, 2009). The background sea surface height is taken as the model evaluated at the grid cell (or profile) location.
Consecutively, the background field vectorĥ ssl,b is subtracted from the sea level estimates, which results in where δĥ ssl are the residuals. The ocean variance σ 2 t is assumed to be 100 cm 2 (typical mesoscale variability) (Chelton et al., 2007), which is close to the average squared rms of fit of the differences of the measurements with the model. These variances are subdivided into three components to represent different correlation scales (Roemmich and Gilson, 2009) as follows: which are then used to construct covariance matrices C based on those used for the Scripps fields (Roemmich and Gilson, 2009), such that and the measurement and representativity error matrix R: where d is a measure for the distance between the profiles p and the grid points g, such that The parameter a is one above 20 • latitude and below this latitude that a decays linearly to 0.25 at the Equator, in order to represent the zonal elongation of the correlation scale here (Roemmich and Gilson, 2009).
Using the covariances C pg (between profiles and grid points) and C p (between profiles), the weight matrix K is computed as The weight matrix is then used to compute a vector of steric sea levelsĥ ssl,g for every grid point within the area: for which also the variance-covariance matrix C ssl,g is computed, so that where C g are the covariances of the background grid. To average the steric sea level anomalies, the values are weighted by the cosine of the latitude, which results in Like for altimetry, the weights are normalized: Eventually these are used to compute the mean steric sea levelh ssl and its associated errorσ ssl , with and Subtraction of mean from the mean steric sea level time series yields the steric sea level anomaliesh ssla used in Eq. (1).

GRACE mass
We use the full variance-covariance and normal matrices to filter the spherical harmonic coefficients with an anisotropic non-symmetric (ANS) filter (Klees et al., 2008). This Wiener filter exploits the ratio between the variance of the error and www.ocean-sci.net/12/1179/2016/ Ocean Sci., 12, 1179-1203, 2016 of the signal to filter the coefficients. With the variancecovariance matrices C x and D x for the errors and the signals, respectively, the spherical harmonic coefficientsx are filtered aŝ For the filtered coefficientsx of , a corresponding variancecovariance matrix C x,of is computed. This is a joint inversion of a static background field, which is set to zero, and the timevarying coefficients, resulting in The derivation is elaborated upon in Appendix A.
The filtered grids contain ringing effects around strong signals over Greenland and the Amazon region, which can have substantial effects on the estimated trends in the ocean, which is discussed in Sect. 4. If averaged over large areas this will have hardly an effect, but on regional scales the ringing should be reduced. To obtain smoother fields, we use a fan filter (Zhang et al., 2009;Siemes et al., 2012), which is given aŝ where the degreel and orderm are related to the maximum degree, l max . For a maximum degree of 60 and 96, this is comparable to a Gaussian filter of 280 and 110 km, respectively (Siemes et al., 2012). Suppose F f = diag(sinc(ˆl l max ) • sinc(m l max )); then, the resulting covariance matrix C x,ff is written as Note that there is a fundamental difference between filtering the CSR and ITSG solutions. The CSR solutions are computed with respect to a static gravity field, while the ITSG solutions are computed with the respect to a static gravity field including a secular trend and an annual cycle. As a consequence, the CSR spherical harmonic coefficients and signal variance-covariance matrices include the annual and the secular trend during Wiener and fan filtering. The ITSG gravity fields are Wiener-filtered first, then the annual cycle and the secular trend are added back and eventually the fan filter is applied.
Since the degree-1 coefficients are not measured by GRACE, we add those of Swenson et al. (2008) to the CSR solutions. For the ITSG solutions, a dedicated degree-1 solution is computed using the same approach. Furthermore, we replace the C 20 coefficient with satellite laser ranging estimates (Cheng et al., 2013).
The intersatellite accelerations of GRACE are de-aliased for high-frequency ocean and atmosphere dynamics with the Atmospheric and Ocean De-aliasing Level-1B (AOD1B) product. Monthly averages of the AOD1B are provided as the GAD product for both CSR and ITSG, where the mass changes over land are set to zero. To be able to combine the GRACE MC with inverse barometer-corrected altimetry, the GAD products containing the modelled oceanic and atmosphere mass are added back in the form of spherical harmonics. Because the ocean model in the AOD1B product is made mass conserving by adding/removing a thin uniform layer of water to or from the ocean, the degree zero is removed before subtraction from the GAD product to compensate for the mean atmospheric mass change over the ocean, which is not measured by inverse barometer-corrected altimetry (Chambers and Willis, 2010).
To compute the MC at a specific grid cell, the 4πnormalized associated Legendre functionsŷ are evaluated at its latitude-longitude location. The MC of the grid cell h mc is consecutively calculated with For multiple grid cells, the vectorŷ becomes a matrix Y, such thatĥ mc becomes a vector of EWHs. This is written in the form It is possible to compute the grid's variance-covariance matrix C mc (Swenson and Wahr, 2002) as The averaging over an area is equal to that of the Argo grids. Suppose thatŵ are the normalized latitude weights for the gridded MC; then is the mean MC of an area and is its error.
To correct the GRACE MC for the GIA trend, we first convert the GIA spherical harmonic coefficients into EWH. The GRACE degree-2 coefficients are different than those for the altimetry, due to changes in the Earth's rotation axis (Tamisiea, 2011). Note that GRACE only measures degrees 2 and higher, and therefore the coefficients of degree 0 and 1 are not taken into account. Eventually, the GIA spherical harmonic coefficients are converted to spatial grids, which are then averaged over the considered area, and consecutively the mean GRACE MC is corrected for the mean GIA trend.
The mean MC anomalyh mca,GIA used in Eq.
(1) is obtained by applying the GIA correction to the mean MCh mc and subtracting the mean of the time series.

Comparison with existing products
In this section, a comparison is made between existing products and the sea levels from altimetry, gravimetry and Argo floats. First, we compare the MSL time series over the North Atlantic Ocean with the existing time series provided by the NOAA Laboratory for Satellite Altimetry (Leuliette and Scharroo, 2010) and we show the effect of a latitudedependent intermission bias. Secondly, amplitude and trend grids of steric sea level are compared to those computed from Scripps salinity and temperature grids (Roemmich and Gilson, 2009) and the Glorys reanalyses grids (Ferry et al., 2010). Thirdly, the optimal and fan-filtered gravimetry grids are compared to the DDK5-filtered gravity fields (Kusche, 2007;Kusche et al., 2009).  Table 2, while the geophysical correction in the first column are applied to the light blue line.

Total sea level
As visible from the figure, hardly any differences are observed between all four time series. The rms differences between all time series computed in this study and NOAA are on the order of 3-4 mm, which is slightly larger than differences found between the GMSL time series (Masters et al., 2012). The fact that the red and blue line resemble each other indicates that the under-weighting of high-latitude measurements in the Wang and Rapp method hardly has any effect on the trend. This also holds for averaging over smaller areas in the North Atlantic Ocean, where the only noticeable differ-ence occurs when a substantial number of satellite tracks are missing due to some maintenance or orbit manoeuvres. However, the time series may differ in places where strong decorrelations or strong differences in trends occur, especially in the north-south direction at high latitudes.
The application of a latitude-dependent intermission bias has a substantial effect on the trend. From the NOAA time series, a trend of 1.5 mm yr −1 is found, while the time series from the Wang and Rapp method and our method provide a trend of 1.8 mm yr −1 . This difference can be explained by the combination of another averaging technique and geographically varying trends in sea level. However, if the difference in MSL is computed between Jason-1 and Jason-2 over the North Atlantic Ocean during the tandem phase and this used as the intermission bias correction, trends of 1.3 and 1.4 mm yr −1 , respectively, for the theoretical and the empirical weighting method are found. This is comparable to the trend computed by applying the geographical dependence of the intermission bias (the light blue line), which is 1.4 mm yr −1 . To further illustrate this, Fig. 4 shows the differences in trend if a constant intermission bias is used or a latitude-dependent one is used. The mean difference of 0.4 mm yr −1 is already significant, but locally the differences in trend increase to approximately 0.8 mm yr −1 .

Steric sea level
In Fig. 5, grids for the amplitudes and trends of the steric signal are shown. The Scripps grids (Roemmich and Gilson, 2009) and our solution are solely based on Argo data, while the Mercator reanalyses product Glorys 2V3 assimilates various types of data including altimetry (Ferry et al., 2010), sea surface temperature and Argo. Besides the different input data, the Glorys relies on a volume-conserving ocean model, while the other two methods are statistical, and the results can differ quite substantially. Since we use the same correlation structures as Scripps, the resulting grids should resemwww.ocean-sci.net/12/1179/2016/ Ocean Sci., 12, 1179-1203, 2016 ble each other quite closely. However, to be able to create a variance-covariance matrix between grid cells, it was required to do a 2-D interpolation of the steric sea levels instead of a 3-D interpolation of temperature and salinity profiles. In order to do this, the criteria for removing profiles, as described in Sect. 3.2, is different for both methods. As a consequence of the 2-D interpolation and the differences in the removal criteria, the results differ.
In terms of the amplitudes of the annual signal, all three methods provide similar results in terms of the large-scale features. Typically, large signals are found in the Gulf Stream region and close to the Amazon Basin, while the areas around Greenland and west of Africa have small amplitudes. The Glorys grid differs from the others primarily in the Labrador Sea and northwest of Ireland. Secondly, the grid computed in this study and the Glorys grid exhibit more short-wavelength spatial variability than the Scripps grid. As long as the regions over which budgets are made are large enough, the methods will not differ substantially in terms of annual amplitude.
The plots in the right column of Fig. 5 immediately reveal a significant difference between the statistical methods and the reanalysis in terms of trend. It is not completely clear what the cause for this difference is. Since the Scripps grid and our grid resemble each other in terms of large-scale features and are purely based on T /S data, we trust the interpolation of Argo. The differences between those two methods are again primarily the noise in the grids and the area around the Antilles, where Argo samples poorly, as discussed in Sect. 2.

Mass component
In Fig. 6, the trends and amplitudes of the CSR96-DDK solution are compared with those obtained from CSR60-W, CSR96-W and ITSG90-W. Note that the Wiener-filtered solutions are also fan-filtered, as discussed in Sect. 3.3, but will be referred to as "Wiener-filtered" from here on. Both in the annual amplitude and the trend grids some residual striping effects are present for the CSR96-DDK solutions, yielding non-physical trend patterns in the MC. The Wiener filter strongly reduces the striping and, as a result, especially the trend grids are smoother. However, the ITSG grids also exhibit striping (as it appears at shorter wavelengths), which is the results of adding the trend and annual cycle back from the static field, as discussed in Sect. 3.3. A second observation is that the CSR96-DDK and ITSG90-W amplitudes are slightly larger, which indicates that a part of the annual sig-  Ocean Sci., 12, 1179-1203, 2016 www.ocean-sci.net/12/1179/2016/ nal is lost in the CSR60-W and CSR96-W solutions. Thirdly, Tamisiea et al. (2010) estimated a slight increase in MC amplitudes using fingerprint methods based on forward models of water mass redistribution around the Sahel and Amazon of 10-15 mm. In the Wiener-filtered CSR grids, larger amplitudes are also visible in these regions; however, their amplitude of 30-60 mm is far too large and is probably the result of hydrological leakage. This leakage is slightly reduced in CSR96-W compared to those of the CSR60-W.
To determine how this effects sub-basin-scale MC time series, it is first required to determine the minimum area over which the measurements have to be integrated. GRACE gravity fields have a resolution of typically 250-300 km half wavelength (Siemes et al., 2012). For small ocean signals after applying filtering procedures, we expect the resolution to be closer to 400-500 km. Argo has approximately one to two floats per 3 × 3 • box, so its resolution is in the same range as that of GRACE. Jason-1 and Jason-2 have an inter-track spacing of 315 km at the Equator, which decreases substantially towards 60 • N. Considering all systems, this theoretically makes it possible to create budgets over grid cells of approximately 500 × 500 km; however, due to the limited length of the time series, the error bars on the trends become much larger than the signals. The size of the polygons is therefore chosen based on the criterion that the error on the trends does not exceed 1 mm yr −1 .
To illustrate the effects of different filters and residual striping on sub-basin-scale budgets, Fig. 7 shows time series of mass averaged over the polygons shown in Fig. 6. All three polygons have approximately the same size, but have different orientations. The location is chosen in the middle of the Atlantic to avoid effects of hydrological leakage. Except for the months surrounding the near 4-day repeat period in 2004, where the variance-covariance matrices of CSR probably do not properly described the noise of the gravity fields, the time series resemble each other best for the zonally oriented polygon. In the zonal polygon, the noise in CSR96-W is substantially larger than the other results. Furthermore, it becomes clear that the CSR96-DDK solutions do not contain a substantial signal above degree 60, because the red and yellow lines are on top of each other, while CSR60-W and CSR96-W are substantially different.
The month-to-month noise of CSR60-W and CSR96-W time series is comparable for all three polygons. The CSR60-DDK and CSR96-DDK time series become much noisier for the meridionally oriented polygon, where month-to-month jumps of 10-20 mm occur. In addition, the DDK time series exhibit a substantially different trend in the meridional polygon than the other time series, because the orientation of the polygon is aligned with the residual stripes (Fig. 6). So, in terms of trend and noise, the DDK time series strongly depends on the orientation of the polygon. Even though the ITSG90-W trend and amplitude grids suffer from striping, they do not become significantly noisier for the meridionally oriented polygon.

Results and discussion
The first objective of this section is to reveal patterns of sea level amplitudes and trends in the North Atlantic Ocean and how these resemble for the two different methods: altimetry and the combined method of Argo and GRACE, hereafter referred to as Argo+GRACE. Secondly, this section discusses the closure of sea level budgets over polygons of approximately 1/10 of the North Atlantic Ocean in terms of trend, annual amplitude and residual variability. It is shown for which regions the budget is closed and possible causes for non-closure are discussed. Thirdly, we focus on the best choice of GRACE filter solutions for the MC.

North Atlantic sea level patterns
In Fig. 8, grids of trends and amplitudes computed from Argo+GRACE are overlaid with Jason-derived trends and amplitudes at the ground tracks. In areas where the ground tracks of altimetry are barely visible, there is a good resemblance between Argo+GRACE and altimetry.
The grids and ground tracks shown in the left column show that large annual signals are present in the Gulf Stream region and in a tongue stretching from the Amazon to the Sahel. A region without any substantial annual signal is located just west of Africa, which is clearly visible in both the Argo+GRACE grid and altimetry. Both methods reveal these large-scale oceanographic features in amplitude, but there are also quite some differences. East of the Antilles, altimetric measurement show an annual amplitude of more than 60 mm, whereas Argo+GRACE estimates are in the range of 40-50 mm, depending on the choice of GRACE filter. Note that in this area, there are barely any Argo floats (Fig. 1), which might lead to interpolation problems. A second difference is observed in the Wiener-filtered grid (bottom left) at the Amazon and Sahel regions. This is exactly at the areas where the Wiener-filtered MC grids of Fig. 6 exhibit probable hydrological leakage.
The trends from altimetry in the right column of Fig. 8 show a distinct pattern, where positive trends are found south of 35 • N and negative trends north of it, with the exception of the North American coastline. Large trends along the North American coast are also found by tide gauge studies (Sallenger at al., 2012), where they attribute this to a weakening Atlantic meridional overturning circulation (AMOC). The Argo+CSR96-W solution resembles the trend patterns derived from altimetry measurement better, while the residual stripes in the CSR96-DDK solution are clearly visible. Note that a significantly larger altimetric trend is visible west of the Mediterranean. Possible causes will be discussed below.

Sub-basin-scale budgets
The North Atlantic Ocean is split into 10 regions, divided in the middle by the mid-Atlantic ridge, while trying not to cut www.ocean-sci.net/12/1179/2016/ Ocean Sci., 12, 1179-1203, 2016 Then, budget closure in terms of trend and annual amplitude is addressed and the corresponding best gravity filter is determined. Ultimately, the trends, semiannual and annual signals are reduced from the time series and the best filter choice in terms of residual variability is determined.

Time series
Budgets for three representative regions, using the Wienerfiltered MC solutions, are shown in Fig. 9. The time series for the rest of the regions can be found in the Supplement. The left plots confirm that the main driver for annual fluctuations in sea level is the steric sea level, but that the trend is strongly influenced by a mass component. On the right side, we see that the sum of the components and the total sea level agree to within the error bars, but some problems arise in the Gulf Stream area (region D), probably caused by sharp gradients in sea level. The sea levels in polygons D and I also contain some interannual signals, which is especially pronounced between 2010 and 2012. The left column shows that the interannual variability is primarily a steric signal. Note that the larger size of the error bars in regions B and I is due to the decrease in altimetry track density closer to the Equator and the elongation of the correlation radius for the interpolation of Argo floats.

Trends
Trends computed from the time series of Fig. 9 are given in Table 4. Close to the Equator (A, B, I and J) and over the whole North Atlantic Ocean, the trend budget is closed within 2 standard deviations no matter which of the MC solutions is used. This confirms the results of Fig. 8. In this region, the GIA correction is relatively small and no sharp gradients or strong features are present in the trend grids, which contribute to proper budget closure. Budget closure is also achieved by all filters in the northeast of the Atlantic (F, G). Again, this is a relatively quiet region, with no significant gradients in trends and a small GIA correction. The results of the Argo+GRACE, however, show a larger spread www.ocean-sci.net/12/1179/2016/ Ocean Sci., 12, 1179-1203, 2016 Table 4. Trends of total sea level (mm yr −1 ) and their standard deviations from altimetry (Jason) and the sum of steric and mass from Argo (A.) and GRACE (CSR, ITSG) for different filter solutions. NA is the trend for the complete North Atlantic Ocean between 0 and 65 • N. A 0.4 mm yr − 1 drift error is taken into account for altimetry based on the comparisons with tide gauges (Mitchum, 1998(Mitchum, , 2000.  for the different MC solutions. It is important to note that especially region F suffers from some ringing artefacts before the fan filter is applied and that the far northeast is not very well covered by Argo floats. The trends of CSR96-DDK in region G are a bit further off than the other solutions, probably resulting from the striping, as visible in Fig. 6. In the northwest of the Atlantic, the choice of gravity field filter either substantially influences the estimated trends (D and E), or they are just outside of 2 standard deviations (C) for one or more solutions. Using the CSR96-W solution, the budget is closed within 2 standard deviations for all three polygons, whereas the other solutions do not close the budget. For region C, the results of the all filters resemble one another quite well, but some are just outside of 2 standard deviations from altimetry. For region D, the CSR60-W results are far off, but the other results are close again. In this region, sharp gradients occur not only in the MC with the presence of a neighbouring continental shelf but also in the steric component. This might lead to leakage of the continental shelf mass signal or problematic interpolation of the Argo steric sea levels. In addition, for both of the aforementioned regions, the GIA correction on the MC is relatively large. Adding a GIA correction error of 10-20 %, which is smaller than discussed in Sect. 3.3, to the mass trends would close the budget in these regions for all the solutions, except for the CSR60-W solution in region D. In region E, a clear split is visible between the Wiener-filtered CSR solutions, which close the budget, and the other two solutions, which do not close the budget. The difference in results could be caused by the filter not being able to handle the large gradients (Klees et al., 2008) in the MC within this region (Fig. 8). However, if we would again add only a 10-20 % GIA correction error, it would suffice to close the budget for all filters.
Ultimately, only the budget in region H cannot be closed with any of the solutions and there is no strong GIA signal present, which could be responsible for a large bias. In addition, the sea level in this polygon does not exhibit any strong gradients and the number of Argo floats is substantial. This excludes interpolation or filtering problems. Therefore, we argue that this can be explained by a deep-steric effect that could be related to variations in the export of saline water from the Mediterranean (Ivanovic et al., 2014), which is not captured by Argo.
In conclusion, it is possible to close the sea level budget within two standard deviations for 9 out of 10 regions using CSR96-W. If a 10-20 % GIA correction error is taken into account, the budget for 9 out of 10 polygons is also closed for CSR96-DDK and ITSG90-W. This also suggests that the commonly assumed GIA correction error of 20-30 % (Von Schuckmann et al., 2014) is probably overestimated.

Annual signal
We indicated that the seasonal cycles are primarily caused by steric variations in sea level (Fig. 9). By comparing the first column with the last column in Table 5, it becomes clear that in most cases an additional mass signal is required to close the budget in terms of annual amplitude. The discrepancy between Argo and altimetry for the whole North Atlantic Ocean reveals that, on average, in-phase mass signals with an amplitude of approximately 7 mm are required to close the budgets, which is in line with the modelled results of Tamisiea et al. (2010). They modelled, using fingerprints, amplitudes of the MC ranging from 3 to 12 mm, and phases (not shown here) between days 210 and 330, which is in phase with the steric signal. Table 5 shows that for virtually every region the choice of filter matters. On top of this, there is a clear difference between the Wiener-filtered CSR solutions and the other two solutions. Adding the CSR60-W and CSR96-W solutions increases in a few cases even the discrepancy with altimetry, which is caused by an out-of-phase mass signal. Especially in regions A and J, where the amplitude is underestimated and overestimated, respectively. Only in four regions (C, D,  H and I) the amplitude budget closes within 2 standard deviations using these solutions. Even though no error bars are computed for the CSR96-DDK, it is clear that the results are far better in terms of budget closure. The results are comparable to ITSG90-W, which closes 7 out of 10 budgets within 2 standard deviations. CSR DDK5+Argo underestimates the amplitude in region B, while ITSG90-W+Argo overestimates the amplitude with respect to altimetry in region D. In region B, the estimate of ITSG90-W+Argo is relatively small, and in region D, the CSR96-DDK+Argo also relatively large. Note that the number of Argo floats in region B is often small (Fig. 1) and that large gradients in the steric sea level in region D could cause interpolation problems for steric sea level. Secondly, in both northern polygons E and F, both combinations of Argo+GRACE underestimate the amplitude compared to altimetry. Why this underestimation occurs is not completely clear. A likely culprit is the gravity field filtering, but yearly deep convection events in these regions (Vå ge et al., 2009), which transport surface water to depths below 1000 m, and the limited number of Argo floats, could also be contributing factors.
Using ITSG90-W, it is also possible to close the budget on the scale of the whole North Atlantic Ocean (last row of Table 5). The ITSG90-W+Argo performs best in terms of amplitude budget closure in most regions, even though it is often characterized by slightly smaller amplitudes than those derived from altimetry. This suggests that there is either a long-wavelength underestimation of the amplitude in GRACE, an overestimation in altimetry or a missing steric effect in Argo. This is in line with Storto et al. (2015), where on a global scale, steric sea levels computed from reanalyses and gridded T /S fields are found to be smaller than those indirectly derived from altimetry GRACE. Additionally, Marcos et al. (2011) found differences in phase and amplitude of steric-corrected altimetry and the MC from de-striped 500 km Gaussian-filtered GRACE solutions in the North Atlantic.

Residual variability
Time series for the same regions as in Fig. 9 are shown in Fig. 10, but their trend, semiannual and annual signals have been reduced to show the residual variability. For the rest of the regions, plots of the residuals are given in the Supplement. In contrast to the time series for the whole North Atlantic Ocean (not shown), the sub-basin-scale time series show significant interannual variability. Region D, located at the east coast of the United States, shows a jump of 60-70 mm within 3 months at the end of 2009. This jump coincides with a shift in the Gulf Stream described by Pérez-Hernández and Joyce (2014) as the largest in the decade, which they related to the North Atlantic Oscillation. As illustrated in the left column, the shift in the Gulf Stream is primarily of steric nature; however, small deviations in the mass signal are also present. It is remarkable that at the same time on the other side of the Atlantic (region I, bottom figures), an increase in sea level is observed by both altimetry and Argo. This suggest a link between the latitude of the Gulf Stream and sea level temperatures in the east of the Atlantic. In region B, we also observe a small interannual effect by altimetry and Argo. However, the amplitude of the signal is larger for altimetry than is captured by Argo, which suggests either some interpolation issues in an area without many Argo floats or a deep-steric effect. Using any of the filtered CSR or ITSG solutions, it is possible to detect the interannual variability described, probably because most of the signal is of steric origin. However, for the interannual signals that are less pronounced, or for highfrequency behaviour of sea level, there are some differences between the MC solutions. Table 6 shows the fraction of variance of the residual signal of altimetry (trend, semiannual and annual cycles removed) explained by Argo+GRACE.
The third column indicates that Argo in combination with CSR96-W does not explain much of the residual variance, but mostly introduces additional noise, which causes the negative values. Using the DDK5-filtered MC, the explained variance increases, but the best performance is obtained with the CSR60-W and especially the ITSG90-W gravity fields. The last column shows that after reducing the trend, and the semiannual and annual signals, between 24 and 53 % of the residual signal can be explained by the combination of Argo and ITSG90-W. It is remarkable that for the whole North Atlantic Ocean (last row), no variance is explained by the Argo+GRACE, primarily due to the absence of a clear interannual signal. Note that the value −1.21 for the CSR96-W gravity fields indicates that variance increases after its subtraction from altimetry, which indicates that the Argo+GRACE time series is substantially noisier than the altimetry time series.

Conclusions
For the first time, it is shown that sea level budgets can be closed on a sub-basin scale. With the current length of the time series, it is possible to establish budgets over areas of approximately 1/10 of the North Atlantic Ocean. To obtain error bars on the annual amplitudes, trends and time series, errors for altimetry and Argo profiles are propagated from existing correlation functions, while for GRACE full variance-covariance matrices are used. For altimetry, a latitude-dependent intermission bias is applied and it is shown that this leads to trend differences ranging up to 0.8 mm yr −1 if the period from 2004-2014 is considered.
To obtain proper averaged mass for sub-basin-scale polygons, the gravity fields have to be filtered. The application of an anisotropic Wiener filter on the CSR96 solutions leads to the best closure of the budget in terms of sea level trends, with closure in 9 out of 10 regions. In the considered regions, also the CSR96-DDK and the ITSG90-W solutions appear to close just as many budgets when a 10-20 % GIA correction www.ocean-sci.net/12/1179/2016/ Ocean Sci., 12, 1179-1203, 2016 error is added. The results of the CSR96-DDK filter, however, strongly depend on the orientation of averaging area due to residual meridional striping. The strong resemblance between trends also indicates that the errors on the GIA model are probably smaller than the commonly assumed 20-30 %. Furthermore, a large difference in trend between altimetry and Argo+GRACE is observed in front of the Mediterranean Sea where only a small GIA correction is applied. We believe that this originates from steric effects below the considered 1000 m, where saline water enters the Atlantic Ocean from the Strait of Gibraltar and dives to large depths. Further research is needed to confirm this hypothesis. The CSR60-W and CSR96-W solutions appear to underestimate the amplitude of the annual signal substantially. They also suffer from what appears to be leakage around the Amazon and Sahel, regions with a substantial annual hydrological cycle. Using the CSR96-DDK gravity fields and the ITSG90-W solutions, the sum of the steric and mass components becomes significantly closer to that of altimetry, with closure in 7 out of 10 regions. However, it must be noted that the altimetry signals tend to be slightly larger. This is likely due to partial destruction of the signal by filtering of the gravity fields or limited Argo coverage or, in some regions, deepsteric signals.
By removing the semiannual and annual signals and trends, interannual variability can be detected. Since most of the interannual variability in the North Atlantic Ocean is contained in the steric component, the type of filter on the gravity fields is not really important. However, if we look at differences on a month-to-month basis, high-frequency variations or small interannual fluctuations in mass, it is best to use the CSR60-W the ITSG90-W solutions, because the fraction of explained variance of the altimetric sea level time series by the sum of the components using these solutions is largest. Using the ITSG90-W solution, 24-53 % of the variability in the altimetry-derived sea level time series is explained. The CSR96-W solution only introduces noise and explains virtually no residual variability of the altimetry time series. Especially in the 4-day repeat orbits in 2004 and even the months around them, the Wiener-filtered solutions do not give proper estimates of the MC, which partly contributes to a lower explained variance.
To summarize, using the ITSG Wiener-filtered solution, the trend budgets close when an error of 10-20 % on the GIA correction is assumed. They perform, together with the standard DDK5-filtered CSR solution, best in terms of annual amplitude budget closure. Additionally, the combination of ITSG mass and Argo steric sea levels explains the largest fraction of variance in altimetry time series. Based on this, the best option to establish budgets, at scales considered in this paper, is the ITSG90-W solution. However, due to residual striping in the trend grids from the static background field that are added back after Wiener filtering, one must take care when averaging the MC over even smaller regions, or meridionally oriented polygons, which is a even a bigger problem for the standard CSR96-DDK solutions.

Appendix A
The Wiener filter is, in principle, a joint inversion between the spherical harmonic coefficients of the background field x b and those of the time-varying gravity fieldx. Suppose that C x is the error variance-covariance matrix ofx and D x the signal variance-covariance matrix; then the filtered coefficientsx f are expressed aŝ Assuming the spherical harmonic coefficients of the background field are zero, this equation reduces to Eq. (27). Its filtered variance-covariance matrix C x,f is computed using Since the matrices (C −1 x +C −1 x ) −1 and C −1 x are symmetric, it is possible to simply change the order underneath the transpose sign and leave which is further simplified by using the identity C −1 x C x = I to Finally, this equation is rewritten, such that which eventually is simplified to Eq. (28).

Information about the Supplement
We added data in the supplement, so other people can reproduce trend plots and grids from the results section.
The Supplement related to this article is available online at doi:10.5194/os-12-1179-2016-supplement.