4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry

. The reconstruction of sea surface currents from satellite altimeter data is a key challenge in spatial oceanog-raphy, especially with the upcoming wide-swath SWOT (Surface Water and Ocean and Topography) altimeter mission. Operational systems, however, generally fail to retrieve mesoscale dynamics for horizontal scales below 100 km and timescales below 10 d. Here, we address this challenge through the 4DVarnet framework, an end-to-end neural scheme backed on a variational data assimilation formulation. We introduce a parameterization of the 4DVarNet scheme dedicated to the space–time interpolation of satellite altimeter data. Within an observing system simulation experiment (NATL60), we demonstrate the relevance of the proposed approach, both for nadir and nadir plus SWOT al-timeter conﬁgurations for two contrasting case study regions in terms of upper ocean dynamics. We report a relative improvement with respect to the operational optimal interpolation between 30 % and 60 % in terms of the reconstruction error. Interestingly, for the nadir plus SWOT altimeter con-ﬁguration, we reach resolved space–timescales below 70 km and


Introduction
Satellite altimetry is the main data source for the observation and reconstruction of sea surface dynamics on a global scale (Chelton et al., 2001).Current satellite altimeters only deliver along-track nadir observations.This results in a very scarce sampling of the ocean surface.Interpolation schemes are then key components of the operational processing of satellite altimetry data.Current operational products (Taburet et al., 2019;Lellouche et al., 2018) show however a limited ability to retrieve the full-range of mesoscale dynamics.Upcoming wide-swath altimetry SWOT mission, see e.g.(Gaultier et al., 2015), will provide for the first time two-dimensional observation of the sea surface height.The space-time sampling of satellite altimeters will however still remain scarce for a long time, which has motivated a recent research literature towards the improvement of the interpolation of satellite-derived SSH fields, see e.g.(Lopez-Radcenco et al., 2019;Lguensat et al., 2017;Beauchamp et al., 2021;Ballarotta et al., 2019).
Here, we explore further this avenue and more specifically the 4DVarNet framework recently introduced in Fablet et al. (2021).As it relies on a variational data assimilation formulation, it appears particularly suited to the spacetime interpolation of sea surface variables from irregularly-sampled observations.We propose a parametrization of the 4VarNet scheme dedicated to SSH interpolation from satellite altimeter data and report OSSE (Observing System Simulation Experiment) results to support the relevance of the proposed scheme.Our main contributions are as follows: The proposed 4DVarNet-SSH scheme delivers an end-to-end neural architecture using as inputs raw satellite altimeter data and optimally-interpolated fields.We also address uncertainty quantification issues using an ensemble method.
For OSSE on two case-study regions, respectively along the GULFSTREAM and for an open ocean area dominated by mesoscale eddy dynamics, the 4DVarNet-SSH scheme outperforms previous work and significantly improves performance metrics with respect to the operational processing.We also support the relevance of wide-swath SWOT altimeter data to significantly improve the reconstruction of sea surface dynamics compared to nadir-only satellite altimeters.
We deliver an open source code for the proposed 4DVarNet-SSH scheme.It relies on a Pytorch and associated state-of-the-art packages.As such, it supports multi-GPU configuration and can scale up to large-scale domains.
We believe these contributions to contribute to the development of deep learning approaches for satellite altimetry, and more broadly for operational oceanography.
This paper is organized as follows.Sect. 2 briefly reviews key methodological aspects and related work.We describe the proposed 4DVarNet-SSH approach in Sect. 3 and Sect. 4 presents the considered OSSE setting.We report our results in Sect. 5 and discuss further our main contributions in Sect.6.

Background and related work
From a methodological point of view, interpolation problems in geoscience are classically regarded as data assimilation issues (Asch et al., 2016).They aim at estimating the state x t of a multi-dimensional dynamical system: The first equation relates to the forecast step which describes the evolution of the system from time t to t + dt according to the potentially non-linear model M .The second equation introduces the observations y t at time t where H t is the corresponding observation operator, usually known, but also potentially trainable.η(t) is the model error and ε(t) the observation error.Both errors are generally assumed to be Gaussian, unbiased and uncorrelated over time.When discretized on a spatio-temporal grid where index k = 1, • • • , T refers to time t k , their associated covariance matrices write Broadly speaking, a vast family of data assimilation methods stems from the minimization of some energy or functional which involves two terms, a dynamical prior and an observation term.We may distinguish two main categories of data assimilation approaches (Evensen, 2009): variational and statistical data assimilation.Specifically, within a variational data assimilation framework, the state analysis x a results in a gradient-based minimization of the defined variational cost J (x) = J Φ (x, y, Ω) (Asch et al., 2016).The latter generally combines the sum of an observation term and a regularization term involving an operator Φ: 4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry (2) In a weak-constrained 4DVar scheme (Carrassi et al., 2018), prior operator Φ is a time-stepping operator associated with dynamical model M .H is the observation operator, Ω = {Ω k } is the set of subdomains of D with observations at time t k , k = 1, • • • , T .Q and R are, respectively, the background and the observation error covariance matrices and Φ(x) denotes here the background estimation, i.e. the physical prior, more often noted as the deterministic forecast x b .Regarding statistical data assimilation, many state-of-the-art methods rely on optimal formulation (OI), Interestingly, the analyzed state obtained from OI matches the minimization of the 3DVar cost function, which relates to the stationary case of 4DVar formulation described above, see e.g.Carrassi et al. (2018).This establishes the formal link between the statistical DA frameworks and the optimal control theory used in the variational formulation.Optimal Interpolation (OI) has been used for decades (Taburet et al., 2019) for the interpolation of along-track nadir altimeter datasets and is still used today for the operational Marine (CMEMS) and Climate (C3S) production of the E.U.Copernicus program.It involves a significant smooting, solving spatial scales up to 150km.Extensions of OI schemes to multi-scale to better account for mesoscale sea surface dynamics have recently been proposed (Ardhuin et al., 2020;Ubelmann et al., 2016).Variational DA schemes have also been widely explored for the assimilation of satellite altimeter data in ocean general circulation models, see e.g.Ngodock et al. (2015), Benkiran et al. (2021) or Li et al. (2021).Previous works have also considered quasigeostrophic (QG) dynamics as an approximate and reduced-order dynamical prior for sea surface dynamics, leading to state-of-the-art performance (Ubelmann et al., 2016;Le Guillou et al., 2020).Overall, BOOST-SWOT 2020 data challenge (https://github.com/ocean-data-challenges/2020aSSHmappingNATL60)provides a representative benchmarking framework to assess the performance of SSH mapping schemes for nadir-only and nadir+SWOT altimetry datasets.It further stresses the limited ability of the state-of-art to retrieve fine-scale dynamics below 1 • and 10 days.
Whereas model-driven and optimal interpolation approaches are the state-of-the-art solutions for operational products, data-driven strategies have recently emerged as promising alternatives to improve the space-time resolution of interpolated products.We may cite among others DINEOF (Beckers and Rixen, 2003a;Alvera-Azcárate et al., 2005;Alvera-Azcárate et al., 2009) and the Analog Data Assimilation, AnDA (Lguensat et al., 2017;Tandeo et al., 2020) and the recent developments of deep learning schemes (Barth et al., 2019;Beauchamp et al., 2020) .Beauchamp et al. (2020) have reported a benchmarking experiment, which supported the relevance of data-driven schemes compared with the operational OI product.Here, we further explore deep learning approaches, and more particularly the 4DVarNet scheme (Fablet and Chapron, 2022), which bridges variational data assimilation and deep learning.As detailed herefater, we introduce a parameterization of the 4DVarNet scheme dedicated to SSH interpolation issues and demonstrate its relevance in the context of the benchmarking settings introduced in BOOST-SWOT 2020 data challenge.

Method
This section details the proposed learning-based framework for the interpolation of satellite altimeter data.We first briefly review 4DVarNet framework recently introduced in Fablet et al. (2021) in Sect.3.1 and present the proposed parameterization for SSH mapping from nadir and SWOT altimeter data in Sect.3.2.We describe the resulting Pytorch package and associated implementation details in Sect.3.4 and the proposed learning setting in Sect.3.3.

4DVarNet framework
4DVarNet framework introduced in Fablet et al. (2021) provides a generic approach for the learning of 4DVar models and solvers.They have been shown to outperform classic 4DVar solver for toy case-studies such as Lorenz-63 and Lorenz-96 dynamics, when considering partially-observed systems.4DVarNet framework can be regarded as an extension using trainable gradient-based solvers of the deep learning scheme, which led to the best SSH interpolation performance in our previous work (Beauchamp et al., 2020).
From a methodological point of view, 4DVarNet framework derives an end-to-end neural architecture from an underlying variational DA formulation: 4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry y(Ω) Q by a standard mean-square norm for sake of simplicity.In the regularization term, we substitute to the traditional dynamical prior M a neural operator Φ which convolutional architecture.Then, we can exploit the automatic differentiation tools embedded in deep learning framework to consider the following iterative gradient-based solver for the minimization of variational cost J Φ w.r.t.state x: where L is a convolutional LSTM model, see e.g.(Shi et al., 2015), α a normalization scalar and T a linear mapping.This iterative rule based on a trainable LSTM operator is similar to that classically used in meta-learning schemes (Andrychowicz et al., 2016).Due to the ability of LSTM models to capture long-term dependencies, it results in a trainable gradient descent with momentum.
Overall, a 4DVarNet scheme defines a neural architecture which runs a predefined number of iterative gradient-based update (see Eq. 4).The resulting neural architecture is referred as an end-to-end architecture in the sense as it uses as inputs raw observation data y and an initial guess x (0) and as outputs the reconstructed state x.Let us denote by Ψ Φ,Γ (x (0) , y, Ω) the output of the 4DVarNet architecture for given priors Φ and solver Γ, see Fig. 1 and Algorithm below, the initialization x (0) of state x and the observations y on domain Ω.
Then, the joint learning of operators {Φ, Γ} is stated as the minimization of a reconstruction cost, see Sect.3.3: In Appendix A, a parameter-free fixed point version of the solver is also given, based on the previous results of (Beauchamp et al., 2020).In addition, Beauchamp et al. (2021) have already shown how the iterative gradient-based update is more efficient that the simpler fixed-point formulation.

4DVarNet-SSH parameterization
The proposed 4DVarNet-SSH framework aims at exploiting and improving the mapping performance of current operational OI products.Given that OI products retrieve consistent large-scale dynamics, we rely on the following multiscale decomposition: where the anomaly dx is seen as the difference between the true state x and the large-scale components x.Regarding the observations data, let us denote by y(Ω) = {y k (Ω k )} the partial and potentially noisy altimetry observations associated with masks Ω = {Ω k } ⊂ D, where Ω k corresponds to the gappy part of the field and index k refers to time t k .We use the operational OI product as a gap-free obervation data, denoted as y, for state component x, whereas the observation data for anomaly dx is y − y over domain Ω.
Numerical experiments showed that an augmented state formulation led to better interpolation performance regarding potential stripping artifacts due to the nadir along-track sampling.This results in the application of 4DVarNet model (see Eq. 5) to augmented states x and observations ỹ defined as follows: This augmented state parameterization introduces two anomaly components.While only the first one is actually observed, the reconstructed SSH state is given by x + dx 2 .
Following Fablet et al. (2021), the operator Φ follows a purely data-driven parameterization with a two-scale residual architectures involving bilinear units (Fablet et al., 2020).The number of residual blocks is set to 2 and the bilinear units are made of two hidden convolutional layers, respectively with linear and ReLU activations, followed by a linear scheme combining the outputs of the second layer.A final convolutional layer with linear activation is involved to bring the outputs back to the initial state dimension.In its current implementation, Φ contains about 500.000parameters.In any case, the number of gradient iterations for the solver Γ is fixed at 5. Complementary tests showed that a higher number of iterations leads to a large increase of the training time (because of the implicit number of parameters which grows linearly with this number of iterations) without a significant gain in terms of 4DVarnet reconstruction skills.
Regarding the initial state for iterative gradient-based rule (4), we consider the OI field y for state component x, y − y for anomaly component dx 1 and a zero state for anomaly component dx 2 .For anomaly component dx 1 , gaps are initialized to 0.
4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry

Learning setting
We implement a classic supervised learning strategy using gap-free targets.The considered training loss L combines reconstruction losses and additional regularization terms: i.e., the L2-norm of the difference between state x and reconstruction x as well as for the their gradients, and regularisation losses according to prior Φ to enforce that both the true states and the reconstructed ones are correctly encoded by prior Φ. w = {w i }, i = 1, • • • , N denotes a weighting vector along the data assimilation window of size N (=7 here).
To give more importance to the center of the DAW, we use: This training loss is used in Sect. 4 for the OSSE-based BOOST-SWOT data challenge framework.
Let precise that state sequences x k±l = x k−l:k+l of length N = 2l + 1 are used in the training for the interpolation problem.The idea is to optimize the results for the time t k at the center of the window [t k−l ;t k+l ].The value of N has to be chosen according to the dynamics of the geophysical field considered.In the following experiments, we use a value of N = 7 which seems to be enough to describe the spatio-temporal correlations of the anomaly between the Ground Truth and the DUACS OI scheme.
Regarding the training configuration, when the domain is small (see GULFSTREAM and OSMOSIS regions definition in Sect.4), we use a single GPU and Adam optimizer with batch size of 2 over 200 epochs.The same set of parameters holds for larger domain (NATL and cNATL, see again Table 1) but we use the 4DVarNet-distributed version of the code over 4 GPUs.The computational time of the training procedure lies between 4 and 5 hours for the small-domain setup and between 7 and 8 hours for the large-domain setup.

Implementation aspects
We provide an open source PyTorch implementation of the 4dVarNet-SSH scheme1 .Pytorch is a state-of-the-art deep learning framework.We benefit from associated packages such as lightning and hydra to provide a high-level environment and make easier the reproduction of the experiments as well as the development of other applications.Through lightning package, our implementation supports multi-GPU distributed learning configurations.This may be highly relevant to speed up the training process.
Regarding computational issues, the OSSE-based applications, see Sect. 4, involves the processing of 7x240x240 tensors (i.e., 7-day time series over a 12 • × 12 • domain with a 1/20 • resolution).GPU with a significant RAM (typically above 30Go), such as NVidia V100, A40, A100, can process such tensors through the proposed 4DVarNet architecture.The direct training 4DVarNet models over larger spatial domains is however limited by the GPU memory.To address this issue, we develop a specific data management module, through the so-called dataloaders.Our dataloader module automatically extracts patches of a predefined size (typically, 7x240x240 in the reported experiments) from the considered training dataset according to stride parameters as sketched in Fig. 2. One can exploit the same approach to apply a learnt model to a large domain during the evaluation or production stage.In both cases, we benefit from the fully-convolutional feature of the considered neural architecture.This guarantees that, up to border effects, the 4DVarNet processing is translation-invariant.
4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry 4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry The GULFSTREAM and OSMOSIS domains (blue and red solid lines in Fig. 3) are the domains used by the BOOST-SWOT project in the framework of the NATL60 OSSE throughout the different related studies, see their 2020 ocean data challenges and Le Guillou et al. (2020).Because we aim at exploring the capabilities of 4DVarNet to deploy at the basin scale, we also propose the two alternate GULFSTREAM2 and OSMOSIS2 domains (blue and red dashed lines in Fig. 3), with similar dynamical properties than the two initial domains, as well as a larger domain centered in the North Atlantic basin (cNATL, purple dashed lines).The full extent of the subdomains are summarized in Table 1: Table 1: Description of the NATL subdomains used for assessing 4DVarNet capabilites generalization he GULFSTREAM regions display physical processes 100 times more energetic at scales larger than 100km with a greater temporal variability than the OSMOSIS regions.As a consequence, the SSH spatial gradient at scales above 100km is lower for OSMOSIS regions which explains why we can see more small scales related structures on such domains.In addition of their intrinsic differences in terms of dynamical regimes, the latitudes of GF-based and OSMOSIS-based regions implies different SWOT temporal samplings.For OSMOSIS regions, one SWOT observation is available every day, while over the low-latitude GULFSTREAM domains, the SWOT sampling is irregular leading to sequences of several days with only pseudo-nadir observations.Over these regions, the Sea Surface Height (SSH) resolution of the nature run is downgraded to 1/20 • , which is enough to capture both mesoscale dynamical regimes and the OSMOSIS-related smaller scales, while avoiding unnecessary heavy computation time.
The NATL60 nature run will then be used as the reference Ground Truth (GT) in an observing system simulation experiments (OSSE).The pseudo-altimetric nadir and SWOT observational datasets will be generated by a realistic sub-sampling of satellite constellations.

Simulated altimetry datasets
Regarding the pseudo-nadir altimetry dataset, representative of the current pre-SWOT observational altimetric dataset, we use the groundtracks of 4 altimetric missions (TOPEX/Poseidon, Geosat, Jason-1 and Envisat) picked up from the 2003 constellation to interpolate the NATL60 simulation from October 1st, 2012 to September 29th, 2013.A Gaussian white noise with variance σ 2 = (4 • • • 9)cm 2 is added to the interpolated NATL60 simulation by the SWOTsimulator tool to mimic a noise with a spectrum of error consistent with global estimates from the Jason-2 altimeter (Dufau et al., 2016).We aggregate the nadir pseudo-observations on a daily basis to procude the gappy daily fields used as inputs by 4DVarNet-SSH.We proceed similarly to simulate SWOT pseudo observations using the swotsimulator tool (Gaultier et al., 2015) in its swath mode with an along-track and across-track 2km spatial resolution (the same theoretical resolution that the upcoming SWOT mission derived products is expected to provide).Let us note that we consider error-free SWOT pseudo-observations.

Evaluation framework
Our evaluation framework exploits and extends the one introduced in Le Guillou et al. ( 2020) as follows: Training and evaluation setting: We train all learning-based models using the time period from 2013, February 4 to September 30 as training period.During the training procedure, we select the best model according to metrics computed over the validation period from 2013, January 1 to February 2. Overall, we evaluate performance metrics over the test period from 2012, October 22 to December 2 for intercomparison purposes.
Evaluation metrics: We use BOOST-SWOT DC metrics to benchmark 4DVarnet-SSH scheme with respect to the state-of-the-art SSH interpolation schemes.They comprise: RMSE-scores, in terms of mean -µ(RMSE)-and standard deviation -σ (RMSE)-, and minimal spatial and temporal scales resolved (λ x and λ t).We refer the reader to Le Guillou et al. (2020) for the detailed description of these metrics.Besides this quantitative metrics, we analyse the space-time distribution of the interpolation error.We also explore the impact of the interpolaion onto the characterization of mesoscale eddy dynamics.Based on the work of Mason et al. (2014), we detect anticyclonic and cyclonic eddies in the Ground Truth NATL60 outputs and interpolated SSH fields using py-eddy-tracker toolbox (https://py-eddy-tracker.readthedocs.io)and analyze how key features of matching eddies, such as speed radius (km), outter radius (km), amplitude (cm) and speed max (cm/s), are retrieved.

Results
This section presents the considered OSSE for the evaluation of the 4DVarNet-SSH scheme.We first report the benchmarking experiments with respect to the state-of-the-art (Sect.5.1).Sect.5.2 studies the impact of wide-swath SWOT data to improve the reconstruction of finer-scale SSH pattern.Last, we analyze generalization issues and uncertainy quantitication in Sect.5.3 and 5.4.

Benchmarking experiments
Regarding the BOOST-SWOT OSSE data challenge on the GULFSTREAM domain, we provide both performance with 4 nadirs and 1 swot + 4nadirs in Table 2.For both settings, the improvement is quite significant with respect to all benchmarked schemes, i.e. not only compared to DUACS OI (Taburet et al., 2019), but also with respect to the recently proposed SSH interpolation schemes Le Guillou et al. (2020)  4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry   We can draw similar conclusions from the experiments reported in Table 3 and Fig. 7 for the OSMOSIS domain.We may emphasize that 4DVarNet-SSH interpolation for the 1 swot + 4 nadirs configuration, see e.g.Fig. 7, retrieves most of the fine-scale features of the SSH fields, which are smoothed out by the optimal interpolation.
4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry

Impact of SWOT data on the interpolation performance
Thanks to its ability to reconstruct finer-scale patterns, 4DVarNet-SSH complements the assessment of the potential impact of SWOT data onto the reconstruction of mesocale sea surface dynamics.Though the interpolation performance (Tab. 2 and 3) improves with the use of SWOT data for all the interpolation methods, the relative improvement strongly depends on the interpolation method.Interestingly, contrary to OI DUACS scheme, we report a significant improvement when using SWOT data with 4DVarNet-SSH for both GULFSTREAM and OSMOSIS regions.These results emphasize the ability of our scheme to exploit irregularly-sampled high-resolution data.For instance, for the OSMOSIS region, we truly benefit from SWOT data to reconstruct mesoscale dynamics up to 0.4 • and 7 days, whereas OI DUACS smooths out the altimetry signals in the mesoscale range below 1 • and 14 days.
While we report relative gains of 20 − 25% for the GULFSTREAM region for the different evaluation metrics, it reaches 40 − 60% for the OSMOSIS domain.We interpret these results as a direct consequence of differences in the space-time sampling of SWOT data for these two regions.As revealed by Fig. 8(b) and 9(b) no SWOT data may be available over 4 (resp.1) consecutive days for the GULFSTREAM (resp.OSMOSIS) domain, This time variability of the sampling pattern translates for the GULFSTREAM region into a periodic variability of the MSE time series.By contrast, the OSMOSIS region leads to a much lower time variability of the interpolation performance.The PSD-based analysis reported in Fig. 8(c) and 9(c) further supports these conclusions.
To complement with the analysis of contribution of SWOT altimetry on the interpolation performance, Fig. 10 displays eddy identification results on 2022, October 25 after application of a 200km high pass filter when using 1 swot + 4 nadirs configuration.Additional Figures are given in Appendix B for illustrations for both the GULFSTREAM and OSMOSIS domains in the two observational configurations (4 nadirs and 1 swot + 4 nadirs).Clearly, 4DVarNet-SSH improves the matching between true and interpolated eddies (39 vs 35), and the features of the matching eddies are also more similar to those of the true eddies, in terms of speed radius (km), outter radius (km), amplitude (cm) and speed max (cm/s), with respect to their true values.Again, the interpolation of eddy-related dynamics significantly improves with the exploitation of SWOT data.
4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry

Generalization performance
Whereas the results reported in the previous sections involve 4DVarNet-SSH models evaluated on the same domain as the training one, we assess how 4DVarNet-SSH schemes trained for a specific domain may also apply to another one.
Besides the GULFSTREAM and OSMOSIS regions, we consider three additional domains: cNATL domain: a larger 20 • ×40 • North Atlantic domain, which involves a variety of dynamical regimes; GF2 domain: a domain similar to the reference GULFSTREAM domain in terms of upper ocean dynamics, but with a disjoint spatial extent; OSMOSIS2 domain: a domain similar to the reference OSMOSIS domain in terms of upper ocean dynamics, but with a disjoint spatial extent; For the 1 swot + 4 nadirs configuration, we train 4DVarNet-SSH schemes on these three domains.We then evaluate how these models compare with the models reported in Sect.5.1 for the GULFSTREAM and OSMOSIS domains.
We also evaluate how the different models apply to the cNATL domain.Table 4 summarizes the resulting performance metrics.
As expected for each evaluation domain, we retrieve the best performance for the model trained on this domain.For the GULFSTREAM regions, the difference in terms of minimal temporal scales is negligible while the minimal spatial scales may exhibit an increase of 30% using the model trained on the GULFSTREAM2 domain.This does not hold in the other way when applying on the GULFSTREAM2 domain a model learnt on GULFSTREAM, with similar spatial scales resolved in the end.The same conclusions hold for the OSMOSIS regions, except that the minimal resolved temporal scales also display a slight increase (lower than 20%) over OSMOSIS.These results are consistent with the dynamical properties given in Sect. 4 and support the generalization capabilities of 4DVarNet-SSH schemes.
The comparison with the performance metrics reported for the model trained on the cNATL domain suggests that the considered 4DVarNet-SSH parameterization applies to a regional scale.This training configuration only leads to a relatively marginal gain w.r.t.OI DUACS when applied to the GULFSTREAM region.We report a slightly better performance for the OSMOSIS domain.We expect future work to explore new 4DVarNet-SSH parameterizations, which could better account for basin-scale variabilities.Besides gap-free fields, operational interpolation products generally require to provide some evaluation of the reconstruction uncertainty.While this is a built-in feature of OI and statistical DA methods, uncertainty quantification may 4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry involve specific methodological or computational methods for other data assimilation schemes, among which ensemble methods represent a widely-considered family of approaches, see e.g.Asch et al. (2016).Their common feature is to generate an ensemble of solutions, generally through some randomization process.

Uncertainty Quantification for 4DVarNet-SSH interpolations
Here, we benefit from the stochastic nature of the training procedure of 4DVarNet-SSH schemes (Goodfellow et al., 2016).Similarly to most deep learning schemes, we exploit a stochastic gradient descent during the learning stage and a random initilization of model parameters.As such, for a given training configuration, we can learn an ensemble of 4DVarNet-SSH schemes by running multiple training procedures.We apply this approach to build an ensemble of nine 4DVarNet-SSH schemes for a given training configuration, which comprises a training dataset, the considered 4DVarNet-SSH parameterization and given training hyperparameters (i.e., number of epochs, learning rates, optimizers).For a given observation time window, we then retrieve nine interpolations, from which we can compute a median field and the associated standard deviation.We report in Tab. 5 the performance metrics for the GULFSTREAM domain of the nine trained models as well as the median model.It reveals the internal variability of the training process.Though it does not reach the best performance, the median model combines a resolved spatial scale below 0.7 • and a resolved time scale below 5 days, which is only the case of 6 over 9 of the trained models.Fig. 12(a) further illustrates this aspect.Interestingly, the standard deviation of the ensemble of 4DVarNet-SSH schemes correlates to the interpolation error, with an R2 coefficient of determination equals to 0.86, see Fig. 13.As such, it can be regarded as an indicator of the interpolation error.

Conclusion and Discussion
This paper introduced 4DVarNet-SSH scheme, an end-to-end neural architecture for the space-time interpolation of SSH fields from nadir and wide-swath satellite altimetry data.4DVarNet-SSH scheme draws from recent methodological development to bridge data assimilation and deep learning with a view to learning 4DVar DA models and solvers from data.Numerical experiments within an OSSE setting support the relevance of 4DVarNet-SSH scheme with respect to the state-of-the-art.
We discuss further our main contributions according to three aspects: the added value of deep learning scheme for satellite altimetry and operational oceanography, the exploitation of upcoming SWOT data and the ability to scale up learning approaches from regional case-studies to the global scale.
Deep learning for satellite altimetry and operational oceanography: This study contributes to a growing research effort regarding the potential benefit of deep learning schemes for space and operational oceanography challenges, see e.g.Ballarotta et al. (2020).Given the sampling of available satellite and in situ data sources, interpolation problems naturally arise as critical challenges.This study brings an additional evidence of the potential of deep learning schemes to outperform the state-of-the-art operational techniques, generally based on optimal interpolation and data assimilation.Importantly, we do not rely on the off-the-shelf application of some reference deep learning architectures.The considered class of neural architectures relates to a variational DA formulation, such that it can be regarded as the implementation of a neural and trainable version of a DA model and solver.Our results for satellite altimetry are in line with other recent studies for other ocean parameters, such as sea surface temperature (Barth et al., 2019), suspended sediments (Vient et al., 2022) and 3D temperature and salinity fields (Pauthenet et al., 2022).All these studies support the potential of neural approaches to retrieve finer-scale variabilities from available satellite and/or in situ observations.Regarding satellite altimetry, future challenge includes the application to real altimetry datasets, see e.g. the 2021 Observation System Experiment (OSE) BOOST-SWOT data challenge https://github.com/ocean-data-challenges/2021aSSHmappingOSE,as well as the exploitation of multimodal synergies (Fablet and Chapron, 2022).
Making the most of SWOT data: Our study brings new evidence that the wide-swath space-time sampling of upcoming SWOT mission could lead to a very significant improvement of the reconstruction of mesoscale sea surface 4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry dynamics.For the considered case-study regions, with contrasted dynamical regimes in play and revisit times of SWOT orbits, we report relative gains from 20% to 60% compared to nadir altimetry data only in terms of RMSE and resolved space-time scales.These results assume an error-free SWOT product.Therefore, exploring further how these results could generalize to error-prone (Esteban-Fernandez, 2014;Gaultier and Ubelmann, 2010) and uncalibrated SWOT data (Febvre et al., 2022) is a critical challenge.Preliminary preprocessing of the pseudo-SWOT observations (Metref et al., 2020) to filter out its correlated components and avoid major issues in the assimilation and/or learning process of the interpolation methods may also be considered.The extension of the considered OSSE to multi-swot configurations could also provide new means to optimize the deployment of multi-satellite configurations in coming years.
Scaling up to a global scale with learning-based scheme: Our numerical experiments focused mainly on a regionalscale, typically 10 • × 10 • domains as illustrated by the GULFSTREAM and OSMOSIS regions.The reported results support the relevance of the proposed 4DVarNet-SSH parameterization to account for such regional space-time variabilities.Scaling up to a basin scale or even the global scale naturally arises as a key challenge for future work.
Through the built-in features of Pytorch framework and associated packages, our open-source code can leverage multi-GPU distribution learning schemes and on-the-fly mini-batch generation tools to deal with larger-scale dataset from a computational point of view.To account for a greater diversity of dynamical regimes in play on the global scale, or even on a basin scale, it also seems necessary to explore more complex 4DVarNet-SSH parameterizations, especially regarding dynamical prior Φ.This could benefit from the variety of neural architectures recently introduced in computational imaging (Barbastathis et al., 2019), especially using attention mechanisms (Vaswani et al., 2017) to achieve some decomposition of the underlying space-time variabilities.4DVarNet-SSH: end-to-end learning of variational interpolation schemes for nadir and wide-swath satellite altimetry

Figure 1 :
Figure 1: Sketch of the gradient-based algorithm: the upper-left stack of images corresponds to an example of SSH observations temporal sequence with missing data used as inputs.The upper-right stack of images is an example of intermediate reconstruction of the SSH gradient at iteration i while the bottom-left stack of images identifies the updated reconstruction fields used as new inputs after each iteration of the algorithm.

Figure 2 :Figure 3 :
Figure 2: Patch-based strategy: the whole spatio-temporal dataset is split into small patches.The temporal size of the patches corresponds to the data assimilation window.The spatial size of the patches is chosen to match the maximal distance with spatial autocorrelation of the SSH Figure 4: NATL60 Ground Truth (a) and its gradient (b) ; one day accumulated along-track 4 nadirs (c) and wide-swath SSH pseudo-observations + 4 nadirs (d) on 2012, October 25 (domain: GULFSTREAM)

Fig. 6
Fig. 6 displays the SSH gradient field of DUACS OI and 4DVarNet-SSH interpolations on October, 25.The comparison to the associated Groundtruth displayed in Fig. 4(b) clearly reveals the improvement brought by 4DVarNet-SSH, in particular along the main meandrum of the GULFSTREAM.

Figure 12 :
Figure 12: Interpolation performance of an ensemble of nine 4DVarNet-SSH models trained using similar training parameters (number of epochs, learning rates, optimizers, gradient steps, etc.) but different random initialization of both Φ and Γ weights.(a) : spatial RMSE time series on the BOOST-SWOT DC evaluation period ; (b) 4DVarNet median run (2012-10-25, GF domain) and (c) its spatial standard deviation

Figure 13 :
Figure 13: Standard deviation of the 4DVarNet-SSH median ensemble interpolation error (left) and average of the daily standard deviation interpolation errors on the test period (right)

Figure 15 :
Figure 15: 4DVarNet generalization capabilities (GULFSTREAM): spatial, temporal and spectral performance on the BOOST-SWOT DC evaluation period based on three different training domains: GULFSTREAM, GULFSTREAM2 and cNATL

Figure 16 :
Figure 16: 4DVarNet generalization capabilities (OSMOSIS): spatial, temporal and spectral performance on the BOOST-SWOT DC evaluation period based on three different training domains: OSMOSIS, OSMOSIS2 and cNATL

Figure 17 :
Figure 17: 4DVarNet generalization capabilities (cNATL): spatial, temporal and spectral performance on the BOOST-SWOT DC evaluation period based on three different training domains: cNATL, GF and OSMOSIS

Figure 18 :
Figure 18: 4DVarNet generalization capabilities (GULFSTREAM2): spatial, temporal and spectral performance on the BOOST-SWOT DC evaluation period based on three different training domains: GULFSTREAM2, GULFSTREAM and cNATL

Figure 19 :
Figure 19: 4DVarNet generalization capabilities (OSMOSIS2): spatial, temporal and spectral performance on the BOOST-SWOT DC evaluation period based on three different training domains: OSMOSIS2, OSMOSIS and cNATL

Table 2 :
4DVarNet-SSH performance on the GULFSTREAM domain compared to DUACS OI (traditional covariancebased Optimal Interpolation), BFN (Back and Forth Nudging of A QG model), MIOST (Multi-scale OI), DYMOST (Dynamic OI accounting for non-linear temporal propagation of the SSH fields), and Fixed-point versions with 10 and a single iteration of the 4DVarNet solver, over the period from2012-10-22 to 2012-12-02 (42 days)

Table 5 :
4DVarNet performance on the GULFSTREAM domain based on nine different trainings with random initialization of both Φ and Γ weights but similar training parameters (number of epochs, learning rates, optimizers, gradient steps, etc.) over the period from 2012-10-22 to 2012-12-02 (42 days) The open-source 4DVarNet version of the code is available at (https://doi.org/10.5281/zenodo.7186322).The datasets is shared through the BOOST-SWOT data challenge also available on Github (https://github.com/ocean-data-challenges/2020aSSHmappingNATL60)Videosupplement The animations corresponding to the 4DVarNet comparison to DUACS OI on the BOOST-SWOT DC test period are given for both GULFSTREAM and OSMOSIS domains in the 4 nadirs and 1 swot + 4 nadirs configuration.They can be found on the AI CHair OceaniX Youtube channel:Author contributions Maxime Beauchamp designed the experiments, ran the analysis of the results and wrote the paper.Ronan Fablet is the principal investigator of the 4DVarNet methodology.Quentin Febvre led the developments of the 4DVarNet implementation on large domains.Hugo Georgenthum ran the experiments used in this paper.All the authors actively participate to the open-source 4DVarNet version of the code available at (https://doi.org/10.5281/zenodo.7186322) GF (4 nadirs): https://youtube.com/shorts/QKXukBRd5EGF (1 swot + 4 nadirs): https://youtube.com/shorts/i91Z1pMm4gYOSMOSIS (4 nadirs): https://youtube.com/shorts/Pxcsd0Afco0OSMOSIS (1 swot + 4 nadirs): https://youtube.com/shorts/HbVSJFtdG6Q