Satellite-based Time-Series of Sea Surface Salinity designed for Ocean and Climate Studies

Sea Surface Salinity (SSS) is an increasingly-used Essential Ocean and Climate Variable. The SMOS, Aquarius, and SMAP satellite missions all provide SSS measurements, with very different instrumental features leading to specific measurement characteristics. The Climate Change Initiative Salinity project (CCI+SSS) aims to produce a SSS Climate Data Record (CDR) that addresses well-established user needs based on those satellite measurements. To generate a homogeneous CDR, instrumental differences are carefully adjusted based on in-depth analysis of the measurements themselves, together with some limited use of independent reference data. An optimal interpolation in the time domain without temporal relaxation to reference data or spatial smoothing is applied. This allows preserving the original datasets variability. SSS CCI fields are well-suited for monitoring weekly to interannual signals, at spatial scales ranging from 50 km to the basin scale. They display large year-toyear seasonal variations over the 2010-2019 decade, sometimes by more than +/-0.4 over large regions. The robust standard deviation of the monthly CCI SSS minus in situ Argo salinities is 0.15 globally, while it is at least 0.20 with individual satellite SSS fields. r2 is 0.97, similar or better than with original datasets. The correlation with independent ship thermosalinographs SSS further highlights the CCI dataset excellent performance, especially near land areas. During the SMOS-Aquarius period,

when the representativity uncertainties are the largest, r2 is 0.84 with CCI while it is 0.48 with the Aquarius original dataset. SSS CCI data are freely available and will be updated and extended as more satellite data become available.

Introduction
Salinity is a key ocean and climate variable that plays a fundamental role in the thermohaline global ocean circulation, the hydrological cycle, and climate variability (Durack et al., 2012;Siedler et al., 2001).
Along with temperature, salinity controls sea water density. This parameter influences the ocean stratification, water mass formation, and ultimately the general circulation of the ocean. At high latitudes, in cold polar surface waters (typically Sea Surface Temperature (SST)= 2°C), a change of 0.11 in Sea Surface Salinity 1,2 (SSS), is equivalent, in terms of density, to a change of 1°C in SST. In warm regions (SST=28°C), a larger salinity change of 0.44 is equivalent to a 1°C change in SST, as per its density contribution. This is the reason why salinity variations play a key role in controlling the global thermohaline circulation, in particular due to high salinities anomalies strongly contributing to convective overturning (i.e. transporting waters from the ocean surface to the deep ocean) at high latitudes. It is also a tracer of oceanic processes (advection, mixing). SSS bears the signature of freshwater fluxes originating from evaporation minus precipitation (E-P), river discharges and ice melting or freezing. Freshwater fluxes can modify the vertical stratification in density and strongly influence the air-sea exchange through the development of so-called salt-stratified (halocline) barrier layers (Lukas and Lindstrom, 1991). Indirectly, salinity contributes to El-Nino Southern Oscillation (ENSO) (Vialard and Delecluse, 1998), Indian Monsoon (Shenoi et al., 2002) and primary productivity (Picaut et al.,1 Sea Surface Salinity in this article refers to the salinity determined from a satellite microwave radiometer sensing the thermal emission at 1.4GHz due to its changing dielectric properties. Thus the measured quantity is representative of the upper few cm of the sea surface depending on the depth of surface foam present on the sea surface. 2 Sea Surface Salinity is expressed with no physical units. It is defined according to the Practical Salinity Scale (UNESCO, 1985) as conductivity ratio: a seawater sample of Practical Salinity 35 has a conductivity ratio of 1.0 at 15°C and 1 atmosphere pressure, using a potassium chloride (KCl) standard solution containing a mass of 32.4356 grams of KCl per Kg of solution. 2001). At high latitude, the sharp halocline shields the heat stored in the deep ocean from reaching the surface layer and melting the sea ice (Carmack et al., 2016;Lique, 2015;Steele et al., 2004).
The ocean is a major component of the Earth's water cycle: 86% of evaporation and 78% of precipitation take place over the ocean. The near-surface salinity also carries the imprint of the anthropogenic influence on water cycle, with a tendency for SSS to increase in evaporation regions and decrease in precipitation-dominated regions (Yu et al., 2020). This so-called observed 'dry gets dryer and wet gets wetter' tendency also occurs in climate models projections (Bindoff et al., 2019). Both positive and negative trends in ocean salinity and freshwater content have been observed throughout much of the ocean including sea surface and ocean interior, providing indirect evidence that the E -P pattern over the oceans is amplifying. In addition, evaporated water above the ocean is transported above the continents by atmospheric circulation. Li et al. (2016) recently found correlations between satellite SSS anomalies in the North Atlantic and anomalies of rain in Sahel lagged by several months suggesting that large-scale SSS anomalies could bring skill as precursor indicators of rain over the continents.
In addition, being the total mass of dissolved salts per kg of seawater, salinity affects the entire oceanic carbonate system and its components: total alkalinity, dissolved inorganic carbon, pH, and fugacity of CO2 (Millero, 2007).
Due to its role both as a variable that influences the oceanic circulation, air-sea exchanges, upper-ocean biogeochemical state and as a tracer of the water cycle and its changes, SSS has been identified as an Essential Climate Variable (ECV) (https://public.wmo.int/en/programmes/global-climate-observing-system/essential-climatevariables).
SSS measurement by satellite remote sensing was motivated by the essential need for better monitoring, understanding and constraining the marine components of the climate system.
At the end of the 1990s, the Global Ocean Data Assimilation Experiment (GODAE) group estimated that it was necessary to measure SSS with an accuracy of 0.1 at monthly, 100 km (or every 10 days at 200 km) scales. Since the beginning of the 21st century, the Argo network of insitu profiling floats has continuously evolved (Roemmich et al., 2019) and provided invaluable measurements of 3D oceanic salinity. The Argo array reached global coverage in 2006 with one manuscript submitted to Journal of Geophysical Research: Oceans measurement of both temperature and salinity every ~10 days and every ~300 km with a vertical sampling range of between about 5 -2000m depth. Nevertheless, there is increasing consensus that the GODAE specifications that are well covered by the Argo network do not serve all the needs for ocean and climate studies. The analysis of ship measurements (Delcroix et al., 2005), and later of satellite measurements have revealed a natural variability of SSS much larger than 0.1 in regions characterized by large mesoscale variations (Fournier et al., 2016;Hasson et al., 2019;Reul et al., 2014), in river plumes and areas characterized by strong precipitation events (e.g., Figure 6 in Boutin et al. 2016). In these regions, SSS fields derived from satellite data depict SSS variability much better than the salinity products derived from the in situ Argo network alone .
Satellite SSS data are available with regular repeat global coverage since 2010, thanks to to microwave radiometers operating at a frequency of 1.4 GHz (wavelength 21 cm; L-Band). The brightness temperature (Tb) measured by ocean-observing microwave radiometers is related to the emissivity of the ocean surface layer. The L-Band Tb depends primarily on the dielectric properties of the surface seawater (i.e., seawater conductivity and, thus, salinity and temperature) and its geometric characteristics, determined by sea surface roughness . At this frequency, the atmosphere is almost transparent (except for strong precipitation). Molecular oxygen has the most significant effect on the measured Tb with small contributions from water vapor, cloud liquid water, and rain. Tb has a relatively low sensitivity to SSS, especially at low SST: the Tb change per SSS unit is ~0.8 K at 30°C and ~0.2 K at 0°C (Yueh et al., 2001). The radiometer measurements of SSS in cold waters at high latitudes are thus particularly challenging.
SSS measured from satellites is not directly comparable to near surface in-situ measurements due to the vertical structure of salinity in the upper ocean. At L-Band in foam-free conditions, the emissive ocean surface layer is ~1cm deep (when the foam is present, emissions may emanate from a layer >5 cm thick (Anguelova and Gaiser, 2011)), while the upper measurements performed by most in-situ devices (such as Argo floats, thermosalinograph onboard ships, or moorings networks) are in the depth range of 1 to 20 m. Obtaining high vertical resolution measurements in the upper few meters of the ocean is particularly challenging even when using all available in-situ data sources. Hence, significant differences between satellite SSS compared to in-situ salinity have been observed with mean vertical differences manuscript submitted to Journal of Geophysical Research: Oceans 6 larger than 0.1 in the Pacific, Atlantic, and Indian Oceans coinciding with the average position of the inter-tropical convergence zone (ITCZ) heavy precipitations. The upper ocean stratification effects are the subject of dedicated experiments and their quantification via air-sea coupling modeling an active research domain (e.g. (Drushka et al., 2019) and references herein).
Furthermore, L-band satellite measurements are integrated over a large spatial footprint, from ~40 km to more than 100 km, which poses obvious representativity issues when compared to punctual samples from Argo floats or ship transects. These issues have a strong impact when using in situ data for validation and vicarious calibration of satellite SSS products which must be managed with care in these areas of potentially significant vertical salinity stratification.
In the open ocean, the uncertainty of the SMOS (Soil Moisture and Ocean Salinity), Aquarius, and SMAP (Soil Moisture Active Passive) satellite SSS averaged over the GODAE scales is now estimated to be of the order of 0.2 Vinogradova et al., 2019).
SMOS data initially suffered from large systematic errors in the vicinity of the coast and in areas polluted by radio frequency interferences (RFI). SMAP data are less polluted than SMOS in these areas because of advanced RFI filtering capabilities. However, recent SMOS processing largely reduces these systematic errors making SMOS SSS often very close to SMAP SSS (e.g. (Akhil et al., 2020;Fournier and Lee, 2021)).
As this study started, individual satellite SSS is capable of resolving the upper register of the mesoscale spectrum (spatial scale of 50-500 km and temporal scale of 10-100 days) (e.g (Hasson et al., 2019;Huang et al., 2021;Kolodziejczyk et al., 2021;Lin, 2019;Olivier et al., 2020)). These scales play an important role in the mixing and exchanges of water masses close to fronts, as a result of ocean circulation and atmospheric fluxes. At larger scales, ENSO events in the tropical Pacific have serious climatic repercussions at planetary scale and are the main source of interannual climatic variability with strong societal consequences (agriculture, marine ecosystems, health…) in many areas (McPhaden et al., 2006). We list below progresses which have been allowed by satellite SSS data. Zhu et al. (2014) have shown that salinity variability may play an active role in ENSO evolution, and is thus an important information to be taken into account for a better understanding of air-sea interaction processes during ENSO. Furthermore, monitoring ENSO phases via dedicated SSS-based climate indexes is useful to complement existing SST-based indexes (Qu and Yu, 2014). There is also some recent evidence that SSS provides an additional forecast skill for ENSO prediction (Hackert et al., 2020). Given the scientific advances made possible by satellite-based SSS, the redefinition of the observation strategy in the tropical Pacific Ocean (Cravatte, 2016) and in the tropical Atlantic ocean (Foltz et al., 2019) underlined the need to pursue SSS satellite measurements.
Variations of precipitations above continents lead to variations of river discharges (Amazon/Orinoco, Congo, Niger, Mississippi…) which, together with ocean circulation, lead to a large SSS variability in river plumes. These have been well observed and documented by SMOS, Aquarius, and SMAP measurements in the tropics (e.g. (Akhil et al., 2020;Fournier et al., 2016;Houndegnonto et al., 2021;Reverdin et al., 2021) and references herein).
Furthermore, SMOS and SMAP data analysis have shown that the active role of the salinity in the development of barrier layers might intensify the water cycle in some tropical areas. Under tropical cyclones, it limits vertical mixing and hence cooling of the ocean surface which influences the development of the cyclones themselves (Balaguru et al., 2020;Reul et al., 2021). These SSS feedbacks on climate are of increasing importance for climate studies.
Satellite SSS provides a precious information to better understand and constrain the airsea CO2 exchanges, due to the dependency of carbonate properties on seawater salinity. Satellite SSS is in particular useful to study the total alkalinity and pH (Fine et al., 2017;Land et al., 2015;Salisbury et al., 2015).The impact of fresh water input onto air-sea CO2 exchanges has been highlighted from satellite data in high precipitation regions (Brown et al., 2015;Ho and Schanze, 2020) and in river plumes (Ibánhez et al., 2017;Lefèvre et al., 2014).
The goal of the European Space Agency (ESA) Climate Change Initiative Sea Surface Salinity (CCI+SSS) project is to optimize satellite SSS time series by merging satellite SSS acquired by various instruments. For the first time, SMOS, Aquarius, and SMAP measurements are combined to produce Level 4 (L4) gridded multi-mission estimates of SSS. Such a combination reduces the noise of satellite SSS fields owing to better sampling and improves the spatial resolution of large mesoscale SSS features. Examples of the derived merged SSS L4 maps are illustrated in Figure 1 during a period with SMOS data-only, a period with SMOS plus Aquarius data and a period with SMOS plus SMAP SSS measurements. The large mesoscale features around tropical river plumes (e.g., Amazone, Orinoco, Congo), frontal regions (e.g. Gulf Stream) are nicely captured by the satellite products. The addition of SMAP SSS leads to a reduced noise during the SMOS and SMAP period with respect to the SMOS-only period, and to a reduced blurring with respect to the SMOS-Aquarius period, owing to the better spatial resolution of SMOS and SMAP compared to Aquarius. We have polled potential users of the CCI+SSS product using a Web Survey (available at https://forms.gle/BVDroYrNpVvpxFJu9), in order to better define their needs. A more detailed overview of the results of this poll can be found in the supporting information S1, but we briefly summarize the main points here. Most users were interested by global L4 products that cover a period of at least 9 years, with a typical resolution of 1 day to 1 month and 0.25 to 1°. In fact, there appear to be both the need for high-resolution, low accuracy (weekly 0.25° ~ 0.3) and lower-resolution higher accuracy (montly 1°, < 0.1) products. Users also request some simple characterization of the quality of the data, along with its value. Finally, many users thought that it was important to merge SSS datasets from several sources. In this paper, we address the merging of several satellite products, also using a limited statistical information about SSS variability derived from in situ and model reanalyses.
The objective of this paper is to describe the method and data employed to build the CCI L4 SSS time series, and then the ones used in the validation exercise; to analyze the resulting fields and their validity; to review and discuss the strength of satellite merged SSS and its remaining caveats and propose research avenues to solve or mitigate them.

Overview
The CCI+SSS algorithm described in this paper corresponds to the version 2 of the algorithm. It is summarized in Figure 2 and in this subsection, with each step and dataset more thoroughly described in the remaining of section 2. Figure 2: Outline of the CCI+SSS merging methodology which is performed independently on each node of the EASE 25km grid. Input satellite SSS data are indicated in light blue boxes, additional input information is in green boxes (origin of this information in bold black), processing steps are in yellow boxes, temporary informations output of monthly OI are in light violet boxes and output CCI+SSS L4 fields are in violet boxes.
The main inputs are the satellite datasets. The SMOS input data has the same rectangular 25km Equal-Area Scalable Earth (EASE) 2.0 grid than our final CCI SSS dataset. The SMAP and Aquarius datasets are all reprojected on the same grid. The spatial resolution of each level 4 SSS is driven by the spatial resolution of the SMOS and SMAP SSS measurements, i.e. ~50 km x 50 km. Most of the input datasets biases are treated in the optimal interpolation (OI) step (see below). The input datasets hence simply undergo a minimal level of pre-processing of well known biases. This includes a better model of the dielectric constant and a correction for seasonal latitudinal biases for SMOS (see section 2.2 for more details). Seasonal latitudinal biases are corrected in the SMAP original processing and are neglected in case of Aquarius. The latter could be revised in a future version.
The monthly CCI SSS maps are obtained based on an OI in the time domain, i.e. each grid point is treated separately, the intent being to preserve the spatial SSS features contained in the original datasets. A larger representativeness error is attributed to Aquarius to account for the fact that its effective resolution is lower than that of SMAP and SMOS. The a-priori value of the SSS is set to a constant value, the median of the observed SSS over the full time period. The OI is performed over the entire time period of the three datasets.
The errors on the input satellite datasets to be corrected during the OI include: • Poorly modeled instrumental effects (solar influence, land contamination, absolute calibration …), • Radiometric noise, • Flaws in the radiative transfer model (e.g. deficiencies in rough sea surface emission, in scattering of galactic and solar radiation…), • Uncertainty in auxiliary geophysical data used as priors in the retrieval algorithms (wind speed, SST…), • Measurement contamination by spurious effects (e.g. RFI).
Both a random component and a systematic component of the error in each dataset are estimated by the OI. The systematic component (also called bias) depends on each satellite measurements geometry: it is constant in time but highly variable in space (for instance it tends to be higher near coasts where RFI and pollution by land signals are more common).
The OI relies on error statistics to estimate these two components of the error. These statistics include the error variance which is deduced from the data itself (see section 2.4.2) and a temporal covariance. The bias correction is nonlocal (it uses the entire time period of the three datasets), while the random part is treated more locally. A theoretical estimate of random uncertainty associated with the CCI L4 SSS product is provided by the OI.
The previous OI step is meant to combine the various datasets into a consistent one, but a final calibration of the absolute reconstructed SSS is still needed. This is done by adjusting the median value (or the upper quartile in highly variable regions, see section 2.3.1) of the multisatellite SSS to that of the ISAS product, reconstructed from in situ data.
The weekly data is obtained through another OI, but this time using as inputs the satellite SSS corrected from the systematic uncertainties estimated by the monthly OI, using the monthly data as an a priori value and using a representativity uncertainty between monthly and weekly SSS derived from an ocean model reanalysis.
In the following, we present the satellite datasets and the auxiliary datasets used in the OI. We then describe the OI method and the methodologies followed to derive the various components (such as the error statistics) involved in the OI.

Input Satellite Data
The main characteristics of L-band radiometric satellite missions enabling global SSS measurements are summarized in Table 1.  (Piepmeier et al., 2017) We consider the longest SMOS, SMAP, and Aquarius SSS retrieved with up-to-date algorithms available at the time of the development of the CCI L4 SSS ( Table 2).  : where SSSobs is the observed SSS, t is the time of the measurement, , and , are respectively the latitude and the longitude of the considered pixel over the ocean, xswath corresponds to the pixel location across the swath, xorb indicates the satellite orbit direction (ascending or descending), blat is a latitudinal correction that varies seasonally as a function of the month, m, and SSSsmosref is a reference SMOS SSS taken at a given xswath and xorb. In order to avoid land-sea contamination, blat is derived from SMOS measurements at several hundred kilometers from coast. blat is removed from all SMOS SSS whatever their distance to coast and before estimating the temporally constant land-sea contamination correction (section 2.4.4).
Actually, the seasonal latitudinal bias is expected to affect all pixels, whatever their distance to land.
SMAP v3 and Aquarius v5 SSS retrieval algorithms are described in . SMAP algorithm has been updated towards version 4 as described in Meissner et al. (2019) with improved land correction, sea-ice mask, ice flagging. SMAP and Aquarius SSS are reprojected on the SMOS CATDS spatial grid, the rectangular EASE 2.0 grid (Brodzik et al., 2012) with 625 km 2 surface resolution (referred to as 25km EASE grid in this paper) using the nearest neighbor criterion. We do not apply seasonal latitudinal correction to SMAP SSS as the reflector emissivity correction (  Kao et al. (2018) and Meissner et al. (2018) found latitudinal seasonal SSS systematic differences with respect to Argo SSS or between ascending and descending passes that can reach +/-0.2.
RFI and calibration stability are two main challenges to deal with before retrieving reliable SSS measurements, given the low sensitivity of Tb to salinity (0.5K per salinity unit at 20°C at nadir). The data processing for each satellite mission is summarized below.
SMOS, Aquarius, and SMAP missions operate in a protected spectrum band (1400-1427 MHz) that is nevertheless known now to be affected by numerous RFI. Areas affected by RFI might experience data loss or result in inaccurate salinity retrieved values. To alleviate this situation, several strategies were set up to filter RFI contaminated measurements. SMOS, launched in 2009, was the first satellite with a radiometer operating in L-band, and it does not have any on-board hardware/software to filter RFI, so that RFI counteracting only relies on data post-acquisition processing. Filtering is significantly improved for SMAP (Soldo et al., 2019) (and to a lesser extent for Aquarius (Le Vine and Matthaeis, 2014)), as they are (were) equipped with on-board frequency/time-domain-based RFI filters. Nonetheless, RFI remains a major problem (e.g. (Kristensen et al., 2019)).
Onboard calibration and correction for parasitic effects does not allow to reach enough Tb stability and accuracy to cope with the SSS remote sensing requirements. As a consequence, vicarious calibrations, the so-called Ocean Target Transformation (OTT) for SMOS (Yin et al., 2013) and Ocean Target Calibration (OTC)  for Aquarius and SMAP, are applied for removing spatially homogeneous but time-varying biases in the measured minus expected radiometric signal. OTT is computed in a limited region, almost free of RFIs and land sea contamination, while OTC is estimated from global comparisons. The SSS references come from either a climatology (SMOS), an ocean model (SMAP) or an Argo-derived field (Aquarius), the choice being partly driven by the reference product availability with respect to production temporal constraints. In addition, a latitudinal seasonal varying correction for the SMAP antenna emissivity is derived for each day of the year from Scripps Argo SSS .
The auxiliary geophysical information necessary to initialize the SSS retrieval is a possible source of uncertainty. In the case of SMOS, wind, SST and atmospheric parameters are extracted from the European Center for Medium Weather Forecast (ECMWF) Integrated Forecast System (IFS). In the case of Aquarius and SMAP, atmospheric parameters are extracted from the National Centers for Environmental Prediction (NCEP) General Data Assimilation System (GDAS) and SST is from the Canadian Meteorological Center (CMC). In the future, the Copernicus Microwave Imaging Radiometer (CIMR) (Donlon, 2020)) will provide L-band measurements together with higher frequency data that are sensitive to SST (C-band), and surface roughness (K-band) allowing a more precise initialization of SSS retrieval.

ISAS
Monthly gridded fields of salinity derived from in-situ measurements are obtained from the ISAS v6 algorithm, an optimal interpolation (OI) tool (Bretherton et al., 1976) developed for the synthesis of the Argo global dataset (Gaillard et al., 2016). ISAS OI considers spatiotemporal length scales between 300km and 4 times the Rossby radius and between 3 weeks and 1 week. In practice, the smoothing is very dependent on the availability of in situ measurements, with, on average, one Argo measurement every 10 days and every 3°x3° (Roemmich et al., 2019).
We use the ISAS fields reconstructed at 5 m depth on a half degree horizontal grid for two purposes. First, ISAS 2011-2016 latitudinal profiles are qualitatively used for defining the reference across swath location for the SMOS latitudinal seasonal bias correction. Second, at each grid point, a statistical distribution property of ISAS SSS is used to adjust the long-term absolute reference of the CCI L4 SSS corrected fields (see Figure 2 and section 2.4.4). In very variable regions, the SSS statistical distribution is skewed towards low SSS values (e.g. (Bingham et al., 2002)) and these low SSS values are poorly reproduced by gridded in situ salinity products . Hence, in very variable regions, we adjust the 80% quantile of CCI SSS to that of ISAS one whereas we adjust the medians (50% quantiles) in less variable regions.

Ocean model reanalysis
We estimate the amplitude of the SSS variability at spatio-temporal scales that are not resolved from satellite missions. To that end, we use salinity fields at 0.5 m depth between 2011 and 2014, from the GLORYS eddy resolving (1/12°) global oceanic re-analysis produced by Mercator Ocean. This model re-analysis assimilates satellite sea level anomalies, SST, sea ice concentration, in-situ temperature and salinity vertical profiles (but not satellite SSS) and is forced using climatological runoffs. More information about this product can be found on https://resources.marine.copernicus.eu/?option=com_csw&view=details&product_id=GLOBAL _REANALYSIS_PHY_001_030.

OI methodology:
We summarize below the basic principle of the OI methodology. A more detailed description is given in supporting information S3.
In order to estimate SSS at a given time, the algorithm uses an OI in the time domain, applied independently to each node of the spatial grid. We do not apply any catch-up to reference climatological fields which might remove or attenuate important interannual and/or large mesoscale variations.
Together with SSS and its associated random uncertainty, the CCI+SSS monthly OI estimates a systematic correction, bc, that is applied to each satellite SSS observation. bc mainly corrects for biases related to land-sea contamination and permanent RFI. The algorithm is very similar to the one described in  but applied to different sensors and geometries.
To estimate SSS time series at spatial resolution R1 and temporal resolution T1, knowing that SSSobs are at spatial resolution r1 and temporal resolution t1, the cost function (to be minimized) is written as (for monthly OI): with: there is no bias correcteion as the monthly OI is used as an a-priori value, only SSS is estimated (m=SSS). Ct is the observation error covariance matrix containing the expected random errors of the satellite SSS relative to the OI SSS. It includes random uncertainty of the satellite SSS observation, Co, and a representativity uncertainty, Cr, that is added to take into account the difference in resolution of fields (R1;T1) and (r1;t1):

Ct=Co+Cr
This covariance is seasonally dependent and specific to each grid node.
The a priori error covariance matrix, Cm, contains the random fluctuations allowed around the a priori fields of SSS, SSS, and of bc, bc. In the monthly OI, SSSprior is constant over time and SSS are derived using a monthly climatology of SSS variability. bc is set to an arbitrary constant value (chosen equal to 4 pss, a value well above reasonable values for retrieved bc). In case of the weekly OI, SSSprior are taken equal to the monthly L4 SSS and SSS are derived from the uncertainties of the monthly L4 SSS estimated by the monthly OI combined with the expected variability between monthly and weekly fields derived from model simulations. The variability fields are seasonally dependent and specific to each grid node.
The solution of the minimization provides the estimated SSS and bc, m_est:

m_est=m_prior+Cm'•H T • (H•Cm•H T +Ct) -1 • (SSS obs -F(m_prior))
with H the matrix of the partial derivatives of each sensor SSS relative to the estimated parameters. Cm is the time covariance matrix operating in the observation space and Cm' the time covariance matrix between the observation space and OI product space. A posteriori uncertainty term, corresponding to L4 SSS random uncertainty, is also derived.
The way we derive the various terms involved in equation (2) is described in the following sections.

Climatology of SSS variability:
We explain below how the information about SSS variability introduced in the Cm matrix is built.
SSS variability is introduced to distinguish between SSS fluctuations due to an instrumental flaw or to a plausible geophysical SSS variation. For instance, in a river plume, during runoff season, the SSS is allowed to fluctuate much more than during calm periods. SSS variability is considered at two temporal scales in the OI: -the sub-monthly SSS variability combined with interannual SSS variability in each pixel, SSSvarmonth, that is used to estimate monthly SSS, taking a constant SSS value as SSSprior; -the variability of SSS at 50km/7days relative to SSS at 50km/1month resolution, SSSvarweek, that is used to estimate weekly SSS, taking monthly SSS as SSSprior.
SSSvarmonth, is estimated on each grid node and for each month of the year, as the root mean square difference, RMSD between satellite SSS for the considered month, and the SSS averaged over all months and all years, y, over the whole time period, ̅̅̅̅̅ : Hence the variability is seasonal and SSS is allowed to fluctuate more or less around its mean value with the seasons. To remain as coherent as possible with respect to the variability sampled by the satellite measurements, we prefer to avoid estimating it from model simulations.
Instead, we estimate it recursively from satellite SSS fields as follows. In a first step, we estimate this variability from the SMOS filtered SSS in regions that are not strongly contaminated by errors (low to middle latitudes, excluding RFI contaminated areas). In regions where SMOS SSS are very noisy (such as in RFI contaminated areas e.g. close to Fiji island or in the Bay of Biscay, we artificially set a low variability; at very high latitudes, we consider ISAS SSS variability increased by a factor of 2 in order to leave the possibility of unexpected fluctuations). These monthly variability fields are then used to provide a first estimate of CCI L4 SSS weekly fields.
The CCI L4 SSS weekly fields are then used to derive a SSSvarmonth. The use of weekly OI fields instead of individual satellite SSS aims at avoiding too large RMSD values due to outliers (see Figure 3 and supporting information S4).
The monthly SSS variability is the dominant term among the representativity uncertainties between the several spatio-temporal scales involved in the OI and the monthly to weekly variability which will be described below. An example SSSvarmonth is illustrated on  Estimates of the variance of weekly fluctuations relative to the monthly averaged fields are derived from GLORYS simulated SSS. Since river discharges used in the model simulations are based on climatological river runoff information, temporal and spatial variations in river plumes are not properly considered and may be underestimated. We, therefore, take an estimate for this variability defined by twice the RMSD between 50 x 50 km 2 daily and 50 x 50 km 2 30day mean SSS for each month, obtained from GLORYS fields at 1/12 th of a degree resolution and then smoothed over space scales of 200 km (see supporting information S5).

Random uncertainties of satellite SSS measurements:
We describe below how the random uncertainties, linked to observational errors (Vinogradova et al., 2019) and reported in the Co matrix are built.
The SMOS processing provides a 'theoretical' random error, ESSS_L2, derived from the Jacobian of Tb relative to SSS and auxiliary parameters, the expected radiometric noise, the expected random errors on auxiliary parameters, and radiative transfer model. A more realistic estimate of the SSS random uncertainty is obtained by multiplying ESSS_L2 by the normalized  of the Bayesian retrieval, N .
For SMAP and Aquarius SSS, we derive random uncertainties from self-consistency analyses of the observed SSS as a function of SST. We check that they correspond to a realistic radiometric noise given the SST dependency of Tb derivative with respect to SSS (Yueh et al., 2001). A theoretical uncertainty is derived that is used for our analysis.
For SMAP, we derive empirical uncertainties by comparing collocated retrieved SSS between fore and aft acquisitions. We computed the standard deviation of the difference, STDD, which should be an estimator of the SSS random uncertainty multiplied by √2 (assuming that the random uncertainty is the same for aft and fore acquisitions). The STDD is very close to modeled uncertainty with a 0.45K radiometric noise.
For Aquarius, the random uncertainties are derived from comparisons of Aquarius SSS at time t with the Aquarius SSS at time t + 7 days given the 7-day periodicity of this satellite orbit.
For that, we assume that SSS in the open ocean does not change significantly during 7 days, so that the STDD relates to the random error (multiplied by √2). A fit to the observed random errors is derived as a function of SST (see supporting information S6).
We check the reasonable behavior of estimated random uncertainties, σ , by considering the statistical distribution of the centered reduced SSS, namely, SSSc: Where SSSobs is the retrieved SSS possibly corrected from systematic errors and SSSref is a reference SSS. We find that the distribution of SSSc is most often close to a Gaussian law with a standard deviation, STD, equal to 1 over the open ocean at distances further than 800 km from the coast. Closer to the coast, STD(SSSc) deviates from 1. Hence, we estimate multiplicative factors, f(dcoast), to be applied to σ , for each instrument independently. Since part of this difference can be associated with representativity errors, we quantify STD(SSSc) in grid nodes with low spatio-temporal SSS variability, lower than 0.2, as derived from each instrument RMSD. In regions with greater SSS variability, it is likely that representativity errors will make the SSSc distribution non-Gaussian and generate outliers. Hence instead of using a criteria on STD(SSSc), we prefer to use robust STD, STD* that filters out outliers (STD*(x) is defined as the median(abs(x -median (x)))/0.67; STD*(x) is equal to STD(x) in case of a Gaussian distribution of x). We then adjust f(dcoast) so that STD*(SSSc) becomes close to 1 (see supporting information S6).

Representativity uncertainties:
We explain below how the representativity uncertainties that are reported in the Cr matrix are estimated. They originate from the different samplings of the various sensors and they are also called sampling uncertainties (Vinogradova et al., 2019).
In the monthly OI, we neglect the representativity uncertainty for SMOS and SMAP, that corresponds to the transition from acquisition time (about one second) to monthly resolution (30 days), because, in the majority of cases, SMOS and SMAP random uncertainties, on the order or larger than 0.5, are much larger than their representativity uncertainties. This might be an issue in highly variable regions, such as river plumes but there, the SSS variability is poorly known, and the frequent revisits of SMOS and SMAP in a month should alleviate this issue. Nevertheless, this will be included in a future version. For Aquarius, which has much smaller observational uncertainty and a poorer spatial resolution, the algorithm takes into account the representativity uncertainty. It corresponds to SSS variability at CCI spatio-temporal scales smaller/shorter than those resolved by the Aquarius fields. The aim is to increase the uncertainty associated with Aquarius data by a representativity uncertainty. We estimate it as the STD between SSS fields smoothed over 150 x 150 km 2 and 50 x 50 km 2 which is calculated off-line using GLORYS simulations. This representativity uncertainty depends on the month and increases in highly variable regions (see supporting information S7). In some parts of the ocean, SSS does not vary over large distances and durations: in this case, the representativity uncertainty tends towards 0.
Tests using this representativity uncertainty lead to too smoothed L4 SSS fields particularly in very variable areas suggesting that the representativity uncertainty is not large enough. This might be due, among other reasons, to a spatial variability of SSS underestimated in GLORYS simulations, especially given the limited 2011-2014 time period we considered, as well as to non-Gaussian distribution of SSS not handled by our method. In order to bypass these artifacts and to preserve small scale features detected by satellite data, possibly at the expense of an increased noise of L4 fields, the variability was spatially smoothed and doubled. This rather coarse methodology should be revised in the future.
Representativity uncertainties are generally low (compared to observational uncertainties) for grid nodes in the open ocean but are large near river mouths, in rainy areas, and very dynamical regions (e.g. Gulf Stream).
Close to the coast and near the river mouths, these uncertainties are difficult to assess to a large extent because of the high interannual variability that limits the validity of the statistical approach. These uncertainties can be greater than 1 and becomes dominant in the error budget as they are larger than SSS observational random uncertainties of original L2 satellite data.

Systematic uncertainty:
We describe below how bc is parametrized and we illustrate how it is retrieved together with SSS.
Systematic uncertainties, the bc term in equation (2), that depend on the satellite measurements geometry, are retrieved in the course of the monthly OI. Only systematic uncertainties constant over the full time period, which have been shown to be a main source of contamination (e.g. ), are estimated in the frame of the OI. The amplitude of the systematic uncertainties depends strongly on the sensor passes (ascending or descending), on the satellite pixel location on Earth, and, on the acquisition geometry (e.g., beam number (for Aquarius), across-swath position in the field of view (for SMOS), aft or fore view acquisition (for SMAP)). We assume that land sea contamination biases are constant over time.
This is a reasonable assumption given that temporal variations in Tb on land and sea are generally an order of magnitude smaller than the contrast between land and sea Tb. bc varies as a function of location, instrument and geometry of observation: SSSobs(t, , ,X, xorb)=SSS0 (, ) + SSSrel (t, , )bc(, ,X, xorb) (4) where the notations are as in equation 1 and X accounts for the geometry of observation, and can pertain to different instruments. X corresponds to fore or aft acquisition for SMAP, to across swath location for SMOS and does not vary for Aquarius. Actually, for a given xorb, a pixel at a SSScorr= SSSobs+bc+ΔSSSq (5) In regions with low SSS variability, ΔSSSq is derived from the 50% quantile (median) and the quantile is increased up to 80% in regions with high SSS variability, where SSS statistical distribution is skewed towards low values which are not well represented in ISAS fields, due to the Argo undersampling and ISAS smoothing. To avoid the irregular spatial sampling of Argo matchups ( Figure 5), we also report statistics of gridded collocated datasets. In order to keep a significant number of matchups per grid point, the Argo/CCI pairwise MDB is re-gridded on the CCI grid subsampled by a factor of 7 both in latitude and longitude. It corresponds to an Equal Area EASE grid resolution of 175 km. The same bi-weekly temporal sampling as CCI monthly product is conserved. The median value for each grid point of all pairwise MDB values is taken.
We summarize below the characteristics of each dataset and the methodology used to build the matchup data bases (MDB) with CCI L4 SSS.

Comparisons with Argo SSS:
Argo is a global array of almost 4,000 free-drifting profiling floats that measure

Comparisons with SSS collected by ship thermosalinograph:
Ship ThermoSalinoGraph (TSG) measurements are sampled at a few kilometres resolution along ship transects, hence, reducing the spatial representativity uncertainty in a satellite pixel with respect to Argo. These higher resolution 'validation' data are therefore very useful in very variable regions and the vicinity of land. In this study, we have considered the GOSUD-RV and the LEGOS-DM datasets located mostly at low to mid latitudes, which are very complementary to Argo in high SSS variability regions ( Figure 5).
The TSG-GOSUD-Research-vessel dataset corresponds to French research vessels, which have been collecting data since early 2000 as a contribution to the GOSUD program. The set of homogeneous instruments is permanently monitored and regularly calibrated. Water samples are taken daily by the crew and later analyzed in the laboratory. The careful calibration and instrument maintenance, complemented with a rigorous adjustment on water samples lead to reach an accuracy of a few 10 -2 in salinity. This delayed mode dataset (Kolodziejczyk et al., 2020) is updated annually and freely available on https://www.seanoe.org/data/00284/39475/ .
Adjusted values when available and only collected TSG data that exhibit quality flags 1 or 2 were used.
The TSG-LEGOS-DM dataset is delayed mode data derived from voluntary observing ships collected, validated, archived, and made freely available by the French Sea Surface Salinity Observation Service (Alory et al., 2015). Adjusted values when available and only collected TSG data that exhibit quality flags=1 and 2 were used. A spatial smoothing (running median filter with a window width of 25 km) is applied to TSG data before co-locating TSG with CCI L4 SSS, in order to reduce the spatial representativity error.
Each TSG smoothed data is then co-located with all CCI L4 SSS data within a radius of 12.5 km and +/-15 days (for the monthly CCI product) from the TSG data location. If several satellite SSS pixels meet these criteria, the final satellite SSS match-up pixel is selected as the CCI L4 SSS whose central time is closest to the TSG measurement date.

Overview of large scale SSS variability
The ten year-long CCI L4 SSS time series include large scale SSS anomalies, as shown by quarterly SSS anomalies ( Figure 6). These SSS anomalies are particularly large in the tropical and subtropical bands and are detected with great realism as indicated by ISAS derived SSS anomalies (see supporting information S8). This comparison is fully independent as no information about interannual SSS variability in ISAS is introduced in the CCI+SSS OI. In areas strongly affected by rain, the comparison must be considered cautiously as the vertical representativeness of Argo and satellite SSS can contribute to non-negligible measurement differences. We shortly recall below the extreme events which occurred during this period and

Comparisons with in-situ SSS:
The statistics of the comparisons between monthly CCI SSS and in-situ SSS are reported in Table 3 to Table 6; the ones for weekly SSS are reported in the supporting information S9.
Over the global Ocean, when considering all matchups of monthly (weekly) CCI SSS with Argo SSS, STDD* is 0.16 (0.17), STDD is 0.27 (0.28), r 2 is 0.97 (0.97) (  Table 4), the STDD* decreases to 0.13 (0.14), the STDD to 0.16 (0.17) and r 2 is 0.97(0.97). These metrics are amongst the best performance of satellite-Argo comparisons made at the PI-MEP (Table 3 & Table 4).  The large decrease in STDD between All cases and case C1 is in a large part due to the increasing representativity errors between Argo and CCI fields in regions close to land, but could also reflect remaining errors related to land-sea contamination. The statistical comparisons with Argo exhibit better performances with Aquarius L4 IPRC data likely because the latter are adjusted monthly with Argo SSS, contrary to CCI L4. With respect to the statistical indicators obtained with L3 SSS fields for each individual sensor, the STD and RMS differences with CCI L4 SSS are systematically reduced, because of the noise reduction associated with the OI method. r 2 is much improved with respect to the Aquarius L3, especially when considering "All cases". It is reduced when considering only open ocean data (0.95 with Aquarius weekly data instead of 0.97 with CCI weekly data) where variability at scales smaller than 100 km is significantly less than when the data are within a distance of 800 km from the nearest coastlines.
Comparisons of CCI data with TSG data ( Table 5 & Table 6) also show systematic improvement of all statistical indicators with respect to L3 SSS from individual sensors. The Aquarius L4 IPRC SSS still outscores CCI SSS in term of STDD* but not for the STDD and r 2 for all cases comparisons: the STDD of CCI SSS with respect to ship TSG SSS is 23% less and r 2 is higher by 16% for monthly CCI L4 SSS compared to Aquarius L4 IPRC SSS (Table 5). This is likely because ship TSGs dominantly explore tropical and mid-latitudes regions ( Figure 5) and better sense SSS spatial variability than Argo punctual measurements, in particular in case of extreme values, hence strengthening the better sampling of spatial SSS variability with CCI L4 SSS.  With a regular sampling (gridded collocated dataset), the CCI statistical indicators improve slightly: the STDD* is 0.15, the STDD is 0.24. This is likely because of the Argo oversampling in regions strongly contaminated by RFIs (e.g., along Asian coasts; Figure 5).
The global pattern of SSS, its variability and the global SSS statistical distribution are consistently observed by both Argo and the CCI+SSS fields ( Figure 7). The spatial patterns are especially coherent in the tropics. At higher latitudes, the CCI+SSS variability in very dynamical regions (Gulf Stream, Agulhas current, Rio De La Plata river plume) appears slightly smaller with CCI fields than it is with Argo individual measurements, possibly due to variability at smaller spatio-temporal scale than 50 x 50 km 2 and one month. On the opposite, the CCI product variability is larger in regions with very stable SSS, e.g. in the Southern Ocean, likely a result of the increasing satellite SSS noise with decreasing temperatures. The statistical distribution of the CCI SSS is close to the one of Argo SSS (Figure 7, bottom left). The statistical distribution of the difference is much narrower than a Gaussian distribution derived from the STDD, and very close to that derived from STDD*, due to the presence of outliers at the edges of the distribution. It is interesting to consider the temporal evolution of the median, STDD and correlation coefficient, r, between CCI and Argo SSS (Figure 8 and Figure 9). STDD* is always less than 0.2 ( Figure 8, 3 (Figure 8 and Figure 9). This is possibly because the thermal model of the SMAP reflector was empirically adjusted using Argo temporal-latitudinal SSS profiles . However, we also noticed that seasonal   All these studies require an accuracy between 0.1 to 0.3 at spatio-temporal scales greater than 100km and one week (see https://www.wmo-sat.info/oscar/variables/view/sea_surface_salinity).
While the spatial and temporal resolution requirements of the users and the previously mentioned approach is, therefore, upstream of the optimal interpolations which correct satellite SSS biases using in-situ SSS fields on a monthly basis or less, such as (Melnichenko et al., 2016). The CCI+SSS fields could be used as inputs to such method, as was done with SMOS data (Nardelli et al., 2016) or with SMOS and SMAP data (Kolodziejczyk et al., 2020).
Nevertheless, there is still room for improving CCI L4 SSS, their uncertainties and validation. In particular, it would be interesting to reach better stability of the CCI L4 SSS time series.
In particular, seasonal latitudinal differences remain with respect to Argo salinities, L-band in cold water has also been shown to remain an issue . This will be corrected in the future CCI L4 SSS release. Another source of uncertainty comes from the varying versions of ECMWF forecast fields used as priors (wind speed, SST) in the SMOS SSS retrieval. The use of the same model version for the reanalyzed fields, such as reanalyses like ERA5, could help stabilize the time series. Last, an ongoing SMOS reprocessing with a revised calibration is more stable (preliminary results) and the level 1 revised algorithm corrects part of the sun contamination at the high northern latitudes. For Aquarius SSS, remaining systematic differences have been found between ascending and descending orbits (Kao et al., 2018;Meissner et al., 2018). We have not considered correcting Aquarius SSS for systematic latitudinal seasonal biases before merging with SMOS SSS, but this should be envisioned in the future. Future studies should also pay more attention to systematic differences arising from the components of the radiative transfer models and of the prior datasets which differ between the processing chains of each of the three sensors.
The development of this CCI L4 version 2 algorithm focused on low to mid latitudes regions where satellite SSS datasets were the most mature. Recent products (Brucker et al., 2014;Olmedo et al., 2018;Supply et al., 2020b;Tang et al., 2018) are nevertheless achieving a useful accuracy for detecting Arctic Ocean freshwater changes (Fournier et al., 2020), changes in river plumes extent (Vazquez-Cuervo et al., 2021) and their relations to wind atmospheric forcing (Tarasenko et al., 2021). Hence keeping the best potentialities demonstrated with the existing datasets associated with the three satellite missions at high latitudes will be another remaining main challenge for the future research and developments of CCI+SSS merging algorithms. In this CCI product, no correction has been applied for rain effects on SMOS SSS products, because reliable rain estimates at hourly resolution as provided by IMERG, and as required by correction or sorting methods (Supply et al., 2020a;Supply et al., 2018), were not available before 2014, i.e., at the time of this CCI+SSS version 2 development. This effect might be non-negligible, even on monthly SSS estimates in the rainiest regions of the globe and is very likely responsible for zonal r lower than 0.95 in the tropics (Figure 9). In very rainy regions such as the Pacific ITCZ, Boutin et al. (2014) estimated that vertical representativeness mismatch might lead to differences between monthly salinity at a few meters depth and in the firstcentimeter depth of up to 0.5 at satellite pixel level (50x50 km 2 ). Nevertheless, this effect is very patchy, and when averaged over all longitudes, it is mostly less than 0. Validation with standard statistical indicators such as the ones used in this paper has inherent limitations. For instance, high-resolution fields might appear as having a higher RMS difference with respect to reference fields than lower resolution fields when representativity errors between reference fields and high-resolution fields are important. This effect contributes to some lower RMS difference computed with Aquarius than with CCI L4 SSS. We identified it by looking at r 2 , but methodologies such as the ones developed in the high resolution modeling community (Crocker et al., 2020) could be investigated to better quantify the accuracy of highresolution fields relative to lower resolution fields. Wavenumber spectral analysis, such as the one performed on sea surface height by (Dufau et al., 2016), should also be studied in order to validate dynamical features of SSS at various spatio-temporal scales.
The characterization of SSS variability remains challenging as, on one hand, the combination of in-situ and satellite information remains to be improved (Stammer et al., 2021); and on the other hand, regions with high SSS variability such as the river plumes or strong surface currents regions are the ones benefiting the most from the satellite information (Tranchant et al., 2019). The statistical distribution of SSS is not expected to be Gaussian (Bingham et al., 2002), especially in regions affected by fresh-water inputs, so that vertical, temporal and spatial representativity errors between in-situ and satellite measurements are not expected to be Gaussian. In particular, the SSS distributions are expected to be skewed towards low SSS values while the higher salinity parts of the SSS distributions are expected to vary much less. This leads us to adopt an adjustment of the full time series of CCI L4 SSS and ISAS SSS in fresh and very variable regions based on a high quantile of their statistical distributions.
Nevertheless, the OI assumption of Gaussian errors used here might lead to some drawbacks in fresh regions such as river plumes or rainy areas, e.g., an artificial increase (decrease) of the uncertainty during periods with decreased (increased) variability that are very difficult to quantify given the sparseness of existing in-situ measurements. For the reasons outlined above, the validation of the SSS and its uncertainty estimate is very tricky and requires extended research to go beyond the relatively crude validation presented in this paper.
Finally, we plan future CCI+SSS updates typically once every 18 months. In future versions of the CCI products, in addition to a global product, polar products are foreseen. We also plan to develop several regional products with longer time series than the one presented in of Bengal. For this, we will complement the observations provided by the suite of L-band sensors using AMSR-E lowest microwave frequency channel data (at 6.9 GHz=C-band and 10.7 GHz=X-band) acquired in warm and strongly contrasted dynamical river plume regions. In such conditions, the small SSS signal contained in C-band radiometer data is improved by differentiating the vertical polarization surface reflectance between the C and X band, minimizing SST and wind effects on the data. Monthly-averaged SSS retrievals using such approaches have been already demonstrated from AMSR-E data for the Amazon plume region (Reul et al., 2009) and HY2-A data for the freshwater runoff near the Yangtze Delta (Song and Wang, 2017). In the future, new missions such as the Copernicus Imaging Microwave Radiometer (Donlon, 2020)  All the material presented here is not essential to the comprehension of the article but provides more detailed information to the reader.

S1. SSS data requirements for ocean and climate studies
To create an SSS data set that satisfies the needs of climate users, both modeling and Earthobserving scientists groups, users of satellite SSS data were consulted through various approaches: personally, via e-mail, mailing lists, or at meetings. They were invited to participate in web surveys and to specify their requirements for satellite SSS data. In our survey, we asked specific questions to find out the user's priorities (typically higher resolution or improved uncertainty estimates).
The survey (available on https://forms.gle/BVDroYrNpVvpxFJu9) gathered 54 answers from various countries of origin/fields (Table S1). Most responses were from the USA (28%), followed by Germany (19%) and the UK (17%). The questionnaire had four different parts, which were 1) User profile information, 2) User dataset requirements, 3) User dataset quality information, 4) Other user requirements or suggestions.
The majority of users requires global spatial coverage and temporal coverage from at least 9 years. The resolution requirements vary according to the studied phenomena. About 33% of respondents require data with a temporal resolution of 1-3 days, while, for 35% (28%) of respondents, weekly (monthly) averaged data are sufficient ( Figure S1a). In terms of spatial resolution, 39% of respondents require data on a 0.25° spatial grid, while 28% of respondents require data on 1° spatial grid ( Figure S1b). The majority of respondents would prefer a data product with high spatial and temporal resolution (weekly, 0.25°) on a regular latitude-longitude grid. Interestingly, a majority of users would prefer a product with high temporal and spatial resolution and a lower accuracy rather than working with a product with high accuracy but a lower resolution ( Figure S1c). It was also found that the participants are aware of the data set limitations and have realistic expectations. According to the survey, data should be combined to overcome the weaknesses of individual datasets. 50% prefer a combination of satellite and in-situ measurements, whereas 39% require the combination of data from different satellite sensors. However, in the CCI L4 SSS products described here, information from in-situ measurements was restricted to a minimum in order to work with measurements having homogeneous spatio-temporal resolution and sampling.
By making available the multiple-sensor datasets on different spatial-temporal grids, the needs of different users can be met. The most common requirement is for L4 data (43%), directly followed by requirements for L3 (37%). Some potential users, mainly modelers or scientists investigating rapid SSS changes, require L2 (20%). L3 and L2 data are already available from the original data centers. L2 and L3 datasets including the CCI+SSS systematic corrections are kept as an internal CCI+SSS product.
Uncertainty information for each SSS grid point has to be fully characterized, including random noise and systematic uncertainties of the applied adjustments. Information about bias (systematic uncertainty) correction is most commonly required by respondents. 46% of the respondents would prefer a quality information easy to use, such as a good/bad flag or the probability that a value is good/bad.
User Requirement Survey results show the importance of contacting users and promoting communication between the users and potential users of CCI L4 SSS fields. Users will be regularly contacted to refine requirements, as well as to check their satisfaction with the CCI L4 SSS product. The recommendations regarding resolution, format, quality, and additional information derived from the user consultation are summarized in the User Requirement Document (URD available on https://climate.esa.int/en/projects/sea-surface-salinity/keydocuments/).

S2. SMOS SSS data pre-processing
We only consider SMOS SSS satisfying the following criteria (same notations as in ): normalized  of the retrieval,  N < 3, SSS random uncertainty, ESSS_L2 < 3, pixel within +/-400 km from the center of the swath, with small number of Tb outliers (level 2 fg_outlier flag), uncontaminated by ice (level 2 Dg _suspect_ice=0; this flag removes pixels in cold waters (SST<2°C) in which at least one Tb differs by more than 20K from modeled Tb. This is a very stringent filtering that is likely to be removed in future versions), with moderate to strong RFI and ice contamination as detected using SMOS retrieved pseudo-dielectric constant, Acard (|Acard smos -Acard mod|<2 and Acard>42; see more in (Supply et al., 2020b)), wind speed less than 16 m/s, SSS between 2 and 45.
Deficiencies in the dielectric constant model leads us to adjust SMOS SSS with a polynomial SST function derived from comparisons between Aquarius SSS (retrieved with similar dielectric constant model and atmospheric model as SMOS processing models), and Argo SSS (blue dotted curve in Figure 16 of (Dinnat et al., 2019)). A correction for seasonal latitudinal varying biases is also applied, similar to what is described in : SSSobs(t, , ,xswath, xorb)=SSSsmosref (, ,m)-blat (,xswath,xorb,m) where SSSobs is the observed SSS, t is the time of the measurement, , and , are respectively the latitude and the longitude of the considered pixel over the ocean, xswath corresponds to the pixel location across the swath, xorb indicates the satellite orbit direction (ascending or descending), blat is a latitudinal correction that varies seasonally as a function of the month, m, and SSSsmosref is a reference SMOS SSS taken at a given xswath and xorb, chosen so that SSSsmosref interannual variability for the considered month and the corresponding pixel is the closest with that of in situ interpolated (In-situ Analysis System, ISAS) SSS after 5° latitudinal smoothing. blat is estimated through a least square minimization approach, and through a series of iterations. In order to avoid land-sea contamination, blat is derived from SMOS measurements further than 1200 km from coast except at the high northern latitudes where the distance to coast is reduced to 600km in order to get enough measurements. It is computed over 2012-2018 to 6 avoid large RFIs in the North Atlantic in 2010 and 2011. blat is then removed from all SMOS SSS whatever their distance to coast and before estimating the land-sea contamination correction.

S3. Generation of level 4 SSS fields: detailed algorithm
The algorithm is searching for solutions SSS(t) and bc that both minimize the cost function. Each grid node is processed separately. All available SSS data associated with the grid node considered are used by the algorithm. The problem is linear, so that to minimize the cost function, a classic Raphson-Newton descent is used.
This matrix is calculated on the observation points.
The covariance matrices used are defined as follows: -Co the error matrix with data uncertainties derived, -Cm the matrix of SSS variability and a priori uncertainty on bc, -Cr the matrix of representativity uncertainties.

Co= [
CSSS is a time smoothing operator that contains the expected variability that is provided as auxiliary data. Thus, the covariance of the SSS that links two times t1 and t2 (either between two observational times or between an observational time and a sampled time of the OI) is written: with ξ=25 days and 6 days for monthly and weekly products respectively. "sigSSS" is interpolated temporally to t1 and t2 from seasonal variability.
"Cbc" is a diagonal matrix that contains the a priori standard deviation of biases. This standard deviation is set at 4 pss.
In addition to measurement uncertainties, representativity uncertainties are added:

Ct=Co+Cr
Representativity uncertainties are reported monthly. They are interpolated temporally to the acquisition times.

F(m)=SSS-bc
We look for SSS_est and bc_est that minimizes C (SSS,bc). The solution of minimization is written: where "T" indicates the transpose operator, Cm is the matrix of variability operating in the observational space and Cm' the matrix of variability between observational time and regular sampled time of the OI.

Estimation of monthly SSS
In order to estimate the monthly SSS, we proceed in 3 steps: 1) a first estimation of the biases and time series of SSS is performed spatial grid node by spatial grid node, 2) a 3-sigma filtering of the observed SSS in comparison with the estimated SSS is done.
The aim here is to identify any outliers against the returned SSS field. Outliers can be linked to intermittent RFIs. It is assumed here that stable RFI contamination can be corrected.
3) a second estimate of SSS biases and time series after removing outliers.
The relative biases used to derive monthly SSS are estimated taking the averaged SSS from the SMOS central across swath location as a priori.

Estimation of weekly SSS
To estimate the weekly SSS, the biases calculated at the monthly SSS generation step are frozen (it is assumed that the biases will not be better estimated from a weekly smoothing). We start from the monthly SSS as a priori value. We estimate the weekly fluctuations around this a priori, taking into account the acceptable SSS variability between weekly and monthly fields that was derived as a monthly climatology from the Mercator model. A 3-sigma filter is used in order to eliminate outliers that deviate too far from what is expected. Here, = √ _ 2 2 + 2 . The weekly SSS field estimate is done in a single step. 9 Absolute SSS correction At the end of the inter-sensor bias correction step, the salinities are obtained in relative values, i.e.
they are known within one additive constant. This is corrected by adjusting a quantile of the CCI and ISAS SSS statistical distributions in each grid node over the period considered. The dynamics of SSS variability are not affected by this adjustment as only one constant value, grid node per grid node, is added for the entire period. In regions where SSS variability is low, we assume that high frequency variability better sampled by CCI than by ISAS does not affect significantly the median of the SSS and we therefore adjust the SSS median (50% quantile). In regions with larger variability, given that intermittent freshening is much more frequent than intermittent oversalting, we expect the high part of the SSS distribution to be less affected by the higher frequency sampling by satellite than by ISAS. Hence in case of high weekly variability, we perform the calibration of CCI SSS on ISAS SSS, not by using the median but a higher quantile, in order to promote the calibration on the high SSS values. A high quantile is not used everywhere as in case the SSS error is greater than the variability, the high quantile of the satellite SSS is expected to differ (be higher) from the one of ISAS.
If the variability is greater than 0.8, the quantile is taken as 80%. If the variability is between 0.6 and 0.8, we take a quantile intermediate between 50% and 80% that varies linearly with the SSS variability. The map of quantiles used for the absolute calibration of the SSS is given in Figure   S4.

Estimation of SSS uncertainties
The computation of theoretical uncertainties is obtained directly from the pseudo Hessian matrix.

Cpost=Cm''-Cm'•H T • (H•Cm•H T +Ct) -1 • H • Cm′ T
Where single apostrophe (') indicates covariance defined between observational time and regular sampled time of the OI. Double apostrophe ('') indicates covariance acting over regular sampled time of the OI.
The problem becomes into inverting the "H·Cm·H T +Ct" matrix over the entire period, which is rather computationally heavy. We therefore prefer to take a sliding window over a large time interval and invert the matrix on this time domain (the computation being similar to the one we could perform over the entire period).
Note that the a posteriori uncertainty is necessarily lower than the variability introduced in the operator Cm. In the monthly case, this variability corresponds to the expected monthly fluctuations with respect to the whole time series. In the weekly case, the variability is calculated relative to the monthly field. The latter is generally lower than the monthly variability. The a 11 posteriori uncertainty obtained on the weekly fields should therefore be lower than that obtained on the monthly fields. However, this is only true if the weekly fields are derived from noisecorrected monthly fields, which is not the case. The propagation of uncertainties on the weekly fields must therefore take into account uncertainties on the monthly field. Thus, for the monthly fields, we have: with "Cmmonth", the monthly variability and "Cmweek" the weekly variability relative to the monthly variability.
The a posteriori uncertainties on the monthly and weekly fields are therefore obtained as follows: σSSS month =√diag(Cpost month ) σSSS week =√diag(Cpost week ) The number of outliers is also calculated on this same basis as well as the number of data available. The window sizes used are respectively +/-30 days and +/-10 days for monthly and weekly products respectively.

12
S4. Climatology of monthly SSS variability relative to the whole period SSS mean Figure S4. Examples of climatological maps of monthly SSS variability relative to the whole period SSS mean. a) February; b) May; c) August; d) November.
13 S5. Climatology of weekly SSS variability relative to monthly mean: Figure S5. Examples of variability of weekly SSS relative to monthly mean SSS taken as input information in weekly OI , from GLORYS reanalysis. A) February; b) May; c) August; d) November.

S6. SSS random uncertainties of L2 satellite SSS
The SMAP random uncertainties are derived from the std of the difference between SSS retrieved from fore and aft acquisitions ( Figure S6.1, red). They are very close to a modeled error with a 0.45K radiometric noise.
The Aquarius random uncertainties are derived from comparisons of successive (7-days apart) Aquarius SSS measurements and are fitted with an SST dependency.
The SMOS random uncertainties are taken from the theoretical error multiplied by the Chi of the retrieval provided in SMOS L2 files which provide a reasonable estimate ( Figure S6.2).
In the above estimates, only pixels further than 800 km from coast have been considered in order to avoid land-sea contamination and very large representativity uncertainties.
14 Figure S6.1. SSS uncertainties derived for SMAP (red) and Aquarius (blue) as a function of SST.
We check the reasonable behavior of estimated random errors, σ , by considering the statistical distribution of the centered reduced SSS, SSSc: where SSSobs is the retrieved SSS possibly corrected from systematic uncertainties, SSSref is a reference SSS. Figure S6.2 shows an example obtained with SMOS data further than 800 km from coast compared with a 20-day SSS average. In that case SSSc is quite close to the expected Gaussian law. The slightly large value of std(SSSc) (1.25 instead of 1) is partly due to the presence of outliers. Closer to the coast, std(SSSc) deviates more significantly from 1. Part of this difference can be associated with the variability of salinity. In order to verify this, we sought to quantify std(SSSc) in regions with low variability. For grid nodes with variability lower than 0.2 on SMOS-CCI SSS rmsd, we compute a robust std, and we observe it to increase towards the coast. We apply the same process to SMAP and Aquarius data. We then derive multiplicative factors which are function to the distance to the coast, f(dcoast), that will be applied to σ of each instrument ( Figure S6. 3), so that the reduced random variables normalized with the L2 random uncertainties multiplied by these factors, have a standard deviation equal to 1.