Supermodding – A special footprint operator for mesoscale data assimilation using scatterometer winds

Satellite observation footprints may extend over many grid points of a high-resolution limited-area model, while small and fast model scales may not be traceable by the observing system. We discuss the spatial representation of scatterometer ocean surface winds for the representation of high-resolution model state variables. A prototype observation operator called supermodding is studied in the variational assimilation framework in order to avoid correcting unconstrained small scales during the assimilation procedure. The challenges connected to small scales that are represented by the mesoscale model with respect to the application of the supermodding operator are discussed through idealised experiments. These results show that the application of the supermodding operator is able to avoid correcting unconstrained small scales, putting focus on the large scales only during data assimilation. Departure-based diagnostics show that the footprint representation helps to reduce the standard deviation of observation minus background departures (4–5% reduction) while the statistics for the supermodding method show a further reduction (8–11%). The impact of the supermodding approach is discussed through a forecast sensitivity study using a moist total energy norm (MTEN)-based technique and verification of the forecasts against observations. It is shown that the impact of the supermodding method to ASCAT data assimilation on the upper-air AROME-Arctic forecasts is observed over sea and near the surface, and that it


INTRODUCTION
Today, remote-sensing observations provide essential information for the initialization of advanced numerical weather prediction (NWP) models. The continual appearance of new satellites and new observations push further the use of remote-sensing data and the corresponding developments aiming at their real-time operational application. It is clear that over high-latitude regions, where conventional observations are usually sparse and NWP forecasts are less skilful (Jung et al., 2016), accurate initial conditions are not achievable without these satellite observations, especially for a limited-area NWP model (LAM). However, the satellite data are mostly assimilated in a conservative way, which means the data assimilation (DA) system is not taking into account all properties of the satellite measurements for the best corresponding estimate of the atmosphere. As an example, satellite observations are often used as point measurements in DA while their footprints may extend over many model grid points, in particular for LAM. Among the biggest DA challenges, the representation of satellite measurements for their model counterpart might act as one of the most important obstacles in the success of LAM DA. The total observation error consists of different error contributions in which the instrument error is generally assumed to follow a Gaussian distribution with zero mean, as well as the observation representativeness. Daley (1991;1993) explains that the representativeness error mainly derives from the resolution mismatch between the observation and the NWP model background. For instance, in a global model, this error comes from the unresolved scales of the model, which are well represented in the measurement. Other authors considered a combined forward model (Harris and Kelly, 2001;Sherlock et al., 2003) or forward interpolation error (Lorenc, 1986), which incorporates the errors of the observation representativeness and the observation operator. The representation error (defined by Janjić et al., 2017) encompasses error due to unresolved scales and processes, forward model or observation-operator error and pre-processing or quality-control error. In a perfect DA system, all components of the total observation error have to be appropriately described by the employed observation-error covariance matrix.
For atmospheric motion vectors (AMVs) and radiance data in a global DA system, Bormann et al. (2002; and  show that correlated observation errors exist, which can be partly explained by the presence of representation error. Liu and Rabier (2002) demonstrated in a simplified framework that more observations can lead to better analyses, but only when the assimilation system is optimal and uncorrelated errors are ensured unless correctly specified in the observation-error covariance matrix. In a later study, Liu and Rabier (2003) verified this further in a more practical four-dimensional variational (4D-Var) assimilation scheme with simulated (that is, perfectly known) errors. Therefore, when observation-error correlation exists, but is ignored, more advanced error statistics or different practical solutions are required to improve the sub-optimal DA system. For instance, Bédard and Buehner (2020) showed in a 1D simplified model that observations with spatially correlated errors can still be employed to extract more small-scale information by the inflation of observation errors. In the area of ocean DA and altimetry, Brankart et al. (2009) investigated a parametrization of observation-error covariance matrix in Ensemble Kalman Filters and Ruggiero et al. (2016) examined another way to account for observation-error correlations, which consists of the transformation of the observation vector allowing us to use errors that are spatially correlated.
Operational global NWP DA systems (Parrish and Derber, 1992;Lorenc et al., 2000;Rabier et al., 2000) have for a long time been using satellite observations. For global models a diagonal observation-error covariance matrix is generally employed and correlated errors are eliminated by observation thinning (Järvinen and Undén, 1997). Nowadays interchannel error correlations (e.g.,  or the representation error by the use of superobbing (e.g., Lin et al., 2017) are more and more commonly taken into account, but the application of the full observation-error covariance matrix is still considered to be computationally too expensive. Recently, the superobbing of Advanced Scatterometer (ASCAT) winds was also studied at the European Centre for Medium-Range Weather Forecasts (ECMWF). The experiments using superobbed ASCAT observations show reduced analysis and background departures in the global 4D-Var system (De Chiara et al., 2019). The use of satellite measurements in LAMs has been investigated and implemented operationally following the success of global model applications. However, showing the benefit and the added value of such observations in LAMs appears to be even more challenging. The representation error and unwanted observation error correlations are tackled with similar approaches (i.e., thinning and superobbing as in global DA, e.g., Scheck et al., 2020 assimilating visible satellite images). For very-high-resolution observations like RADAR, same procedures were successfully employed as well (Seko et al., 2004). Additionally, for the synoptic scale, i.e., coarser resolution LAMs, the superobbing of ASCAT was also applied, e.g., in the Weather Research and Forecasting Data Assimilation (WRFDA) system on a 15 km horizontal grid (Zhang et al., 2018).
The impact of scatterometer data in a high-resolution DA system was studied as well by Valkonen et al. (2017) showing positive impact, but only after precise observation-error tuning and data selection. However, as the grid size of LAMs is nowadays gradually increased towards mesoscales and even sub-kilometric scales, the remote-sensing observations do not actually measure or represent the small-scale spectrum of very-high-resolution models. In order to eliminate this type of error, the so-called footprint operator has been developed, e.g., in the Environment Canada regional ice analysis system (Buehner et al., 2013;Carrieres et al., 2017) and at Météo-France for making an assimilation study of satellite infrared radiances (Duffourg et al., 2010). The footprint operator aims to average gridded model information over an area equal to the footprint of the assimilated satellite measurement, i.e., rather than averaging in observation space to best represent the model resolution. For very-high-resolution LAM we need to average in model space to best represent the observation footprint.
This paper addresses spatial structures simulated by models which cannot be observed by the observing system. Belmonte and Stoffelen (2019) demonstrate the relatively large mesoscale errors in global NWP, and furthermore Trindade et al. (2020) show that some of the model errors are persistant. Without a strong ocean coupling, LAM may not do better over the ocean, since no information is available for the initialization of the small scales . The effect of unconstrained small scales in mesoscale DA was discussed by Marseille and Stoffelen (2017) in an idealized framework with scatterometer observations. They showed that averaging in model space improves observation minus background statistics for scatterometer winds for larger scales than the scatterometer resolves, suggesting an improved representation of these observations to their model equivalents. In this paper, we further elaborate on the footprint operator proposed by Marseille and Stoffelen (2017) in a mesoscale DA system with focus on scatterometer winds and taking into account the small spatial scales of the NWP model. Our hypothesis is that the representation of the observation footprint is only one aspect, and we argue that lack of adequate forcing like 4D observations, high-resolution boundary conditions and physical constraints, also determine the scales one can analyse deterministically in a limited-area domain over the open ocean. To test this hypothesis, we introduce the concept of supermodding, which can be considered as a generalisation of the footprint operator. It also implies that predefined weights in DA have to be revised, but currently we do not aim at tuning these errors, which will be the subject of a future study. Furthermore, the definition of representation error is used and considered as Janjić et al. (2017) defined, but it does not cover the spatial representativeness issue we are aiming to study.
The paper is divided into the following sections. In Section 2, the mesoscale AROME-Arctic (Application of Research to Operations at Mesoscale -Arctic) model, scatterometer ocean surface wind data, and the applied methods are presented. The challenges of small-scale DA, the results of a single-observation experiment and the concept of supermodding through an idealised case-study are introduced in Section 3. In Section 4, departure-based diagnostics and verification results of an observing system experiment are presented to assess the impact of the supermodding method on a large data sample and on LAM forecasts. The last Section 5 discusses the different results and draws the conclusion for future studies.

2.1
The AROME-Arctic model AROME-based models (Seity et al., 2011;Bengtsson et al., 2017;Termonia et al., 2018) are developed in a collaborative effort of the ALADIN-LACE-HIRLAM (Aire Limitée Adaptation Dynamique Développement International; Limited-Area modeling in Central Europe; High-Resolution Local-Area Modelling) consortia. The AROME-Arctic spectral model makes use of a semiimplicit two-time-level scheme and a semi-Lagrangian advection scheme with the non-hydrostatic ALADIN dynamical core (Bubnová et al., 1995;Bénard et al., 2010). The applied physical parametrization schemes and the surface scheme called SURFEX (Externalized Surface) are described by Bengtsson et al. (2017) and Masson et al. (2013) respectively. In this study, the AROME-Arctic operational settings of cycle 40h1.1 were used to set up the experimental system, which was operational from 2016 until early 2019 (Müller et al., 2017a). This configuration is running with 2.5 km horizontal grid spacing and using 65 vertical levels up to 10 hPa over the high-latitude model domain, which is displayed in Figure 1. The lateral boundary conditions (LBCs) in the AROME-Arctic model are provided by the global ECMWF/IFS (Integrated Forecasting System) model at TCo1279 spectral resolution (around 9 km grid distance). Regarding the observation set, the operational AROME-Arctic DA system (Müller et al., 2017b) uses all available conventional and numerous satellite observations including ASCAT scatterometer data. For a given date, the numbers of active observations in the AROME-Arctic DA system are summarized in Table 1. Furthermore, the whole configuration and more details of the DA system are described by Randriamampianina et al. (2019).

The ASCAT coastal product
The scatterometer instrument on board polar-orbitting satellites is designed to measure wind speed and direction by backscattered radar signals from ocean waves. The ASCAT instrument is carried on EUMETSAT (European Organisation for the Exploitation of Meteorological Satellites) MetOp (Meteorological Operational Satellite) satellite series MetOp-A, MetOp-B, and recently also on MetOp-C. The scatterometer wind measurements are determined on a grid of Wind Vector Cells (WVCs). Each WVC has two, three, or four ambiguous wind solutions depending on the results of the applied wind retrieval algorithm (Stoffelen and Anderson, 1997;Portabella and Stoffelen, 2004). The wind ambiguities have to be removed to provide a unique solution for active DA. This is generally achieved by a comparison with NWP prior information choosing the closest wind ambiguity to the estimated wind field in a 2D-Var procedure (e.g., Vogelzang et al., 2009;Vogelzang and Stoffelen 2018). The observation footprint of scatterometer measurements corresponds to the size of the WVCs used for actual wind calculation. In this study, the ASCAT coastal product with 12.5 km distance between adjacent observations along and across the swath is considered. The effective resolution of scatterometer wind products is typically twice the sampling distance, i.e., around 25 km for the coastal product (Marseille and Stoffelen, 2017).

The data assimilation scheme
In the ARPEGE/IFS (Action de Recherche Petite Echelle Grande Echelle/Integrated Forecasting System) model family including the AROME model, the variational technique (4D-Var and 3D-Var) became paramount for atmospheric DA purposes. One advantage of this framework compared to, for example, optimal interpolation is that remote-sensing measurements can be more optimally used. The implementation of the variational method incorporates the use of an incremental formulation, a change of variable, and iterative minimization algorithms to speed up the assimilation procedure (Courtier et al., 1994). Following the Ide et al. (1997) notation, the variational analysis can be obtained by searching for the minimum of a cost function (J) described by Equation (1)  gradient Equation (2).
where B and R are the background-error and the observation-error covariance matrices, respectively. H is the linearized observation operator and H T stands for its adjoint version. x is the increment and d represents the innovation, i.e., the difference between the observation vector y and its model counterpart projected to observation space by the nonlinear observation operator H(x b ): In Equation (3), x b stands for the background model state. More details about the variational assimilation method can be found in Bouttier and Courtier (2002). H includes spatial horizontal and vertical interpolation, physical conversion, and a radiative transfer model in the case of radiance DA. In 4D-Var, H involves the model propagator as well to take into account observations at the appropriate time.
Considering that wind components are analysed variables (via vorticity and divergence) in the variational framework, scatterometer observations do not require a complex (i.e., nonlinear) observation operator. In its recent implementation, the operator does bilinear horizontal interpolation using 4 or 12 neighbouring grid-points, depending on the user's decision to compute the horizontal part of H(x b ). For this reason, the observation-operator error is, in fact, an interpolation error for this observation type. This interpolation approach is considered as an accurate solution for most of the conventional observations, but scatterometer observations are representative of an area about three times the size of a WVC, and at very similar spatial resolution to the model.

The supermodding observation operator
Supermodding aims to improve the spatial representation of the observation by averaging the model counterpart and therefore ignoring the correction of small unobserved and unconstrained scales in DA. This approach was implemented for scatterometer data in the observation operator of the variational assimilation scheme (schematic Figure 2 illustrates the method). For the purpose of this study, the size of averaging in model space (called the supermodding size hereafter) can be easily changed to test the impact of Additionally, for the reason that the scatterometer measures surface wind over sea, the observation operator has to avoid taking into account NWP information over land, which might occur for large supermodding sizes. In this case, grid-points over land are omitted in the observation model-equivalent computation, using the land-sea mask model information.
This solution enables us to compare fairly equal observation datasets even with different supermodding sizes. For further constraints to avoid H(x b ) computational conflicts both at the domain border and close to sea ice, scatterometer data were blacklisted 250 km away from the LAM domain border and over sea surface temperatures lower than 2 • C. This was done in all experimental runs to ensure comparable datasets. Furthermore, the selected grid points of the corresponding supermodding size are averaged with equal weight. One approximation in the definition of J is that the observation operator can be linearised around the background state, that is H(x) ∼ H(x b ) + H( x). In this study, the H supermodding operator used for initial departures is linear, therefore it is used for the tangent-linear operator (H) as well. The corresponding conjugate transpose of H, that is the adjoint, is also needed for solving the variational problem, which is made of the transpose of each line of the averaging (supermodding) operator code in reverse order. This implemented prototype of the supermodding operator replaces only the horizontal part of the default scatterometer observation operator and keeps the vertical (interpolation) part unchanged. Additionally, the multi-incremental variational formulation was not taken into account and an experimental 4D-Var has been used in our study. Regarding computational aspects of the supermodding operator, the larger supermodding sizes require large numbers of neighbouring grid points, which need to be accessible during the computation of innovations and increments. In the ARPEGE/IFS model family, the semi-Lagrangian (SL) advection requires flexible communication between a processor and the neighbouring points for the computation of a trajectory, therefore a computation domain (SL halo) is defined to ensure the access of necessary grid points. In DA configurations, the SL halo is also defined in the set-up phase after the distribution of grid blocks to processors has been defined. For the supermodding operator, the width of the SL halo is extended proportional to the requested averaging size in the set-up level; this allows only one calculation of SL communication tables, but needs more memory in comparison with horizontal interpolation or an on-demand halo extension scheme, which is considered as a future development for the supermodding operator.

Diagnostic tools
In order to analyse the characteristics of the AROME model and the impact of the supermodding operator, the following diagnostics are utilised in this study. The kinetic energy spectra of the AROME analysis and forecast, and the LBCs on AROME domain are computed in spectral space using ellipses referenced by the model wavenumbers. This tool aims to determine the amplitude spectrum as a function of atmospheric scales simulated by the NWP model. More details about the spectra computation and the AROME model in general can be seen in Ricard et al. (2012).
A forecast sensitivity of initial conditions can be assessed by a technique called Moist Total Energy Norm (MTEN; Storto and Randriamampianina 2010). It is defined by the change of the energy norm between a reference (in our case an AROME forecast without assimilation) and an experiment at a given forecast time t of a selected case. The impact of an observation (scatterometer in our case) or a change in the observation operator (supermodding in our case) can be described through a cost function J as where M t is the (fully nonlinear) model propagator at the forecast t and x exp a and x ref a are the experimental and the reference analyses respectively. The ⟨ … , … ⟩ is the moist total energy norm operator applied for evaluating the impact. Basically, MTEN can indicate how and where the studied observations or changes in the observation operator have impact on AROME forecasts, however the MTEN does not give information about the quality of the impact (positive or negative).

Verification techniques
Forecast verification against independent observations is challenging over the AROME-Arctic domain because there are not many available observations to achieve this. For instance, the coverage of observations from polar-orbiting satellites is one issue and the temporal availability of the remote-sensing data is another in the verification procedure. Therefore, the verification against radiosonde observations was performed using only eight radiosondes, which are very well-distributed inside the NWP model domain (Randriamampianina et al., 2019). Common error statistics -the root mean square error (RMSE) and normalised differences in RMSE -are calculated by comparison of the model data to radiosondes and comparing different experiments. Using a two-month trial period between 01 February and 31 March 2018, AROME-Arctic forecasts up to 12 hr initialised at 0000 and 1200 UTC were performed and verified against radiosondes excluding the first 5 days, which serve as a warm-up period.

3.1
Small-scale data assimilation Wikle and Berliner (2006) define DA as an approach for fusing data (observations) with prior knowledge (e.g., mathematical representations of physical laws; model output) to obtain an estimate of the distribution of the true state of a process. In global NWP models, the data coverage provided by the global observing system largely determines the atmospheric scales, which can be analysed deterministically over the oceans and in the upper air . The available observations provide a relatively stable 4D data sample in each global atmospheric analysis, while inside a LAM domain the available data sample is non-uniform and largely variable (e.g., Table 1). Another substantial difference between global models and LAMs is the atmospheric scales resolved by the numerical model. The effective resolution of the model can be determined by the kinetic energy (KE) spectrum, which is plotted in Figure 3 for the lowest model level, which is at about 12 m altitude above the surface. It shows that the AROME model on this horizontal resolution is able to reproduce the transition to and part of the k −5/3 mesoscale spectrum (Ricard et al., 2012), in agreement with 2D/3D turbulence theory. At the highest wavenumbers, the tail of the model KE spectrum deviates from the theoretical k −5/3 slope due to the energy dissipation mechanism of F I G U R E 3 The kinetic energy spectra for AROME 3 hr forecast (red curve) and for downscaled ECMWF IFS model (green curve) at the lowest AROME model level (around 12 m elevation). Dashed straight lines correspond to the theoretical slopes of kinetic energy at large-(k −3 ) and mesoscales (k −5/3 ) on the log-log scale the numerical model. At the scale from which the model spectrum departs from the k −5/3 slope, the effective resolution can be defined (Skamarock, 2004). In the case of AROME-Arctic, this scale corresponds to a roughly 25 km effective resolution, i.e., ten times the model grid size. In current DA practice, the ASCAT coastal product in AROME-Arctic is assimilated as a point observation, which appears compatible with the model resolution and the ASCAT spatial resolution. In Figure 3, the energy spectrum of the downscaled LBCs can be seen as well. It shows that the downscaled LBC spectrum deviates from the theoretical slope of energy cascade and also from the AROME spectrum above wave number 20, which corresponds to spatial scales smaller than 90-100 km. This is explained by the global model lacking variability on these scales in order to avoid spurious energy accumulation at the highest wavenumbers. Additionally, the LBC energy spectrum is rather negligible above wave number 50 (i.e., below 35-40 km spatial scales), where the global model does not hold physical information.
The high-resolution LAM simulates smaller scales well and the amplitude spectrum is reasonably reproduced (Figure 3), but the phase spectrum of the small-scale waves cannot be accurately determined due to the fact that physical forcing (e.g., orography) and 4D high-resolution observations are missing over the ocean. Lacking such forcing terms, mesoscale models generate realistic, but not real, small-scale processes and rely more on the LBC forcing. Hence, lacking additional 4D observations over the ocean, the small mesoscale structures cannot be constrained. The objective of supermodding is to project well the observation input in model space by constraining those atmospheric scales which can be adequately determined by the DA system. In practice, this means that small-scale 3D spatial features simulated by the model and partially (2D) observed by the observing system, but which cannot be well initialized in the model, should be neglected through averaging in model space.

A single-observation experiment
A single scatterometer observation was assimilated with 3D-Var (and also with experimental 4D-Var) in order to assess how analysis increments are projected by the supermodding operator in the variational scheme. Figure 4 shows analysis increments for the u-component of the wind vector with 3D-Var using horizontal interpolation ( Figure 4a) and 100 km supermodding size (Figure 4b). Figure 4c shows 3D-Var increments using also 100 km supermodding size but the assimilated single observation was adjusted to get the same first-guess departure as horizontal interpolation (Figure 4a). Figure  It can be seen that for large (100 km) supermodding size, the analysis increment gets smoother and corresponding length-scales become slightly larger as well. The magnitude of the analysis increment becomes smaller at the observation location, but larger in the surrounding area, spreading scatterometer observation information over the area of the supermodding size. Therefore, the supermodding operator has the capability to influence those grid-points which correspond to the observation footprint using the predefined structure functions of the variational scheme. Both the size of the first-guess departure and the spatial characteristics of the observation operator affect the magnitude and shape of the increments. One may wonder which one is the dominant mechanism in terms of reducing the size of the increments at the observation location. We tested this by setting the same first-guess departure in the experiment using 100 km supermodding size. In Figure 4c, it can be seen that the supermodding operator reduces the analysis increments at the observation location while the first-guess departure is the same as with horizontal interpolation. It is clear from Figures 4a  and 5a that the horizontal interpolation provides sharp and localised increments in the AROME-Arctic analysis. This sharp analysis increment is what one can expect for a point-based measurement. These results demonstrate how supermodding alters the information of a single observation, however an idealised case-study with more ASCAT observations can explain the concept of supermodding, which was carried out and discussed in Section 3.3. The experimental AROME-Arctic 4D-Var was set up for the single-observation experiment only, therefore experiments discussed in the following sections make use of the operational set-up of AROME-Arctic, i.e., the use of 3D-Var and ASCAT observations from MetOp-A and MetOp-B satellites.

An idealised assimilation case-study with only ASCAT observations
The concept of supermodding can be best explained through a case study where all available ASCAT observations were assimilated without other operationally used observations in AROME-Arctic. The case study analysed here is 1200 UTC, 7th of March, 2018, where the AROME-Arctic model shows small-scale phenomena (convection in this case) which are not observed by the ASCAT instrument. Figure 6 displays a subdomain near Jan Mayen island where wind vectors of AROME-Arctic (Figure 6a) can be seen in full resolution. Figure 6b shows the corresponding AVHRR (Advanced Very High Resolution Radiometer) satellite image indicating that convection was present in the area of interest, however, AROME-Arctic was out of phase and produced too strong convection that is, wind downbursts. At first, a small area of interest is chosen, which is highlighted in Figure 6a for better illustration of resolution differences. In Figure 7, a closer look of AROME-Arctic wind vectors ( Figure 7a) and ASCAT wind barbs (Figure 7b) in full resolution can be compared over this highlighted zone. The AROME-Arctic wind vectors with light grey colour can be seen in Figure 7b as well.
To diagnose the supermodding and the point observation (horizontal interpolation) ASCAT assimilation, Figure 8 illustrates an intermediate step of the assimilation procedure when model information is projected to observation space (i.e., the H(x b )). Figure 8a shows the interpolated AROME-Arctic wind barbs (pink), Figure 8b the supermodded AROME-Arctic wind barbs with 12.5 km averaging size that is, the ASCAT footprint size (green), Figure 8c the supermodded AROME-Arctic wind barbs with 30 km averaging size that is, the ASCAT effective resolution, and Figure 8d demonstrates the supermodded AROME-Arctic wind barbs with 60 km averaging size. The AROME-Arctic wind vectors on model 2.5 km grid can be seen in the background of Figure 8 with light grey colour as well. From the figure, it is clear that default ASCAT assimilation as a point observation compares and corrects small-scale model information with ASCAT measurements, but which cannot be observed by the satellite instrument. This is still the case when representing the The interpolated AROME-Arctic background to ASCAT location (magenta wind barbs), (b) the 12.5 km supermodding of AROME-Arctic background to ASCAT location (green wind barbs), (c) the 30 km supermodding of AROME-Arctic background to ASCAT location (dark red wind barbs), and (d) the 60 km supermodding of AROME-Arctic background to ASCAT location (blue wind barbs) at 1200 UTC on 07 March 2018. The AROME-Arctic on its 2.5 km model grid (grey wind barbs) is shown as background in all figures ASCAT footprint (12.5 km supermodding), while 30 km and more apparently 60 km supermodding constrains the larger scales of the AROME-Arctic background and avoids any correction on the smallest scales, by construction. Figure 9 shows wind u-component analysis increments for a slightly larger subdomain comparing the same four assimilation configurations, namely point observation (Figure 9a), 12.5 km supermodding (Figure 9b), 30 km supermodding (Figure 9c), and 60 km supermodding (Figure 9d). It shows how increments are reshaped by the supermodding observation operator in order to correct the larger scales by the use of ASCAT observations. Similarly to the single observation experiment, it can be seen that correlation length-scales are usually larger and analysis increments are smaller for the supermodding operator.
For this particular case-study, the KE spectra for the above studied AROME-Arctic analyses were computed and compared with the AROME-Arctic background spectrum. This diagnostic was computed for the complete NWP domain. The absolute difference of KE spectra between AROME-Arctic background and analyses is plotted in Figure 10. It can be seen that the absolute difference between horizontal interpolation analysis and the background (purple colour) is the largest for small scales and the smallest for large scales meaning that default ASCAT assimilation impacts the smallest scales substantially. When supermodding is activated and the ASCAT sampling grid is represented (12.5 km averaging size), the smallest scales are effected less (green curve) while below wave number 50 (∼35-40 km), the analysis spectrum overlaps the default assimilation spectrum. A similar transition can be seen with 30 km supermodding, but for scales between wave number 20 and 50 (yellow curve). In the case of 60 km supermodding (blue curve), the assimilation of ASCAT observations has minimal impact on the smallest scales and maximum impact on the larger scales ( Figure 10). This case study exemplifies that supermodding can constrain impact to those atmospheric scales, which are actually measured by the remote sensing instrument and does not touch the smallest scales in the assimilation procedure. This implies that the resulting supermodding analysis is similar to the background at

Departure-based statistics
An informative and low cost assimilation diagnostic used by many operational NWP centres is based on the statistics of observation minus background departures. Moreover, appropriate statistics of observation and background errors are not perfectly known, thus the departure-based diagnostics are essential information, for example, to check for model and/or observation biases, to determine an upper bound for the true observation error, or to get information on combined observation and background (kg•m 2 •s -2 ) F I G U R E 10 Absolute difference of kinetic energy between the AROME-Arctic background and the analyses of the default ASCAT assimilation (purple), the 12.5 km supermodding (green), the 30 km supermodding (yellow), and the 60 km supermodding (blue) errors. Analysis and background departures enable also to separate contributions from background and observation errors if various assumptions are valid (Desroziers et al., 2005). Regarding predefined errors, a diagonal R matrix with assumed spatially uncorrelated errors is employed in AROME-Arctic. To ensure this assumption, for example, scatterometer observations are assimilated with 50 km thinning distance. The pre-defined observation errors for ASCAT data are given for zonal u and meridional v wind components to 1.39 and 1.54 m⋅s −1 respectively. The background errors and its covariance matrix (B) are simulated by downscaling the ECMWF ensemble data assimilation (EDA) results into the AROME-Arctic resolution. The horizontal correlations are homogeneous and isotropic for temperature, specific humidity, vorticity, and divergence analysis parameters and depending on the vertical elevation. The non-diagonal B matrix spreads out the observation information in space by its spatial (and also multivariate) covariances of the background errors. In this paper, we do not modify the abovementioned (operational) settings when only checking the impact of the supermodding method on the AROME-Arctic DA system. In this study, the observation minus background (d o   b hereafter) and observation minus analysis (d o a hereafter) departures are used to evaluate the impact of the supermodding operator using various supermodding sizes with the assimilation of ASCAT observations. For this reason, we express the covariance of the innovation vector (d in Equation (3), which is hereafter d o b ) as follows: where E is the instrumental error covariance matrix, which may include error correlations between different observations. F is the observation representation error covariance matrix and thus the total observation error covariance matrix (R) is the sum of E and F as a full error covariance matrix, but, in practice, a diagonal matrix is used. The B matrix can be expressed as the sum of the following two covariance matrices: B d is the background-error covariance matrix, which contains the model error on deterministic large scales. Deterministic means that atmospheric scales are adequately constrained by the available forcing terms which include 4D observations, boundary conditions, etc. B n is the error covariance matrix related to unconstrained smaller scales. The concept of splitting background-error statistics is motivated by Stoffelen et al. (2020). In Equation (5), E is the statistical expectation. When the supermodding observation operator (H s hereafter) is activated, the covariance of innovations can be written as In Equation (6), d o bs stands for the departure using the supermodded background state. When the supermodding is appropriately applied, the last term in Equation (6) decreases and eventually vanishes .  Table 2) and with different supermodding sizes are shown for both meridional and zonal wind components.
Regarding the d o b standard deviation, a reasonable reduction in the order of 8% for the u-component and 11% for the v-component can be gained by the supermodding operator. The supermodding operator with 30 km size (i.e., with ASCAT effective resolution) achieves 4-5% reduction, which is in accordance with the results of Marseille and Stoffelen (2017). The largest reduction is obtained by the 100 and 135 km averaging in model space for uand v-component respectively. However, the d o b standard deviation starts to increase with even larger supermodding sizes (i.e., with 135, 175, and 225 km sizes). These results suggest that the supermodding removes error variances (essentially from B n ) and, therefore, decreases d o b departures. However, the supermodding may remove true TA B L E 2 ASCAT observations assimilated in the AROME-Arctic 3D-Var with default scatterometer assimilation, and with supermodding method. The observation minus background and observation minus analysis statistics were computed over the period 15-30 November 2013

STDV(d o b ) S T D V ( d o a ) S T D V ( d o b ) S T D V ( d o a ) Size (km)
No. of obs. for u 10 m for u 10 m for v 10 m for v 10 m  , which may relate to detrimental effects due to the small-scale analysis increment, a closer look at the observation weights might be needed.
In order to explain the increased standard deviation of d o a , let us consider the so-called BLUE (Best Linear Unbiased Estimation) analysis equation by taking the minimum of J( x) when its gradient with respect to the increment x is zero (Equations (1) and (2)).
where x a is the analysis increment, K s stands for the Kalman gain or weighting matrix utilising supermodding which can be expressed as Equation (8) using the dual formulation of the assimilation scheme: Equations (7) and (8) show that analysis increments are determined by the weighted innovations where the K weighting matrix consists of an innovation covariance matrix (HBH T + R) −1 and a representer matrix BH T (El Akkraoui et al., 2008). First, the innovation covariance matrix filters the observation information, i.e., the analysis increment is generated in observation space,

F I G U R E 12
The diagnosed observation error correlations for default scatterometer data assimilation (pink), and supermodding at 30 km (red), 60 km (cyan), 100 km (brown), 135 km (lime green), 175 km (green), and 225 km (purple) sizes. The diagnostic was run over the period 15-30 November 2013. A local polynomial regression method (LOESS) was used to fit a smooth curve then the BH T representer matrix spreads the analysis increment from observation space to the model grid by using the background-error correlations. The subscript s means that the supermodding operator is employed in the assimilation scheme. In Equation (8) therefore, H s stands for the supermodding observation operator for scatterometer data, which not only impact the innovations, but the properties of the weighting matrix as well. It means that the supermodding makes the spatial filtering properties of the variational scheme broader (i.e., the correction towards the observation is smaller at the observation location; Figures 4b and 5b show a single scatterometer increment). On the other hand, the impact area increases with larger supermodding size. We note that the d o a standard deviation increases even further when the standard deviation of d o b goes up with large supermodding sizes. In conclusion, reducing H s BH T s without changing R will result in analyses closer to the background, i.e., contributing to larger values of the d o a standard deviation. Another interpretation of departure-based statistics is by the scatterplot of analysis and background departures ( Figure 11). It shows a density scatterplot for default ASCAT assimilation and the fitted linear regression line.
The regression lines are also plotted for the use of the supermodding operator using various sizes, but without indicating their density scatterplots. The line that shows the operational ASCAT assimilation (Def) results in a closer fit to the observations than the use of the supermodding method ( Figure 11). In correspondence with the results on Table 2, one can expect that the larger the supermodding size, the smaller analysis increments of the ASCAT assimilation. Undesirable adjustments in observation weight due to H may be compensated for by changing the observation error. Additionally, the observation-error correlations can be assessed by the departure-based Desroziers et al. (2005) technique providing error correlations as a function of a separation distance (i.e., observations are paired in 5 km bins). This can be seen in Figure 12 for default and supermodding ASCAT assimilation with various supermodding sizes. The spatial correlation structures for diagnosed observation-error correlations show increased error correlations for large supermodding sizes. The diagnosed error correlations with 30 and 60 km supermodding sizes are not particularly increased (even slightly reduced in the first bin), but the correlations become considerably larger above 60 km size. It is important to note that error correlation diagnostics were done without the thinning procedure of scatterometer data in order to evaluate error correlations at short distances, however a minimum distance was set at 2.5 km.
The Desroziers method can provide diagnostics for observation errors and a posteriori tuning of observationand background-error statistics. However, it requires basic assumptions that the assimilation process can be adequately described by linear estimation theory and the error statistics are correctly specified in the analysis. If those basic assumptions are not ensured, the diagnostics might provide misleading results (Bathmann 2018). Additionally, such diagnostic methods can be applied effectively if correlation structures are fairly different in background-and observation-error covariance matrices, which might be less the case for high-resolution LAMs. For that reason, the impact of supermodding on AROME-Arctic analyses and forecasts was carried out without any tuning or modifications of predefined errors. This has to be taken into account during the interpretation of the results in Section 4.2.

4.2
The impact of the supermodding method on AROME-Arctic forecasts In this section, AROME-Arctic experiments were carried out with all operationally used observations, and the only difference between them was the applied observation operator (horizontal interpolation for use as point observation or supermodding with various sizes). The normalised differences in RMSE scores are shown to focus on the statistically significant results in the comparison between the default ASCAT assimilation and the experiments using the supermodding approach.
On Figure 13, normalized differences in RMS error as a function of vertical pressure levels can be seen for wind speed forecasts initialised at 1200 UTC showing the differences between the default ASCAT assimilation and 30 km (Figure 13a), 60 km (Figure 13b), 100 km (Figure 13c),

F I G U R E 16
The Moist Total Energy Norm of the AROME-Arctic case-study computing the diagnostic from 1200 UTC on 25 February 2018. The MTEN score is shown for (a) analysis time and for (b) 3 hr, (c) 6 hr, (d) 9 hr, and (e) 12 hr AROME-Arctic forecasts. AROME-Arctic forecasts are shown using ASCAT assimilation with horizontal interpolation (purple), with 30 km supermodding (yellow), with 60 km supermodding (blue), with 100 km supermodding (orange), and with 225 km supermodding (violet) and 225 km (Figure 13d) supermodding approaches. The use of 30 km supermodding has positive impact below 700 hPa with statistical significance at 700 hPa, however it degrades wind speed at 500 hPa compared to default ASCAT assimilation during the studied period. The 60 km supermodding improves more clearly the wind speed forecasts, with statistically significant positive impact at 850 hPa. Larger supermodding sizes (100 and 225 km) have mostly neutral impact versus default assimilation. It is worth mentioning that supermodding experiments with AROME forecasts initialised at 0000 UTC show similar signals as Figure 13, but without statistically significant scores, which can be explained by the fact that ASCAT observations are not available at 0000 UTC inside the AROME-Arctic domain (Table 1). On Figure 14, normalized differences in RMSE are shown for temperature forecasts initialised at 1200 UTC showing the differences between the default ASCAT assimilation and 30 km (Figure 14a), 60 km (Figure 14b), 100 km (Figure 14c), and 225 km (Figure 14d) supermodding runs. Similarly to wind speed verification scores, the 30 km supermodding shows slightly positive impact in the mid-troposphere with statistical significance at 850 hPa. The 60 km supermodding improves slightly further temperature forecasts compared to the default ASCAT assimilation, with statistically significant positive impact at 850 and 700 hPa levels. 100 and 225 km supermodding sizes have no added value in comparison with default scatterometer assimilation.
On Figure 15, the same normalized differences are shown for the geopotential height parameter from AROME forecasts initialised at 1200 UTC showing the differences between the default ASCAT assimilation and 30 km (Figure 15a impact with no statistical significance in the lower atmosphere, and the 60 km supermodding has slightly better scores than default assimilation with statistically significant positive impact at 500 hPa. The 100 km supermodding provides comparable forecast skill for this parameter to 60 km supermodding with slightly more positive scores and for the largest averaging size, i.e., 225 km supermodding neutral/negative impact is found in the lower troposphere but positive impact at 500 hPa versus default scatterometer assimilation. Other forecast parameters during the trial period showed mainly neutral impact of the ASCAT supermodding assimilation (not shown).
In order to track the impact of the assimilated scatterometer data and the changes in the employed observation operator, the MTEN diagnostic is used. It was computed for two different and distant cases within the two-month trial period to verify the sensitivity of the forecast model to the scatterometer observations and to the observation operators (default and supermodding with different sizes) at forecast length from 0 to 12 hr. In Figures 16 and 17, the MTEN scores of AROME-Arctic forecasts are shown for 25 February and for 7 March 2018, respectively. Other cases from the two-month period show similar sensitivity pattern with MTEN diagnostics (not shown). It can be seen that the scatterometer impact is propagating from near-surface to mid-troposphere during the studied forecast range in all experimental runs. Figures 16e and  17e indicate that the forecast sensitivity diagnostic shows the largest impact of scatterometer observations and the supermodding operator in the lower atmosphere below 400-500 hPa with the maximum at around 700-800 hPa. This corresponds well with the levels with significant differences in the verification against observations.

DISCUSSION AND CONCLUSIONS
A special footprint operator called supermodding was implemented in the variational framework of an AROME-based model according to the idea of Marseille and Stoffelen (2017). The supermodding prototype is first used to improve the spatial representation of scatterometer observations in a high-resolution DA system by averaging in model space to best represent the scatterometer effective resolution. A second objective of supermodding is to constrain the scales in DA over the ocean, which can be well determined by the available observing system. The smaller scales are at best realistically described by the model, but poorly and, in the worst case, ambiguously initialised in NWP DA. Additionally these poorly constrained scales may detrimentally interfere with and affect the initialisation of the larger scales that are supported by the observing system .
First, data and methods were presented. The subject of small-scale DA and the impact of the new observation operator were discussed through a single-observation experiment and a case-study. Then, the examination of the assimilation of ASCAT observations in AROME-Arctic 3D-Var was carried out using departure-based diagnostics. Finally, the impact of the supermodding technique on AROME-Arctic forecasts was evaluated in two-month period experiments, where various supermodding sizes applied to assimilate the ASCAT data were compared to the default ASCAT assimilation procedure. The results are discussed and summarised as follows.
• The KE spectrum of AROME-Arctic shows that the amplitude spectrum of the small scales is reasonably resolved in the model, however the phase spectrum cannot be determined over the ocean since high-resolution and accurate 4D observations (Table 1), physical constraints, and small-scale boundary conditions (Figure 3) are not available or missing.
• To analyse 4D structures deterministically by an assimilation method, sufficient data samples are required.
In the case of insufficient sampling, distorted dynamical structure and the problem called aliasing (Nyquist, 1928) occur, leading to corrupted analysis structures and short-range forecasts. Nyquist (1928) suggests oversampling by a factor of two in all dimensions to prevent the generation of artificial waves (Moiré effect). By the use of ASCAT coastal products in the IFS global DA, De Chiara et al. (2019) arrived at an optimum of 62.5 km superobbing of ASCAT winds, which comes close to the Nyquist criterion, i.e., half the distance of the deterministic resolution (which is ∼150 km). Following the same logic, but in high-resolution DA, the supermodding method can be utilised to neglect the correction on unconstrained small scales larger than the effective resolution of the ASCAT winds.
• Based on single-observation experiments, the spatial filtering properties of the variational scheme become broader at the observation location when using the supermodding method. It implies that the supermodding technique spreads the information content of the observation from the observation location to the neighbouring grid-points depending on the chosen supermodding size. Thus, it suggests that the implemented supermodding approach can adequately represent the observation footprint in the observation operator.
• Figures 6 and 7 illustrate that the AROME-Arctic model simulates small-scale phenomena which are not necessarily observed by the ASCAT coastal product, which is due to phenomena being likely out of phase (at the wrong location and/or time) and/or because the ASCAT resolution is too coarse to sample these.
• The departure-based diagnostics provide information on the combined observation and background error in which we take into account further error components (namely representation error and model error related to unconstrained small scales) to explain in particular observation and model spatial representation in a high-resolution LAM. It is confirmed that supermodding can reduce d o b standard deviation by removing the (background) error variance at small scales. The 30 km supermodding size can provide 4-5% reduction in the standard deviation of d o b and further decrease can be obtained with larger supermodding sizes up to 100-135 km.
• We note that scatterometer wind component accuracy is estimated in triple collocation on the 30 km scale globally, where values of around 0.7 m⋅s −1 are found (Vogelzang et al., 2011). AROME supermodding to 30 km should reduce the spatial representativeness error to zero. However, inflated wind component errors of around 1.4 m⋅s −1 are used in NWP analyses generally. This accounts for the oversampling of the background-error covariances, such that the weight of all scatterometer winds in a 100 km box is similar to a superob at the 100 km scale with wind component error of about 1 m⋅s −1 . Note that using observations in the box that oversample the background structure functions, but with such inflated error, allows improved spatial sampling with respect to thinning or superobbing (De Chiara et al., 2019).
• The standard deviation of d o a for various supermodding sizes shows that the method results generally in smaller analysis increments due to the altered spatial filtering properties of the assimilation scheme and the analysis is gradually shifted towards the first-guess locally when large supermodding sizes are applied. This can be seen in Table 2 and in Figures 11 and 12 as well. In line with expectation, supermodding results in spatially more extensive increments, updating the large-amplitude large scales, as seen in Figures 4, 5 and 12.
• Regarding the diagnosed observation-error correlations (Figure 12), statistics showed increased correlation for short separation distances when large supermodding sizes (larger than 60 km) were applied. For 30 and 60 km supermodding sizes at the first bin (i.e., observation pairs collected for less than 5 km distances), the diagnosed observation-error correlation is even slightly decreased. According to the increased observation-error correlations, one can conclude that the supermodding introduces representation error (Janjić et al., 2017) when the supermodding size is too large. In that case, true variance resolved by the scatterometer is not used in DA and hence a correlation on those scales appears in the observations.
• In the verification against radiosonde observations, supermodding showed both improvement and degradation depending on the chosen supermodding sizes. The use of 30 km supermodding size has mostly positive impact on wind speed and temperature forecasts initialised at 1200 UTC in the lower troposphere. The use of 60 km supermodding showed statistically significant positive impact in wind speed at the 850 hPa level, in temperature at 700 and 850 hPa levels, and in geopotential height at 500 hPa without observed degradation of the forecasts initialised at 1200 UTC compared to the default ASCAT assimilation. Further increasing supermodding size in the assimilation system seems more detrimental than beneficial.
In conclusion, the DA of scatterometer observations in the AROME-Arctic model can be improved by the use of the supermodding method. However, the supermodding method with too large averaging sizes can introduce representation error and can remove true model variance as well. In order to efficiently remove the undesired representation error and to ignore unconstrained small scales in scatterometer DA, a combination of scatterometer superobbing and supermodding is foreseen, which can be the subject of future studies.