Estimating Forecast Error Covariances for Strongly Coupled Atmosphere–Ocean 4D-Var Data Assimilation

Stronglycoupleddataassimilationemulatesthereal-worldpairingoftheatmosphereandoceanbysolvingthe assimilation problem in terms of a single combined atmosphere–ocean state. A signiﬁcant challenge in strongly coupled variational atmosphere–ocean data assimilation is a priori speciﬁcation of the cross covariances be- tween the errors in the atmosphere and ocean model forecasts. These covariances must capture the correct physical structure of interactions across the air–sea interface as well as the different scales of evolution in the atmosphere and ocean; if prescribed correctly, they will allow observations in one medium to improve the analysis in the other. Here, the nature and structure of atmosphere–ocean forecast error cross correlations are investigated using an idealized strongly coupled single-column atmosphere–ocean 4D-Var assimilation system. Results are presented from a set of identical twin–type experiments that use an ensemble of coupled 4D-Var assimilations to derive estimates of the atmosphere–ocean error cross correlations. The results show signiﬁcant variation in the strength and structure of cross correlations in the atmosphere–ocean boundary layer between summer and winter and between day and night. These differences provide a valuable insight into the nature of coupled atmosphere–ocean correlations for different seasons and points in the diurnal cycle.


Introduction
Strongly coupled atmosphere-ocean data assimilation treats the atmosphere and ocean as a single coherent system, applying a single assimilation scheme to a fully coupled model. Interest in the potential use of coupled data assimilation techniques for generating initial conditions for medium-to long-range coupled forecasting and in coupled model reanalysis has grown in recent years and is now an increasingly active area of research (Laloyaux et al. 2016;Lea et al. 2015). Strongly coupled variational atmosphere-ocean assimilation systems require specification of the relationship between the errors in the atmosphere and ocean model forecasts. Unfortunately, the characterization of the statistics of these errors is nontrivial; the atmosphere-ocean error cross-covariance information must capture the correct physical structure of processes occurring across the air-sea interface as well as the different scales of evolution in the atmosphere and ocean. The purpose of this study is to investigate the nature and structure of the coupled atmosphere-ocean forecast error correlations with a view to developing new methods for incorporating this information within fourdimensional variational (4D-Var) coupled data assimilation schemes. If done correctly, a priori prescription of atmosphere-ocean cross covariance information in the 4D-Var background error covariance matrix will allow observations in one fluid to improve the analysis increments in the other, so that both fluids are adjusted consistently. This is expected to lead to better use of nearsurface observations and generation of more physically balanced analysis states, which should in turn lead to more reliable coupled model forecasts, reanalyses, and better prediction of coupled atmosphere-ocean phenomena (Smith et al. 2015).
Traditionally, in uncoupled variational assimilation systems, the background error covariance matrix is held fixed for each assimilation cycle, but more recently methods have been developed to include ensemblederived information within the variational framework. Ensemble data assimilation methods, such as the ensemble Kalman filter (EnKF) capture the flow dependence of the uncertainty in the background errors by evolving the covariance matrix according to the underlying model dynamics. The advantages and disadvantages of the variational and ensemble methods are widely discussed in the literature (e.g., Lorenc 2003;Kalnay et al. 2007;Whitaker et al. 2008). Schemes that aim to exploit the merits of both the variational and ensemble approaches by using a combination of the two are known as ''hybrid'' assimilation schemes. For example, Météo-France and the European Centre for Medium-Range Weather Forecasts (ECMWF) use the statistics from an ensemble of 4D-Var assimilations to diagnose the error variances for the background error covariance matrix in their deterministic operational 4D-Var systems (Bonavita et al. 2012;Raynaud et al. 2011). The methodology has recently been extended to incorporate flow-dependent ensemble information into the modeled error covariance structures (Bonavita et al. 2016). Another option is to use an ensemble to compute a sample estimate of the full forecast error covariance matrix and then apply localization and filtering techniques to remedy problems with low rank and spurious correlations arising due to sampling error (Buehner et al. 2010a,b). A further, and potentially more robust and flexible, approach is to use a weighted linear combination of the static (climatological) and ensemble-based covariance formulations as is done, for example, in the Met Office global hybrid ensemblevariational assimilation system (Clayton et al. 2013). Kuhl et al. (2013) have also investigated using a similar approach in observation space within the framework of the Naval Research Laboratory Atmospheric Variational Data Assimilation System-Accelerated Representer (NAVDAS-AR) dual-form 4D-Var scheme.
The relative infancy of coupled atmosphere-ocean data assimilation means that there has, to date, only been a limited number of published studies exploring the estimation and implementation of coupled error covariances, and the majority of these have employed ensemble-based assimilation methods (e.g., Han et al. 2013). For coupled assimilation it is generally recognized that hybrid ensemble-variational-based schemes would offer the required flexibility in terms of time and space scales, allowing the blend of static and flowdependent components of the forecast error covariance matrix to be adjusted for different types of application (e.g., resolving large versus small-scale processes and flows, large versus small ensembles, long versus short assimilation and forecast window length), and thus enabling the coupled system to make the most of the available observations (Lawless 2012). Frolov et al. (2016) have recently begun to explore this idea in a 3D-Var framework by experimenting with the use of hybrid static-ensemble error covariances in their interface solver: a 3D-Var-based system that solves an approximation to the strongly coupled atmosphere-ocean assimilation problem. In this study, we use ensembles of cycled 4D-Var data assimilations to gain insight into the characteristics of atmosphere-ocean forecast error cross correlations in strongly coupled systems; the methodology is based on the approach described in Zagar et al. (2005) and uses the statistics of differences between pairs of forecast ensemble members to derive estimates of the forecast error covariance matrix. Experiments are performed within an idealized 1D single-column coupled atmosphere-ocean model framework. The system employs the incremental 4D-Var algorithm and was previously used in a systematic comparison of the uncoupled, weakly coupled, and strongly coupled approaches to treating the coupled model initialization problem (Smith et al. 2015;Fowler and Lawless 2016).
Our results show that the strongest error cross correlations occur within the near-surface atmosphere-ocean boundary layer between atmosphere and ocean model variables that are directly related via surface boundary conditions. This broad finding was foreseen, but the detail, including notable variation in the strength and structure of the atmosphere-ocean correlations between summer and winter, and between day and night, has provided valuable new knowledge that is now being used to inform the development of a full hybrid ensemble-variational framework for our idealized system; this study therefore represents an important step in the advancement of coupled atmosphere-ocean data assimilation methods.
The scientific and technical challenges of strongly coupled data assimilation mean that most operational centers are focusing their initial efforts on developing intermediate, or weakly coupled, assimilation systems that do not include explicit atmosphere-ocean error cross covariances. Nevertheless, the increased understanding of the type and significance of error correlations arising from strong atmosphere-ocean coupling that has been gained from this study will aid the design of innovative methodologies for incorporating crossfluid error covariance information into both weakly and strongly coupled assimilation systems of the future. This paper is organized as follows: in section 2 we introduce our coupled atmosphere-ocean system, briefly describing the nonlinear model and incremental 4D-Var algorithm upon which our strongly coupled assimilation system is based, and explaining the method used to compute the ensemble forecast error correlations. Details of the experimental design are given in section 3, and the results are presented in section 4. Finally, in section 5 we summarize the conclusions from this work.

The coupled 4D-Var system
We begin with an overview of our idealized 1D coupled atmosphere-ocean model and the strongly coupled incremental 4D-Var algorithm, and then introduce the forecast error covariance estimation methodology. The assimilation system and dynamical model are the same as that described in Smith et al. (2015).

a. The model
The coupled atmosphere-ocean model was built by coupling a stripped back version of the ECMWF singlecolumn atmospheric model (SCM), which is based on an early version of the Integrated Forecasting System (IFS) code, to a single-column K-profile parameterization (KPP) ocean mixed layer model, which is based on the scheme of Large et al. (1994).
The atmosphere component solves the primitive equations for temperature T, specific humidity q, and the zonal and meridional wind components u and y, using a hybrid vertical coordinate system (Simmons and Burridge 1981) that extends over 60 levels from the surface to around 0.1 hPa with finest resolution in the planetary boundary layer. The original (ECMWF) version of the SCM code includes the parameterization of physical processes such as radiation, turbulent mixing, moist convection, and clouds. Our reduced version includes tendencies due to vertical advection and turbulent diffusion only, but is still able to produce a good approximation to the evolution of the atmosphere when compared to the full version, and is therefore adequate for the purposes of this study. The surface pressure, vertical velocity, tendencies due to horizontal advection, and geostrophic wind components are prescribed externally.
The ocean component describes the evolution of the mean values of temperature, salinity, and zonal and meridional currents on a fixed vertical grid that stretches from a depth of 1 to 250 m. The grid resolution is increased near the surface to allow simulation of upperocean diurnal variability. There are 35 levels in total, with 25 in the top 50 m. The time evolution of each field is expressed as the vertical divergence of its kinematic fluxes. In the ocean surface boundary layer, the kinematic fluxes are parameterized using K profiles; mixing in the ocean interior is assumed to be governed by a combination of shear instability and internal wave activity. Terms describing the effects of nonlocal transport and double diffusion were omitted. Shortwave and longwave radiation forcing at the ocean surface and the geostrophic component of the currents are prescribed externally.
The atmosphere and ocean model components exchange information at every time step. The surface boundary conditions for atmospheric temperature and specific humidity depend on the sea surface temperature (SST; a no-slip condition is used for the u and y wind components) and the ocean surface boundary conditions depend on the near-surface atmospheric state. The surface boundary condition for ocean temperature has turbulent and nonturbulent (radiative) elements, which combine to give the net heat flux, where Q SW is surface shortwave radiation, Q LW is net surface longwave radiation, and Q E and Q H are the (turbulent) latent and sensible heat fluxes. The surface boundary condition for salinity depends on the turbulent freshwater flux, where L y is the latent heat of evaporation, and the surface boundary conditions for the ocean currents depend on the zonal and meridional components of the surface wind stress, t x and t y . The latent and sensible heat and momentum fluxes are all computed within the atmosphere model component using the bulk formulas t x 5 r a C D jU n ju n , t y 5 r a C D jU n jy n , Q H 5 r a C H jU n j(T n 2 SST) , where the subscript n represents the lowest atmosphere model level, is the approximate 10-m wind speed, r a is the density of air, and q sat (SST) is the surface saturation specific humidity. The drag coefficient C D and the transfer coefficients for heat C H and moisture C E are computed using the method of Louis et al. (1982). A complete description of the system, including the model equations, is given in the appendix of Smith et al. (2015). Despite the modifications outlined, the SCM's description of the air-sea exchange processes is sufficiently realistic for our results to be relevant to full 3D systems. The validation of the model is described in section 3.1 of Smith et al. (2015); it generally compares well against the original ECMWF version of the code and also with ERA-Interim and Mercator Ocean reanalysis data for forecasts of up to OCTOBER 2017 S M I T H E T A L .
around 5 days, but beyond this its performance is hindered by the simplified physics and lack of horizontal processes.

b. Strongly coupled incremental 4D-Var
The problem of variational data assimilation is to find the initial state such that the model forecast best fits the available observations over a given time window, subject to the initial state remaining close to a given a priori, or background, estimate and allowing for the errors in each. Rather than searching for the initial state directly, the incremental 4D-Var algorithm (e.g., Courtier et al. 1994;Lawless et al. 2005) seeks increments dx 0 to the initial background state estimate by solving a sequence of linearized inner-loop least squares cost function minimizations and outer-loop nonlinear update steps.
where d (') x b 0 2 R m is the background model state, used as a first guess at t 0 ; x (') 0 2 R m is the estimate of the initial model state at outer-loop iteration '; y i 2 R r i is a vector of r i imperfect observations at time t i ; the operator H i 2 R r i 3m is the tangent linear of the nonlinear observation operator h i : R m / R r i ; M is the tangent linear of the nonlinear model operator M, and B 0 2 R m3m and R i 2 R r i 3r i are the background and observation error covariance matrices.
The strongly (or fully) coupled 4D-Var approach treats the atmosphere and ocean as a single coherent system; the incremental 4D-Var control vector, dx in (8), consists of both the atmosphere and ocean prognostic variables, and the coupled model is used in both the inner and outer loops.
The background (or forecast) error covariance matrix B 0 should contain information on the statistics of the errors in the background state. Note that since the initial background state is typically a model forecast from a previous analysis, the terms ''background'' and ''forecast'' are used interchangeably. For a coupled system with dx 5 (dx T A , dx T O ) T , where dx A represents the atmosphere increment and dx O the ocean increment, the matrix B 0 can be decomposed as Here B AA and B OO represent the background error covariances for the atmosphere and ocean state variables, respectively, and B AO represents the cross covariance between background errors in the atmosphere and ocean states. The inclusion of cross covariances between the atmosphere and ocean means that atmosphere observations can influence the ocean analysis (and vice versa). Background error cross covariances are implicitly generated by the incremental 4D-Var algorithm, so even if we assume that the errors in the atmosphere and ocean fields are uncorrelated at t 0 (i.e., we set B AO to zero), nonzero cross covariances will be produced throughout the rest of the assimilation window. However, we can also explicitly prescribe nonzero cross covariances a priori by including them in B 0 .

c. Ensemble error covariances
Formulation of the 4D-Var algorithm assumes that the background errors e b are random and unbiased with Gaussian probability distribution functions. The matrix B 0 is then defined as where x t 0 is the true system state at t 0 and e b 0 represents the error in the background state at t 0 . In practice, the true error statistics are unknown and so must be approximated in some manner; the accuracy of their description is crucial to the success of the assimilation process. Variational methods prescribe a static matrix B 0 at the start of each assimilation window, whereas sequential, Kalman-filter-based methods evolve the background covariance matrix according to the underlying model dynamics and thus attempt to capture the flow dependence of the uncertainty in the forecast errors. In the filtering case, the forecast error covariance matrix is denoted P b k : where the subscript k indicates time dependency, and e b k 5 x b k 2 x t k . The standard approach in ensemble methods is to use the covariance statistics of the differences between each ensemble member and the forecast ensemble mean as a proxy for P b k . The ensemble estimate, at a given time t k , is constructed as where x b is the mean of the forecast ensemble, N is the ensemble size, and x b,j ( j 5 1, . . . , N) denotes the jth forecast ensemble member. The division by (N 2 1) in (14) ensures that the ensemble covariance matrix is an unbiased estimate of the true covariance matrix. Averaging at time t k provides an estimate of the instantaneous P b k matrix; alternatively, averaging may be performed over multiple assimilation cycles for an estimate of the climatological covariance matrix.
In (14) the ensemble mean, x b represents the best estimate of the ''truth,'' but, since it is computed from the sum of the individual ensemble members, it is unlikely to itself be a valid realization of the state and may actually lie outside the model attractor. An alternative method that avoids reference to the mean is to use the statistics of differences between pairs of forecast ensemble members (Berre et al. 2006;Fisher 2003;Zagar et al. 2005).
denote a forecast at time t k from an analysisx a 0 , at t 0 made by adding Gaussian random perturbationsẽ b 0 andh to an unperturbed initial background state x b 0 and set of observations y. We can write this analysis state as where f represents the assimilation system. If we define the error in the forecastx b k relative to an unperturbed (or control) forecast x b k as and consider the difference between (15) and a second forecastx b k from an analysis made with different initial background and observation perturbationsê b If we assume that the perturbationsẽ b k andê b k are uncorrelated and have the same statistics as the unperturbed forecast errors e b k , that is, then it can be shown (e.g., Berre et al. 2006) that the covariance of the difference (18) is equal to twice that of the error covariance matrix of the unperturbed forecast errors We can approximate this using an ensemble of perturbed forecasts as . For an ensemble of N members there will be N 2 1 independent pairs of members. The full theoretical justification for this approach is given in Zagar et al. (2005). In a cycled 4D-Var system, the forecast from the analysis is used to provide the background state for the start of the next assimilation cycle, so an initial background guess only needs to be explicitly specified at the start of the first cycle. Similarly, if we perform an ensemble of cycled 4D-Var analyses, we only have to explicitly generate an ensemble of perturbed background states for the first cycle and thus can generate a series of perturbed analysis and forecast states from a single set of initial background perturbations; the schematic shown in Fig. 1a illustrates this idea and summarizes how the 4D-Var cycling is implemented in this study (the full details of the experimental design are given in the next section). For each assimilation cycle, the error covariance matrix formed from (20) after forecasting the analysis ensemble forward to the end of the current window will represent both an approximation of the forecast error covariance matrix at the end of the current cycle and an approximation of the initial background error covariance matrix for the next cycle. If we use pairs of forecast ensemble members collected over several assimilation cycles in the computation of (20), we can increase the effective ensemble size and thus confidence in the reliability of this estimate. 1 Note that in practice this matrix is not actually implemented in the 4D-Var algorithm; instead, it is conventional to use the same predefined matrix B 0 in the assimilation step for all cycles, as is the case for the experiments described in the next section.

Experimental design
In these experiments, we use an ensemble of strongly coupled 4D-Var assimilations with N 5 500 members to derive estimates of the atmosphere-ocean forecast error covariance matrix for a summer and a winter test case. Using such a high ratio of ensemble members to the dimension of the coupled state vector (which is 380 in this case) would not be computationally practical in many operational-scale systems, but the relative simplicity and small dimension of our idealized system means that we are able to run large ensembles comparatively cheaply. Using a large ensemble size (relative to the dimension of the system) reduces the potential for contamination with sampling noise and increases confidence that the estimates of the atmosphere-ocean forecast error correlation structures we obtain are real. In more complex systems, small ensembles are typically used in combination with methods designed to alleviate issues associated with undersampling, such as covariance localization and inflation. The potential for using a limited ensemble size together with vertical localization has been examined for our system and will be reported in a separate publication.
The experiments are identical twin-type; the coupled nonlinear model is assumed to be perfect and is used to forecast the truth or reference trajectory from which observations are then generated at 3-hourly intervals. The true initial state for the summer case is given by a 24-h coupled model forecast valid at 0000 UTC on 2 June 2013; the true initial state for the winter case is given by a 24-h coupled model forecast valid at For cycle 2 onward, the initial background ensemble is produced by forecasting the t 0 analysis ensemble from the previous cycle forward 12 h. For each cycle, observations y k are generated at forecast lead times t k 5 3, 6, 9, and 12 h. A different set of perturbed observations is produced for each ensemble member by adding random perturbations e o,j 0 ; N(0, R) to y k . The same background and observation error covariance matrices B 0 and R are used for all cycles. (b) Each cycle starts at either 0000 UTC (local day) or 1200 UTC (local night) and uses a 12-h assimilation window; eight cycles are run, giving a total period of 4 days. The 1200 UTC error correlations are computed after forecasting the t 0 analysis ensembles from cycles 1, 3, 5, and 7 to the end of their respective assimilation window, and the 0000 UTC error correlations are computed after forecasting the t 0 analysis ensembles from cycles 2, 4, 6, and 8 to the end of their respective assimilation window. 0000 UTC 2 December 2013. The initial background control state x b 0 for each case is given by a second 24-h coupled model forecast initialized from a perturbed initial state. The initial atmosphere and ocean states and forcing data for these forecasts are derived from the ERA-Interim (Dee et al. 2011) and Mercator Ocean reanalyses (Lellouche et al. 2013), and a model time step of 15 min is used in all cases. A complete description of the design of this setup is given in Smith et al. (2015).
A total of eight 4D-Var cycles are run for each experiment. Each cycle uses a 12-h assimilation window with three outer loops and starts at either 1200 or 0000 UTC. This gives us a sample of 499 3 4 5 1996 differences between 12-h forecasts valid at 0000 UTC and 499 3 4 5 1996 differences between 12-h forecasts valid at 1200 UTC, that is, two daily analysis times (this is illustrated schematically in Fig. 1b). Our point is located at 258N, 188.758E in the northwest Pacific Ocean and was chosen for consistency with previous studies using the same system (Smith et al. 2015;Fowler and Lawless 2016). This location has a UTC offset of approximately 11 h, which means that 1200 UTC corresponds to the early hours of the morning local time (;0100 LT) and 0000 UTC corresponds to the early afternoon (;1300 LT). This enables us to also compare the structure of the error correlations between day and night.
The ensemble is generated by perturbing both the initial background state and the observations. At the start of the first assimilation cycle, an ensemble of initial background states is generated by adding random perturbations to the control state x b 0 ; these are drawn from a Gaussian distribution with zero mean and standard deviation consistent with the 4D-Var background error covariance matrix B 0 (see next paragraph). For the second cycle onward, the initial background ensemble is given by the analysis ensemble from the end of the previous assimilation window, as illustrated in Fig. 1a. The observations are randomly perturbed across all cycles according to the observation error statistics in R. A different random perturbation is added to each observation at each different observation time and model level, but the standard deviation of the perturbations is fixed for a given observation type (see Table 1); thus, each ensemble member assimilates a different set of observations for every cycle.
The 4D-Var background error covariance matrix is assumed to be diagonal and is fixed for all cycles. It is standard practice in 4D-Var to use a static background error covariance matrix, and starting each new cycle from the same diagonal B 0 allows us to better understand the type of flow-dependent covariance and cross-covariance structures that are generated by the implicit propagation of B 0 across the assimilation window by the 4D-Var algorithm (see, e.g., Bannister 2008). The background error variances are calculated from a 24-h coupled model forecast time series as described in section 4.2 of Smith et al. (2015); they vary for each model variable and are different for the June and December cases, as illustrated in Figs. 2 and 3. Although the prescribed ocean background error variances are smaller than would normally be used in a full-scale uncoupled ocean assimilation system, they are appropriate for our model system and reflect the fact that the ocean evolves more slowly than the atmosphere over the time scales we consider. The scales of variability represented by the background error variances should be consistent with the model and the length of the assimilationforecast window. For example, in the weakly coupled assimilation system developed at ECMWF (Laloyaux et al. 2016), they have reduced the ocean background error variances to a third of the values used in their uncoupled ocean assimilation system to account for the fact that their coupled system uses a much shorter assimilation window length (24 h compared to 10 days). If the prescribed background errors are too large, the assimilation will overfit to the observations, and this will negatively impact the analysis. We discuss the effect of larger-amplitude ocean initial perturbations and background error variances on our system in section 4.
The observation error covariance matrix is also taken to be diagonal, with a fixed error variance for each observation type. The observations are generated by adding uncorrelated random Gaussian errors, consistent with the prescribed statistics (see Table 1), to the truth trajectory. Because the observations are direct, the prescribed error variances represent measurement error only; hence, the values we use are smaller than would ordinarily be used in an operational setting (where the observation errors will vary with instrument type and will also incorporate representativity error). In practice, it is the relative weighting of the background and observation errors that is important in an assimilation system rather than their actual magnitudes. If the observation errors are large relative to the variability of the model, the observations will not bring any additional information to the system. For example, the variability of the ocean salinity in our 1D system is limited on the time scales we consider; in order to enable the salinity observations to have some impact in the assimilation, the variance of their errors was set at a value lower than that of a typical salinity data source.
Observations of atmospheric temperature and u and y wind components are assimilated at 17 of the 60 atmosphere model levels, selected to approximately correspond to the standard pressure levels (which range from 10 to 1000 hPa). Since the atmospheric model does not include the parameterization of processes such as moist convection, clouds, and precipitation, we do not assimilate observations of specific humidity q. Observations of ocean temperature, salinity, and zonal and meridional currents are assimilated at 23 of the 35 ocean model levels; these are irregularly spaced at depths ranging from 1 to 250 m. In the upper ocean, where the model grid is finest, the observation locations are chosen to approximate the resolution of a typical ocean observation profile; below this the vertical frequency of the observations is limited by the relative coarseness of the model grid. Although insufficient spatial-temporal resolution means that observations of ocean currents are not routinely assimilated into uncoupled operational assimilation systems, this is an idealized study and we are not attempting to emulate a real-world observing system. Assimilating observations of all ocean variables will provide guidance on the type of error covariance information that can be generated by (and should be incorporated into) an ideal coupled assimilation system. The same observation network is used for all cycles, so the same number of observations is assimilated at each observation time and the observation error covariance matrix R is constant. Note that the 3-hourly observation frequency excludes the start of each 12-h assimilation window, that is, observations are at forecast lead times t k 5 3, 6, 9, 12 h.

Results
Since the aim of this study is to understand the relationships between the errors in the atmosphere and ocean forecasts, we focus our discussion on the coupled atmosphere-ocean error cross correlations. We consider cross correlations rather than cross covariances because different components of the coupled state vector have very different levels of variability; standardizing prevents variables with large error variances from dominating the structure of the covariance matrix. Using a 12-h assimilation window enables us to compute one set of error correlations from 12-h forecast ensembles from day to night (valid at 1200 UTC) and one set of error correlations from 12-h forecast ensembles from night to day (valid at 0000 UTC).
Before beginning our discussion, we reemphasize that we are examining the nature of the correlations between the errors in different atmosphere and ocean forecast fields rather than between the forecast fields themselves. The errors in two different variables will not necessarily interact in the same way as the model variables themselves, and this interaction may not be linear, especially when there are multiple variables at play and the relationships between them are strongly nonlinear. A positive correlation between the errors in two fields means that an increase (decrease) in the error in one field will be associated with an increase (decrease) in the error in the other, and a negative correlation means that an increase (decrease) in the errors in one field will be associated with a decrease (increase) in the error in the other. If an error has negative sign, ''increase'' means that its value moves toward zero, and so its magnitude will actually decrease. Similarly, if a negative error value ''decreases,'' then it becomes more negative and its magnitude increases.
Selected results for the June and December 500-member ensembles are shown in Figs. 4, 6, 11, and 12; there is significant variation in the atmosphere-ocean error cross correlation structures between summer and winter, and also between day and night. The strongest cross correlations are seen in the lower part of the atmosphere and upper portion of the ocean column; beyond this the atmosphere-ocean errors appear to be mostly uncorrelated. This is consistent with what we would expect as the atmosphere-ocean boundary layer is the region directly influenced by air-sea exchange processes, and therefore the area where errors in one fluid are likely have the greatest impact on the other. In the following discussion, we explain the various correlation patterns we observe by considering knowledge of the underlying coupled model physics, external forcing, and known atmosphere-ocean feedback mechanisms.

a. June ensemble
In the summer, solar insolation is strong (the prescribed radiation forcing assumes a clear sky) and the mean net heat flux is positive (i.e., into the ocean); the ocean mixed layer is shallow (maximum ;25 m depth), which implies that the upper ocean is thermally stratified. The atmosphere-ocean surface temperature difference and hence the magnitudes of the turbulent heat fluxes are small relative to the winter case, implying less air-sea heat exchange and weaker coupling. Consequently, the atmosphere-ocean error correlations are generally fairly small and concentrated in the top few meters of the ocean and bottom 100 hPa or so of the atmosphere. The exceptions to this are the correlations between errors in the upper-ocean currents and nearsurface winds (Fig. 4) and between the errors in the near-surface ocean salinity and atmosphere temperature and humidity (Figs. 6f,h).

1) WIND-CURRENT ERROR CROSS CORRELATIONS
The errors in the near-surface u-wind and u-current components have strong positive correlation, as do the near-surface y-wind and y-current components. components of the ocean velocity depend on the zonal and meridional components of the surface wind stress t x and t y , where K u and K y are turbulent exchange coefficients; these stresses act to transfer momentum from the atmosphere to the ocean and drive the ocean surface currents, and they are a function of wind speed and direction [ (3) and (4)]. Equation (21) tells us that, at the ocean surface, the vertical shear of the u o (y o ) current is proportional to the zonal (meridional) wind stress. In the absence of rotation, the ocean surface currents will accelerate in the direction of the force of the wind stress; therefore, the effect of a positive perturbation in t x (t y ) will be to increase momentum (and thus velocity) in the direction of the u o (y o ) current, and the converse will apply for a negative t x (t y ) perturbation. Now, if we consider a small perturbation du n to the zonal component of the surface wind u n and take the tangent linear of (3) for t x (assuming r a and C D are unperturbed), we find dt x ' r a C D [u 2 n (u 2 n 1 y 2 n ) 2(1/2) 1 (u 2 n 1 y 2 n ) 1/2 ] du n , where dt x is the resultant perturbation in t x . The drag coefficient C D , air density r a , and terms inside the square brackets of (22) are positive, so this tells us that errors in t x and u n will, to first order, be positively correlated; we can use a similar argument to show that the same holds true for errors in t y and y n . We therefore expect errors in the nearsurface u wind and u current, and y wind and y current, to be positively correlated. Below the ocean surface, mixing diffuses the wind-induced momentum downward so that the influence of the wind forcing (and wind forcing errors) on the ocean currents decays with depth; instead, Coriolis and horizontal pressure gradient forces dominate. We note that, the idealizing assumptions made by Ekman (e.g., Stewart 2008, chapter 9) do not typically hold for our model and so it does not consistently simulate the spiraling vertical flow predicted by the theory. In addition, the difference between two Ekman velocity profiles formed from different surface stresses and vertical eddy viscosities will not necessarily also be an Ekman spiral. Therefore, even when the structure of the ageostrophic component of the model forecast flow is close to the classical Ekman spiral the (truth-forecast) error vectors do not exhibit the same regular pattern of rotation with depth, instead they fluctuate both in direction and magnitude; this is illustrated for an example case in Fig. 5.

CROSS CORRELATIONS
The correlations between errors in the atmosphere and ocean temperature are overall weak for the summer case, with minimum and maximum values of 20.29 and 10.5, respectively (Figs. 6a,b). Intuitively, we might expect the errors in the near-surface region to be negatively correlated (atmosphere gaining too much heat, implying the ocean losing too much heat), but they appear small and positive both day and night.
During daylight hours, the temperature of the upper ocean is essentially being driven by the strong summer solar insolation. The atmosphere temperature field is gaining heat from the ocean via the sensible heat flux Q H , but this loss of heat from the ocean to the 4022 atmosphere is small relative to the magnitude of the shortwave radiation flux, and so the net heat flux Q net [(1)] into the ocean is positive, meaning that it is also gaining heat, as illustrated in Fig. 7. For a given ensemble member, the forecast ocean surface temperature will (i) become too warm if the ocean heat gains too much heat relative to the truth, that is, Q net . 0 is overestimated, or (ii) become too cold if the ocean is not gaining enough heat relative to the truth, that is, Q net . 0 is underestimated.
Similarly, the forecast atmosphere surface temperature will (iii) become too warm if the atmosphere gains too much heat relative to the truth, that is, jQ H j overestimated, Q H , 0, or (iv) become too cold if the atmosphere is not gaining enough heat relative to the truth, that is, jQ H j underestimated, Q H , 0.
During the night, the atmosphere temperature field is still gaining heat from the ocean via Q H , but the net heat flux Q net becomes negative and so the ocean will be losing heat (see Fig. 7). For given ensemble member, the forecast ocean surface temperature will (v) become too warm if the ocean loses too little heat relative to the truth, that is, jQ net j underestimated, Q net , 0, or (vi) become too cold if the ocean loses too much heat, that is, jQ net j overestimated, Q net , 0.
The relationship between error in predicted heat gain or loss and error in predicted temperature is illustrated graphically in Fig. 8. To summarize, a positive atmosphere-ocean surface temperature error correlation during the day implies that the atmosphere and ocean are 1) both gaining too much heat relative to the truth [cases (i) and (iii)] or 2) both gaining too little heat relative to the truth [cases (ii) and (iv)], and a positive atmosphere-ocean surface temperature error correlation during the night implies that 3) the atmosphere is gaining too much heat and the ocean is losing too little heat [cases (iii) and (v)] or 4) the atmosphere is not gaining enough heat and the ocean is losing too much heat [cases (iv) and (vi)].
Since Q SW and Q LW are prescribed, errors in the magnitude of Q net , and in turn the amount of heat lost or gained by the ocean, will come from the combination of errors in the magnitude of the latent and sensible heat fluxes Q E and Q H . Assuming both Q H , 0 and Q E , 0, for each of the combinations 1-4 above to hold, jQ E j must be underestimated when jQ H j is overestimated and vice versa, suggesting that errors in Q E and Q H are negatively correlated for this case. These ideas are illustrated schematically in Fig. 9. Figure 10a shows a scatterplot of the errors in the sensible and latent heat flux for every ensemble member at the end of each assimilation window; there is a moderate negative trend between them with positive errors in Q E associated with negative errors in Q H and vice versa, thus agreeing with our conjecture. The size of the errors in Q E are bigger than those in Q H , which would suggest that the errors in Q net are being driven by errors in Q E ; scatterplots of the errors in Q E versus Q net and Q H versus Q net (Figs. 10b,c) confirm this to be the case. The model equation for Q E [(6)] does not directly depend on the atmosphere temperature; rather, the errors in ocean heat gain/loss appear to be primarily coming from errors in the exchange of moisture; this would help to explain why the correlations between the nearsurface atmosphere and ocean temperature errors are overall quite weak.

SALINITY ERROR CROSS CORRELATIONS
Errors in ocean salinity and atmosphere temperature in the atmosphere-ocean boundary layer show strong correlation for the forecast valid at 1200 UTC (close to midnight local time, Fig. 6f). The correlations are strong and positive in the near-surface region and switch to negative around the height of the atmospheric boundary layer (ABL). The potential origin of this relationship is not immediately obvious from the model equations as neither is explicitly included in the surface boundary condition of the other. However, it may be explained via the relationship between the errors in the ocean salinity and specific humidity, which show an almost equal but opposite (negative) correlation in the same region (Fig. 6h). The surface boundary condition for salinity depends on the latent heat flux Q E . From Eq. (6) we expect a positive linear association between errors in the surface specific humidity q n and errors in Q E , and scatterplots of the errors in these two fields for each ensemble member confirm this (Fig. 10d). An overestimate (underestimate) of the surface specific humidity will therefore be associated with an overestimate (underestimate) of the magnitude of Q E (relative to the truth). Assuming Q E , 0, an overestimate of the magnitude of Q E will result in an overestimate of evaporation from the ocean surface, and this will cause errors in the surface salinity to decrease [remember that since error 5 (truth 2 forecast) ''decrease'' can also mean become more negative and thus increase in magnitude]. Conversely, an underestimate of the magnitude of Q E will lead to too little evaporation and an increase in the surface salinity error. By this reasoning, we expect the near-surface salinity and specific humidity errors to be negatively correlated, and this holds for our results (Fig. 6h). Next, we consider the relationship between the nearsurface atmosphere temperature and specific humidity errors; from (5) we expect errors in the atmosphere surface temperature and the sensible heat flux Q H to be positively correlated to first order, and, as previously stated, (6) tells us that errors in the surface specific humidity and the latent heat flux Q E will be positively correlated too. From our explanation of the mechanism for the observed positive near-surface atmosphere-ocean temperature error correlations, we know that, for this case, an overestimate of the magnitude of Q H will generally be associated with an underestimate of the magnitude of Q E (and vice versa, see Fig. 10a); given this, we would expect errors in the near-surface atmosphere temperature and specific humidity fields to be negatively correlated. Although weak, this relationship is seen for both the 1200 and 0000 UTC forecast error correlations (not shown). Together, the negative relationship between errors in near-surface salinity and specific humidity, and between errors in near-surface atmosphere temperature and specific humidity, would imply a positive relationship between errors in the near-surface salinity and atmosphere temperature, hence explaining the correlations seen in Fig. 6f.
More generally, errors in the heat and moisture content of the lower atmosphere will lead to errors in the model-predicted ABL height and misplacement of the position and gradient of the temperature inversion capping the ABL; this will introduce something akin to a phase error and hence explain the change in sign of the near-surface salinity-atmosphere temperature and salinity-humidity error correlations at around 900 hPa.
The same strength of correlation between the errors in the near-surface ocean salinity and atmosphere temperature and humidity are not seen for the forecast valid at 0000 UTC (local day; Figs. 6e,g). During the day, strong solar insolation warms and stabilizes the upper ocean; because the turbulent heat fluxes, and their errors, are relatively small in magnitude, the structure of the ocean is dominated by this solar heating. At night, the ocean is losing heat and the stratification of the water column is weaker; the ocean is more responsive to perturbations in the turbulent fluxes, thus enabling stronger error cross correlations to develop.

CROSS CORRELATIONS
The structure of the near-surface ocean temperature and wind speed error cross correlations almost mirrors those of the atmosphere-ocean temperature errors FIG. 7. Schematic illustrating how the shortwave radiation flux Q SW affects the net heat flux Q net and in turn ocean heat gain or loss between day and night. In our model, all fluxes are positive downward; this simplified representation assumes that SST . T surf so that both Q E , 0 and Q H , 0.
(cf. Figs. 6a and 6c and Figs. 6b and 6d). Stronger surface winds act to draw heat from the ocean and induce ocean mixing. If the surface wind speed is persistently overestimated (underestimated), turbulent heat exchange will be enhanced (reduced) and ocean heat loss will also be overestimated (underestimated); this will cause the ocean temperature to become underestimated (overestimated) relative to the truth, that is, the ocean will become too cold (warm), or even colder (warmer) if it is already too cold (warm). Therefore, a negative correlation between errors in near-surface ocean temperature and wind speed is consistent with what we would expect physically. Analogous to the misplacement of the ABL, errors in ocean-atmosphere heat exchange will result in errors in the vertical structure of the ocean and the modeled mixed layer depth, thus explaining the change in sign of the error wind speed-ocean temperature correlations around the bottom of the mixed layer.

b. December ensemble
In the winter, the strength of the incoming solar radiation is reduced and the mean net heat flux is negative. The surface winds and air-sea surface temperature differences are large compared to the summer case leading FIG. 8. Schematic illustrating the relationship between error in predicted heat gain or loss and error in predicted temperature, for example, due to error in Q net (ocean) or Q H (atmosphere). (a),(b) The solid black line denotes the truth and the red and blue lines represent realizations of temperatures for different initial values and different rates of heat gain/loss. (c),(d) The evolution of the (truth 2 estimate) errors for each temperature estimate. (left) Temperature errors increase if heat gain is underestimated (red lines) and decrease if heat gain is overestimated (blue lines) regardless of whether the initial temperature is overestimated (solid lines) or underestimated (dashed lines). (right) Temperature errors decrease if heat loss is underestimated (red lines) and increase if heat loss is overestimated (blue lines) regardless of whether the initial temperature is overestimated (solid lines) or underestimated (dashed lines).

OCTOBER 2017 S M I T H E T A L .
to turbulent heat fluxes of greater magnitude and more heat exchange. The upper ocean is less stable and the nighttime ocean mixed layer is deeper (maximum depth of ;80 m). Greater air-sea coupling means that we see much stronger atmosphere-ocean error cross correlations in this case (Figs. 11, 12), and the influence of errors at the air-sea boundary spreads higher into the atmosphere and deeper into the ocean, as deep as approximately 50 m in the ocean and as high as approximately 500 hPa in the atmosphere. For the summer case, the differences between the day and night error cross correlations are relatively small, whereas for the winter case there are clear changes in correlation magnitude and sign between day and night, with stronger error correlations in the 0000 UTC (local day) forecast. In parallel with the near-surface salinity-atmosphere temperature and salinity-specific humidity error correlations in the summer case, the cross correlations typically seem to switch sign near the top of the ABL. We now discuss the December atmosphere-ocean error cross correlation patterns in more detail; the various physical processes and atmosphere-ocean feedback mechanisms we describe are illustrated schematically in Fig. 13. FIG. 9. Schematic illustrating how different combinations of errors in the magnitude of the latent and sensible heat fluxes Q E and Q H affect the correlation between errors in the atmosphere and ocean surface temperature: (a) during the day when the net heat flux Q net . 0 and (b) during the night when Q net , 0. Blue text indicates a negative atmosphere-ocean temperature error correlation and red text indicates a positive atmosphere-ocean temperature error correlation. In our model, all fluxes are positive downward; this simplified representation assumes that SST . T surf so that both Q E , 0 and Q H , 0.

CROSS CORRELATIONS
In contrast to the summer case, the winter atmosphereocean temperature error cross correlations are reasonably strong and clearly structured (Figs. 11a,b). The correlations for the 1200 UTC (local day to night) forecast are relatively strong and negative within the atmosphereocean boundary layer (Fig. 11b), whereas the 0000 UTC (local day) correlations (Fig. 11a) are relatively weak and positive within the boundary layer but become stronger around the top of the ABL and then switch sign to negative; again, this indicates that errors in the heat and moisture exchange at the atmosphere-ocean boundary induce errors in the lower-atmosphere mixing scheme and in turn errors in the lower-atmosphere temperature and specific humidity profiles, particularly across the boundary-layer capping inversion.
Following the logic given for the positive June atmosphere-ocean temperature error correlations in the previous section, we expect one of scenarios 1 or 2 to hold at 0000 UTC (local day). This implies that the errors in the latent and sensible heat fluxes are negatively correlated, and a scatterplot of the errors in Q E and Q H for the 0000 UTC forecast ensembles confirms this to be the case (Fig. 14a). A negative atmosphere-ocean temperature error correlation during the night (1200 UTC) would imply that either the atmosphere temperature field is gaining too much heat and the ocean is losing too much heat, or that the atmosphere temperature field is not gaining enough heat and the ocean is losing too little heat (relative to the truth). For this relationship to hold, we would expect the errors in Q E and Q H at 1200 UTC to show a positive association; again, a scatterplot of the errors in Q E and Q H for the 1200 UTC forecast ensembles confirms this to be true here (Fig. 14b).

4028
To explain why the correlation between errors in the sensible and latent heat fluxes changes sign between day and night, we need to consider where these errors may originate from. Equations (5) and (6) tell us that errors in Q H or Q E will come from error in the magnitude of the transfer coefficient C H or C E , error in the magnitude of the surface wind speed jU n j, and/or error in the airsea temperature difference DT or air-sea humidity difference Dq. Considering the effect of a small perturbation to DT in (5) and to Dq in (6) tells us that errors in Q H and DT, and errors in Q E and Dq, will be positively correlated. Similarly, errors in DT and the atmosphere surface temperature will be positively correlated, as will errors in Dq and the surface specific humidity. However, errors in near-surface atmosphere temperature and specific humidity are negatively correlated; thus, we expect errors in DT and Dq to also be negatively correlated. The negative association between the errors in Q H and Q E at 0000 UTC therefore implies that they are primarily being driven by errors in DT and Dq, respectively. The positive correlation between the errors in Q H and Q E at 1200 UTC shown in Fig. 14b indicates that they are being dominated by a different source at night. Again using (5) and (6), we know that an error in the estimated surface wind speed will affect the estimated magnitude of Q H and Q E in the same way. Assuming Q E , Q H , 0, the effect of an increase (decrease) in the magnitude of the wind speed will be a decrease (increase) in error for both Q H and Q E , thus leading to a positive error correlation between them. This suggests that, rather than temperature and specific humidity errors, it is errors in the wind speed that are mainly influencing the errors in Q H and Q E at night.

CORRELATIONS
Unlike the summer ocean temperature-wind speed error correlations, the winter ocean temperaturewind speed correlations (Figs. 11c,d) do not mirror the atmosphere-ocean temperature correlations in the near-surface layer (Figs. 11a,b). In this case, it is the ocean temperature-wind speed and salinity-wind speed error correlations that reflect one another (Figs. 11e,f); errors in temperature and salinity will have opposing effects on the ocean density profile, and so this pattern of behavior is expected.
For the 1200 UTC (local night) forecast ensemble, the ocean temperature-wind speed error correlations are negative between the ABL and near-surface ocean but then become positive further into the ocean mixed layer, whereas the salinity-wind speed error correlations are positive between the ABL and near-surface ocean and become negative within the mixed layer. The explanation for a positive relationship between the near-surface salinity and wind speed errors follows from that given in the discussion of the June results for the negative correlation between the near-surface ocean temperature-wind speed errors (since errors in the near-surface ocean temperature and salinity are negatively correlated); overestimation (underestimation) of the surface wind speed will enhance (reduce) the evaporation of moisture from the ocean surface and thus cause the near-surface ocean salinity to become overestimated (underestimated) relative to the truth.
During the night, the ocean is typically less buoyant and less stable, and this leads to greater vertical turbulent mixing and a deepening of the ocean mixed layer. Errors in the extent of this mixing will produce errors in the ocean temperature and salinity profiles. A potential source of mixing errors are errors in the vertical shear of the ocean velocity, which can in turn be generated by errors in surface wind stress caused by errors in the lower atmosphere wind profile. Figure 12b shows that the structure of the 1200 UTC correlations between errors in the magnitude of the atmosphere wind and ocean velocity is consistent with this argument: they are almost identical to that of the wind speed-ocean temperature and wind speed-salinity error correlations (Figs. 11d,f), although FIG. 13. Schematic illustrating how the correlation between errors in the sensible and latent heat fluxes affects the air-sea error cross correlations in the December case: (a) during the day when errors are driven by errors in the air-sea temperature and air-sea humidity difference and (b) during the night when errors are driven by errors in the near-surface wind speed. Note that, although these assume that the wind speed is overestimated, when the wind speed is underestimated the sign of everything is reversed and so the pattern of error cross correlations stays the same. stronger in magnitude (and of opposite sign to those for wind speed-ocean temperature). The change in sign of the error correlations within the mixed layer suggests that, in order to restore uniformity of density, the winddriven errors near to the surface are counterbalanced by reversing the direction of errors from below.
The 0000 UTC (local day) wind speed-ocean temperature and wind speed-salinity error cross correlations (Figs. 11c,e) are stronger than the 1200 UTC (local night) correlations and extend higher into the atmosphere. The wind speed-ocean temperature correlations are strong and positive throughout the lower atmosphere-upper ocean, and strong and negative around the top of the ABL; the wind speed-salinity errors display the opposite pattern, as in the 1200 UTC (local night) case. Figure 12a shows that these error correlations again have a similar structure to those between the atmosphere wind-ocean velocity errors but are reduced in magnitude, although in this case it is the wind speed-ocean temperature error correlations that have the same sign. During the day the absorption of solar radiation has a stabilizing effect and so the ocean column is typically more stratified and buoyant. This buoyancy acts as a barrier to vertical mixing, and so changes in the turbulent surface fluxes will not have the same effect on the ocean as at night (also recall that the errors in the daytime sensible and latent heat fluxes show a negative correlation which suggests that they are being driven by errors in the air-sea temperature and humidity differences rather than the errors in the wind speed). Because mixing is limited, the response of the ocean model to a perturbation in the near-surface wind profile will be more linear, that is, the ocean simply shifts from one stable density profile to another, hence the uniform structure of the correlations between the errors. Indeed, the weaker appearance of the 1200 UTC error cross correlations is perhaps a consequence of the nonlinearity of the vertical mixing process. The 1200 UTC ensemble error standard deviation profiles for ocean temperature and salinity both show sharp increases around the bottom of the mixed layer, which is consistent with the idea that the nighttime ocean is less stable, more turbulent, and thus more nonlinear in its response to perturbations in the near-surface wind.
c. Sensitivity to amplitude of errors in the ocean model In these experiments, the amplitude of the initial background error perturbations and background error variances were chosen to be consistent with the variability of our 1D model system. In particular, the errors in the ocean model are much smaller in amplitude than would be expected in a full-scale 3D coupled model system. To investigate the effect of initial ocean errors comparable to those in a full-scale 3D system on the atmosphere-ocean error cross-correlation patterns, we reran our December experiment using an initial background state with larger ocean errors; this state was generated by inflating the size of the perturbations added to the ocean variables at the start of the initial 24-h spinup forecast. The prescribed background error standard deviations for the ocean variables were increased accordingly, as shown in Fig. 15. Initially, the prescribed observation error standard deviations were kept unchanged, but this led to overfitting of the ocean observations. Although the errors in the ocean state are initially more realistic in that they reflect the amplitude of errors that would be found in a 3D ocean model, they become damped by the dynamics of the idealized model and return to levels closer to those in the original experiment; this means that although the prescribed error variances in B 0 are consistent with the size of the initial background perturbations, they are much larger than, and inconsistent with, the actual forecast variability of the 1D model. The 4D-Var algorithm then assumes that the model forecast state is far less accurate than it actually is and erroneously draws the analysis more closely to the observations. The analysis increments are therefore dominated by the observations, and we lose information about the error cross-correlation structures of the background flow. To prevent this overfitting and maintain similar background to observation error ratios as in the original experiment, the error standard deviation of the ocean temperature, salinity, and current observations were increased to 0.05 K, 0.015 psu, and 0.05 m s 21 , respectively. The initial (truth 2 background) errors and prescribed background and observation error standard deviations for this case are shown in Fig. 15. The large errors in the initial ocean background state lead to poorquality analyses in the first two assimilation cycles (i.e., analysis ensemble mean is far from the true state), so these are excluded from the computation of the error correlations via (20). A selection of the resulting ensemble error cross correlations is shown in Fig. 16. The 0000 UTC (local day) error cross correlation structures are broadly similar to the original case (Figs. 11,12), except that they do not extend as deep into the upper ocean; the atmosphere-ocean temperature and wind speed-ocean salinity error cross correlations (Figs. 16a,e) are also weaker in magnitude. The corresponding 1200 UTC (local night) atmosphere-ocean error cross correlations match the original case less closely but are generally of the same sign in the near-surface region.
We would expect to see some differences in the detail of error correlation patterns for this new case as the initial background state has a different structure, and all ensemble estimates will inevitably contain sampling errors. Further, we reduced the effective sample size by excluding the results of the first two assimilation cycles. Nonetheless, these results are a useful reminder of how it is the relative weighting of the background and observation errors that is important in an assimilation system and demonstrate the reliance of the ensemble methodology on an appropriate choice of these weights for the model system being studied.

Summary
To fully realize the potential of coupled atmosphereocean data assimilation, proper representation of the relationship between the errors in the atmosphere and ocean model forecasts is needed. We have been using an idealized 1D coupled atmosphere-ocean model to explore ensembles of 4D-Var data assimilation as a means of capturing the characteristics and structure of the atmosphere-ocean forecast error cross correlations in coupled systems.
The strongest error cross correlations are seen in the near-surface atmosphere-ocean boundary layer, but beyond this the atmosphere and ocean errors appear to be mostly uncorrelated. Within the boundary region there is notable variation in the strength and structure of the error cross correlations between summer and winter, and also between day and night. These differences provide a valuable insight into the nature of coupled atmosphere-ocean forecast error correlations for different seasons and points in the diurnal cycle. They are most distinct in the winter case when the effect of solar insolation on ocean stability is reduced, surface winds are high, and the atmosphere-ocean surface temperature difference is large; these combine to produce turbulent heat fluxes of greater magnitude so that air-sea coupling is strong. The observed forecast error correlations can be explained by a careful consideration of the underlying model physics, forcing, and known atmosphere-ocean feedback mechanisms.
Introducing improved cross-covariance information between the two fluids in coupled assimilation will enable greater use of near-surface observations and should in turn produce more accurate and balanced atmosphereocean analysis states and more reliable coupled model forecasts and reanalyses. In addition to offering an indication of the type of error correlation structures that could be expected from coupled systems, this study has highlighted the fact that atmosphere-ocean forecast error cross correlations are very state and model dependent; they will naturally vary depending on factors such as location and time of day and year, but will also depend on features of the model and assimilation system design, such as window length. So, although it is expected that the inclusion of cross-covariance information in the 4D-Var forecast error covariance matrix will have a positive impact on the coupled assimilation, the static B 0 formulation assumed in traditional 4D-Var may not be sufficient; rather, it will be important to introduce an element of flow dependence.
The knowledge gained from this study is now being used to develop new methods for approximating the statistics of the atmosphere-ocean forecast errors and for incorporating this information within a variational data assimilation framework. Longer term, this will include the development of a strongly coupled hybrid ensemble-4D-Var system. The next stage in this process is to incorporate the ensemble atmosphere-ocean forecast error cross-covariance information into our 1D strongly and weakly coupled 4D-Var assimilation systems and assess whether they do in fact help to generate more accurate and/or balanced analysis states.