Inflating transform matrices to mitigate assimilation errors with robust filtering based ensemble Kalman filters

An error covariance matrix plays an important role in maintaining the statistical property of the ensemble in an ensemble Kalman filter method. However, data assimilation filter divergence may occur from an inaccurate estimate of the covariance matrix. In this study, based on an ensemble time‐local H‐infinity filter, which inflates the eigenvalues of the analysis error covariance matrix, a new robust ensemble data assimilation method is proposed, referred to as an inflation transform matrix eigenvalues. By design, new filters may be preferred over other traditional ensemble filters, when model performances are not well known, or change unpredictably. The primary aim is to improve the performance of the assimilation system in the framework of the ensemble filtering, according to the minimum/maximum rule of robust filtering. The proposed estimation method is tested using the well‐known Lorenz‐96 model, in order to investigate how the ensemble time‐local H‐infinity filter method of the inflation transform matrix impacts the robustness of the assimilation system under selected special conditions, such as the assimilation steps, force parameters, ensemble sizes, and observation information. The experiments show that the proposed inflation transform matrix method displays good robustness to the changes in the system's parameters. Also, when compared with the traditional filtering methods, this robust filtering method is found to improve the assimilation performance.


Introduction
In data assimilation, the ensemble Kalman filter (EnKF) (Evensen, 2003) provides advantages over the traditional Kalman filter (KF) by overcoming the shortcomings with reduced computational costs. However, due to model errors, initial background conditions, and limitations of ensemble sizes, the EnKF generally leads to an inaccurate estimate of the error covariance matrix, and also usually reduces the filtering performance, which eventually results in filter divergence. To compensate for those defects, Anderson and Anderson (1999) proposed a method referred to as covariance inflation. Sakov and Bertino (2011) introduced localization to address the background error spurious correlations. Hoteit (2013, 2014) conducted residual nudging. However, in practical atmospheric problems, the model and observation errors are often non-Gaussian and/or biased, and the statistical properties of the errors are often unknown, or not fully known. Therefore, there is a need to study the assimilation method in a general sense.
Motivated by robust control concepts that have been developed in the control engineering field for numerous years, a robust filter applied the norm of performance index H ∞ to the filters, which is also used to solve the uncertainty problems in the data assimilation system (Luo and Hoteit, 2011), and is referred to as an H-infinity filter (HF) (Simon, 2006). Wang and Cai (2008) compared the HF and KF, and showed that the HF utilized deterministic interference with unknown but finite energy, instead of the white noise process driven state space system. For convenience in applying the H ∞ filtering theories to sequential data assimilation, and in order to solve the problem of high dimensional data assimilation systems, Luo and Hoteit (2011) proposed a variant of the HF, known as the time-local HF (TLHF), and then applied the ensemble idea to the TLHF. Altaf et al. (2013) improved the accuracy of ensemble Kalman prediction using a robust adaptive inflation method.
For the purpose of constructing an improved error covariance matrix, Luo and Hoteit (2011) directly inflated the eigenvalues of the analysis error covariance matrix (hereafter referred to as IA). In this study, based on an ensemble time-local H-infinity filter (EnTLHF), a new robust data assimilation method is proposed in order to inflate the transform matrix eigenvalues (hereafter referred to as IT). The essence of the method is to conduct singular value decomposition (SVD), and to obtain the eigenvalues of the transform matrix. To apply the positive semi-definite condition of analysis error covariance matrix inverse, the eigenvalues of the Inflating transform matrices robust ensemble Kalman filters 471 transform matrix are inflated, which indirectly inflates the analysis error covariance matrix. The application of this method can potentially avoid the complex SVD of the analysis error covariance matrix.
In this article, a robust EnTLHF is developed and implemented in the Lorenz-96 model for state forecasting based on the ensemble transform Kalman filter (ETKF) (Bishop et al., 2001). Similar experiments are reported using the IA method (Luo and Hoteit, 2011). Although the IA method was proved to improve the filter performance by significant amounts, it still lacks accuracy when the parameter changes, even filter divergence occurs, which motivates this study. New methods proposed in this study are compared with ETKF and IA for the special conditions. The results suggest that the IT implemented in the framework of the EnTLHF strongly mitigate assimilation errors as compared to the ETKF and IA.

Time-local H-infinity filter
HF (Simon, 2006) is one of the robust filters involving the nature of the model and observation errors which the KF relies on. No matter what the size of the model and observation errors, if the norm of the HF has less than a preset positive value 1/ , then the performance of the HF estimator can be guaranteed. The minimum/maximum rule is utilized to compute the estimation values in the uncertainty background, model errors, and observation errors.
To satisfy the sequential data assimilation in certain circumstances, Luo and Hoteit (2011) proposed that a TLHF would be more flexible than the HF.
Define a local cost function of the TLHF: are the corresponding information matrices. The weight matrixes S i , Q i , and R i are free choice by the designers, and i is a suitable local performance level, which satisfies Equation (3).
where sup

EnTLHF-IA (IA)
It is assumed that the system is nonlinear in assimilation, with M i,i − 1 possibly being nonlinear, and H i is the linear operators. Concretely, let be the n-member background ensemble at time instant i, which is the prediction of the analysis ensemble at the previous cycle. The steps of the EnTLHF can be derived as follows: Prediction step: Filtering step: where Δ i denotes the uncertainty matrix; superscript 'a' and 'b' are the analysis and background, respectively; and G i represents the gain matrix. Subject to the constraints: It should be noted that (7) corresponds to the inverse of the analysis covariance matrix obtained in the KF (Luo and Hoteit, 2011). Therefore, the EnTLHF mainly differs from the KF in that it introduces an extra term − i S i into the inverse covariance update formula Equation (7). Therefore, the KF can be considered as a special case of the EnTLHF with a performance level of i = 0.
From the above it can be concluded that, when i = 0 in Equation (2), the estimation error 'energy' S i is less than infinite. Therefore, that ensemble filtering algorithm does not guarantee that the estimated error 'energy' is bounded in the assimilation. In other words, this algorithm is not robust for the errors. However, the TLHF can guarantee the minimum energy of the estimation error when the system is disturbed even though the filtering accuracy is not high for highly nonlinear systems. Therefore, when the idea of an ensemble is applied to the TLHF method, a robust framework for ensemble filtering can be formed. For Equation (7), if i > 0, then − i S i ≤ 0, and thereby the uncertainty matrix Δ a i is larger than that obtained in the EnKF ( i = 0). Therefore, the presence of the extra term − i S i in Equation (7) introduces inflation into the covariance matrix obtained by the EnKF. The EnTLHF provides a general mathematical mechanism of the robust control theory for conducting covariance inflation.

EnTLHF-IT (IT)
Similarly, a more 'sophisticated' inflation scheme based on Equation (7) can also be derived. For instance, by is equal to the analysis error covariance matrix inverse of ensemble idea in Equation (7). Then in the standard formulas of the ETKF: (Julier, 2003), and satisfies UU T = I and U1 T m = 0 (Livings et al., 2007), with 1 m being the m-dimensional vector whose elements are all equal to 1; (3) the transform matrix T i = C i (Γ i + 1) -0.5 , where C and Γ are the eigenvectors and eigenvalues by conducting SVD on ( represents the square roots of the sample covariance matrices of the background ensemble; (5) the analysis error covariance matrix So, the is named as the performance level coefficient (PLC). By selecting this choice, the following is obtained: In Equation (11), the eigenvalues Γ is modified using i,j − i,n − 1 instead of i,j from the above. The following one can be realized: T = C(Γ + I) -0.5 , where (Γ + I) -0.5 is the eigenvalues of the transform matrix T. That is to say, the transform matrix is inflated with the increase of , and referred to as the inflation transform matrix basic of the inflation form of EnTLHF. The essence of this method is to inflate the analysis error covariance matrix by means of the inflation of the eigenvalues of the transform matrix.
This inflation mechanism is similar to that introduced by Luo and Hoteit (2011). They introduced the different inflation techniques from the essence, and the basic principles are as follows. Firstly, an SVD is conducted to the analysis error covariance matrix. Secondly, inflation to the eigenvalues i,j (j = 1, · · ·, m) is proposed. In other words, is used instead of i,j as the eigenvalues of the analysis error covariance matrix, where > 0 is the PLC. The analysis error covariance matrix is inflated with the growing and regarded as the EnTLHF-IA, which directly inflates the analysis error covariance matrix eigenvalues.
Although both IT and IA conduct SVD, the IA may attempt to directly modify the eigenvalues of the analysis covariance, which will be expensive for large-scale problems. Therefore, this is a selling point for the modifications of the eigenvalues of the transform matrix T, as it is more economical than the IA.
In the conventional update scheme, the inflation factor is invariant in time. In contrast, in the new inflation scheme, i = i,n − 1 + 1 is not only adaptive in time (with i), but it is also different for each mode of the system (column of X i b C i ), even for a fixed value of . Also, alternative adaptive inflation schemes have already been proposed in the research conducted by Hoteit et al. (2002).

Lorenz-96 model
The Lorenz-96 model (Lorenz and Emanuel, 1998) is a strongly nonlinear dynamical system, with quadratic nonlinearity, and the governing equations are given by the following: where k = 1, 2, … , 40, and for consistency, are defined as X − 1 = X 39 , X 0 = X 40 , X 1 = X 41 . It can be assumed that the true value of parameter F is eight. However, in the assimilation other values for F may be chosen, which represent different model error degrees. In this study, a fourth-order Runge-Kutta method is used to integrate (and discretize) the system from time 0 to 125, including a constant integration step of 0.05 (overall 2501 integration steps). The first 500 steps are discarded to avoid transition effects, and the remaining 2000 steps were used for the data assimilation.

Observation system
The above system is used to record the observation of the state vector x i at the i time step, where v i is the observation error that follows the Gaussian distribution

Validation statistics
Given a set of m-dimensional state vectors with i max being the maximum time index, then the resulting root mean square error (function of PLC ) could be defined as: where x a i and x t i are the analysis mean and true state, respectively; and || · || denotes the Euclidean norm.
The average RMSE at time i max and at repeated 20 (L = 20) times is defined as described in Hoteit et al. (2015), the RMSE were set Also, to further assess the behavior of the filters, the time evolution of the average ensemble spread (AES) of each filter is monitored, which is then computed at every filtering step as the following (Hoteit et al., 2015): where 2 i,k is the ensemble variance of x i,k .

Experimental results and analysis
In this section, the ETKF, IA, and IT are compared in the condition of the assimilation steps, force parameter, ensemble sizes, and observation information changes, and the filter's robustness and accuracy are investigated. In practice, in order to avoid the cross covariance possibly are becoming unbounded one, we multiply (1 − ) on the left of the SVD matrix  (14)) of the IA and IT in assimilating the L96 model. The values of the parameter F are 9, 10, and 11, respectively. It can be seen from the figure that: (1) After using the EnTLHF data assimilation method of robust ensemble filter theory, including the inflating of the eigenvalues of analysis error covariance and transform matrix, when t = 20, 25, and 30, the time mean RMSE (using Equation (15)) monotonically decreases the functions with respect to . When t = 20, the IT estimation error is less than the IA when ∈ [0.3, … , 0.8]. When t = 25, similarly, the IT RMSE is less than the IA in the conditions of ∈ [0.2, … , 0.9]. When t = 30, the IT estimate error is lower than the IA for ∈ [0.3, … , 0.7]; (2) For the same PLC , the estimation errors of IT and IA are not changed considerably under the three different assimilation time steps. This shows that the change of the short time assimilation steps does not have much influence on the estimation errors of IT and IA; (3) In the three different cases, all of the time mean RMSEs with > 0 are lower than those of the ETKF ( = 0). These results indicate that the IA and IT are more robust and have greater filter accuracy than the ETKF. However, the IT is found to be superior to the IA. Figure 2 illustrates the RMSEs (using Equation (14)) of the ETKF, IA, and IT when the values of the F parameter are 9, 10, and 11. In the experiment, the remaining steps (500-1500) are interrupted during data assimilation, R = 0.1, N = 20, and PLC = 0.6.

Influence of the force parameter F
From the results shown in Figure 2, the following can be concluded: (1) When F = 9, the RMSE of the ETKF remains large with the increase in assimilation time. However, the IA and IT have almost indistinguishable values, and the analysis shows that the RMSE remains small. Although the diffidence is not so big in the Lorenz-96 model, IT and IA SVD matrix are n × n and m × n (n is the size of ensemble and m is the dimension of state vector), respectively. In this experiment Lorenz-96 is 40 dimension, ensemble numbers are set as 20, so IA has larger SVD matrix than IT (IA is 40 × 20 and IT is 20 × 20). As for the calculation costs and matrix dimensions, IT is easier than IA; (2) In the case of F = 10, the RMSE of ETKF is also large, and the RMSE of the IA is also small until it reached 1000. After that point, the RMSE rapidly become larger. However, the RMSE of IT remains small with the increase in the assimilation time; (3) The result of F = 11 is that the RMSE of the IA and ETKF become large with the increase in assimilation time. However, the RMSE of the IT is found to be small. These results indicate that a filter divergence occurs for the IA, and the fluctuant estimate error of the IT is small. In other words, the IT shows robustness for the changes of the force parameter. One possible reason for the IT showing a stable phenomenon may be that the IA will directly modify the eigenvectors of the analysis covariance, and each eigenvector has different degrees of inflation. Certain eigenvectors originally may be not so influential to the dynamics of the L96 system, but through inflation, their impacts are over-amplified, especially in the presence of model errors. In general, IT modifies a linear combination of the eigenvectors of the analysis covariance, this may make it slightly less likely to over-amplify the impacts of certain individual eigenvectors that are originally so influential. That is, IT changes the directions of eigenvectors, or in other words, the subspace spanned by the eigenvectors. This is possibly a phenomenon similar to the case of doing the transform X a = X b TU in the ETKF, Tödter and Ahrens (2015) reports that using a random centering U is better than using a deterministic U, because a random U tends to rotate into different directions (hence changing the directions of eigenvectors of the analysis covariance).  Figure 3 shows the RMSEs (using Equation (14)) of the ETKF, IA, and IT with the change of assimilation time for ensemble size N = 15, 18, and 22. This experiment's remaining 500-1500 steps are interrupted during data assimilation, R = 0.1, F = 8, and PLC = 0.6. Figure 3 shows that in all of the cases, the ETKF have large RMSEs with assimilation steps. In the case of N = 15, the RMSEs of the IA and IT assimilation schemes are large when the assimilation step is increased. However, the IT is smaller than that of the IA and ETKF schemes. When N = 18, the RMSE of the IA is smaller or equal to that of the IT until the assimilation step reaches 1100. After that point, the RMSE of the IT is smaller than the IA, although both cases are large. However, when N = 22, both the IA and IT have almost indistinguishable values of RMSE, which have remained small. The filtering performance of the IT and IA are found to be better than the ETKF method. However, the IT method is superior to the IA, which is verified by the increase of ensemble members.   (15), L = 1) and AESs in the IA and the IT with the same observation information as in Figure 5. of observations at every model time step are that the AES of the IT is less than the IA; (3) Correspondingly, in the cases of each numbered element of the state vector having observations and assimilation of observations at every three model time steps, the AES of the IT is less than the IA in all of the processes of assimilation; (4) Overall, when each odd numbered element of the state vector have observations and assimilation of observations at every three model time steps, then the AES of the IT is less than the IA. This proves that the degree of convergence of the IT is better than that of the IA, whether or not there are observations; that is, the IT has better robustness than the IA. To analyze the filters' spreads in relationship to their corresponding RMSEs, the time evolution of the spreads and RMSEs (as they resulted from the same single run) are plotted for the best performance of each filter with the different observational scenarios, as shown in Figure 5. Figure 5 shows that for the IA, all AES curves except for obs half and sparse time, merge together, whereas for IT, it is more distinguishable with the iteration.

Influence of the observational information
On the other hand, the ensemble spread of the IT and the IA continuously exhibits U-turn behavior with the iteration. The ensemble spreads of the IA are higher than those of the IT. The spreads and RMSEs have good comparability for the IT, with the exception of the IA spread, which seems to overestimate in comparison with the corresponding RMSE. One possible reason for this overestimation phenomenon may be that the different inflation methods of the IA and IT have different impacts on the background ensemble at the next assimilation cycle. It may be that the IA has the larger background ensemble discrete degree, leading to the analysis ensemble spread degree also being large, while the RMSEs are determined by the distances between the analysis average value and the truth, and thus are not necessarily as large as the corresponding ensemble spreads. Table 1 shows the time average RMSEs and AESs in the IT and the IA with the same observation information as in Figure 5.

Discussion and conclusions
Using numerical experiments, the relative robustness of the specific forms of the IT are verified in comparison with the IA and ETKF without covariance inflation. The following conclusions are reached: (1) The time mean RMSE of the IT is a monotonically decreasing function with respect to , for the three types of assimilation time steps. Overall, the IT is found to be better than the IA, in that the verified IT improve the accuracy and robustness of the filter. In the short time of the assimilation process, with the increase of the PLC , the inflation of the eigenvalues of the transform matrix is indirectly (IT indirectly inflates the eigenvalues of analysis covariance matrix) equivalent to the inflation analysis error covariance matrix; (2) The estimate error of the ETKF becomes a large value with increases in the model error, and filter divergence occurs in the IA method. However, in the IT, it remains small, which indicates that the IT have strong robustness to the variations of the model error; (3) The IT with the few ensemble members (like 15) have less estimation errors than the other two methods, and the IA and IT schemes have almost indistinguishable values of the estimation error with the increase in ensemble members. Therefore, according to the above theory, the IT is superior to the IA (besides Figure 3(b) shows that the IT results in smaller RMSE up to 1000); (4) The observation information plays an important role in the process of assimilation, and the AES of both the IT and IA increases with the reduction of observational information with fixed assimilation steps. However, the AES of the IT is less than that of the IA. The spreads and RMSEs show good comparability for the IT filter with the different observational scenarios. However, the IA spread seems to overestimate the corresponding RMSEs.
Overall, the IT method shows good robustness with the different conditions, when compared with the ETKF and IA methods. In this study, a robust filtering theory is introduced, which do not need to make assumptions regarding the statistical properties of the model and observations, and through the estimation error growth rate scale, a new data assimilation method of inflation transform matrix eigenvalues is proposed. The variable parameters do not affect the performance of the filter due to the robustness. This method can potentially be widely used in the data assimilation of other nonlinear systems.