Tuning of length‐scale and observation‐error for radar data assimilation using four dimensional variational (4D‐Var) method

The effects of tuning of length‐scale and observation‐error on heavy rainfall forecasts are investigated. Length scale and observation error are tuned based on observation minus background (O − B) covariances and theoretically expected cost function values, respectively. Tuned length scale and observation error are applied to radar data assimilation using the Four Dimensional Variational (4D‐Var) method. Length‐scale tuning leads to improved Quantitative Precipitation Forecast (QPF) skill for heavy precipitation, better analyses, and reduced errors of wind, temperature, humidity, and hydrometeor forecasts. The effects of observation‐error tuning are not as significant as those of length‐scale tuning, and they are limited to improvements in QPF skill. This is because tuned observation errors are close to pre‐assumed values. Proper tuning of length‐scale and observation‐error is essential for radar data assimilation using the 4D‐Var method.


Introduction
With the increase in resolution of numerical weather prediction models, the importance of radar data assimilation has been emphasized in recent years, especially for forecasting high-impact weather events (Sun, 2006). Various sophisticated data assimilation methods such as variational, ensemble-based, and hybrid methods have been used for assimilating radar observations. Xiao et al. (2005) examined the impact of assimilating radar radial velocity on the prediction of a heavy rainfall event by implementing the observation operator and the Richardson balance equation within the fifth-generation Pennsylvania State University-National Center for Atmospheric Research Mesoscale Model (MM5) Three Dimensional Variational (3D-Var) system. Wang et al. (2013a) assimilated retrieved rainwater and estimated in-cloud water vapor instead of assimilating radar reflectivity directly to avoid the linearization error of the reflectivity observation operator by using the Weather Research and Forecasting (WRF) 3D-Var system. In Wang et al. (2013b), the WRF 4D-Var radar data assimilation system was introduced by developing tangent-linear and adjoint of a Kessler warm-rain microphysics scheme, and by including cloud water, rainwater, and vertical velocity as new control variables.
Background error covariance determines how the observation information spreads horizontally, vertically, and among variables. Therefore, accurate estimation and proper tuning of background error statistics are essential for successful data assimilation. There have been previous studies about tuning of the length scale of background error correlation within the 3D-Var framework by using the method of Hollingsworth and Lönnberg (1986) (e.g. Lee et al., 2010;Ha and Lee, 2012). Determining the observation error covariance is one of the challenging issues in data assimilation, and only observation error variance is considered in general. Because the ratio between background and observation error variances determines relative weights given to the background and observation, a precise estimation of background and observation error variances is of critical importance in data assimilation. Several approaches have been tested to tune observation error variance in previous studies (e.g. Desroziers and Ivanov, 2001;Desroziers et al., 2005;Chapnik et al., 2006;Lupu et al., 2015). The objective of this study is to investigate the effects of tuning of background-error correlation length-scale and observation-error variance on forecasts of heavy rainfall cases when assimilating radar observations using the 4D-Var method.

Theoretical backgrounds
2.1. Length-scale tuning Following Hollingsworth and Lönnberg (1986), the length scale of a background error correlation can be tuned using O − B values. For this, two assumptions are necessary: (1) there is no correlation between background and observation errors and (2) observation 442 Y. Choi et al. errors are spatially uncorrelated. In this study, thinning of original radar observations to 6-km mesh was adopted to minimize a potential violation of the second assumption. The process for tuning of length scale can be summarized as follows.
(1) For a specific level, calculate O − B covariances between two observation points and make a histogram of O − B covariances using the distance between the two points as a bin.
(2) Find a Gaussian function that best fits O − B covariances. (1) where B is O − B covariance, r is distance, B 0 is background error variance, and s is length scale.
(3) After taking the natural logarithm of both sides of Equation (1), find the slope and intercept values for a linear relationship using the least-square regression method.

Observation-error tuning
From Desroziers and Ivanov (2001), expectation values of background and observation cost functions at the minimum can be given as follows.
where J b is the background cost function, J o is the observation cost function, K is the Kalman gain matrix, H is an observation operator, n obs is the number of assimilated observations, B is the background error covariance, R is the observation error covariance, and Tr (A) is the trace of a matrix A.
In practice, the computation of Tr (KH) or Tr (HK) can be done by using a randomized method (Girard, 1989). A randomized estimation of Tr (HK) or Tr (KH) is given by (Desroziers and Ivanov, 2001): where is a vector of random numbers with a standard Gaussian distribution (i.e. zero mean and unit variance), and x a (y o +R 1∕2 ) and x a (y o ) are analysis increments obtained with perturbed and unperturbed observations, respectively.
Background (or observation) error variance can be tuned by using the expectation and actually computed values of the background (or observation) cost function. A cost function with tunable weighting parameters can be written as follows.
where s b and s o are the background and observation error tuning parameters, respectively. The error tuning parameters can be determined iteratively.
where the subscript i denotes the ith iteration, J b and J o are actually computed values of the background and observation cost functions, respectively. Desroziers and Ivanov's method has been used in literatures for tuning the observation and/or background errors (e.g. Chapnik et al., 2004;Buehner et al., 2005;Desroziers et al., 2005;Chapnik et al., 2006;Lee et al., 2010), and they showed the convergence of error tuning parameters within several (less than ten) iterations.

Experimental design
The WRF Model version 3.7.1 (Skamarock et al., 2008) is used as the forecasting model in this study. Triply-nested domains with horizontal resolutions of 54, 18, and 6 km, and with grid points of 120 × 102, 121 × 103, and 121 × 127 are employed (Figure 1(a)). All domains have 35 vertical levels and the model-top pressure is 50 hPa. The 6-hourly European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-Interim data with resolution of about 80 km are used for the initial and boundary conditions. The following physical parameterization schemes are chosen: the Kain-Fritsch cumulus scheme (Kain, 2004), the WRF Single Moment 6-class (WSM6) microphysics scheme (Hong and Lim, 2006), the Yonsei University (YSU) boundary layer scheme , the Rapid Radiative Transfer Model (RRTM) longwave radiation scheme (Mlawer et al., 1997), and the Dudhia shortwave radiation scheme (Dudhia, 1989). Note that Kessler warm-rain microphysics scheme is used for tangent linear and adjoint model runs.
Radar data assimilation experiments are conducted only for the innermost domain using the WRF Data Assimilation (WRFDA)'s 4D-Var method (Barker et al., 2012). Background error covariance is calculated using the National Meteorological Center (NMC) method (Parrish and Derber, 1992), where background error statistics are derived from the differences between 24 and 12-h forecasts for a 1-month period. Radar radial velocity and reflectivity observations from 19 stations (Figure 1(b)) over the Korean Peninsula are assimilated. All reflectivity observations greater than 0 dBZ are assimilated in this study. Before assimilation, radar data are preprocessed, including quality control, interpolation, and thinning. Details of the preprocessing of radar data can be found in Park and Lee (2009). Preprocessed radar data have horizontal, vertical, and temporal resolutions of 6 km, 0.5 km, and 10 min, respectively. Observation errors for radial velocity and reflectivity are assumed to be 2 m s −1 and 5 dBZ, respectively.
A total of 11 heavy rainfall cases over the Korean Peninsula, which occurred in 2006, 2008, and 2010 are selected. Selected heavy rainfall cases can be classified as isolated thunderstorm (IS), convection band (CB), cloud cluster (CC), or squall line (SL) according to Lee and Kim (2007). For each case, six experiments are conducted: NoDA, Rv, Rf, Rv + Rf, LS, and ERR. In the NoDA experiment, no radar data assimilation is carried out. In the Rv (Rf) experiment, only radial velocity (reflectivity) observation is assimilated, and both radial velocity and reflectivity observations are assimilated in the Rv + Rf experiment. The LS experiment is the same as the Rv + Rf experiment except for employing tuned length scale, and the ERR experiment is the same as the LS experiment except for using tuned observation error variance. Radar radial velocity and reflectivity observations are assimilated using the methods of Xiao et al. (2005) and Wang et al. (2013a), respectively. The length of the assimilation window is 30 min, and radar observations are available every 10 min within the assimilation window.

Results and discussions
For each case and experiment, a 24-h forecast is conducted, and the forecast starts 3 h before a heavy rainfall system affects the Korean Peninsula. Because radar data assimilation is done only for the 6-km domain, the following analyses focus on the results of the 6-km domain.  Figures 2(a) and (c). CSI and BIAS values of data assimilation experiments are better than those of the NoDA experiment, and this indicates positive effects of radar data assimilation on QPF skill. Regardless of threshold values, CSI of the Rf experiment is slightly greater than that of the Rv experiment. CSI values of the Rv + Rf experiment are better than those of the Rf experiment, and this implies that both kinematic and hydrometeor information is important for improving QPF skill of heavy rainfall cases. Overall, the same conclusion can be drawn from the analyses of BIAS values. However, for 50-mm threshold, BIAS values of the Rf and Rv + Rf experiments are approximately 1.5, and this indicates that heavy precipitation is overpredicted in the Rf and Rv + Rf experiments.  Note that length-scale values for stream function and velocity potential from the NMC-based statistics are about 90 and 70 km, respectively, and length-scales for rainwater, snow, and graupel are specified as a value of 6 km in the WRFDA system. Although length scale for radial velocity is estimated, length scales of control variables associated with wind (i.e. stream function and velocity potential) are tuned because data assimilation is done in control variable space in WRFDA.

Effects of length-scale tuning
In order to examine the effects of length-scale tuning, CSI and BIAS values of the Rv + Rf and LS experiments are compared in Figures 2(b) and (d). For thresholds of 5, 10, and 20 mm, both CSI and BIAS of the Rv + Rf experiment are better than those of the LS experiment. However, for a threshold value of 50 mm, CSI and BIAS values of the LS experiment are better than those of the Rv + Rf experiment. In the Rv + Rf experiment, larger length scales than those in the LS experiment are used for the assimilation. Larger length scales in the Rv + Rf experiment spread the impact of radar observations to more distant grid points than the LS experiment, and this generally broadens simulated rainfall area. A broader rainfall area tends to give a better CSI value for small thresholds (i.e. weak rainfall), but it also produces erroneously wider area of weak rainfall [which leads to a higher Probability of False Detection (POFD) value]. POFD values (of 24-h accumulated rainfall) of the Rv + Rf experiment are greater than those of the LS experiment for all threshold values (see Figure S2). This stands out particularly at earlier forecast ranges. For earlier forecast ranges (1-9 h), BIAS values (of 1-h accumulated rainfall for 1-mm threshold) of the Rv + Rf experiment are much greater than those of the LS experiment (see Figure S2). Weak rainfall over wider areas simulated in the Rv + Rf experiment makes CSI values higher for smaller thresholds, but it also leads to higher POFD values. This does not necessarily mean better forecasts of heavy rainfall. Although QPF scores of the LS experiment are better than those of the Rv + Rf experiment only for the 50-mm threshold, this is still meaningful for weather forecasting given the difficulties in forecasting of severe weather events like heavy rainfall.
Kinetic energy (KE) spectra of analyses of the Rv + Rf and LS experiments are computed, and KE spectrum of the NoDA experiment is also calculated as a reference (Figure 3). For each case and experiment, the KE spectra from 950 to 100 hPa levels (with an interval of 50 hPa) are calculated and they are vertically averaged. And for each experiment, a total of 11 KE spectra from all heavy rainfall cases are averaged. The Discrete Cosine Transform (DCT) is used to compute KE spectra as in Denis et al. (2002). In the Rv + Rf experiment, energy at wavelengths between 120 and 720 km (i.e. mainly, meso-scale) is increased through radar data assimilation, compared with the NoDA experiment. In contrast, KE at wavelengths between 12 and 360 km (i.e. mainly, meso-scale) is enhanced in the LS experiment compared with the NoDA experiment. Because selected heavy rainfall cases are related to meso-or meso-scale phenomena (e.g. squall line, mesoscale convective complex, convection band, and thunderstorm), the increase in KE at meso-scale rather than meso-scale is more reasonable.
To compare forecasts of the Rv + Rf and LS experiments, Root Mean Square Errors (RMSEs) of radial velocity and reflectivity are computed using 1-h forecasts of the Rv + Rf or LS experiment and radar observations (Figure 4). For each case and experiment, hourly RMSEs of radial velocity and reflectivity are calculated for the forecast range of 0-24 h. Then, hourly RMSEs of radial velocity and reflectivity from (a) (b) Figure 4. Root Mean Square Errors (RMSEs) of (a) radial velocity (m s −1 ) and (b) reflectivity (dBZ) for forecasts of Rv + Rf (orange), LS (red), and ERR (purple) experiments. Error is defined as the difference between 1-h forecast and the corresponding radar observations. Temporally averaged (0-24 h) RMSE value of each experiment is shown next to the experiment name. 11 cases are averaged for each experiment. From 0 to 11 h, RMSEs of radial velocity of the LS experiment are less than those of the Rv + Rf experiment. Similarly, RMSEs of reflectivity of the LS experiment are smaller than those of the Rv + Rf experiment for the forecast range of 0-12 h. In the LS experiment, by using the tuned length scales for radial velocity, rainwater, snow, and graupel when assimilating radar observations, the analyses fit better to the observations than the Rv + Rf experiment, and this improvement lasts for about 12 h.
Finally, vertical distributions of RMS Differences (RMSDs) of zonal wind, meridional wind, temperature, and relative humidity are calculated using a 6-h forecast of the Rv + Rf or LS experiment ( Figure 5). The forecast is verified against the ECMWF ERA-Interim data, and RMSDs from 11 cases are averaged. RMSDs of zonal and meridional winds of the LS experiment are smaller than those of the Rv + Rf experiment, especially at lower-mid levels (i.e. 800-500 hPa). Similarly, RMSDs of relative humidity of the LS experiment are less than those of the Rv + Rf experiment at lower-mid levels. Although no temperature observations are assimilated, RMSDs of temperature are reduced in the LS experiment compared with the Rv + Rf experiment.

Effects of observation-error tuning
Observation errors for radial velocity and reflectivity are assumed to be 2 m s −1 and 5 dBZ, respectively, and observation error for rainwater, snow, and graupel has a value between 0.0005 and 0.001 kg kg −1 , depending on the corresponding mixing ratio. Tuning parameter for observation error can be obtained from the iterative process presented in Section 2.2. In order to help the observation error tuning parameter to converge, the background error tuning parameter is kept constant during the iterations. Table S1 shows observation error tuning parameters as a function of iteration number. The observation error tuning parameters for radial velocity, rainwater, snow, and graupel converge after 5, 4, 3, and 4 iterations, respectively. Converged tuning parameters for radial velocity, rainwater, snow, and graupel are 0.97, 0.94, 0.91, and 0.91, respectively, and this implies a slight overestimation (3, 6, 9, and 9%) of observation error for all observed variables. Although the difference is not large, CSI and BIAS values of the ERR experiment are better than those of the LS experiment for thresholds of 5, 10, and 50 mm (Figures 2(b) and (d)). RMSEs of radial velocity and reflectivity of the ERR experiment are similar to those of the LS experiment ( Figure 4). Similarly, vertical distributions of RMSDs of zonal wind, meridional wind, temperature, and relative humidity for the ERR experiment are close to those for the LS experiment ( Figure 5). Overall, effects of observation-error tuning on QPF and wind/temperature/humidity forecasts are slightly positive and neutral, respectively. This may be because the computed tuning parameters for radial velocity, rainwater, snow, and graupel are close to one, and hence tuned observation errors are similar to the assumed ones. The improvement of QPF skill resulted from slightly changing observation errors implies that tuning of observation error can contribute to forecast improvements when the assumed observation error is not an optimal value.

Summary and conclusions
A total of 11 heavy rainfall cases over the Korean Peninsula are selected to investigate effects of tuning of length-scale and observation-error on heavy rainfall forecasts. Radar radial velocity and reflectivity

Supporting information
The following supporting information is available:   Table S1. Observation error tuning parameters for radial velocity, rainwater, snow, and graupel.