Assimilating GPM hydrometeor retrievals in HWRF: choice of observation operators

A recent study investigated the capability to assimilate hydrometeor retrievals in the National Oceanic and Atmospheric Administration (NOAA) Hurricane Weather Research and Forecasting (HWRF) system. Hydrometeor retrievals were obtained from a hurricane‐specific retrieval that utilizes data from the Global Measurement Mission Microwave Imager. Data assimilation system used in HWRF is the hybrid ensemble‐variational Gridpoint Statistical Interpolation (GSI). As a first attempt to assimilate hydrometeor retrievals in HWRF, observation operators for solid and liquid integrated water content were developed using the GSI standard control variables, such as temperature, pressure, and specific humidity, under the assumption that all water vapor in excess of saturation is condensed out. To improve the usefulness of assimilating hydrometeor retrievals in HWRF, this study extends this previous work by introducing new observation operators that (1) directly use hydrometeor species estimated from HWRF microphysics and (2) include total cloud condensate as a control variable. Hurricane Gonzalo (2014) was used to examine the performance of the old and new observation operators on HWRF analysis and subsequent forecasts with three pairs of experiments. Results suggest that when new observation operators are used the analysis is an improved fit to observations and realistic adjustments to control variables are evident as seen in analysis increment. In addition, the forecasts also indicate some improvements in hurricane intensity.


Introduction
Hurricane intensity and structure are fundamentally related to clouds. Among all the available observations, the use of all-sky satellite radiances and satellite-retrieved hydrometeors (e.g. Boukabara et al., 2013;Kummerow et al., 2015) are especially important for improving hurricane forecasting, because they contain valuable information about cloud and precipitation that often occur in sensitive regions in terms of forecast impact (Bauer et al., 2011). While techniques to assimilate these observations (Bauer et al., 2010;Zhu et al., 2016) are currently used on a global scale (Global Forecast System; GFS), implementation on a regional scale is yet to be done. In this study, the National Oceanic and Atmospheric Administration (NOAA) operational Hurricane Weather Research and Forecasting (HWRF) system (Tallapragada et al., 2015) is used as a representation of such regional scale model that currently assimilates neither all-sky satellite radiances nor hydrometeor retrievals.
As a preliminary attempt toward assimilating hydrometeor retrievals in HWRF, techniques were developed. In Wu et al. (2016), two observation operators to assimilate integrated solid-water content (SWC) and liquid-water content (LWC) are implemented in the hybrid Gridpoint Statistical Interpolation (GSI; Wu et al., 2002;Kleist et al., 2009;Wang, 2010). The integrated SWC and integrated LWC were obtained from a hurricane-specific microwave retrievals that utilizes data from the Global Measurement Mission (GPM) Microwave Imager (GMI), referred to as Hurricane Goddard PROFiling algorithm (GPROF) . These two observation operators were built based on the assumption that super-saturated water vapor will condense out; the one for integrated SWC, referred to as h s_noHydro , is, and the one for integrated LWC, referred to h l_noHydro , is, where T is temperature, P is pressure, q is specific humidity, e s is saturation vapor pressure, the superscript k denotes the model level index, k 0 is the vertical level where temperature is T 0 = 273.15 K, k mix is the vertical level where temperature is T mix = 253.15 K, k max is the index for the top model level, ΔP k is pressure difference between two vertical levels k and k + 1 (i.e.

239
ΔP k = P(k) − P(k + 1)), and g is the acceleration due to gravity. Due to the super-saturation assumption that only depends on temperature, specific humidity, and temperature, there was no need to include cloud condensate as an additional control variable. However, as pointed out by Wu et al. (2016), using h s_noHydro and h l_noHydro may potentially create a negative bias of the guess due to the use of already saturated-then-condensed water vapor. Because of the potential negative bias and the lack of cloud condensate updates, new techniques have been developed that include new observation operators, which directly use cloud microphysical variables. This is done by including total cloud condensate mass (CWM) as control variable and adding individual hydrometeor species (cloud water, rain, ice, snow, graupel, and hail) as state variables; a technique that follows the GSI all-sky implementation developed by Zhu et al. (2016).

Preparation of background hydrometeor species
The HWRF v3.7a release (Tallapragada et al., 2015) is employed in this study. Among the various HWRF physics packages, cloud microphysics scheme is most directly related to the assimilation of hydrometeor retrievals because microphysics explicitly handles the behavior of hydrometeor species. The Ferrier-Aligo microphysics scheme (Aligo et al., 2014) employed by HWRF predicts changes in water vapor mixing ratio (q v ) and CWM, which is the combined sum of individual hydrometeor species that include cloud water (q l ), rain (q r ), cloud ice (q i ), snow (q s ), graupel (q g ), and hail (q h ). Since individual hydrometeor species are not prognostic variables, they are diagnosed from CWM with the use of partition parameters that include fraction of ice (F_ICE), fraction of rain (F_RAIN), and values of riming rate (F_RIMEF).

CWM partition
There exist one formula in GSI that can be used to partition CWM into individual hydrometeor species (q l , q i , q r , q s , q g , q h ) with the use of F_ICE, F_RAIN, and F_RIMEF. This formula, referred to as P6, is part of a routine called cloud_efr_mod.f90. This routine takes CWM from Ferrier-Aligo microphysics and provides mass mixing ratios and particle sizes of individual hydrometeor species as input to the Community Radiative Transfer Model (CRTM; Han et al., 2006), which is used by GSI to compute top-of-the-atmosphere radiances for a given HWRF background.
The first step of P6 formula uses F_ICE to determine the solid and liquid phases of CWM, that is, the solid phase of CWM = F_ICE · CWM (3) and the liquid phase of CWM = (1 − F_ICE) · CWM (4) Then, F_RAIN is used to partition the liquid phase of CWM into q l and q r as and Similarly, the solid phase of CWM is decomposed into q i and precip_ice as and where precip_ice is precipitating ice as opposed to q i being the nonprecipitating ice, using an empirical weighting coefficient w where T is temperature, T 1 = 243.15 K, and T 2 = 233.15 K. Finally, depending on the values of F_RIMEF, precip_ice will be equal to either q s , q g , or q h as follows After the partition, individual hydrometeor species (q l , q i , q r , q s , q g , and q h ) are used by the new observation operators, which are discussed next.

New observation operators
The new observation operator for integrated SWC, referred to as h s , is defined as a vertical integration of the solid hydrometeor species including cloud ice, snow, graupel, and hail, that is, Similarly, the new observation operator for integrated LWC, referred to as h l , is defined as a vertical integration of the liquid hydrometeor species including cloud water and rain, that is, A horizontal smoothing procedure is carried out prior to the vertical integration in Equations (11) and (12). This horizontal smoothing is done by applying a formulation of recursive filter (Hayden and Purser, 1995) to the individual hydrometeor species. Such smoothing procedure was necessary because values of integrated SWC and integrated LWC computed by h s and h l from a given HWRF background were found much larger compared to observations during the preliminary development. As a result, large values of innovation were created and observations were rejected. The horizontal smoothing was used to reduce large innovations by smoothing individual hydrometeor fields. Reasons causing the large innovation values are left for future investigation, because they may be related to the cloud microphysics scheme and the P6 formula. In addition, the spatial resolution of Hurricane GPROF retrievals (5-10 km) is incomparable to the HWRF grid that has a grid spacing of 2 km. Due to the different resolutions, fine features that are associated with clouds and precipitations, which may be inferred from an HWRF background, may not be detected in the retrievals due to a coarse footprint size.

Experiments
Hurricane Gonzalo (2014) is used in this study. In order to have a large portion of Gonzalo covered by Hurricane GPROF data, only three analysis times are selected and they are (1) 0600 UTC 13 October, (2) 1200 UTC 16 October, and (3) (2) and (3), which is indicative a mature hurricane.

Prepare for assimilation
Currently, HWRF only assimilates radiances under clear-sky condition. Therefore, there is no preparation of microphysical variables for a background field. Prior to data assimilation, vortex initialization is carried out to improve the HWRF hurricane vortex through a series of adjustments. After vortex initialization, a background field that contains an improved vortex is then provided and used by GSI. However, currently, adjustment of cloud microphysical variables was not considered by vortex initialization, hence, values of CWM and partition parameters are set to zero. In order to include cloud condensate variables in HWRF assimilation, techniques are developed in HWRF, which can be summarized in three procedures: (1) reinitialize CWM and F_ICE, F_RAIN, and F_RIMEF in the background prior to data assimilation, (2) add CWM as control variable and add individual hydrometeor species as state variables, and (3) include tangent linear and adjoint parts of P6 formula (Equations (3)-(10)) in the minimization of the cost function.
Specifically, in (1), a possible solution to avoid the zero-cloud-condensate situation was proposed. That is, values of CWM, F_ICE, F_RAIN, and F_RIMEF in the 6-h HWRF forecast from a previous HWRF cycle are mapped onto a background (same grid spacing) provided by vortex initialization. Since the center of a hurricane in the 6-h HWRF forecast may be different from the center of the observed one, three-dimensional fields of CWM and partition parameters from the 6-h HWRF forecast are horizontally shifted to the true center before mapping to the background. In (2), the cross-variable correlations between CWM and other variables are described by the hybrid background error covariance, in which 20% comes from a static component embedded in GSI (Wu et al., 2002) and 80% comes from an ensemble component. For the static component that is based on the National Meteorological Center (NMC), the former name of National Centers for Environmental Prediction (NCEP) method (Parrish and Derber, 1992), the same correlations for q v are used to describe the correlations for CWM. For the ensemble component, the 80-member GFS ensemble forecasts are used. Since Ferrier-Aligo microphysics scheme is employed by both GFS and HWRF, CWM is also a prognostic variable to GFS. By adding CWM as an additional control variable, correlations between CWM and other control variables estimated by both static and ensemble components will be included in the hybrid background error covariance. Consequently, the resulting analysis will contain a consistent update of CWM and other control variables. This is important as pointed out by Huang (1996) that a consistent treatment of cloud and other control variables can avoid imbalanced initial condition. In (3), by including tangent linear and adjoint parts of Equations (3)-(6) in the cost function minimization, conversions between individual hydrometeor state variables and CWM control variable are carried out during each iteration.
As mentioned above, the 80-member GFS ensemble is used by HWRF. There exists a potential undesirable impact on the HWRF analysis and forecast when non-native ensemble is used (e.g. vortex spindown as discussed by Pu et al., 2016). Due to the different scales and architectures of the two models, background error covariance estimated by the GFS ensemble may be unable to accurately describe the cross-variable correlations of HWRF.

Experimental design
Two experiments are conducted that use two different pairs of observation operators and they are (1) CTL, which uses the 2015 HWRF operational configuration (Tallapragada et al., 2015) and assimilates integrated SWC and integrated LWC retrievals using h s_noHydro and h l_noHydro while conventional observations are also assimilated in the innermost domain of HWRF, and (2) USEP6, the same as CTL, except that h s and h l are used instead. Both CTL and USEP6 experiments are conducted for Hurricane Gonzalo (2014) at the three analysis times.

Observed versus simulated
A scatterplot is produced to summarize the assimilation statistics from the three pairs of CTL (dots) and USEP6 (triangles) experiments. In Figure 2, statistics are presented in terms of the absolute values of background innovation (observation minus background), |O − B|, plotted on the y-axis and the absolute values of analysis innovation (observation minus analysis), |O − A|, plotted on the x-axis. A diagonal solid line will be referred to as the 1-1 line.
In general, there are more data points from USEP6 located above the 1-1 line. In Figure 2, 42% (45%) of the data points of integrated SWC (integrated LWC) from the CTL experiment (circles) are above the 1-1 line, while 55% (77%) of the data points of integrated SWC (integrated LWC) from the USEP6 experiment are above the 1-1 line. One may also notice that there exhibits a wider spread of USEP6 data points, which  suggest that there are many larger values of background innovation of integrated SWC produced when using h s . In contrast, a narrower spread of USEP6 data points are seen in Figure 2(b) when using h l is used. Nevertheless, USEP6 analysis was an improvement of fit to observations over the background (i.e. |O−A| < |O−B|) when compared to that of CTL, which is encouraging.

Analysis increments
A vertical cross-section of analysis increments (analysis minus background) for CWM, specific humidity, and temperature is produced. In Figure 3, analysis increments from both CTL and USEP6 experiments valid at 1200 UTC 16 October is displayed along 26.5 ∘ N (Figures 1(c) and (d)). Nonzero CWM increments in the USEP6 experiments are evident in the region where observations are located (Figure 3(a)), while no CWM increments are displayed in CTL experiment (Figure 3(d)), as expected. Focusing on the USEP6 experiment, the increase of condensate between 600 and 900 hPa may be related to the liquid component of CWM, while the reduction between 200 and 500 hPa is likely due to the solid   (Figure 3(a)). These responses in CWM appear to coincide with the corresponding increments in specific humidity and temperature where warming and moistening is evident in lower troposphere and cooling and drying is more pronounced in upper troposphere (Figures 3(b) and (c)). On the other hand, the adjustments of specific humidity and temperature in the CTL experiment (Figures 3(e) and (f)) show a tendency to reach super-saturation by moistening and cooling the air column. Such adjustments are considerably different from the USEP6 experiment with additional constraints from CWM.

Hurricane track and intensity forecasts
Three pairs of 72-h HWRF forecasts initialized with CTL and USEP6 analyses are conducted. Forecast track error and intensity (minimum sea-level pressure; MSLP) from CTL and USEP6 are compared with best-track data from National Hurricane Center (NHC). Results suggest that USEP6 forecasts have slightly larger track errors during the first 48 h, but the errors become smaller compared to CTL forecasts in the last 24 h (Figures 4(a), (c) and (e)). In general, the differences between the track errors from CTL and USEP6 forecasts are small. The intensity forecast as measured by MSLP from USEP6 is found closer to the NHC best-track values than CTL for the two forecasts initialized at 1200 UTC 16 October and 0600 UTC 17 October (Figures 4(d) and (f)). There is no obvious difference between the intensity forecasts initialized by CTL and USEP6 analyses valid at 0600 UTC 13 October when Gonzalo was a tropical storm (Figure 4(b)).

Summary and discussion
New observation operators are developed to extend upon the work by Wu et al. (2016). Due to the limitations that include a potential negative bias and the lack of cloud condensate updates from using observation operators based on super-saturation (h s_noHydro and h l_noHydro ), new observation operators are developed by directly using hydrometeor species estimated from cloud microphysical variables. Since the Ferrier-Aligo microphysics scheme predicts CWM rather than the individual hydrometeors, an empirical formula embedded in GSI is used to partition CWM into individual hydrometeor species. Then, the new observation operators, h s and h l , are defined as a vertical integration of solid and liquid hydrometeor species, respectively.
Hurricane Gonzalo (2014) is selected to perform three pairs of CTL and USEP6 experiments to examine the impact of assimilating retrieved integrated SWC and integrated LWC using h s_noHydro and h l_noHydro (CTL) and h s and h l (USEP6) on HWRF analysis and subsequent forecast. For USEP6 experiments, encouraging results are obtained, in which the analysis was an improvement of fit to observations over the background, when compared to CTL. Including CWM as a control variable is found to have a clear impact on other variables, such as specific humidity and temperature. In general, USEP6 experiment appears to result in similar forecast track to that of the CTL experiment. Nevertheless, the MSLP forecast from the USEP6 experiment shows some improvement over the CTL forecast in two of the three cases.
As stated earlier, cloud microphysics scheme is essential to the development of the new observation operators, h s and h l . As an operational model, HWRF is currently restricted to use a rather simple but computationally efficient scheme that predicts total cloud condensate instead of individual hydrometeor species. If switched to a more sophisticated microphysics scheme that predicts individual hydrometeor species, the partition formula may be avoided. As a result, individual hydrometeors are more accurately accounted for during data assimilation, thus, improved assimilation of the retrieved integrated SWC and integrated LWC is anticipated. In addition, the resulting analysis may be further improved by using the HWRF-based ensemble, which is a more optimal choice over the GFS ensemble.