Model moist bias in the middle and upper troposphere during DEEPWAVE

Data from 279 dropsonde profiles collected during the Deep Propagating Gravity Wave Experiment (DEEPWAVE) over New Zealand between 4 June and 20 July 2014 were used to verify the relative humidity (RH) fields simulated by regional configurations of the UK Met Office Unified Model (MetUM) in the troposphere. Significant RH biases (predictions up to 28% too high) were found in the middle and upper troposphere during this period. This RH bias was found to be mainly caused by the errors in the simulated‐specific humidity. It is demonstrated here that evaporation from the lower boundary (mainly sea surface) is not a factor leading to the moist bias. A similar magnitude of moist bias was also found in the Global UM (the global configuration of the MetUM) and from a preliminary inspection is also very likely to occur in ERA‐interim and NCEP‐GFS reanalyses. This study suggests that the moist bias is very likely not a regional or a model specific issue.


Introduction
Relative humidity (RH) is an important field in determining the distribution and occurrence of clouds and precipitation (e.g. Price and Wood, 2002;Derbyshire et al., 2004). Changes in RH also affect (1) the distribution of latent heating, so as to affect the atmospheric circulation (Schneider et al., 2010), and (2) the level for the occurrence of deep convection detrainment (Hartmann and Larson, 2002). It has been shown that the size and intensity of tropical cyclones are controlled/impacted by the environmental RH (Holloway and Neelin, 2009) and atmospheric moisture profiles (Dunion and Velden, 2004). Cumulus parameterization schemes based on moisture convergence are widely used in Numerical Weather Prediction (NWP) models with a grid-length of approximately 8 km or coarser. If these models are too moist or too dry at levels where large-scale convergence occurs, convective precipitation forecasts will be affected.
Water vapour plays a key role in the budget of radiation in the troposphere and affects atmospheric radiative transfer (e.g. Held and Soden, 2000). Biases in forecasted large-scale humidity may result in biases in diagnosed clouds, leading to biases in radiation computations. The radiative effects and phase changes of water vapour play a key role in atmospheric processes, and the land-air and sea-air interactions (e.g. Shine and Sinha, 1991).
The significant impact of RH on precipitation, winds, and atmospheric processes means that a reliable analysis and forecast of RH is desirable. In NWP, initial conditions of moisture are provided through assimilation of direct or indirect measurements of moisture or water vapour from rawinsondes, surface stations and satellites. However, because of the limitations and errors in the measurements and analysis methods, deficiencies in moisture analyses and forecasts have been found both temporally and spatially (e.g. Bock and Nuret, 2009;Newman et al., 2015). Thus, verification of RH forecast and analysis using not only routine RH observations, but also field campaign RH observations are necessary to understand how well RH is analysed (initialised) and forecast within NWP systems.
DEEPWAVE studied the dynamics of gravity waves from the surface of the Earth to the upper level of the atmosphere (Fritts et al., 2015). It was conducted over and around New Zealand from 4 June to 20 July 2014 with extensive in situ observations from research aircraft along with surface, airborne, radar wind profiler, dropsondes and radiosondes. These highly valuable, spatially and temporally dense data can be used to verify model simulations in a region which normally has few in situ observations. During initial DEEPWAVE investigations, simulations from two NWP models; the New Zealand Limited Area Model (NZLAM) and New Zealand Convective Scale model (NZCSM) were compared with profiles from eight dropsondes over the South Island of New Zealand for a flight on 24 June 2014 ( Figure S1, Supporting information). Significant differences -due to errors in the specific humidity -leading to up to 50% moist errors in RH were found over the South Island. This raised three questions: (1) is the moist bias a case sensitive issue or a robust feature during DEEPWAVE not only over land but also over the open sea? (2) does the Global UM, from which NZLAM derives its lateral boundary conditions, and other global models have this issue? and (3) if the moist bias is more generally confirmed, what are the possible factors leading to the bias? This paper reports on our further investigation of these questions.

Description of models and data
The MetUM is a non-hydrostatic and fully compressible model system (see Section 5 of Webster et al., 2003 andDavies et al., 2005 for detailed descriptions). The NZLAM is a regional configuration of the MetUM. Its domain has 324 × 324 horizontal grid points (Figure 1(a)) with a grid-length of approximately 12 km and 70 levels in the vertical, with the model top at 80 km. NZLAM performs a 3 hourly data assimilation cycle using an incremental 3DVAR FGAT (first guess at appropriate time) analysis scheme (Lorenc et al., 2000). The lateral boundary conditions are derived from the operational Global UM run at the Met Office (see Yang et al., 2012 for further details of NZLAM and its data assimilation scheme).
During DEEPWAVE, a total of 282 dropsondes were launched from the flight level (approximately 12 km) of the NSF/NCAR HIAPER Gulfstream-V research aircraft. Among them, 279 dropsondes were within the NZLAM domain (Figure 1(a)). The observed RH ranges from 0 to 100%, with an absolute accuracy of ± 5% and resolution of 1.0%. The vertical resolution of the dropsonde data is approximately 10 m. Moist biases in the dropsonde observations during DEEPWAVE were found and corrected recently. In this study, the corrected dropsonde data were used. For more information on the NCAR Dropsonde System one may see: http://www.eol.ucar.edu/instrumentation/ sounding/dropsonde. Further, these dropsondes were not incorporated in the data assimilation schemes of the Global UM or NZLAM, so can be regarded as an independent validation dataset.
For the verification, the simulations from the operational Global UM and NZLAM were used. The corresponding profiles (i.e. simulated dropsondes) were created from model level data using linear interpolation allowing for the drift of the sondes. Hourly outputs from the models were used for this interpolation. Two groups of simulated dropsondes were made to investigate the effect of forecast range on model moist bias. For the first group, the forecasts started about 6-13 h before the observation time (Group T + 6). In the second group, the forecasts started 30-37 h before the observations (Group T + 30). The difference in forecast range is 24 h between the two groups. Nine days of Global UM data were not available for the purposes of this study. This led to only 192 simulated dropsondes in Group T + 6 and 193 simulated dropsondes in Group T + 30. Between the two groups, 124 dropsondes overlapped. All the dropsondes were launched in a time window of a few hours before midnight to a few hours  Table S1). The sub-area outlined by the black lines is used for area mean RH profiles shown in Figure 5. The arrows from A and B point to the beginning and end of the cross-sections in Figure S1, respectively. (b) The vertical profile of mean RH for Group T + 6 from the dropsondes, the NZLAM simulations and the recalculated RH (NZLAM_UMQ) using the simulated-specific humidity from the global UM. (c) The same as (b) but for Group T + 30. Along each of the RH profiles, the length of the horizontal line is twice the standard error of the mean.
after midnight local time (NZST). From the surface simulations and analysis of NZLAM around midnight, five distinct weather situations were classified for the 279 dropsondes (Table S1).

Relative humidity
The mean RH profiles for both the observations and NZLAM simulations are shown in Figures 1(b) and (c). Large differences in RH (10-28%) were found between 4 and 10 km for the two groups. Obvious moist biases (simulations − observations) were found from 4 km upward, with the maximum moist bias of 28% for Group T + 30 and 22% for Group T + 6 at 10 km (not shown). These biases in simulated RH by NZLAM from 4 to 10 km were significantly larger than the accuracy of ± 5.0% in humidity observations and the standard error of the mean. This bias in RH is referred to as model moist bias in this paper. Mean profiles of RH for five different synoptic weather situations observed during DEEPWAVE were analysed and a robust moist bias from 4 to 10 km was found in NZLAM for each weather situation (Table S1 and Figure 2(a)). Some differences in the magnitude of the moist bias were found, but these were attributed to the difference in weather situations and circulations, and also to the difference in sample sizes and locations of the dropsondes (Figure 1(a)).
RH can be affected by pressure, temperature and specific humidity. To estimate the relative contribution each of these three fields makes to the moist bias, simulated RH was recalculated by replacing simulated fields with, in turn, observations of pressure (NZLAM_OBSP), air temperature (NZLAM_OBST) and specific humidity (NZLAM_OBSQ) from the dropsondes. Using the observed pressure, the bias of the recalculated RH were almost the same as those of simulated RH (NZLAM, Figures 2(b) and (c)), indicating that errors in the simulated pressure did not contribute to the moist RH bias. Using the observed air temperature to replace the NZLAM simulated air temperature, the 'moist bias' only decreased by approximately 5% (Figures 2(b) and (c)). Using the observed specific humidity, the bias of the recalculated RH (NZLAM_OBSQ) were very close to zero, especially from 4 to 10 km. These results indicate that error in the simulated-specific humidity is the dominant factor leading to the moist bias in the middle and upper troposphere.
The simulated model level data of the Global UM available for this study contains only wind, potential temperature and specific humidity. RH at model levels cannot be calculated. To determine whether the Global UM also has the moist bias found for NZLAM, the simulated-specific humidity by the Global UM was used to replace the specific humidity simulated by NZLAM to recalculate RH (NZLAM_UMQ, Figures 1(b) and (c)). For Group T + 6 (Figure 1(b)),  Table S1). (b) Bias of RH for Group T + 6 simulated by NZLAM and bias of the recalculated RH using observed pressure (NZLAM_OBSP), observed mixing ratio (NZLAM_OBSQ) and observed air temperature (NZLAM_OBST) from the dropsondes. (c) is the same as (b) but for Group T + 30. below 2 km, the RH for NZLAM_UMQ was 2-7% higher than NZLAM. From 2 to 5 km aloft, both were very close. Above 5 km, the RH for NZLAM_UMQ was 5-10% higher than NZLAM. For Group T + 30 (Figure 1(c)), below 7 km, the RH for NZLAM_UMQ was either very close or slightly higher (1-2%) than NZLAM. Above 7 km, the RH for NZLAM_UMQ was 3-5% higher than NZLAM. Because the errors in specific humidity are the main factor leading to the moist bias, these facts indicate that a moist bias with similar magnitude is also present in the Global UM in the middle and upper troposphere. Noh et al. (2016) used one-year radiosonde observations from the Global Climate Observing System Reference Upper-Air Network (GRUAN) to verify the Global UM analysis. Moist bias (approximately 3% in RH) was only found in the upper troposphere, much smaller than that in this study. These differences may be due to that fact the radiosonde observations had also been used for the analysis. In addition, the drift of the radiosondes were not considered.

Mixing ratio
For profiles of the mean specific humidity during DEEPWAVE (Figure 3), the exponential decrease of mixing ratio with altitude was consistent between observations and the simulations. NZLAM was very close to the Global UM with respect to vertical variations in the specific humidity. Major differences between observations and simulations were found between 3 km and 6 km aloft. In the lower boundary layer for both groups (Figures 3(b) and (d)), observed specific humidity was greater than that forecast by the Global UM and NZLAM below 500 m. The maximum difference between observations and simulations was found at the surface. These features in the lower boundary layer corresponded to the negative bias (simulations − observations) of the simulated-specific humidity (Figures 4(b) and (d)), indicating that evaporation at the lower boundary (mainly sea surface) simulated by the models was generally less than observations. Because the source of the moisture in the atmosphere is from the earth surface, the simulated moist bias in the middle and upper troposphere was not caused by the lower boundary conditions.
For the specific humidity (Figures 4(a)), large errors (MAD, mean absolute differences) were found below 5 km, with NZLAM slightly higher than the Global UM for both groups (Figures 4(a)). Above 5 km, MAD of the specific humidity decreased dramatically with altitude. For the bias (Figure 4(b)), positive values (moist bias) were found above 2 km with the maximum near an altitude of 4 km. Large negative values (dry bias) were found at the surface. Figure 4(c) shows the relative  ) . Relatively large values (approximately 40% or higher) were found from approximately 4 km upward. Thus, the magnitude of the model moist bias in the middle and upper troposphere is determined by the relative errors of specific humidity, not its absolute errors.

Moist bias in ERA-Interim and NCEP-GFS analyses
It is of interest to determine whether the moist bias observed in the UM is a feature of this particular model or is also manifest in other NWP systems. Simulated hourly RH on standard pressure levels from both ERA-Interim (Dee et al., 2011) and NCEP-GFS have been compared with NZLAM over the sub-area from 150.0 to 182.5 ∘ E and 32.5 to 55.0 ∘ S (Figure 1(a)). This region corresponds with the DEEPWAVE experiment and encompasses 263 DEEPWAVE dropsonde locations. These dropsondes data were not used to create ERA-Interim (0.25 ∘ resolution) and NCEP-GFS (0.5 ∘ resolution) reanalyses. Figure 5 shows the mean RH profiles for the sub-area at pressure levels for ERA-Interim and NCEP-GFS analyses at 1200 UTC and the corresponding mean RH profile from the NZLAM analyses from 6 June to 20 June 2014. RH analyses at 1200 UTC was chosen because the dropsondes were launched around 1200 UTC. For ERA-Interim (Figure 5(a)), the mean RH from 650 to 500 hPa was 2-5% lower than that of NZLAM, but 1-2% higher from 400 to 200 hPa. Overall, the mean RH profile of ERA-Interim was very close to NZLAM from 650 to 200 hPa, indicating that ERA-Interim also had a similar magnitude of moist bias in middle and upper troposphere. These results are consistent with a study made during the Tibetan Plateau Experiment (Bao and Zhang, 2013) which also showed a moist bias (5-25% in RH in their Figures 2 and 3) from 500 to 200 hPa for ERA-Interim. This suggests also that the moist bias may not be specific to the New Zealand region. For NCEP-GFS, from 700 to 200 hPa, the mean RH was 4-10% higher than NZLAM, indicating that NCEP-GFS also had a moist bias of at least the same magnitude as NZLAM, or even larger in the middle and upper troposphere.

Conclusion
Dropsonde descents made during the DEEPWAVE campaign over New Zealand and the surrounding open sea have been used to verify the tropospheric RH simulated by regional configuration of the MetUM. Significant moist biases (up to 28%) in simulated RH were found in middle and upper troposphere. The moist bias was robust for different weather situations during DEEPWAVE. This moist bias was mainly caused by the moist bias in the simulated-specific humidity. The magnitude of the moist bias was determined by the relative errors of the specific humidity instead of its absolute errors. A moist bias with similar magnitude was also found in the Global UM, and very likely in ERA-Interim and NCEP-GFS analyses too. Analysis showed that for both the Global UM and NZLAM the lower boundary conditions had a dry bias and evaporation from the lower boundary (mainly sea surface) is not a factor leading to the moist bias issue. Other possible factors that could cause the moist bias are the data assimilation configuration, precipitation and cloud processes, and the advection of water vapour. We briefly survey these mechanisms below.
Water vapour advection is determined by both the local winds and water vapour gradient, which are significantly affected by different weather systems. The continued presence of the moist bias during DEEPWAVE for the five weather situations suggests that water vapour advection may not be the main factor.
Given a reliable water vapour from the initial conditions, a moist bias may result from less precipitation and clouds. However, validation of NZLAM during the DEEPWAVE period shows both wet and dry biases in precipitation over New Zealand ( Figure S2). This complicated distribution of precipitation bias is partly a result of the orographic effect on precipitation over New Zealand's mountainous regions. This issue and other errors in the parameterization and modelling of rainfall make it difficult to make a direct link between biases in upper troposphere RH and surface precipitation. In addition, the only slight increase in moist bias with forecast range, commensurate with typical forecast degradation with lead time, suggests that this is an initial condition problem. However, further research could be conducted to see if precipitation and cloud processes are a factor contributing to the moist bias.
The fact that the moist bias is observed from the very beginning of each model simulation and that all three data assimilating models investigated in this study show similar middle to upper tropospheric RH profiles, suggests that data assimilation may be making an important contribution to the bias. We think this is an area where more research is warranted.
Given a moist bias in the initial conditions, more clouds would be simulated. This would produce more precipitation with more latent heat release, and affect the surface energy budget by decreasing shortwave radiation and increasing downward longwave radiation reaching the surface. How significant these effects and the possible feedback will be left for future study.

Supporting information
The following supporting information is available: Table S1. Weather situations and dominant incident wind directions (DIWD) to the NZLAM domain (Figure 1(a)) from NZLAM surface simulations around midnight, and the number of dropsondes associated with each weather situation during DEEPWAVE. Figure S1. Cross-sections of RH (%) and potential temperature (K) across the South Island along A and B in Figure 1  Deepwave for NZLAM and NZCSM when compared against rain-gauge observations. Both models show significant biases, but apart from a consistent over-prediction in the south and east of the South Island the distribution of errors is quite different. The differences (and largest errors) are most obvious in areas of complex terrain where the higher resolution NZCSM tends to perform better.