Assimilation of humidity and temperature observations retrieved from ground-based microwave radiometers into a convective-scale NWP model

The results showed that the impact was generally very limited on all veriﬁed parameters, except for precipitation. The impact was found to be generally beneﬁcial in terms of most veriﬁcation metrics for about 18 h, especially for larger accumulations. Two additional data-denial experiments showed that even more positive impact could be obtained when MWR data were assimilated without other redundant observations. The conclusion of the study points to possible ways of enhancing the impact of the assimilation of MWR data in convective-scale NWP systems.


Introduction
Nowadays, ground-based microwave radiometers (MWRs) are robust instruments providing continuous unattended operations and real-time accurate atmospheric observations under nearly all weather conditions (Cimini et al., 2011;Löhnert and Maier, 2012). MWR products are used for a variety of applications including, but not limited to, operational meteorology, air-quality c 2016 Royal Meteorological Society monitoring, wave-propagation studies, as well as site climatology characterization (Westwater;1993;Cimini et al., 2011Cimini et al., ,2015Löhnert and Maier;2012). At the same time, numerical weather prediction (NWP) systems have increasing needs for highresolution observations both in time and space as their own resolutions increase. Typically, current limited-area operational NWP systems run at the kilometre scale and provide forecasts taking into account the latest observations every couple of hours. For instance, both the Met Office's UKV (Tang et al., 2013) and DWD's COSMO-DE (Baldauf et al., 2011) models are run in rapid update cycles with new forecasts initialized every 3 h and data assimilation being carried out at horizontal resolutions of 3 and 2.8 km, respectively. Over recent years, substantial efforts have been devoted to the assimilation of observations specific to the convective scale, such as radar data (e.g. Sun, 2005;Stephan et al., 2008;Montmerle and Faccani, 2009;Caumont et al., 2010;Schwitalla and Wulfmeyer, 2014), which provide information on dynamics and microphysics, and also Global Positioning System (GPS) Zenith Total Delay (ZTD) data (e.g. Yan et al., 2009aYan et al., ,2009b, or Slant Total Delay data (e.g. Kawabata et al., 2013), which mainly provide information on the integrated water content. The assimilation of such data has even become operational at some national weather centres. Other observations which have been considered recently include lightning flash rate from ground-based networks (e.g. Fierro et al., 2014), whose relationship to model parameters is particularly not straightforward, lidar water vapour (e.g. Bielli et al., 2012), and lidar wind (e.g. Kawabata et al., 2014), among others. Even so, humidity and temperature are still acknowledged as poorly observed parameters at the kilometre scale, especially in the planetary boundary layer.
In spite of all these considerations, the use of MWR data for assimilation into NWP systems has been limited to a few sporadic attempts. For example, three-and four-dimensional variational (3D-Var and 4D-Var, respectively) assimilation of temperature and humidity data from a single ground-based MWR has been attempted for a winter fog event by Vandenberghe and Ware (2002). The data were assimilated at a horizontal resolution of 10 km with the Fifth-Generation Pennsylvania State University/National Center for Atmospheric Research Mesoscale Model (MM5; Grell et al., 1994) over a common time window of 3 h. Both 3D-Var and 4D-Var assimilation experiments were able to forecast the fog to some extent, whereas a control run without data assimilation could not. However, MWR data were assimilated along with integrated water vapour content from GPS ground receivers and radar wind profiles; this prevented the attribution of the impact of data assimilation solely to MWR data.
More recently, an Observing System Simulation Experiment (OSSE) considering a simulated network of 140 MWRs was carried out for a winter storm case Hartung et al., 2011). The Weather Research and Forecasting (WRF) model (Skamarock et al., 2005) and an ensemble Kalman filter (EnKF) algorithm were used to assimilate simulated MWR temperature and humidity profiles at a horizontal resolution of 18 km every hour during a period of 24 h. Overall, the authors found that the assimilation of MWR data had a positive impact on temperature and humidity analyses. The impact on forecasts up to a range of 12 h was found to be positive with respect to 850 hPa moisture flux, but more variable regarding precipitation accumulations.
These studies all showed a promising impact of the assimilation of MWR data into NWP, though results were limited to single case-studies and deep convection was parametrized in the chosen NWP systems. The novel purpose of this article is to assimilate data from a real network of ground-based MWRs in a convective-scale NWP system, and study its impact on heavy precipitation forecasts. This study was carried out in the framework of the international Hydrological cycle in the Mediterranean Experiment (HyMeX; Drobinski et al., 2014), in preparation for the HyMeX First Special Observing Period (SOP 1; Ducrocq et al., 2014). One of the goals of HyMeX SOP 1 was to improve our understanding and predictive skills of heavy precipitation and related flash floods around the Western Mediterranean (WMed) basin. Temperature and humidity retrievals from an international continental-scale network of MWRs have been collected for a period of 41 days and assimilated into a version of the operationally used convective-scale NWP system Arome.
Section 2 presents the characteristics of the period under investigation, the NWP system, and the microwave radiometer observations used in the study. Section 3 reports upon the observation-minus-background statistics computed prior to the assimilation experiments. Section 4 describes the assimilation experiments performed in this study and the impact of assimilating MWR temperature and/or humidity on various atmospheric parameters. Section 5 summarizes the findings and discusses the results as well as paths of improvement.

Experimental environment
This study has been carried out in preparation for HyMeX's SOP 1, which was dedicated to the study of heavy precipitation and flash flooding in the northwestern Mediterranean. Such extreme events most often occur in autumn (Ricard et al., 2012). This is the reason why SOP 1 was held from 5 September to 5 November 2012. The experimental design that is described in the following was motivated by the occurrence of many heavy precipitation events (HPEs) during the autumn of 2011 and the concurrent availability of the Arome-WMed prototype and MWR data over the same period.

Period under investigation
The northwestern Mediterranean coastal areas were particularly affected by HPEs in the autumn of 2011. The period considered in this study extends from 15 October to 25 November 2011 and encompasses most of the HPEs which occurred that year. Time series of 24 h accumulated precipitation averaged over the western Mediterranean area show when these HPEs occurred ( Figure 1). The two periods with the largest daily accumulations, i.e. around 25 October and 5 November, have received a lot of attention from hydrometeorologists. For instance, Pulvirenti et al. (2014) studied the severe weather event that hit northwestern Italy from 3 November to 8 November 2011. Silvestro et al. (2012), Fiori et al. (2014), and Hally et al. (2015 studied the Genoa case of 4 November, which occurred in Liguria, Italy. Rebora et al. (2013) and Buzzi et al. (2014) also studied this latter case along with the Cinque Terre case of 25 October, which also occurred in Liguria. The Cévennes, France, case of 1-4 November has been dealt with by Hally et al. (2013). This period of time (41 days) has also been chosen to be long enough to yield robust statistics.

NWP system
The NWP system used in this study is Arome-WMed, a particular version of the Arome system (Seity et al., 2011) covering the western part of the Mediterranean Sea ( Figure 2). As such, Arome-WMed is more suited to study Mediterranean HPEs than its operational counterpart over the 'France' domain. Another advantage is that its domain is larger than the operational one,   Figure 9. Locations of MWR sites are also shown (large circles; + and × indicate respectively humidity and/or temperature retrievals whenever available). Locations of radiosonde launch sites used in section 4.1 are represented as small circles.
which allows it to include more MWRs. Like its operational counterpart until 13 April 2015, Arome-WMed has a nonhydrostatic dynamical core with a horizontal resolution of 2.5 km and 60 vertical levels which follow the terrain in the lowest layers and isobars in the upper atmosphere. The detailed physics of Arome-WMed are inherited from the research Meso-NH model (Lafore et al., 1998). Deep convection is assumed to be resolved explicitly, but shallow convection is parametrized following Pergaud et al. (2009). A bulk one-moment microphysical scheme (Pinty and Jabouille, 1998) governs the equations of the specific contents of six water species (humidity, cloud liquid water, precipitating liquid water, pristine ice, snow, and graupel).
Arome-WMed has a 3D-Var data assimilation system  with background covariances specially computed for the 'WMed' domain. 3D-Var analyses are performed every 3 h and provide new initial states for subsequent forecasts. Such assimilation cycles are usually referred to as rapid update cycles. Data assimilated by the Arome data assimilation system include observations from radiosondes, wind profilers, aircrafts, ships, buoys, automatic weather stations, satellites, GPS stations (Mahfouf et al., 2015), and both radar reflectivity (Wattrelot et al., 2014) and Doppler radar wind velocity (Montmerle and Faccani, 2009). Figure 3 details the average numbers of observations which are used to produce a single analysis. In total, nearly 34 000 observations are assimilated in each data assimilation cycle. In addition, 30 h forecasts are performed starting from the 0000 UTC analysis every day.
The lateral boundary conditions (LBCs) that are needed to compute Arome-WMed forecasts are updated hourly. They are  provided by the global Arpege NWP system . Arpege has a horizontal resolution of approximately 15 km over the WMed domain. Arpege forecasts are initialized every 6 h, i.e. at 0000, 0600 UTC, etc. In the Arome-WMed rapid update cycle, the most recent Arpege analyses or forecasts are used every hour as LBCs. For the 30 h Arome-WMed forecasts starting at 0000 UTC, Arpege forecasts also initialized at 0000 UTC are used as LBCs, i.e. with the same ranges as Arome-WMed. With a similar configuration, the Arome-WMed system was run in real time during HyMeX SOP 1 and contributed to guide the deployment of dedicated, mobile observing platforms such as research aircraft and boundary-layer pressurized balloons (Fourrié et al., 2015, give more details about Arome-WMed).

Microwave radiometer observations
The microwave radiometer observations considered here consist of atmospheric temperature and humidity profiles (from surface up to 10 km altitude) retrieved by 13 ground-based MWRs. Microwave radiometry is a passive technique that has been used for several decades to observe thermodynamic profiles in the troposphere (Westwater, 1993;Westwater et al., 2005). Ground-based MWRs measure the downwelling radiance coming from the atmosphere. The radiance is usually expressed in terms of brightness temperature, T b . Most common MWRs measure T b at selected channels in the 20-60 GHz frequency range (0.5-1.5 cm wavelength). MWR humidity profilers exploit 20-30 GHz channels, while MWR temperature profilers exploit 50-60 GHz channels. MWR full (i.e. temperature and humidity) profilers have channels in the 20-60 GHz range. In principle, T b can be assimilated directly as is now commonly done for satellite sounders (e.g. Andersson et al., 1994). However, direct assimilation requires a fast radiative transfer model and its adjoint, which were not available at the time of this analysis. Such tools are currently being implemented (De Angelis et al., 2016). Instead, we consider here atmospheric temperature and/or humidity profiles which are inferred by processing the measured T b together with some a priori knowledge within an inverse method. Different inverse methods may be used, including multivariate regression, optimal estimation, and neural networks (Westwater, 1993;Cimini et al., 2006). The a priori information is generally obtained from radiosonde climatology, though methods exploiting model analyses or forecasts have also been demonstrated (Cimini et al., 2011;Güldner, 2013).
Temperature and humidity profiles retrieved from MWR observations are usually validated against simultaneous radiosonde measurements. Statistics (mean and standard deviation) of the difference between MWR retrievals and radiosonde measurements are used to quantify the accuracy of MWR retrievals. For temperature profiles, the standard deviation is typically of the order of 0.5 K near the surface, increasing to about 1.5-2 K in the middle of the troposphere (Güldner and Spänkuch, 2001;Liljegren et al., 2005). The best performance of MWR temperature profiling is thus expected in the boundary layer (Crewell and Löhnert, 2007). For profiles of absolute humidity, the standard deviation is typically less than 0.5 g m −3 near the surface, increasing to 1.5 g m −3 in the first 2 km and decreasing exponentially above that level due to the decrease of humidity with height (Güldner and Spänkuch, 2001;Liljegren et al., 2005). Typically, these values are only weakly dependent on the inversion method. Differences from site to site are largely due to specific MWR characteristics and a priori data.
The statistics above include the radiosonde sensor uncertainties as well as the representativeness errors caused by balloon drifting, and thus indicate the envelope for the retrieval uncertainty expected from well-calibrated and maintained MWRs. Sources contributing to the total retrieval uncertainty include instrument calibration, microwave absorption model, a priori climatology, and smoothing error.
The smoothing error is due to the relative low-to-moderate vertical resolution attainable by MWR retrieved profiles. Although MWR retrievals are given on fixed vertical grids depending upon instrument settings (e.g. every 50 m from the surface to 1 km, then every 250 m up to 10 km), the true vertical resolution depends on many factors, including the elevation scanning strategy and the atmospheric conditions. One method commonly used to quantify the true vertical resolution of MWR retrievals is the inter-level covariance (ILC; Güldner and Spänkuch, 2001;Liljegren et al., 2005;Cimini et al., 2006). For a generic MWR operating in the 20-60 GHz range, the vertical resolution decreases almost linearly with height, z, from relative small ILC values (higher resolution) near the surface to larger ILC values (lower resolution) in the upper troposphere. Following Liljegren et al. (2005), one can estimate the true vertical resolution of MWR retrievals as: where both ILC and z are expressed in km.
Additional systematic retrieval error can be caused by environmental conditions (e.g. wetting of the radiometer radome) and faulty calibration. For example, Löhnert and Maier (2012) show that a significant bias (∼0.5-1 K) in temperature may occur as a function of height throughout the troposphere, mainly due to systematic calibration offset in T b . These may be corrected for by continuous comparisons with radiosondes (if available) or within the assimilation procedure itself using the model as mean reference.
In this study we consider 13 MWR units falling within the domain of Arome-WMed, as shown in Figure 2. These instruments belong to different European institutions and are members of MWRnet, an International Network of Ground-based Microwave Radiometers (http://cetemps.aquila.infn.it/mwrnet/; accessed 7 July 2016). MWRnet aims at defining the best practice for obtaining good-quality MWR observations and retrievals, ultimately increasing the use of MWR data in NWP and other applications. Details on each considered MWR are summarized in Table 1. These include one humidity profiler, three temperature profilers, and nine full (temperature and humidity) profilers. The humidity profiler is MIAWARA, a prototype built and maintained at the Institute of Applied Physics (IAP) of the University of Bern (Deuber et al., 2004;Straub et al., 2010). MIAWARA is a spectroradiometer with 16 000 channels around the water-vapour absorption line at 22.235 GHz (from 21.734 to 22.735 GHz). The humidity profile retrievals are obtained using an optimal estimation inversion method. The three temperature profilers are one TEMPRO and two MTP5-HEs. TEMPRO is a sevenchannel (from 51.26 to 58.0 GHz) radiometer manufactured by Radiometer Physics Gmbh (RPG). Temperature profile retrievals are obtained using a multivariate linear regression inversion method. MTP5-HE is a single-channel (56.60 GHz) radiometer manufactured by RPO ATTEX. MTP5-HE performs rapid and dense elevation scans to provide temperature retrievals up to 1 km altitude. The nine full (temperature and humidity) profilers are multichannel radiometers manufactured by either RPG (HATPRO) or Radiometrics Inc. (MP3000). Temperature and humidity profile retrievals (from surface up to 10 km) are obtained using multivariate regression and neural network inversion methods, respectively.
For the period under analysis, the operating institution of each of the 13 MWRs in Table 1 provided original non-biascorrected temperature and/or humidity profiles as obtained using the operational inverse method. Nominal uncertainties within the troposphere are approximated to be within 0.5 and 2 K for temperature and 0.5 and 1.5 g m −3 for absolute humidity, independent of inversion method. The complete dataset has been down-sampled at 3 h interval to match the assimilation scheme of Arome-WMed, i.e. observed data have been averaged over a 1 h window centred on the analysis time and assimilated at 3 h intervals.
Absolute humidity retrieved from the MWR data was converted to specific humidity, the latter quantity being used in the monitoring and assimilated in data assimilation experiments. Temperature is needed to convert absolute humidity to specific humidity. Tests were performed that compared specific humidity obtained by using either temperature retrieved from the MWR data or temperature from Arome-WMed 3 h forecasts. Differences in resulting specific humidities were found to be negligible (not shown). For monitoring and data assimilation purposes Table 1. Station name, operating institution, location (latitude and longitude), altitude (Alt) above mean sea level, available products (Prod), type, number of channels, frequency range and retrieval method for the 13 MWRs considered in this study. (sections 3 and 4), temperatures were retrieved from the MWR data when they were available, and from Arome-WMed 3 h forecasts otherwise.

Monitoring of observations
Before assimilating MWR products, observation-minusbackground (O−B) values have been computed for temperature and specific humidity profiles at each site in order to check the consistency between MWR products and 3 h model forecasts. A control (CTRL) experiment, which assimilated all observations described in section 2.2 except MWR products, was used for that purpose. The 3 h forecasts of its rapid update cycle provided temperature and specific humidity profiles that were interpolated at the observation locations. Thus, a total of 8 × 41 = 328 background forecasts were produced for the O−B statistics. An example of O−B temperature differences for one site and the whole period is shown in Figure 4. The daily cycle as well as the longer-scale modulation measured by the MWR and simulated by the model are consistent. For temperature, differences seem well centred and are usually within 2 K, though they can exceptionally exceed 5 K. For specific humidity, sharp transitions can be noticed in the O−B time series, which are caused by the more variable nature of tropospheric humidity.
O−B statistics were computed from O−B differences at each site. The mean and the standard deviation of the differences in temperature are shown in Figure 5. The biases in temperature can be quite large and reach ±4 K. However, biases are always lower than ±2 K below 2 km amsl, except for Madrid. The biases vary substantially from one site to another. However, the average bias for all MWR data is remarkably close to zero and always within ±1 K (not shown). As expected, standard deviations increase with altitude for temperature from approximately 1 to 2-3 K, because of the intrinsic lower resolution of retrievals at high altitudes compared to that of the model. Compared to the biases, the standard deviations do not depend much on the station, except for Schneefernerhaus, which provides slightly better O−B standard deviations than the average.
The O−B statistics for specific humidity are presented in Figure 6. Both biases and standard deviations decrease with altitude for specific humidity. This is caused by the depletion of specific humidity with altitude. On the other hand, standard deviations increase with altitude for relative humidity for the same reason as for temperature (not shown). In the lower altitudes, the biases for specific humidity are within ±1.5 g kg −1 , while the standard deviations are within 0.5-1.5 g kg −1 . These values are more or less the same for all stations. However, it may be noticed that the Madrid station provides the worst bias between 2.5 and 4 km amsl, the one at Potenza provides the worst standard deviation between 1 and 3 km amsl, and the station at Schneefernerhaus provides the lowest standard deviation above 3 km amsl.
The larger biases seen in both temperature and specific humidity retrievals are due to a combination of model bias, instrument bias, and retrieval bias. Methods to produce weakly biased MWR retrievals are already available (Löhnert and Maier, 2012;Güldner, 2013), though were not used operationally at the sites considered here. Although the biases of MWR retrievals are generally larger than those of radiosondes, the standard deviations are of the same order.
Based on these statistics, further quality control was performed such as removing retrieved temperature profiles when the profile bias minus the average profile bias exceeded 2 K, or removing retrieved temperature from the Kloten station above 3 km amsl (the latter data are not shown in Figure 6). However, this additional quality control did not discard much data and should be more strict in an operational context. In addition, all observations were bias-corrected against the model prior to their assimilation: for each observation, the average O−B value for the given station and altitude was subtracted from the original value. Figures 5  and 6 plotted with the corrected values would result in all curves drawn on the x = 0 axis for the bias, but unchanged for the standard deviation.

Data assimilation experiments
In order to investigate the impact of MWR data assimilation, three Arome-WMed experiments were run in addition to the CTRL experiment described in section 3. These three experiments assimilated, in addition to operational data, MWR products described in section 2.3: temperature profiles only (DA T), humidity profiles only (DA Q), and both temperature and humidity profiles (DA TQ). With the 3D-Var rapid update cycle used here, the information brought by MWR observations at a given time was indirectly propagated throughout the experiment through the 3 h forecasts. For instance, when computing initial conditions at 1200 UTC, operational observations and MWR products (temperature, humidity or temperature and humidity, depending upon experimental run) were merged with a 3 h forecast which started at 0900 UTC from an analysis that took into   account operational and MWR observations. An estimate of the observation error is needed to run variational data assimilation. In this study, the observation-error standard deviation for specific humidity was set to 12% of specific humidity at saturation, while for temperature it was within 1-1.5 K, depending on the model level. These values are typically used to assimilate radiosonde data.

Verification against upper-air observations
The direct impact of the assimilation of MWR data on upper-air temperature and humidity, as well as the indirect impact on upper-air wind was verified against radiosonde measurements. The locations of the 30 available radiosonde launch sites are plotted as small circles in Figure 2. The bias and root-mean-square error were computed for all 3 h forecasts and all radiosoundings valid at the same times. A total of 798 available soundings were used. It was found that all experiments yielded very similar results, and the differences among them were not found to be statistically significant for all considered parameters, i.e. temperature, relative humidity, wind speed, and wind direction. Several reasons may be hypothesized to explain these similar statistics. First, the radiosonde data which are used for verification are also assimilated in all experiments. However, this is done 12-24 h before the verification, so it is likely that most of their impact is lost after such a period of time. It is all the more likely since many other upper-air data such as satellite radiances (about 21 900 in each analysis) are assimilated in the assimilation cycles between the radiosonde data assimilation time and the verification time. Second, given the relative sparseness of the radiosounding network, the impact of assimilating MWR data may not be propagated to radiosonde launch sites. Third, the information provided by MWR data may be redundant with that  of other observing systems such as radiosondes. This latter aspect is addressed in section 4.3.

Verification against surface observations
The impact of the assimilation of MWR data was also assessed by contrasting experiments with observations from ground weather stations. Through the use of the HyMeX database, it was possible to get dense data over Spain, France, and Italy, while synoptic surface measurements available over the Global Telecommunication System (GTS) were used over the rest of the domain. These ground stations usually measure surface pressure, 2 m temperature and humidity, 10 m wind, and precipitation accumulations over various time periods. Verification statistics for all these parameters were extensively computed for all forecast terms of the runs which started at 0000 UTC. The impact of the assimilation of MWR data is generally very limited, whatever the forecast term considered, and the differences between all experiments was not found to be statistically significant for surface pressure, 2 m temperature and humidity, and 10 m wind.
These similarities in verification statistics pertain for all verified parameters, except those related to precipitation. Indeed, more pronounced, statistically significant, differences are obtained for quantitative precipitation forecasts (QPFs) as verified by raingauge observations. For instance, Figure 7 shows continuous verification statistics for 6 h accumulated precipitation forecasts as a function of the forecast range.
While bias in 6 h accumulated precipitation is degraded by the assimilation of MWR products, root-mean-square error and correlation coefficient consistently show some positive impact of the assimilation of MWR products in the first forecast terms. The root-mean-square errors of all data assimilation experiments outperform that of CTRL until about 18 h into the forecast. Correlation coefficients essentially convey the same information as the root-mean-square errors: the assimilation of MWR products has generally a beneficial impact on forecasts up to about 18 h. The larger differences in precipitation forecasts may be explained by the fact that the verifying observations are not assimilated in any of the four experiments. Also, precipitation is highly nonlinear and small differences in other model variables in the model initial state may lead to large differences in QPF. Positive impact is not expected at long-term ranges because the impact of the LBCs increases with forecast range.
Continuous verification statistics for precipitation accumulations may reflect contrasted behaviours as they treat equally small and large accumulations, whereas, for instance, a substantial bias in small accumulations may be considered as negligible for large accumulations. Categorical scores alleviate this issue by considering the ability of a model to forecast events that are defined as functions of given thresholds in accumulation values. The frequency bias (FBIAS) is the ratio of forecast events to observed events. A perfect model would yield an FBIAS of 1, while a model that forecasts too many (too few) events with respect to their natural occurrence would yield an FBIAS larger (smaller) than 1. The equitable threat score (ETS) reflects the number of correct forecasts in excess to those that would verify by chance. A perfect model would yield an ETS of 1, an unskilled model 0, and a poor model −1/3. Since positive impact can be noticed up to about 18 h, the FBIAS and ETS have been computed for 0 to 18 h accumulated precipitation forecasts and are plotted in Figure 8.
All models share the same general features. The FBIAS decreases with a linear trend from 1.1 to 1.2 for lower thresholds down to about 1 for 100 mm. This means that the model tends to predict more precipitation than observed up to 100 mm. Similarly, the ETS decreases from values of 0.5 to 0.6 for lower thresholds to values of 0.2-0.3 for 100 mm. This shows that the model is the most skilful for lower precipitation accumulations and has more difficulty predicting large precipitation amounts. A closer inspection of Figure 8 shows that, DA Q and DA TQ are generally close to CTRL below 10 mm. Above 10 mm, they are worse than CTRL up to 100 mm (40-50 mm) in terms of FBIAS (ETS). They are better above 40-50 mm in terms of ETS. DA T behaves sligthly differently. It is worse than CTRL up to 70 mm and better above in terms of FBIAS. For the ETS, DA T is generally worse than all the other experiments below 10-20 mm and behaves like DA Q and DA TQ above. It is thus concluded that the assimilation of MWR data is moderately beneficial for larger precipitation amounts and moderately detrimental for small precipitation amounts.
In order to illustrate how these statistics relate to the resulting impact on specific events, a heavy-precipitation event was chosen. It occurred in the first days of November 2011 and was characterized by long-lasting heavy precipitation over the Cévennes mountains which are located in the southeastern part of the Massif Central, France. On these days, the largescale situation was dominated by a deep, cold upper-level trough over the North Atlantic Ocean. This configuration was associated with a mid-tropospheric southwesterly diffluent wind over southeastern France. Near the surface, a cold front was approaching southeastern France from the west. Over southeastern France, ahead of the front, a southsoutheasterly warm, moist, and strong wind prevailed with gusts in excess of 100 km h −1 . Hally et al. (2013) give some more details about the meteorological situation of this case. The largest daily rainfall accumulations occurred over the Cévennes on 3 November. Figure 9 shows the distribution of these accumulations as observed and simulated by the four experiments.
The largest observed accumulations are located over the Cévennes along the south southwest to north northwestern direction. A secondary peak is observed east of the largest rainfall pattern. The largest daily accumulations exceed 350 mm locally (one rain-gauge observation) and 250 mm over wider areas. The secondary rainfall pattern is estimated to reach 100-150 mm by radar observations, but this cannot be corroborated by rain-gauge observations due to the sparseness of the surface network. All four of the experiments succeed in simulating the main rainfall pattern, albeit with different smaller-scale structures. The most notable differences regarding this primary rainfall pattern appear at its southern tip. The best experiments seem to be DA T and DA Q, while CTRL and DA TQ seem to underestimate rainfall amounts in this area, where the maximum accumulation of more than 350 mm was recorded by a rain gauge. The secondary pattern is best forecast by CTRL and DA Q. However, there is too much precipitation between the two patterns in CTRL. In this respect, DA Q seems to perform best with accumulations very close to that measured by rain gauges.

Radiosonde data-denial experiments
MWR retrievals of temperature and humidity retrievals give essentially the same information as radiosonde data (excluding wind direction and speed). The main differences between both data sources are the time availability (radiosondes are launched only once or twice a day), distribution of stations (30 radiosonde launch sites versus 13 MWRs at possibly different locations in this study), and quality (in situ data are of better quality than remote-sensing data). As shown in Figure 2, several MWRs are close to or colocated with radiosonde launch sites. In this section, the possible redundancy of these two kinds of profile data is investigated. For this, two additional experiments are performed. The experiment hereafter referred to as CTRL−RS is the same as CTRL, except that radiosonde data are not assimilated. Likewise, DA TQ−RS is the same as DA TQ, but without assimilating radiosonde data. Extensive verification statistics have been computed for these two additional experiments. As found above, there is only very limited, statistically insignificant, impact on the short-term forecasts versus assimilated observations such as those from automatic weather stations (not shown). This result also holds true for radiosonde data. This means that the proximity between all experiments cannot solely be attributed to the fact that the verifying observations are assimilated, as proposed above.
As in the initial set of experiments, the greatest, statistically significant, differences are found in QPF. Figure 10 shows that removing radiosonde data improves the bias in 6 h accumulated precipitation in the first 12 h, but then degrades it (compare CTRL and CTRL−RS in (a)). In all other cases, assimilating MWR products degrades the bias of 6 h accumulated precipitation forecasts for all ranges. Regarding the root-mean-square error and the correlation coefficient, removing radiosonde data degrades these verification metrics for all forecast ranges. The benefit of assimilating MWR data when radiosonde data are not assimilated is more marked than when radiosonde data are assimilated, and lasts up to more than 18 h.
FBIAS and ETS are plotted in Figure 11 for 18 h accumulated precipitation. Assimilating MWR data in experiments without radiosonde data generally degrades the FBIAS for all thresholds. The same result was obtained when radiosonde data were assimilated. It is also worth noting that assimilating radiosonde data (compare CTRL with respect to CTRL−RS) degrades the FBIAS beyond 50 mm. The consistent degradation in bias observed when either radiosonde or MWR data are assimilated could be attributed to errors in the model physics which are better compensated when these data are not assimilated.
More conclusive results are obtained for the ETS. When radiosonde data are not assimilated, the ETS skill score is improved when MWR data are assimilated for all thresholds, and the gain improves further with larger thresholds.
In short, the benefit of assimilating MWR data when radiosonde data are not assimilated can be seen on QPF up to forecast ranges of about 18 h in the root-mean-square error, correlation coefficient, and ETS, but not in the bias and frequency bias whose degradation may result from initial errors balanced with model errors in the reference experiments.

Summary and discussion
Temperature and humidity retrievals from an international network of 13 ground-based microwave radiometer network were assimilated in a kilometre-scale NWP system. The study spanned a period of 41 days encompassing a significant amount of heavy-precipitation events in the western Mediterranean    region. Comparisons between the retrieved temperature and humidity and the model's counterparts from a control run showed reasonable agreement in general. Standard deviations of differences between MWR products and the model were comparable to those obtained with radiosondes. However, the biases were found to be much larger. The retrieved data were assimilated with a 3D-Var technique in a 3 h rapid update cycle configuration. Three experiments were performed which assessed the respective impact of the assimilation of temperature retrievals, humidity retrievals, and temperature and humidity retrievals. Extensive verification was carried out against upper-air observations as well as surface observations. On average, little impact was found on analysed and predicted upper-air and surface parameters, except on precipitation forecasts. Continuous and categorical verification statistics on precipitation forecasts yielded mixed results. While the bias was found to be degraded up to 28 forecast hours when MWR data were assimilated, the root-mean-square error and the correlation coefficient were found to be mainly beneficial up to about 18 forecast hours. When MWR data were assimilated, categorical scores on 0-18 h accumulations generally showed a deterioration for the FBIAS. The assimilation of MWR data generally degraded the ETS below 40 mm, but improved it above. A case of heavy precipitation illustrated to what extent the patterns of QPF vary when MWR data are assimilated or not, and revealed distinctive differences in these patterns compared to a radar quantitative precipitation estimate. In two additional experiments based both on the control experiment and on the experiment assimilating MWR data, radiosonde data were not assimilated to avoid redundancy with nearby or colocated MWR stations. Generally, a more marked improvement was noticed when MWR data were assimilated. In particular, the ETS was improved for all thresholds, and even more markedly for larger thresholds. However, as when radiosonde data were assimilated, the results were mixed in terms of bias and frequency bias, which possibly points towards deficiencies in the model physics.
Overall, this study has demonstrated that: (i) MWR network data can be safely assimilated into convective-scale NWP systems, and (ii) the impact is generally neutral, although some improvements have been noticed on QPFs up to forecast ranges of 18 h and for larger rainfall accumulations.
The relatively low impact obtained in this study can be attributed to the following reasons. First, the network used in this study was quite sparse and inhomogeneous. For instance, the Eumetnet GPS water vapour programme (E-GVAP; http://egvap.dmi.dk/; accessed 11 July 2016) used in GPS ZTD data assimilation studies is much denser (e.g. Yan et al., 2009b). The MWR network was denser around the central Alps (more than half of the stations), but, given the typical meteorological situations associated with flash-flooding in the Mediterranean coastal areas, those stations probably did not strongly impact the forecasts of such events. It is expected that increasing the density of the MWR network would increase the impact. Similar conclusions were drawn from other data assimilation studies of sparse observations (e.g. Hamill et al., 2013). The novel use of scanning strategies for MWRs as proposed by Themens and Fabry (2014) might also be useful to get more impact. Also, a lot of observations were assimilated in addition to the MWR data, including those from nearby or colocated radiosonde launch sites. As this study showed that more positive impact was obtained when MWR data were assimilated alone, the design of a MWR network should aim at complementing the existing operational radiosounding network to maximize its impact for data assimilation purposes. Second, this study has pointed out some errors in the retrieved humidity and temperature data. Removing these errors, and, more generally, improving the quality of the data should also help obtain more impact. The errors in MWR temperature and humidity products are partly caused by indispensable, yet not necessarily valid, assumptions in retrieval methods. The use of such assumptions can be avoided by using a different approach in which brightness temperature would be assimilated directly. However, this necessitates a radiative transfer model calibrated for such a use with its linear tangent operator and its adjoint. These tools were not yet available at the time of this study, but are under active development, in particular in the framework of the EU COST Action TOPROF (TOwards operational ground-based PROFiling with ceilometers, Doppler lidars and microwave radiometers for improving weather forecasts, http://www.toprof.eu/; accessed 11 July 2016), a continuation of EG-CLIMET (European Groundbased Observations of Essential Variables for Climate and Operational Meteorology; Illingworth et al., 2015), which aims at improving the quality of MWR data, among others. 1D-Var retrievals of temperature from MWR brightness temperature were found to outperform Arome very-short-term forecasts, and thus demonstrated the potential benefit of such data assimilation techniques (Martinet et al., 2015). In this study, the bias has been computed over the whole period under investigation and removed prior to the assimilation. Although the bias appeared quite constant during the period under study, such an approach would not be feasible in an operational context, and the bias should be monitored to make sure that it is actually constant. Finally, this work has focussed on the impact of MWR data on deep-convection events, but the assimilation of such data could be even more useful in other situations such as fog events.