Variation of multimodality in rainfall drop size distribution with wind speeds and rain rates

In the coming years, there will be more usage of the millimetre/sub-millimetre frequencies due to congestion of the lower frequencies. At these frequencies, precipitation greatly affects the quality of service, by attenuating signals, hence the need for a thorough study and understanding in order to design mitigation techniques for improved signal quality. Previous studies modelled rainfall data using mostly unimodal statistical distributions, which may not fit multimodality encountered in the data. This paper looks at the prediction of the number of modes, given rain rates and wind speeds by looking at the occurrence of multimodality in rainfall data captured at Chilbolton Observatory, southern England from 2003 to 2009. From the drop size distributions, it develops a novel model based on the Gaussian mixture model. This enables the multimodal distributions observed by various researchers to be modelled. It provides expressions for calculating model parameters as a function of rain rate, R (mm/h) and wind speed,W (m/s). The model parameters include number of modes Nm, standard deviation σ1–σm of each mode along with corresponding means, μ1–μm. The study concludes that multimodality exists, and the average number of modes tends to increase with increasing wind speeds and rain rates.


Introduction
Modern communication systems rely heavily on frequencies above 10 GHz as the lower bands are now congested. These also provide for higher data rate and larger transmission bandwidths as well as reduced interference potential and smaller equipment sizes [1]. At these higher frequencies, precipitation, especially rain, plays a great role as raindrops absorb and scatter radiowave energy leading to a degradation in the quality of signals. The systems designers need a thorough understanding of the rainfall rate and drop size distribution (DSD) if they want to make optimal use of available bandwidth.
The designer needs a proper estimate of the attenuation as overestimating leads to a waste in resources while under-estimating may lead to system outages. To design reliable systems, the engineers need a reliable prediction of how precipitation will affect transmitted signals. Modelling of the rainfall DSD gives the designer a reliable prediction of the rain attenuation and allows providers to design mitigation techniques to counter attenuation due to rain events. Recent work suggests multimodal distributions may be needed to accurately model the rainfall DSD. This paper examines how the prevalence of multiple modes varies with weather parameters such as wind speed and rain rate, and presents a model for the prediction of the number of modes based on the rain rate and the wind speed.
As discussed in the next section, there seems to be evidence of multimodality in the captured rainfall data. Ekerete et al [2] propose using a Gaussian mixture model (GMM) with multiple modes as a possible multimodal model. However, they do not recommend a set number of modes and this paper investigates how the number of modes varies with wind speed and rain rate.

Standard statistical models
The rainfall DSD, N(D) [in cubic metres per millimetres (m −3 mm −1 )] is defined as the number of raindrops per unit volume per unit diameter centred on D (in mm). N(D) dD, expressed in m −3 , is the number of such drops per unit volume having diameters in the infinitesimal range (D-dD/2, D + dD/2) [2].
Several standard classical statistical distributions have been suggested as models that will best describe N(D) (the drop densities), given that the drop diameter is a continuous, non-negative quantity. Marshall and Palmer [3] suggested that N(D) can be represented as where D max is the biggest measured drop diameter, and that where Λ (in mm −1 ) is a function of the rainfall rate R (mm/h). Their data gave N 0 = 8000 mm −1 m −3 , in (1), and α = 4.1, β = −0.21 in (2). They however go on to state that the relationship fails for small rainfall drop diameters (D < 1.5 mm). Further studies however show that these small sized drop diameters do contribute to rain attenuation in the millimetre and submillimetre radio wave transmissions [4]; hence, some other DSD model is needed that accurately describes the relative count of these small drops.
Other later researchers modelled rainfall data using the lognormal distribution [5][6][7][8]. The lognormal distribution for the number of drops in a given volume is given in the general form where θ is the offset value and generally taken to be zero, and σ g , μ g are the geometric standard deviation and geometric mean, respectively. Ulbrich and Atlas [9] showed that a gamma distribution yielded better rainfall rate computations when combined with radar data. They used a gamma distribution described in the form given as with Λ, µ and N T as the slope, shape and scaling parameters, respectively, and these allow for the characterisation of a wide range of rainfall scenarios. The exponential distribution is a special case of the gamma distribution with µ = 0. Ulbrich and Atlas [9] do not actually claim the DSD is a gamma distribution, but simply that a gamma distribution yields more accurate rainfall rate computations. They accept that other distributions might serve equally well. It should be noted that these three standard models for DSDs are all unimodal. Whilst these standard unimodal distributions discussed may give accurate rainfall predictions derived from the DSD, studies show that these distributions do not give satisfactory (with 95% confidence) fits for the data [2]. There rises the need for a distribution that will cater for not just the unimodal data, but also for the multimodal data encountered in the data analysis.

Multimodality in the DSD
Multimodality has been observed in the DSD by previous workers [10][11][12]. Sauvageot and Koffi [11] attribute the presence of multimodality in DSDs to the overlapping of different rain shafts resulting from cloud volumes at different heights, and they also show that the number of peaks, N m , of a DSD depends on the rain rate variations, and not on the mean rain rate, but this was for rain with D i > 2 mm, where large N m are inversely related to values of the slope parameter, λ and with large values of the intercept, N 0 of the exponential distribution. Steiner and Waldvogel [10] also studied multimodal behaviours in DSDs and reported that these modes existed for different drop size diameters in convective rain regimes. Radhakrishna and Narayana Rao's [12] study indicated that the appearance of multimodal distributions in the DSDs are dependent on the height, and varies with different rain systems, with multimodal distributions frequently encountered in convective rain systems. They classified the rain systems as convection, stratiform and transition. This paper however classes rain regimes as light, moderate, heavy and very heavy.
McFarquhar [13] says that multimodal peaks are observed in computed models, but not systematically in physical data. Åsen and Gibbins [14] argue about the existence of multimodal peaks in observed rainfall, and say that the presence of multimodal peaks may be due to an error in the calibration of the bins. McFarquhar and List [15] recompute the bin boundaries based on the recalibration of the RD-69 disdrometer done at the Laboratory of Atmospheric Physics at the Eidgenössische Technische Hochschule (ETH) in Zurich, Switzerland [14]. However, Ekerete et al [16] showed that even with the ETH calibration there was still the presence of multimodality, and not just a consequence of the instrumental artefacts as it compared the Distromet ® (manufacturers of the disdrometer) calibration with the ETH's recalibration. This paper, however relies on the ETH recalibration of the bins boundaries.
Steiner and Waldvogel [10] define a frequency as a mode if '… the concentration of raindrops per unit volume and unit diameter interval of a given interval was significantly larger than the concentrations of the two neighbouring diameter intervals'. Sauvageot and Koffi [11] and Radhakrishna and Narayana Rao [12] similarly treat is the density of drops with diameter D i , and D i are the diameters measured by the disdrometer.
This definition, while valid overlooks the issue of sampling from a distribution. A random sample drawn from a distribution may be multimodal even though the underlying distribution is unimodal. This is just a consequence of the 'noisy' nature of the sample. We can reduce this effect by merging neighbouring bins.
This paper however employs a more robust method to determine the number of modes in a multimodal distribution by identifying each individual mode from the troughs surrounding it. N(D i ) is defined as a trough, when This ensures a steady rise to determine the beginning of the next cluster.
To establish that the reported multimodality in DSD is not merely an artefact of the impact disdrometer employed at Chilbolton, England, this paper also examined 1587 1 min DSDs recorded at Graz, Austria (47.1°N, 15.6°E) using a two-dimensional (2D) video disdrometer over the period 22-25th May 2015. This dataset was treated the same way as the Chilbolton dataset. Even when five contiguous bins were merged, the result showed that 440 (i.e. more than a quarter) of these DSDs were multimodal. A sample DSD from this dataset with a GMM fit is shown in Fig. 1.
3 Data and procedure used in this paper

Data collection
This paper utilised data captured by the RD-69 Joss-Waldgovel impact disdrometer located at the Chilbolton Observatory in southern England (51.1°N, 1.4°W) between April 2003 and December 2009. The disdrometer works by converting the vertical momentum of an impacting raindrop into an electrical pulse, and estimating the diameter of the raindrop from the amplitude of the pulse. The disdrometer has a surface area of 50 cm 2 and measures raindrop diameters from 0.3 to 5.1 mm in 127 gradations, or bins, sampling at 10 s intervals. The lower limits of the 127 size classes are distributed exponentially over the range of drop diameters and the accuracy rate of the readings is ±5% of measured drop diameter [17]. The 127th bin measures drops with diameters above a certain size, rather than within a small interval so is not used in this paper.
The volume drops distribution was estimated as [18,19] where at a discrete time instant t, D i is the central drop diameter of the ith channel, n i (t) is the total drop count in that channel, A is the exposed area of the disdrometer's sensor, Δt is the time interval, We however note that different workers [18,19,23] have used the equation Both equations are derived from the measurements of Gunn and Kinzer [24]. This paper uses (7a). The largest recorded drop diameter for the period under consideration was 4.79 mm, while the mean temperature was 9.7°C (proving it was not frozen). Our study shows that using (7b) as the reference, there is an 8.46% difference between both equations for the drop size range (0.3-4.79 mm) measured in this dataset. The difference is biggest with small drops with diameters below 0.6 mm and with large drops with diameters above 4.1 mm. Since most of our analysed data fall within the range of 0.6 and 4.0 mm, we may conclude that there is not much difference between the two equations for our dataset.

Possible errors
There are a number of possible sources of error in the JWD disdrometer measurement [25]. The effect of wind on an impact disdrometer is an issue, and this may affect both volume size and number concentration [25,26]. The horizontal component of the wind is not a problem as it does not directly affect fall velocity, but the vertical turbulence also increases with wind speed. The up-and down-draughts caused by turbulence will affect the actual fall speed of the particle, so the assumption that it is falling at terminal velocity is invalid. To some extent, the velocity differences will work to both increase and decrease the measured momentum, and hence the assigned size bin of the drops, but some errors are likely to remain. The effect will be worst for small drops which already have a small terminal velocity.
The authors considered the effect of large and small drops falling simultaneously on the disdrometer and the breaking of large drops into smaller drops by re-splashing after the first impact had been recorded, thereby increasing the count of smaller drops. Recall that the disdrometer converts the momentum of the drops to a predetermined voltage rating to determine the diameter of the drops. The smaller drop may be masked by the larger drop, depending on the time delay. The effect is likely to be negligible except at very high rain rates. The smaller secondary drops are also unlikely to be included in this analysis as they are not falling at terminal velocity and so would be assigned to a smaller bin than their true size.
'Dead time', the time it takes for the disdrometer to reset itself after the impact of a drop, may also be considered a problem with the disdrometer. The dead time is longer for larger drops. This effect will be more significant at higher rain rates. It would tend to produce a nearly constant error across the DSD, and therefore should not affect its multimodality.

Procedure
This paper aggregated six 10 s samples into a 1 min sample to achieve a larger sample size. This implicitly assumes that the underlying distribution is approximately stationary over a 1 min timescale. This is the approach taken by previous workers [18,19,27].
Furthermore, three adjoining bins were merged to smooth the data. Merging too many bins will result in the loss of the data's resolution. It should be noted that Radhakrishna and Narayana Rao [12] worked with a 5 min distribution interval, rather than our 3-bin merge within 1 min distribution intervals.  Given the unreliability of small diameter drops [2,28], even with more sophisticated instruments than the impact disdrometer [28], all bins with drop diameters <0.6 mm were eliminated from the data under consideration.
For each of the 1 min samples, the drop concentrations were derived using (6) with the terminal drop speed as in (7a). On the basis of the definition of a mode given in (5), the number of modes in each 1 min distribution was determined, with the maximum number of modes capped at four.
The paper determined the average number of modes amongst the different wind type and rain rates. The rain types were classed as light (with rain rates from the threshold 0.1 to 2.0 mm/h), moderate (2 to 10 mm/h) and the merge of heavy and very heavy rain due to very little data in the latter (rain rates above 10 mm/h). The wind speeds were classified from calm (<1 m/s) to severe gale (21-24 m/s) as shown in Table 1. The average of the number of modes in each delineated group was derived.
A correlation of the average rain rate and the average of the number of modes in each rain type for all wind speeds were computed, with a multiple linear regression derived to determine the relationship between the number of modes and the rain rates and wind speeds.
A multiple linear regression was then fitted between the variables, where each range of rainfall rates was represented by its average value.
For each 1 min DSD, the number of modes determined was taken as the observed number of modes, whereas the number of modes given by (9) was taken as the predicted number of modes. Each of the 1 min DSDs was fitted with a GMM with the number of clusters determined by the predicted and observed number of modes.
The probability density function for a GMM is of the form [2] where μ i and σ i are the mean and standard deviation of the ith mode, respectively, and the weights w i have the property k i=1 w i = 1. The GMM was modelled in the log domain. For each rain regime (i.e. light, moderate and heavy/very heavy) the average values of μ i , σ i and w i were computed.

Results and interpretations
On the basis of the data and methodology described in Section 3 above, the average number of modes was derived, and the results are as shown in Table 1. Fig. 2 shows a graphic of the same information.
Results show that the average number of modes for the entire data is 1.549, and that the number of modes tends to increase with increasing rain rates and wind speeds.
It is noteworthy that the result showed the lowest average as 1.424, slightly over unimodality, giving credence to the assertion that multimodality is prevalent in the data.
The multiple linear regression showed that given the rain rate, R, and the wind speed, W, the number of modes, N m , can be predicted as It should be pointed out that even though the computations in this paper assumes a non-integral number of modes, the practical applications would however require that the usable number of modes be rounded to the nearest integer. Using non-integral modes in this paper however ensures that much information is not lost by rounding the number of modes to the nearest integer. From the results as presented, a correlation was drawn between the rain rates and the number of modes as well as the wind speeds and the number of modes. Results show a strong correlation (0.9) each between the average number of modes and wind speeds as well as the rain rates.
Equation (9) shows the individual contribution of each of the quantities to the total number of modes. Fig. 3 shows the DSD for 26th July 2007 at 14:10 with a fit of the GMM using the observed and predicted number of modes. This figure shows an observation of one mode, whereas two modes are predicted from (9). The top panel shows the individual Gaussians that make up the GMM (one for observation and two for the prediction), whereas the bottom panel compares the GMMs for both observed and predicted number of modes.
To determine the agreement of the prediction with the observed data, a root mean square percentage error (rmspe) was computed between the observed number of modes and the predicted number where M p and M o represent predicted and observed number of modes, respectively, and N is the sample size. Table 2 shows the average values of μ i , σ i and w i in each of the rain rate regimes. It can be seen that the spread of cluster centres (i. e. the range from μ 1 to μ 4 , for example) increases with rain rate. Furthermore, the clusters are approximately equally weighted.
It should be noted that though Table 1 shows a trend of increasing number of modes with increasing rain rate, there were only 622   four-mode DSDs at high rain rate, which represents only ∼3% of the dataset, and may therefore be considered too small a sample size for a reliable conclusion to be drawn. On the basis of the results in Table 2, regression analyses were carried out to derive linear relationships between rain rate and the GMM model parameters μ i , σ i and w i . The resulting model equations are presented in Table 3.
The fundamental result here is that multimodality does occur significantly often, particularly at higher rain rates and increases with rain rates and wind speeds, and the number of modes can be predicted by (9). The set of equations given in Table 3 serves as a tool in the prediction of the basic parameters of the GMM model.

Conclusions
This paper has looked at statistical models with a need to modelling the rainfall DSDs for proper systems planning. In the measurement of the DSDs, multiple modes were observed, and since standard statistical models do not make room for these multiple modes, this paper has suggested the GMM, with a hope of providing better estimates for the DSDs. Here, modes are redefined, with a trough taken as the end of a mode, a departure from the traditional definition of a peak determining a mode.
Data used are from Chilbolton Observatory from 2003 to 2009, and multimodality was established as present, and not just an instrumental artefact. Bins were merged for better results, and the effect of the loss of resolution due to the merge minimised by the number of bins merged.
Whilst the limitation of the measuring instrument is acknowledged, the effect of wind on the disdrometer is an issue that needs to be explored further. This paper eliminated drop diameters of <0.6 mm, as the drop counts for these sizes may be unreliable.
The results show that the number of modes increases with both the wind speeds and rain rates, and a strong correlation (0.9) was found between the average number of modes and rain rates as well as between the average number of modes and wind speeds. The work goes on to define a model (9) determining the expected number of modes, given the rain rate and the wind speed. On the basis of these, the parameters μ, σ and weight of the GMM can be derived.