Uncertainty in Measured Raindrop Size Distributions from Four Types of Collocated Instruments

Four types (2D-video disdrometer: 2DVD; precipitation occurrence sensor system: POSS; micro-rain radar: MRR; and Joss–Waldvogel disdrometer: JWD) of sixteen instruments were collocated within a square area of 400 m2 from 16 April to 8 May 2008 for intercomparison of drop size distribution (DSD) of rain. This unique dataset was used to study the inherent measurement uncertainty due to the diversity of the measuring principles and sampling sizes of the four types of instruments. The DSD intercomparison shows generally good agreement among them, except that the POSS and MRR had higher concentrations of small raindrops (<1.0 mm) and offered a better chance to observe big raindrops (>5.2 mm). The measurement uncertainty (σ) was obtained quantitatively after considering the zero or non-zero measurement error covariance between two instruments of the same type. The results indicate the measurement uncertainties were found to be neither independent nor identical among the same type of instruments. The MRR is relatively accurate (lower σ) due to large sampling volumes and accurate measurement of the Doppler power spectrum. The JWD is the least accurate due to the small sampling volumes. The σ decreases rapidly with increasing time-averaging window. The 2DVD shows the best accuracy of R in longer averaging time, but this is not true for Z due to the small sampling volume. The MRR outperformed other instruments for Z for entire averaging time due to its measuring principle.

However, DSD suffers from instrumental uncertainty, and significant discrepancies can exist between DSDs obtained from different types of disdrometers. Joss and Waldvogel [22] and Lee and Table 1. Identification number (ID), type (affiliation, code name), rainy minutes, time lag (positive values indicate the data are behind 2DVD), and highest value of correlation coefficient to 2DVD for 16 disdrometers' network data. NCU represents the National Central University, Taiwan. McGill represents McGill University, Canada. PKNU represents Pukyung National University, Korea. ECCC represents Environment and Climate Change Canada. CCU represents the Chinese Culture University, Taiwan. KNU represents the Kyungpook National University, Korea. The data were visually examined to remove suspicious data. The time offset represents the time difference of each disdrometer relative to 2D-video disdrometer, and a positive value represents ahead of the 2D-video disdrometer. However, we admit the site is not ideal and some site dependent issues may remain. Thus, the biases of each instrument were further removed (described later).  Table 1.
The intercomparison periods were late spring and early Mei-Yu seasons [17,34]. The climatology study showed that this intercomparison period encompassed rainy seasons in Northern Taiwan and was prior to the first peak of rain in June (Figure 2a). The average daily rainfall is about 6 mm day -1 [34]. The DSD characteristics can vary significantly (e.g., standard deviation of the characteristic diameter of DSD are 0.42 mm) during this period [17]. About 1900-3600 one-minute DSD data were collected (Table 1), and total rainfall accumulation is about 110 mm which is smaller than the monthly mean rainfall amount during the experimental period (see Figure 2a). The distributions of rainfall rate and mass-weighted diameter (Dm) from the measured DSD of sixteen disdrometers show sufficiently wide ranges of rainfall rate and different DSD characteristics for intercomparison ( Figure  2b,c).  Table 1.
The intercomparison periods were late spring and early Mei-Yu seasons [17,34]. The climatology study showed that this intercomparison period encompassed rainy seasons in Northern Taiwan and was prior to the first peak of rain in June (Figure 2a). The average daily rainfall is about 6 mm day -1 [34]. The DSD characteristics can vary significantly (e.g., standard deviation of the characteristic diameter of DSD are 0.42 mm) during this period [17]. About 1900-3600 one-minute DSD data were collected (Table 1), and total rainfall accumulation is about 110 mm which is smaller than the monthly mean rainfall amount during the experimental period (see Figure 2a). The distributions of rainfall rate and mass-weighted diameter (D m ) from the measured DSD of sixteen disdrometers show sufficiently wide ranges of rainfall rate and different DSD characteristics for intercomparison (Figure 2b,c).
listed in Table 1.
The intercomparison periods were late spring and early Mei-Yu seasons [17,34]. The climatology study showed that this intercomparison period encompassed rainy seasons in Northern Taiwan and was prior to the first peak of rain in June (Figure 2a). The average daily rainfall is about 6 mm day -1 [34]. The DSD characteristics can vary significantly (e.g., standard deviation of the characteristic diameter of DSD are 0.42 mm) during this period [17]. About 1900-3600 one-minute DSD data were collected (Table 1), and total rainfall accumulation is about 110 mm which is smaller than the monthly mean rainfall amount during the experimental period (see Figure 2a). The distributions of rainfall rate and mass-weighted diameter (Dm) from the measured DSD of sixteen disdrometers show sufficiently wide ranges of rainfall rate and different DSD characteristics for intercomparison ( Figure  2b,c).  The POSS is a bistatic, X-band continuous-wave radar and was originally developed to detect the present weather in airports [32]. It measures an average Doppler power spectrum every minute, and inverts them into DSDs at 34 diameter bins that increase from the mean diameter of 0.34 to 5.34 mm. The mean diameter (mm), diameter interval (mm), fall velocity (m s −1 ), and sampling volume (m 3 s −1 ) for 34 bins are listed in Table 2. However, the measured Doppler spectrum is not the same as the one from the vertically-pointing mono-static pulse Doppler radar. Due to the principle of bistatic radar and slanted beam axes of transmitting and receiving antenna, it measures the perpendicular component of the elliptical surface of constant phase focused into the two focal points, the centers of receiver and transmitter. This hardware setting in particular has a significant advantage in its large sampling volume, which provides better measurements of bigger drops. However, the proper inversion of power spectrum into DSD requires the accurate measurement of beam pattern. The beam pattern is measured by collecting power levels at 91 points while dropping a single water drop in the laboratory as the part of the standard calibration procedure. Sheppard and Joe [31] and Campos and Zawadzki [35] provided a detailed validation of POSS as a disdrometer by comparing with other instruments.
The micro rain radar (MRR) is a vertically pointing, low-cost, frequency modulated continuous wave (FMCW) Doppler radar [36] and provides the vertical profiles of reflectivity and mean Doppler velocity. The wavelength is 12.5 mm (24 GHz). The 3-dB beam width is 3 • . The DSD from the manufacture product was calculated to be 46 bins. The corresponding values of the fall velocity (v i ) for each bin varied from 0.7 m s −1 to 9.3 m s −1 with a fixed interval of 0.2 m s −1 . The fall velocity is converted into the diameter with fall velocity -diameter relation by Atlas [37] and Gunn and and H represents the height of retrieved DSD in meters. The MRR was operated with a vertical resolution of 10 m. The DSD data at the 3rd gate (30 m) were used to represent the DSD at the lowest height while avoiding ground clutter contamination. Since MRR directly measures the Doppler power spectrum in the vertical direction, the inversion into DSDs is more accurate and the sampling volume is relatively large compared with JWD and 2DVD. However, MRR is affected by the precipitation attenuation due to its short wavelength, 1.25 cm. No site specific calibration is applied. Table 2. Corresponding mean diameter D i (mm), diameter interval ∆D i (mm), fall velocity ν i (m s −1 ), and sampling volume V i (m 3 s −1 ) for 34 channels of precipitation occurrence sensor system (POSS). The JWD is an impact-type device and the first automatic disdrometer. The JWD measures an impact from falling drops and subsequently convert it into the electric voltage and then drops size. The JWD has a fixed sampling cross-sectional area of 50 cm 2 , and has 20 size bins ranging from 0.35 to about 5.37 mm ( Table 3). The diameter intervals (∆D) increase with drop sizes from 0.1 mm at 0.35 mm to about 0.5 mm at 4.8 mm. The calibration of each unit determines the exact bin boundaries [39]. The mean diameter, ∆D i (mm), diameter interval, ∆D i (mm), and fall velocity (v i , m s −1 ) are listed in Table 3. The sampling volume is about three orders of magnitude smaller than POSS, resulting into some deficiency in measuring bigger drops. The JWD is calibrated by dropping known-sized drops [35]. It measures the drop size with about 5% accuracy [25]. The JWD cannot measure the size distribution of snow and affected by wind. In addition, the so-called dead-time is a significant shortcoming due to the loss of measuring capability while responding to a falling drop. In addition, the JWD cannot separate two simultaneously falling drops. Table 3. Mean diameter D i (mm), diameter interval ∆D i (mm), and fall velocity ν i (m s −1 ), and sampling volume V i (m s −1 ) for 20 channels of JWD disdrometer. The 2DVD is an electronic optical device equipped with two fast scan line charged coupled devices (CCD) [28]. It originates from old photography, wherein taking a picture of falling precipitation particles gets more precise information of size and shape. A similar instrument is the hydrometeor size detector (HSD), and later came the hydrometeor velocity and size detector (HVSD) [40]. Unlike HVSD, the 2DVD has two perpendicular line CCDs that are apart by about 0.6 cm and provides the two digital images from which size, shape, and axis-ratio are derived in two dimensions. The two images are matched to derive the fall velocity of individual particles and to correct shape distortion independently. The DSD data from 2DVD were divided into 100 bins with a 0.1 mm diameter interval in the range of mean diameter of 0.1-9.9 mm. Although the sampling volume of 2DVD is small, it measures the exact shape and size of precipitation particles directly. However, the stable operation is sometimes problematic due to the frequent changes of light paths and performances of the optical elements. The calibration is done by dropping metal balls of different sizes. Thus, the particle size and distance between the perpendicular planes are calibrated.

Preprocessing of Disdrometer Data
Standard preprocessing procedures of four types of disdrometer are introduced in this section. For the JWD-measured DSDs, the dead-time correction suggested by Sheppard and Joe [31] was applied to one-minute data. For the 2DVD data, the over/under sampling of small raindrops due to the wind-induced turbulence near the optical camera [41] and the break-up/splashes contamination were filtered as suggested by Kruger and Krajewski [42]. The sampling error can be removed by a fall velocity-based filter as follows: V measured represents the observed fall velocity and V ideal represents ideal fall velocity derived from the quartic polynomial formula at a given observed rain drop size [43], and the coefficient C was set to 0.4 [2,42,44]. As the small raindrop is mostly influenced by the air-flow around instruments, most of the disdrometers had a lowest detectable drop size of around 0.35 mm [14,30,45,46], except 0.01 mm for 2DVD. The DSD data with drop sizes less than 0.35 mm for all disdrometers were removed to minimize the noise effect in smaller raindrops. No limitation was applied to large raindrops. The one-minute DSD data for all the disdrometers with rainfall rates less than 0.1 mm h -1 were not considered in the analysis.
After the standard processing procedures of each disdrometer, the time lags of sampled data among disdrometers due to synchronization issue were investigated as well. The rainfall rates of POSS, MRR, and JWD were applied to calculate the cross-correlations to the rainfall rate of 2DVD at different lag times (±5 min). The time lag was consequently determined by the lag that provided the highest cross-correlation. The time lag and the highest cross-correlations are listed in Table 1. The results show that the time lags were all within one minute. The following analysis of the DSD data was performed after applying the derived time lag correction.
Considering that the sixteen disdrometers have different technical issues (mainly computer failure), all the data were visually examined to remove suspicious data. Rainy data was carefully determined by examining time series of rainfall intensity from sixteen collocated disdrometers manually. The small numbers of DSD measurements from M3 and M4 are due to the changes in the vertical sampling resolution. Data from the M3 and M4 were not available from 1 May to 8 May. The total numbers of one-minute rainy DSDs are listed in Table 1 and were applied for following analysis.

Methodology of Estimating Measurement Bias and Uncertainty
The measurements of rainfall rate (R, mm h −1 ) and the reflectivity (Z, mm 6 m −3 ) calculated from DSDs were investigated to understand the disdrometric measurement uncertainty. The calculations of R and Z were undertaken using the following equations: and v(D) represents the fall velocity (m s −1 ) of the raindrops according to Gunn and Kinzer [38], and N(D) represents the number concentration of the raindrops (m −3 mm −1 ). dBR (10 log 10 R) and dBZ (10 log 10 Z) were used instead of their linear values. Hereafter, R and Z represent dBR and dBZ. A given measurement (M obs i ) from a disdrometer unit "i" (e.g., R and Z) can be decomposed of the true value (M true ), bias (ε bias i ), and measurement noise (ε noise i ) as follows, In this study, all sixteen disdrometers were collocated (as shown in Figure 1) and operated synchronously during the intercomparison experiment. It is postulated the natural variability of DSD within this fine-scale network is sufficiently small [20,21]. Therefore, the natural variability of DSD was adequately assumed negligible. Hence, the sixteen disdrometers were considered as collocated and the true value (M true ) should ideally be identical, whereas the discrepancy among measurements can be found due to two major sources, the bias (ε bias ) between different disdrometers and, the instrumental measurement noise (ε noise i ).

Bias Estimation and Correction
While all sixteen disdrometers were considered collocated in this study, the biases between disdrometers could be obtained and removed easily by comparing to a reference. Hence, the bias values of individual disdrometers were obtained via the following two steps. First, the reference disdrometer was selected by comparing nearby tipping bucket rain gauge data maintained and calibrated regularly by the CWB. The size of the tip was 0.5mm. Owning to the limitation of tipping bucket measurements, the AWS has a lower reliability of one-minute rainfall rate [47] and contains no DSD measurements for calculating Z. Thus, the bias values for each disdrometer are consequently obtained by comparing to the selected reference disdrometer. The bias is calculated as: The E represents the expected value. X represents either R or Z from one-minute and hourly averaged DSD and superscript, ref.
indicates the reference values.
Remote Sens. 2020, 12, 1167 8 of 23 The differences among disdrometers can thus be diminished by removing their biases. The biases were applied to DSDs from different disdrometers as below: Here, the one-minute DSD was corrected by a multiplication factor (bias ID ) for each disdrometer. Instead of applying a diameter-dependent correction factor [26], a multiplication factor was applied for entire DSDs. This procedure removes the possible bias in DSD moments but preserves the shape of DSDs.

Estimation of Measurement Noise
The instrumental measurement noise is due to the distinct measuring principle of individual disdrometers. For example, uncertainty of over/under-correction of dead-time effect in JWD measurements due to recovery time of the system [31]. The 2DVD suffered from the over/under sampling of small raindrops due to the wind-induced turbulence near the optical camera [41]. Both POSS [32] and MRR [48] suffered from the unknown degree of wet radome/antenna attenuation effect. In addition, measurement noise due to limited sampling numbers of electromagnetic signals of POSS and MRR occurred. Moreover, different splashing effects due to distinct layout of each disdrometer caused a variety of error characteristics. Moreover, the representative difference among different instruments due to different sampling volume sizes (from 10 −2 to 10 3 m 3 s −1 ) was considered part of the measurement noise. The separation of instrumental measurement noise and representative difference will not be addressed explicitly in this study. For simplification, the measurement noise (i.e., ε noise i ) also includes the representative difference in the following analysis. With bias-corrected measurement (hereafter, M obs is considered as bias corrected measurement), the variance of measurement difference (σ 2 M i −M j ) can be estimated quantitatively by calculating the difference between two collocated measurements to remove natural DSD variability as follows: The var indicates the variance. The subscripts "i" and "j" represent different disdrometers. (9) can be expanded as follows: The first two terms on the right-hand side are variances of measurement noise. The third term is the covariance between measurements noise from i and j disdrometers.
The measurements from sixteen disdrometers were collected synchronously and the bias of individual disdrometer was corrected by comparing with the reference measurement. Since the multiplicative bias factor is only applied for measured DSD, the following proposed measurement noise is nearly independent to the choice of the reference.

Standard Deviation of Measurement Noise Estimated from Paired Measurements (σ PM )
Standard deviation of measurement noise estimated from paired measurements (PM) has been broadly used in various works [6,12,24,[49][50][51][52]. It assumes the instrumental measurement noise as random white noise and independent from two disdrometers of the same type (i.e., cov ε noise i , ε noise j = 0). Thus, the third term on the right-hand side of (10) should be negligible. Furthermore, assuming the measurement noise is identical for "i" and "j", the same type of disdrometer, the value of the Remote Sens. 2020, 12, 1167 9 of 23 standard deviation of the measurement noise (σ PM ) of disdrometers "i" and "j" can be derived from the relationship: In this unique intercomparison dataset, there are 10 combinations of paired measurements from 5 units of collocated POSS, and the same of the MRR and JWD. The assumption related to the identical noise for the same type disdrometers is no longer necessary. A modified technique, the measurement noise from "multiple" paired measurements (MPM), is proposed in this study. The variance of measurement noise (σ 2 MPM ) and variance of measurement difference (σ 2 ) between pairs can be written as the following matrices: C represents the matrix of the variance of the measurement difference (σ 2 combinations and F indicates these pairs. V is the variance of measurement noise of individual disdrometers. The subscripts "1" to "5" of V and C represent the identification numbers (ID) of disdrometers as listed in Table 1. Subsequently, the variances of measurement noise (σ 2 MPM ) of individual disdrometers can be obtained by least-square fitting.

Standard Deviation of Measurement Noise from Paired Cross-Type Measurements (σ CTM )
It should be noted that the calculations of σ 2 PM and σ 2 MPM are from the same type of disdrometers and the measurement noise for them was assumed to be independent of each other. However, the disdrometers with the same measuring principle would most likely over/underestimate DSD simultaneously for different environment conditions (e.g., wind direction, rain intensity). It is probable the measurement noises (ε noise ) are correlated to the same type of disdrometer, and therefore the covariance of measurement noise (cov ε noise i , ε noise j ) is non-zero. Consequently, the variance of the measurement difference (σ 2 (10)  It was adequately assumed that the value of covariance of measurement noise (cov ε noise i , ε noise j ) should be reduced by choosing two different types of disdrometer due to their different measuring principle. A modified MPM, namely the measurement noise estimated from paired "cross-type" measurements (CTM), is proposed. The measurements from different types of disdrometers are purposely chosen in Equations (12)- (15). The CTM reduces the potential correlation between measurement noise in Equation (10) by including only paired "cross-type" of disdrometers. The sixteen available disdrometers provide 90 cross-type sensor pairs. In contrast to the conventional approach (i.e., PM and MPM), the CTM should estimate the instrumental uncertainty more realistically and accurately.

Comparison of R and Z
The hourly R from Gong-Guan automatic weather station (AWS) located at N 0 25.0164, E 0 121.5311 (304 meters away from disdrometers) is compared to the R from the hourly mean DSD of the same type of disdrometers. In Figure 3, pronounced underestimation from POSS and MRR can be noted. The average mean biases were −1.17 and −1.15 dB for POSS and MRR, respectively. The underestimation from POSS and MRR may be attributed to wet radome, the path attenuation effect, and calibration issues. On the other hand, the 2DVD and the JWD disdrometers showed good agreements with the AWS and the average mean biases were low, −0.20 and 0.04 dB for 2DVD and JWD, respectively. Although 2DVD exhibited inherent measuring issues (e.g., over sampling small raindrops due to wind-induced turbulence and splashes near the optical camera), it directly measured raindrop sizes and concentration with digital images. 2DVD was consequently selected as the reference disdrometer. Selection of other disdrometers as the references will determine the different amounts of bias but will not affect the measurement noise.
The hourly R and Z calculated from the hourly mean DSDs were compared for 2DVD and other disdrometers. The biases of hourly R were about −0.74 to −1.52 dB for POSS, −0.31 to −2.90 dB for MRR, and 0.33 to −0.03 dB for JWD ( Figure 4). The biases of hourly Z were about −0.77 to −1.71 dB for POSS, −0.82 to −3.44 dB for MRR, and 0.47 to −0.15 dB for JWD ( Figure 5). The mean biases varied for the different types of disdrometer and for different disdrometers of the same type ( Table 4). The biases of one-minute R and Z were also calculated as listed in Table 4  The hourly R from Gong-Guan automatic weather station (AWS) located at N 0 25.0164, E 0 121.5311 (304 meters away from disdrometers) is compared to the R from the hourly mean DSD of the same type of disdrometers. In Figure 3, pronounced underestimation from POSS and MRR can be  noted. The average mean biases were −1.17 and −1.15 dB for POSS and MRR, respectively. The underestimation from POSS and MRR may be attributed to wet radome, the path attenuation effect, and calibration issues. On the other hand, the 2DVD and the JWD disdrometers showed good agreements with the AWS and the average mean biases were low, −0.20 and 0.04 dB for 2DVD and JWD, respectively. Although 2DVD exhibited inherent measuring issues (e.g., over sampling small raindrops due to wind-induced turbulence and splashes near the optical camera), it directly measured raindrop sizes and concentration with digital images. 2DVD was consequently selected as the reference disdrometer. Selection of other disdrometers as the references will determine the different amounts of bias but will not affect the measurement noise.
The hourly R and Z calculated from the hourly mean DSDs were compared for 2DVD and other disdrometers. The biases of hourly R were about −0.74 to −1.52 dB for POSS, −0.31 to −2.90 dB for MRR, and 0.33 to −0.03 dB for JWD ( Figure 4). The biases of hourly Z were about −0.77 to −1.71 dB for POSS, −0.82 to −3.44 dB for MRR, and 0.47 to −0.15 dB for JWD ( Figure 5). The mean biases varied for the different types of disdrometer and for different disdrometers of the same type ( Table 4). The biases of one-minute R and Z were also calculated as listed in Table.     The results show that the biases were slightly different for different parameters (R and Z) and averaging windows (hourly and one-minute). However, their trends were consistent. For example, M3 had the largest biases of about −2.9-−3.5 dB. Thus, the mean biases for individual disdrometers (except 2DVD) were determined by averaging the one-minute and hourly R and Z biases (summarized in Table 4). The underestimation was noticeable for the POSS and MRR, but no significant bias was evident for JWD as shown in the comparison with the gauge. Remote Sens. 2019, 11, x FOR PEER REVIEW 12 of 23 The results show that the biases were slightly different for different parameters (R and Z) and averaging windows (hourly and one-minute). However, their trends were consistent. For example, M3 had the largest biases of about −2.9-−3.5 dB. Thus, the mean biases for individual disdrometers (except 2DVD) were determined by averaging the one-minute and hourly R and Z biases (summarized in Table 4). The underestimation was noticeable for the POSS and MRR, but no significant bias was evident for JWD as shown in the comparison with the gauge.

Averaged DSD
In order to further investigate the possible bias of observed DSDs from different types of disdrometer, the averaged DSDs from the same type of disdrometer was derived within the same period of observation (Figure 6a). In general, the shapes of the DSDs from the different types of disdrometer showed good agreements except for the smaller (<1.2 mm) and larger (>3.6 mm) sizes of raindrops. The average DSDs from POSS and MRR showed higher concentrations of small raindrops (<0.6 mm), while 2DVD and JWD showed low concentrations with decreasing size. 2DVD tended to underestimate the concentration of raindrop sizes less than 0.6 mm, as reported by [14,30]. Meanwhile, POSS and MRR measured bigger raindrops (>5.2 mm) due to the larger sampling volumes. Both POSS and MRR are radar-based instruments with sampling volumes that are three orders of magnitude larger and offer better opportunities to collect bigger raindrops than 2DVD and JWD [53]. raindrops (< 0.6 mm), while 2DVD and JWD showed low concentrations with decreasing size. 2DVD tended to underestimate the concentration of raindrop sizes less than 0.6 mm, as reported by [14,30]. Meanwhile, POSS and MRR measured bigger raindrops (> 5.2 mm) due to the larger sampling volumes. Both POSS and MRR are radar-based instruments with sampling volumes that are three orders of magnitude larger and offer better opportunities to collect bigger raindrops than 2DVD and JWD [53].

Bias Correction
The mean biases (e.g., Table 4) were consequently applied to Equation (8) in the one-minute DSD of each disdrometers. The mean DSDs were derived again after applying bias correction (Figure 6b). The bias-corrected DSDs show better agreement at size ranges of D = 0.8-3.6 mm. The averaged DSDs are nearly identical at sizes of 1 to 3 mm. The underestimation at 1.4 to 3.8 mm sizes from POSS and MRR was removed. However, the discrepancy in the small (< 1.1 mm) and big (> 3.8 mm) raindrops remained. This inherent characteristic of measured DSDs from different disdrometers can be attributed to different measuring principles. For example, JWD underestimates the concentration of raindrop size greater than 4.2 mm, which is probably due to the small sampling size. The insufficient "dead-time" correction of raindrops may cause low number concentrations at smaller sizes [31].

Measurement Uncertainty
All the discrepancies of DSD among disdrometers have been minimized by assuming 2DVD as the reference and applying the multiplication factor correction as in Equation (8). The remaining discrepancies, instrumental measurement noise due to the diverse measuring principles of individual

Bias Correction
The mean biases (e.g., Table 4) were consequently applied to Equation (8) in the one-minute DSD of each disdrometers. The mean DSDs were derived again after applying bias correction (Figure 6b). The bias-corrected DSDs show better agreement at size ranges of D = 0.8-3.6 mm. The averaged DSDs are nearly identical at sizes of 1 to 3 mm. The underestimation at 1.4 to 3.8 mm sizes from POSS and MRR was removed. However, the discrepancy in the small (<1.1 mm) and big (>3.8 mm) raindrops remained. This inherent characteristic of measured DSDs from different disdrometers can be attributed to different measuring principles. For example, JWD underestimates the concentration of raindrop size greater than 4.2 mm, which is probably due to the small sampling size. The insufficient "dead-time" correction of raindrops may cause low number concentrations at smaller sizes [31].

Measurement Uncertainty
All the discrepancies of DSD among disdrometers have been minimized by assuming 2DVD as the reference and applying the multiplication factor correction as in Equation (8). The remaining discrepancies, instrumental measurement noise due to the diverse measuring principles of individual disdrometers, are analyzed quantitatively by the proposed algorithm, as introduced in Section 3. The standard deviation of measurement noise in logarithm scale (σ[(P)(dB)]) can be expressed in the linear scale (σ(P)), as suggested byBringi and Chandrasekar [9], as follows: Remote Sens. 2020, 12, 1167 14 of 23

Measurement Uncertainties of R and Z from Paired Measurements (σ PM ) and Multiple Paired Measurements (σ MPM )
The measurement uncertainty was first assumed independent and identical, namely PM method, to quantify the measurement noises (e.g., σ PM ) of R from POSS, MRR and JWD. The values of the σ PM from various pairs of POSS are shown on the top of each panel in Figure 7a-j with the correlation coefficient (ρ). The values of ρ were between 0.94 and 0.98. The values of σ PM varied from 0.68 to 1.28 dB. The lowest value of σ PM was from the pair of P3 and P4 (0.68 dB in Figure 7f). The highest value was from the pair of P1 and P4 (1.28 in Figure 7d). It also can be noticed that the degree of scatter of the one-minute R from 5 POSSs varied for different pairs. The larger scatter indicates a higher value of σ PM . The pairs of (P1, P2), (P1, P3), and (P1, P4) showed relatively more scattering compared to the pair of (P3, P4). The various values of σ PM contradict the assumption that the measurement noises were identical in Equations (9)-(10). Consequently, the σ MPM of R for each disdrometer was calculated assuming non-identical measurement noises, namely the MPM method, and shown in Figure 7k. The values of σ MPM varied slightly for different units of POSSs. P1 had the highest value followed by P2. P3, P4 and P5 showed the lowest. The result is consistent with previous analysis showing that the three highest values of σ MPM are associated with P1.

Measurement Uncertainties of R and Z from Paired Measurements (σ ) and Multiple Paired
Measurements (σ ) The measurement uncertainty was first assumed independent and identical, namely PM method, to quantify the measurement noises (e.g., σ ) of R from POSS, MRR and JWD. The values of the σ from various pairs of POSS are shown on the top of each panel in Figure 7a-j with the correlation coefficient (ρ). The values of ρ were between 0.94 and 0.98. The values of σ varied from 0.68 to 1.28 dB. The lowest value of σ was from the pair of P3 and P4 (0.68 dB in Figure 7f). The highest value was from the pair of P1 and P4 (1.28 in Figure 7d). It also can be noticed that the degree of scatter of the one-minute R from 5 POSSs varied for different pairs. The larger scatter indicates a higher value of σ . The pairs of (P1, P2), (P1, P3), and (P1, P4) showed relatively more scattering compared to the pair of (P3, P4). The various values of σ contradict the assumption that the measurement noises were identical in Equations (9)-(10). Consequently, the σ of R for each disdrometer was calculated assuming non-identical measurement noises, namely the MPM method, and shown in Figure 7k. The values of σ varied slightly for different units of POSSs. P1 had the highest value followed by P2. P3, P4 and P5 showed the lowest. The result is consistent with previous analysis showing that the three highest values of σ are associated with P1. The same analysis was applied for MRR, as shown in Figure 8. The values of σ of R were from 0.38 to 0.74 dB and the values of ρ were 0.98 to 1.0. The σ was highest for M5 followed by M3 and M4 (Figure 8k). It can be noticed that MRR were less scattered and less noisy (lower values of σ and σ ) than POSS. It suggests that the MRRs offered more consistent measurements. Meanwhile, the JWD shows the σ of R varied from 0.69 to 1.03 dB and the values of the ρ from 0.97 to 0.99 (Figure 9a-j). The highest σ of R was J2 (Figure 9k). The same analysis was applied for MRR, as shown in Figure 8. The values of σ PM of R were from 0.38 to 0.74 dB and the values of ρ were 0.98 to 1.0. The σ MPM was highest for M5 followed by M3 and M4 (Figure 8k). It can be noticed that MRR were less scattered and less noisy (lower values of σ PM and σ MPM ) than POSS. It suggests that the MRRs offered more consistent measurements. Meanwhile, the JWD shows the σ PM of R varied from 0.69 to 1.03 dB and the values of the ρ from 0.97 to 0.99 (Figure 9a-j). The highest σ MPM of R was J2 (Figure 9k).   (Table 6). All other POSSs showed the similar σ . Both the values of σ of R and Z indicate P1 showed higher measurement uncertainty. P3, P4, and P5 showed relatively lower measurement noises (Table 5). MRR had the σ of Z varied from 0.33 to 0.71 dB and the values of the ρ were from 0.99 to 1.0 ( Table 6). The values of σ of Z also suggest that M5 is the noisiest   (Table 6). All other POSSs showed the similar σ . Both the values of σ of R and Z indicate P1 showed higher measurement uncertainty. P3, P4, and P5 showed relatively lower measurement noises (Table 5). MRR had the σ of Z varied from 0.33 to 0.71 dB and the values of the ρ were from 0.99 to 1.0 ( Table 6). The values of σ of Z also suggest that M5 is the noisiest A similar analysis was performed for Z. The measurement uncertainties of Z from paired measurements (σ PM ) and multiple paired measurements (σ MPM ) from POSS, MRR and JWD are summarized in Tables 5 and 6. The results from POSSs show that the values of σ PM varied from 0.81 to 1.84 dB, and the values of the ρ were from 0.94 to 0.99. Similarly to R, the highest σ MPM appeared for P1 ( Table 6). All other POSSs showed the similar σ MPM . Both the values of σ MPM of R and Z indicate P1 showed higher measurement uncertainty. P3, P4, and P5 showed relatively lower measurement noises (Table 5). MRR had the σ PM of Z varied from 0.33 to 0.71 dB and the values of the ρ were from 0.99 to 1.0 ( Table 6). The values of σ MPM of Z also suggest that M5 is the noisiest (Table 5). JWD had the σ PM of Z from 1.07 to 1.54 dB and values of ρ of 0.96 to 0.98 (Table 6). J2 also showed a relatively higher σ MPM of Z. These analyses indicate that the JWDs are noisier than MRR, but less noisy than POSS (Table 5). Table 5. Standard deviation (σ) of measurement noise of rainfall rate (R) and reflectivity (Z) estimated from multiple paired measurements (MPM) and paired "cross-type" measurements (CTM).  The values of σ PM and σ MPM of R and Z are summarized in Figure 10. The values of σ PM of MRR are relatively smaller than that of POSS and JWD as shown by their mean values in both R and Z. The smaller range of σ PM of R from MRRs and JWDs suggests more consistent measurements than POSSs. In terms of Z, the MRRs have better agreement to each other. The values of σ MPM indicate that the M1 has the lowest measurement noise and the P1 has the highest values in R. For higher moment of Z, the M4 has the lowest σ MPM and P1 has highest measurement noise. The measurement noise (σ MPM ) of R and Z for each disdrometer revealed by MPM are summarized in Table 5  Nevertheless, obtaining σ PM and σ MPM via MP and MPM methods from the same type of disdrometers neglects the possibility of correlated measurement noises from the same type of disdrometers due to the same measuring principles. As shown in Figure 10, the mean values of σ PM and σ MPM of JWDs are slightly lower than those of POSSs. This result contradicts the conclusions from Lee and Zawadzki [23], who suggest that the POSS has lower measurement noise than JWD due to the larger sample volume. The measurement noise (σ PM and σ MPM ) may be underestimated without considering the correlation of the measurement error between disdrometers with the same type as shown in Equation (8).

Measurement Uncertainties of R and Z from Paired "Cross-Type" Disdrometer Measurements (σ CTM )
The CTM method was applied to examine the covariance of measurement noises (cov ε noise i , ε noise j ) and its influence on estimating the measurement noise. The standard deviations (σ CTM ) calculated from 90 "cross-type" combinations of sixteen disdrometers are shown in Figure 10 and Table 5. The values of σ CTM are in general higher than that of σ MPM as shown by their mean values. These results are reasonably attributed to the correlated measurement noise from the same type of instruments.
The difference of the σ MPM and σ CTM is most pronounced in JWD with a maximum value of 0.5 dB. It suggests that the correlated measurement noise is unneglectable in JWD due to a smaller sampling volume. The values of σ CTM of MRR are relatively lower. The MRR and POSS have comparable differences for σ MPM and σ CTM . This may be due to larger sampling volumes from these two instruments. The values of the σ CTM of JWD are higher than for POSS and MRR. The results are consistent with those of Lee and Zawadzki [23] and they suggest that the POSS has lower measurement errors due to larger sampling volume compared to JWD.
Although the average difference of σ MPM and σ CTM is similar in POSS and MRR, their absolute values are considerably larger in POSS. In fact, MRR showed the smallest measurement noise in R and Z. In addition, the variation of σ CTM among the same types of instruments is the largest in POSS. This may be explained by the different measuring principles of two instruments. MRR is a mono-static radar that directly measures the power spectrum in vertical pointing mode that is subsequently converted into DSDs. However, POSS is a bistatic radar and measured Doppler velocity in the power spectrum is not exactly in the vertical direction. In addition, the sensitivity depends on the relative location of falling raindrops to the center of sampling volume [25]. Thus, the proper conversion of power spectrum into DSDs requires the accurate beam pattern of individual POSS and the larger measurement noise relative to MRR is possibly attributed to the accuracy in conversion.
In general, the individual value of σ CTM is higher than of σ MPM except for M5 in R and M5, P3, and P4 in Z. The decrements might be attributed to the remaining correlated measurement noise among different types of disdrometer and/or the insufficient statistical sample size. Moreover, the measurement noise of 2DVD in R was higher than that of MRR and smaller than those of JWD and POSS. The noise of 2DVD in Z is higher than that of MRR but is comparable with JWD and POSS. Note that JWD has a higher σ CTM than 2DVD, particularly in R, due to the dead-time issue even though both have similar sizes of sampling volume. In addition, the result may be slightly changed due to the limitation of single unit of 2DVD.

Measurement Errors of Raindrop Concentration (N(D))
The measurement uncertainties of the DSD arising from different disdrometers, namely σ MPM , were explored by applying MPM into log 10 N(D) for each raindrop size. The CTM was not considered due to inconsistent sampling diameters arising from the different types of disdrometer. This also leaves out the question on which disdrometer should be used as a reference. As shown in Figure 11, all the disdrometers showed higher measurement noise of raindrop size less than 0.5 mm that decreased with increasing raindrop size. The higher measurement noise for small raindrop sizes is subject to turbulent motion that can mask the true terminal velocity of the drops. The POSSs have the lowest measurement noise at the raindrop size of 1.0 mm. The lowest measurement noise of JWDs exists over the size ranges of 1.0 to 2.0 mm. The P1 and P2 have higher σ MPM between raindrop sizes of 2.5 and 4.0 mm compared with other POSSs, and M5 has a higher σ MPM for all sizes compared with other MRRs. J2 has a higher σ MPM between 1.5 and 3.0 mm compared with other JWDs. These features of relatively higher σ MPM are consistent with higher values of σ MPM of R and Z as discussed in Figure 10 and Section 4.2.1.
Remote Sens. 2019, 11, x FOR PEER REVIEW 18 of 23 converted into DSDs. However, POSS is a bistatic radar and measured Doppler velocity in the power spectrum is not exactly in the vertical direction. In addition, the sensitivity depends on the relative location of falling raindrops to the center of sampling volume [25]. Thus, the proper conversion of power spectrum into DSDs requires the accurate beam pattern of individual POSS and the larger measurement noise relative to MRR is possibly attributed to the accuracy in conversion.
In general, the individual value of σ is higher than of σ except for M5 in R and M5, P3, and P4 in Z. The decrements might be attributed to the remaining correlated measurement noise among different types of disdrometer and/or the insufficient statistical sample size. Moreover, the measurement noise of 2DVD in R was higher than that of MRR and smaller than those of JWD and POSS. The noise of 2DVD in Z is higher than that of MRR but is comparable with JWD and POSS. Note that JWD has a higher σ than 2DVD, particularly in R, due to the dead-time issue even though both have similar sizes of sampling volume. In addition, the result may be slightly changed due to the limitation of single unit of 2DVD.

Measurement Errors of Raindrop Concentration (N(D))
The measurement uncertainties of the DSD arising from different disdrometers, namely σ , were explored by applying MPM into log10N(D) for each raindrop size. The CTM was not considered due to inconsistent sampling diameters arising from the different types of disdrometer. This also leaves out the question on which disdrometer should be used as a reference. As shown in Figure 11, all the disdrometers showed higher measurement noise of raindrop size less than 0.5 mm that decreased with increasing raindrop size. The higher measurement noise for small raindrop sizes is subject to for all sizes compared with other MRRs. J2 has a higher σ between 1.5 and 3.0 mm compared with other JWDs. These features of relatively higher σ are consistent with higher values of σ of R and Z as discussed in Figure 10 and Section 4.2.1.
Both measurement noises of POSSs and MRRs increased with increasing raindrop size from 1.0 to 3 mm. This increasing trend persisted for MRR for larger diameters, which may be explained by Both measurement noises of POSSs and MRRs increased with increasing raindrop size from 1.0 to 3 mm. This increasing trend persisted for MRR for larger diameters, which may be explained by the measuring principle of POSS and MRR using the theoretical relationship between the drop size and the terminal fall velocity. The size dependence of the terminal fall velocity gradually saturated with increasing sizes, thereby leading to higher uncertainties at larger sizes. JWDs show slightly increasing measurement noise between 1 and 4 mm. JWD was the impact-type device that utilized the size dependence of the kinetic energy of individual raindrops. This led to less ambiguity in the size determination than in the terminal fall velocity approach.

Measurement Uncertainty as a Function of Temporal Integration
It is common to integrate DSDs over a temporally averaging window (one minute to one hour) to reduce the measurement noise for disdrometers [23,54]. In this part of the study, the mean values of σ MPM and σ CTM were calculated for each type of disdrometer as functions of different time averaging windows from 1 to 60 minutes to examine the change in the measurement noise as time integration increases.
As seen in Figure 12a, the values of σ MPM in R decreased with increasing time averaging window. POSS remained at the highest value of σ MPM throughout the averaging window. MRR had the lowest values of the σ MPM in R for an averaging window less than 15 minutes, whereas JWD outperformed MRR for time integration longer than 15 minutes. After considering the covariance of measurement noise between the same-type disdrometers, the values of σ CTM in R were higher than those of σ MPM . JWD showed the highest σ CTM for the entire averaging windows. MRR show the smallest noise up to 4 minutes and 2DVD outperformed other disdrometers for window sizes larger than 4 min. The σ CTM of MRR and 2DVD became constant for integration of 15 and 20 min. The σ CTM of POSS decreased rapidly with time. The σ CTM of JWD remained the highest for all times due to the undersampling [25] and the dead-time correction issues. In addition, the discrepancies between σ MPM and σ CTM suggest that the covariance of measurement errors is most significant in JWD.
It is common to integrate DSDs over a temporally averaging window (one minute to one hour) to reduce the measurement noise for disdrometers [23,54]. In this part of the study, the mean values of σ and σ were calculated for each type of disdrometer as functions of different time averaging windows from 1 to 60 minutes to examine the change in the measurement noise as time integration increases.
As seen in Figure 12a, the values of σ in R decreased with increasing time averaging window. POSS remained at the highest value of σ throughout the averaging window. MRR had the lowest values of the σ in R for an averaging window less than 15 minutes, whereas JWD outperformed MRR for time integration longer than 15 minutes. After considering the covariance of measurement noise between the same-type disdrometers, the values of σ in R were higher than those of σ . JWD showed the highest σ for the entire averaging windows. MRR show the smallest noise up to 4 minutes and 2DVD outperformed other disdrometers for window sizes larger than 4 min. The σ of MRR and 2DVD became constant for integration of 15 and 20 min. The σ of POSS decreased rapidly with time. The σ of JWD remained the highest for all times due to the undersampling [25] and the dead-time correction issues. In addition, the discrepancies between σ and σ suggest that the covariance of measurement errors is most significant in JWD. The σ and σ in Z are shown in Figure 12b. The MRR shows the smallest measurement noises. The JWD has the highest σ and correlation of measurement noises as shown by the significant discrepancy in σ and σ for entire integration time. The correlation of measurement errors is negligible in POSS with similar values of σ and σ . The σ in reflectivity decreased rapidly with increasing average windows and became constant after 10 minutes for MRR and 20 min for JWD and POSS, which is a shorter integration time than R. Unlike R, the σ of Z from 2DVD is larger than that of MRR but comparable with that of POSS after 20 min of averaging. Overall results are closely linked to their measuring principles. That is, MRR showed the lowest measurement noise in Z since it measured the power spectrum and radar reflectivity which was converted into DSDs. 2DVD directly measured the size and number concentration, which should provide the most accurate R, but the small sampling volume led to larger measurement noise in Z than MRR. The σ MPM and σ CTM in Z are shown in Figure 12b. The MRR shows the smallest measurement noises. The JWD has the highest σ CTM and correlation of measurement noises as shown by the significant discrepancy in σ MPM and σ CTM for entire integration time. The correlation of measurement errors is negligible in POSS with similar values of σ MPM and σ CTM . The σ CTM in reflectivity decreased rapidly with increasing average windows and became constant after 10 minutes for MRR and 20 min for JWD and POSS, which is a shorter integration time than R. Unlike R, the σ CTM of Z from 2DVD is larger than that of MRR but comparable with that of POSS after 20 min of averaging. Overall results are closely linked to their measuring principles. That is, MRR showed the lowest measurement noise in Z since it measured the power spectrum and radar reflectivity which was converted into DSDs. 2DVD directly measured the size and number concentration, which should provide the most accurate R, but the small sampling volume led to larger measurement noise in Z than MRR.

Discussion
The systematic biases of individual disdrometers were derived by considering the 2DVD as a reference. The biases for each disdrometer were obtained by averaging the biases derived from one-minute and hourly R and Z (summarized in Table 4). These biases varied from −3.2 dB for M3 to 0.2 dB for J5, suggesting that the bias correction of sixteen disdrometers was required for further quantitative applications. A multiplicative correction factor was consequently applied to remove these biases. This procedure removed the bias of each disdrometer and preserved the shape of original DSDs. Hence, the averaged DSDs from different types of disdrometer were nearly identical at sizes of 1 to 3 mm after applying this correction.
However, the discrepancy in the small (<1.1 mm) and big (>3.8 mm) raindrop sizes remained due to distinct measuring principles. POSS and MRR showed higher concentrations of small raindrops, while 2DVD and JWD showed lower concentrations of small raindrops with decreasing sizes. The POSS and MRR measure bigger raindrops (>5.2 mm) due to the larger sampling volumes with a better opportunity to collect the big raindrops. JWD measured lesser concentrations of smaller and bigger raindrops compared to 2DVD. The insufficient "dead-time" correction of raindrops may cause low number concentrations at smaller sizes. The results are in agreements with previous studies [27,30,31].
The measurement uncertainties of sixteen disdrometers were investigated quantitatively via three methods with different hypotheses of measurement noise characteristics. These methods estimated the standard deviation of measurement noise from paired measurements (σ PM ) by assuming independent and identical measurement noises, multiple paired measurements (σ MPM ) by assuming independent and non-identical measurement noises, and cross-type measurements (σ CTM ) by assuming dependent and non-identical measurement noises.
MRR showed lower values of σ PM , indicating more accurate than POSS and JWD. Furthermore, the various σ PM from different pairs of disdrometers confirmed that the measurement uncertainty was not identical. The values of σ MPM showed that P1 and P2 have higher measurement noises (higher values of σ MPM ) than the rest of POSS. The σ MPM of M5 and J2 was the highest among MRRs and JWDs, respectively. On the other hand, the values of σ PM and σ MPM of JWDs were slightly lower than those of POSSs. This finding is different from the conclusions reached by Lee and Zawadzki [23], who suggested that the POSS should have lower measurement errors than JWD. The correlated measurement noise from the same type of disdrometer is the probable reason for the underestimation of σ PM and σ MPM from JWD.
Utilizing cross-type measurements to reduce the possible correlated measurement noises from the same type of disdrometer was thus applied. In general, the values of σ CTM were higher than that of σ MPM , indicating correlated measurement noise among the disdrometers of the same type. MRR had the smallest value of σ CTM (≈ 0.36-0.82 dB), whereas JWD showed the largest (σ CTM ≈ 1.24-1.75 dB). The maximum difference between σ MPM and σ CTM of JWDs reached up to 0.5 dB due to more correlated noises. On the other hand, the small difference of σ MPM and σ CTM was shown in POSSs, suggesting relatively low correlation of the measurement noise. Overall results indicated that the measurement noises are neither independent nor identical among disdrometers of the same types. The σ CTM of 2DVD in R was higher than that of MRR and smaller than that of JWD and POSS. Although the σ CTM of 2DVD in Z was higher than that of MRR, it is comparable with those of JWD and POSS.
The examinations of σ MPM in terms of raindrop number concentrations show that all of the disdrometers had higher measurement noise for number concentration at sizes smaller than 0.5 mm, suggesting the effect of turbulent motions in smaller drops and the limitations of measuring such small drops with the disdrometers. The noise of both POSSs and MRRs increased with increasing raindrop size from 1.0 mm because the size dependence of the terminal fall velocity became weak with the increasing size. JWDs had the highest noise at sizes smaller than 1.3 mm due to the dead-time effect.
The reduction of σ MPM and σ CTM , in R and Z was investigated as a function of increasing integration time. They decreased rapidly and became nearly constant after 10-20 min integration. This result is consistent with [3-6,30] Tokay et al. [30], Tapiador et al. [3], Jaffrain and Berne [4,5] and Jaffrain et al. [6]. However, this integration time can be reduced to 5-10 min by averaging the uncorrelated DSDs, as described by [23]. JWD (MRR) had the highest (lowest) σ CTM in R. 2DVD outperforms other disdrometers at the integration time larger than 4 minutes. However, MRR had the smallest measurement noise (σ CTM ) for the entire integration time. These differences in measurement noise of 2DVD and MRR are attributed to the measuring principles of the two instruments.

Conclusions
The measurement characteristics of the simultaneously observed DSD data collected from sixteen collocated disdrometers (four different types) during intercomparison experiment prior to the SoWMEX/TiMREX field project were analyzed. Four major findings have been concluded from this study: (1) the measurement bias of individual disdrometer was properly estimated and reduced by applying a multiplicative factor, (2) the consistency of the measured DSDs exists in the four types of disdrometers in the size range of 1.1-3.8 mm, (3) the measurement noises in R and Z of individual disdrometers estimated by assuming the possible correlated measurement noises from the same type are better in agreement with previous researches and (4) the measurement noises are dramatically reduced and become flat by integrating DSDs over an averaging window of 10-20 min.