Using Markov switching models to infer dry and rainy periods from telecommunication microwave link signals

A Markov switching algorithm is introduced to classify attenuation measurements from telecommunication microwave links into dry and rainy periods. It is based on a simple state-space model and has the advantage of not relying on empirically estimated threshold parameters. The algorithm is applied to data collected using a new and original experimental set-up in the vicinity of Zurich, Switzerland. The false dry and false rain detection rates of the algorithm are evaluated and compared to 3 other algorithms from the literature. The results show that, on average, the Markov switching model outperforms the other algorithms. It is also shown that the classification performance can be further improved if redundant information from multiple channels is used.


Introduction
Precipitation is an important component of the Earth's water cycle and needs to be accurately measured.So far, several techniques have been proposed to measure rainfall with different spatial and temporal resolutions, ranging from traditional point measurements from rain gauges to observations from weather radars and satellites.Each of these techniques has its advantages, but also its limitations (Sevruk, 1999;Upton et al., 2005;Germann et al., 2006).
Recently, microwave links (MWL), which are commonly used in telecommunication networks for wireless data transmission, have been suggested as a novel tool to monitor rainfall in urban areas (Messer et al., 2006;Leijnse et al., 2007c).The main idea behind this technique is to relate the rain-induced signal attenuation to the path-averaged rain rate along the considered link.The potential of this technique has been demonstrated using microwave links specifically designed for rainfall estimation (Ruf et al., 1996;Rahimi et al., 2003;Holt et al., 2003;Upton et al., 2005;Krämer et al., 2005) and commercial microwave links operated by telecommunication companies (Messer et al., 2006;Zinevich et al., 2009).Note that, in addition to estimating rain rates, MWL can also be used to measure evaporation (Leijnse et al., 2007a) and water vapour (David et al., 2009).In fact, MWL nicely complement traditional rainfall sensors because they provide rain rate measurements (near the ground level) at an intermediate scale between point measurements from rain gauges and weather radars with sampling volumes up to several km 3 .The fact that MWL networks can be very dense can also be used to improve rain rate estimates using spatial interpolation techniques (Zinevich et al., 2008).
A very important issue that needs to be addressed prior to rainfall estimation using MWL is the so-called baseline estimation problem (Rahimi et al., 2003;Leijnse et al., 2007c).It consists of identifying and separating the attenuation which occurs during dry periods from the rain-induced attenuation (which is the quantity of interest in most applications).Depending on the link characteristics, this problem can be very difficult (Upton et al., 2005).Dry-weather signal attenuations can exhibit significant variability caused, for example, by changes in water vapour, wind effects on the antennas, birds or insects crossing the beam, losses during transmission or reception, interferences, wet-antenna and multi-path effects (Zinevich et al., 2010).Moreover, attenuation measurements are often quantized, which introduces additional variability in the process.It is only after the attenuation baseline has

Existing algorithms
Three popular classification methods have been chosen: the simple threshold method (ST), the moving window method (MW) and the Factor Graph (FG).The simple threshold algorithm (Leijnse et al., 2007b) is straightforward and computationally efficient.It uses a global threshold on the path-integrated attenuation to distinguish between the dry and the rainy periods.Each time period for which the PIA is above the threshold is classified as rainy, and vice versa.
Decision rule for ST: where A t [dB] denotes the path-integrated attenuation at time t and a 0 [dB] is a given threshold value.The method has shown to produce good results in practical applications and can be applied in real-time.Finding the optimal detection threshold a 0 is, however, difficult.Moreover, the performance (in terms of false dry and rain detections) of this algorithm can be very sensitive to the value of the threshold.Finally, this method is only appropriate for datasets for which the dry-weather attenuation is more or less constant.This is not always the case as can be seen in the right panel of Fig. 2. In some situations, the dry-weather attenuation exhibits clear daily cycles and a strong temporal drift in the PIA, possibly due to changes in temperature between day and night and hardware instabilities.Obviously, the simple threshold is not appropriate for such types of signals and alternative classification methods have been suggested.A slightly more complex approach which may be better suited for non-stationary dry-weather attenuations has been proposed by Schleiss and Berne (2010).Their method, hereinafter referred to as the moving window algorithm, is based on the assumption that the temporal variability of the PIA is small and bounded during dry weather.On the other hand, rainy periods are characterised by larger signal fluctuations.Hence, each time period is classified according to the following decision rule: Decision rule for MW: where S W t [dB] represents the local (temporal) variability of the signal attenuation for a moving window [t − w, t] and σ 0 [dB] is a rain detection threshold estimated using one of the two approaches described in Schleiss and Berne (2010).
The moving window algorithm is also computationally efficient and can be applied in real-time to non-stationary time series of attenuations.However, finding the optimal detection threshold σ 0 can be very difficult without appropriate calibration data over extended periods of time.Moreover, one of the main disadvantages of the moving window algorithm is its inability to separate light rain from dry periods because both signals exhibit similar variability.
The Factor Graph algorithm proposed by Reller et al. (2011) can also be applied to non-stationary MWL signals, but does not require large datasets for model calibration.A Factor Graph is a particular type of graphical model, with applications in Bayesian inference, which computes marginal distributions through the sum-product message passing algorithm (Kschischang et al., 2001).More specifically, the Factor Graph algorithm models the attenuation baseline during dry weather using a line model whose parameters can vary slowly over time together with periodicity constraints.In this, it assumes a smoothly varying baseline and, where the signal exceeds a certain threshold, the algorithm identifies that the system enters another state.The Factor Graph algorithm possesses several advantages, as it can deal with irregular time series and not only identifies dry and rainy periods, but simultaneously estimates the baseline and, thus, delivers the rain-induced attenuation.However, it also relies on several tuning parameters that need to be estimated subjectively prior to the classification into dry and rainy periods.
In the following, a new algorithm for the identification of dry and rainy periods based on MWL attenuation measurements is introduced.It uses Markov switching models to estimate the state of the system (i.e., dry or rainy).

Univariate Markov switching model (MSU)
A Markov switching model combines dynamic linear system behaviour with a Markov process, which models the transitions between different states.It belongs, similarly to the Factor Graph, to a very general class of so-called statespace models.Such models are commonly used to model a change in behaviour with respect to different regimes.The regimes themselves can be related to certain events, often stochastic, such as a financial crisis or changes in government policy.Practical applications of such models can be found (among others) in the fields of Economics (Hamilton, 1989) and Physics (Yue and Han, 2005;Metzner et al., 2007).Markov switching models have also been used in weather generators to model rainfall patterns (Weiss, 1964).
For simplicity, the details of the algorithm are only given for the univariate case, i.e., a single channel input.The multivariate case is briefly described at the end of this section.For more details on Markov switching models, the reader is referred to Hamilton (1989Hamilton ( , 1990) ) and Kim (1994).
Note that Rayitsfeld et al. (2011) proposed a similar approach based on a hidden Markov model with a slightly different implementation.They did, however not compare their method with previously proposed classification techniques.
The underlying assumption of the Markov switching algorithm is that the magnitude and the variability of the PIA are fundamentally different during dry and rainy periods.During dry periods, the PIA mildly fluctuates around a given value, while for rainy periods it is much larger and variable.This additional variability is caused by the scattering and absorption of the transmitted signal by the raindrops along the path of the link.Hence, it should be possible to identify two fundamentally different states of the system (dry/rainy) from the different behaviour of the PIA.For example, the following, very simple model can be used to describe the data: μ 0 + ε 0 for every dry period μ 1 + ε 1 for every rainy period (3) where A t [dB] represents the path-integrated attenuation at time t, μ 0 [dB] and μ 1 [dB] represent the average value of the attenuation during dry and rainy periods.The noise terms ε 0 [dB] and ε 1 [dB] are assumed to be independent Gaussian random variables with zero mean and standard deviations given by σ 0 [dB] and σ 1 [dB].The transitions between the dry and the rainy periods are modelled using a stationary hidden random variable S t ∈ {0,1} where S t = 0 for every dry period 1 for every rainy period (4) The unconditional probability of the system being in the dry state is denoted by p 0 = Pr(S t = 0) = 1 − p 1 .Combining Eqs.
(3) and (4), it is possible to write A t using a single expression given by with 5 model parameters = (μ 0 ,μ 1 ,σ 0 ,σ 1 ,p 0 ).The maximum likelihood technique is then used to infer the optimal model parameters for a given set of observations {A t = a t }: where the log-likelihood function l( ) is given by with given set of model parameters ).The maximization of l( ) is performed using a standard Newton-type algorithm.In order to be valid, the solution must satisfy some simple conditions.Specifically, one must have 1 > p 0 > 0, σ 1 > σ 0 > 0 and μ 1 > μ 0 > 0. Once ˆ has been estimated, the classification into dry and rainy periods can be easily derived from the estimated state probabilities p 0 (a t , ˆ ) = 1 − p 1 (a t , ˆ ).
In absence of any prior information and two states, dry and rainy, this leads to a threshold of 1 2 .
Note that it is also possible to choose another threshold depending on the relative cost associated to each of the classification errors.One of the advantages of the Markov switching model is that it can be easily generalized to include multivariate inputs from different channels or frequencies.

Multivariate Markov switching model (MSM)
Telecommunication microwave links are usually operated using multiple channels such as two directions, frequencies or polarizations.This redundant information can be used to improve the classification performance.
In the multivariate case with N channels, the attenuation at time step t is given by a vector (9) Note that the vector of model parameters is now significantly longer and given by = (μ ,p 0 ), that is, 4N + 1 variables to estimate.The major difference with respect to the univariate case concerns the difficulty to estimate the joint densities f 0 (A t , ) and f 1 (A t , ), although significant simplifications occur if the channels are assumed independent.While this is certainly not the case for rainy periods, it is, at least, reasonable during dry periods (which usually represent the majority of all the periods).In the absence of any further information, a pragmatic solution, therefore, consists in assuming that all channels are independent and that the log-likelihood function is given by where Maximizing l( ) yields, similarly to the univariate case, the maximum likelihood estimate ˆ .The classification into dry and rainy periods can then be derived from the estimated state probabilities p k (a t , ˆ ).Possible extensions to correlated attenuation values, at least during rainy periods, and more general expressions for the joint density f 1 (A t , ) will not be discussed.For simplicity, only the independent case is presented in Sect. 4.

Experimental set-up
The experimental site is located in Dübendorf, near the city of Zürich, Switzerland (see Fig. 1).It consists of a 1.85 km long commercial dual-polarization microwave link, 5 disdrometers and 3 rain gauges placed approximatively at equal distances along the path of the link.The dataset is complemented by climatic and meteorological data from two weather stations.The experiment is designed to investigate different aspects of rainfall monitoring using microwave links in the context of a humid continental climate, such as the retrieval of path-averaged rain rates, the influence of the drop size distribution, the characteristics of dry-weather attenuation and wet-antenna effects.In particular, the horizontally and vertically polarized signals could be used to retrieve the effective drop size distribution along the link path.

Microwave link
The installed microwave link is an "Ericsson Mini-link TN ETSI", a widely used system in commercial telecommunication applications.The MWL is operated at about 38 GHz in a dual-polarization set-up with the specific characteristics given in Table 1.For more redundancy, the link provides measurements on 4 different channels (2 polarizations and 2 directions).In its original configuration, the link only records the transmitted and received powers every 15 min.This is clearly not sufficient for accurate rainfall monitoring at scales relevant for modern hydrological and meteorological applications.Therefore, a stand-alone data logging application using the SNMP protocol has been developed and implemented to record the power measurements in much shorter intervals (see Appendix).For the purposes of this project, a 4 s temporal resolution has been chosen because it was a good trade-off between a high sampling resolution and a limited amount of missing data (caused by occasional long response times of the radio equipment).The experiment started in March 2011 and is continuing into the first half of 2012.It is divided into two parts.During the first part of the project, i.e., until 10 October 2011, the antennas of the link were fully exposed to the rain.Consequently, during rainy periods, a thin film of water was formed on the surface of the antennas, causing additional attenuation in the order of several dB (Kharadly and Ross, 2001;Leijnse et al., 2008).Preliminary data analysis suggest that the antennas remain wet for some time after the rain has stopped.An-tenna drying seems to fundamentally depend on the weather conditions and lasted up to several hours for some cases.In the second part of the experiment, i.e., after the 10 October 2011, the antennas were shielded from rain using plastic shields specifically designed for this experiment (see 5.A in Fig. 1).Visual inspection of the antennas proved that these shields effectively protect the surface of the antennas, even during strong rainfall and moderate wind speeds.
In addition to wet-antenna effects, the experimental setup also revealed unexpected fluctuations in the transmitted power levels.According to the manufacturer, the received power is measured with an accuracy of 0.1 dB and the transmitted power with an accuracy of 1 dB.Additional measurements of the transmitted power using a power meter showed that the transmitted power was accurate within a range of approximatively 0.35 dB over a period of 11 days, for temperatures between 7 • C and 23 • C and relative humidities between 37 % and 100 %.This is confirmed by independent measurements collected in the laboratory, with (more or less) constant temperatures and humidities and for which the uncertainty in the transmitted power was found to be 0.3 dB.

Disdrometers and rain gauges
5 optical disdrometers of type Parsivel (1st generation, manufactured by OTT) have been deployed at 4 different sampling locations (sites 2-5) along the 1.85 km path of the link (see Fig. 1).For more details on the principle of these optical disdrometers, see Löffler-Mang and Joss (2000).All the disdrometers are designed to be autonomous in terms of power supply and data transmission (Jaffrain et al., 2011).They provide measurements of particle sizes and velocities at a 30-s temporal resolution.Note that sampling point 2 is equipped with two collocated disdrometers in order to quantify the measurement uncertainty associated with Parsivel disdrometers (Jaffrain and Berne, 2011).The 4 sampling locations have been chosen as a trade-off between a regular distribution of the instruments, the distance to the path of the link, line of sight for data transmission between the different instruments and minimum probability of disturbance and vandalism.
In addition to the 5 disdrometers, 3 tipping-bucket rain gauges from Précis Mécanique (model 3029) have been deployed at sampling locations 2, 4 and 5.The tipping-bucket rain gauges have a catching area of 400 cm 2 and are connected to data loggers that record the tipping time with an accuracy of 0.1 s.One tip corresponds to 0.1 mm of rain.Note that the 3 rain gauges are not transmitting the data in real time.The collected data are used to check the calibration of the disdrometers and to identify possible biases between the sensors.

Additional data
The rainfall measurement network is complemented by operational radar data provided by MeteoSwiss.Processed maps of rain rate and radar reflectivities are available at a spatial resolution of 1 × 1 km 2 and a temporal resolution of 5 min.In addition, meteorological and climatic data (e.g., temperature, relative humidity, pressure, wind speed and wind direction) are collected using a MIDAS IV weather station (manufactured by Vaisala) located at the airport in Dübendorf.The MIDAS IV system collects data from two sensors situated at both ends of the runway.The temporal resolution depends on the considered parameter and can vary between 3 and 60 s.

Originality
Several other studies involving simultaneous measurements of microwave links, rain gauges, disdrometers and weather radar can be found in the literature.Rincon and Lang (2002) proposed a method to estimate the drop size distribution from the measurements of a dedicated 2.3 km, dual-frequency research link and validated their results using 6 rain gauges and a single 2D video disdrometer placed along the path of the link.Rahimi et al. (2003) 2010) compared the rain estimates from 23 commercial microwave links with 5 nearby rain gauges.The experimental set-up presented above is original because it combines attenuation measurements from a dual-polarization commercial microwave link with a sufficiently dense network of disdrometers to accurately estimate the path-averaged DSD.This provides a platform to develop and validate new methods for rainfall retrieval using MWL and to evaluate their respective performances as outlined above.In particular, it can be used to investigate if the redundancy between the different channels and polarizations can be used to improve the rain rate estimates.Furthermore, it might also be of interest to radio engineers concerned with better predictions of rain-induced attenuation and MWL simulation methods (Paulson, 2002;Callaghan et al., 2008).It is intended to make the data publicly available for download from a web-platform after the end of experiment.

Selected datasets
Two datasets have been selected from the experimental observational record to evaluate the performance of the algorithms described in Sect. 2 under fundamentally different conditions.A visual illustration of these datasets is given in Fig. 2. Note that for a better visibility, the attenuation measurements are only shown for one channel.The first dataset covers the period between 17 May 2011 and 12 June 2011 and is representative of a (more or less) constant dry-weather attenuation baseline (hereinafter referred to as the stationary case).This period is also characterised by small variations in the PIA during dry weather.The second dataset covers the period between 17 March 2011 and 26 April 2011 and illustrates a very different behaviour (hereinafter referred to as the non-stationary case).This period is characterised by a highly-variable attenuation baseline with a strong temporal drift and daily cycles in the PIA, due to changes in temperature and humidity.A preliminary analysis of the current observational record suggests that the non-stationary cases represent a non-negligible amount (about 10-20 %) of all the time periods and must therefore be considered carefully.

False rain and dry detections
The performances of the algorithms described in Sect. 2 are evaluated and compared using two criteria: type I error: #dry periods classified as rainy #dry periods type II error: #rainy periods classified as dry #rainy periods In other words, type I errors correspond to false rain detections and type II errors to false dry detections.A perfect classification algorithm has 0 type I error and 0 type II error.In practical applications, however, both types of errors are usually competing against each other, i.e., if the type I error decreases, the type II increases and vice versa.Finding an optimal trade-off between both errors is difficult and depends on the underlying application and the cost associated to each type of error.However, this far beyond the scope of paper and will not be addressed here.
For comparison purposes, it is assumed that the pathaveraged rain rate measured by the 5 disdrometers along the path of the link (see Sect. 3) is representative of the "true" weather state.If the path-averaged rain rate is greater than zero, the period is considered rainy.Otherwise, it is supposed to be dry.In order to analyse the sensitivity of the results with respect to this rain-detection threshold, a slightly higher rain detection threshold of 0.1 mm h −1 is also considered.All periods for which the path-averaged rain rate is smaller than 0.1 mm h −1 are considered dry and vice versa.The value of 0.1 mm h −1 was chosen as a threshold because it approximatively corresponds to the hardware-induced measurement uncertainty of 0.1 dB in the path-integrated attenuation (ITU-R P. 2005).In other words, rainy periods with rain rates smaller than 0.1 mm h −1 cannot be distinguished from dry periods because of the uncertainty in the power measurements.Finally, note that because the disdrometer data are provided at a 30-s temporal resolution, the corresponding MWL data (at a 4-s temporal resolution) are averaged at 30-s prior to the analysis.Periods for which one of the instruments was not working are not considered for the comparison.

Stationary dry-weather attenuation baseline
The results for the first dataset (stationary case) are shown in Table 2. To better illustrate important details, a small subset of dataset 1 (a 5-days period between 8 and 12 June 2011) is plotted in Fig. 3.The univariate Markov switching model Table 2. Classification performances (in percentages) for the simple threshold (ST), the moving window (MW), the Factor Graph (FG), the univariate Markov switching (MSU) and the multivariate Markov switching (MSM) algorithms for dataset 1 (stationary case).For the univariate algorithms, the value given in the table corresponds to the average classification performance for all 4 channels.In parentheses the associated standard deviation.Note that no model parameters could be fitted for the MSU algorithm on channel 2.
Models rain detection threshold 0 mm h −1 rain detection threshold 0.1 mm h (MSU) clearly produced the best classification performances among the univariate models, closely followed by the simple threshold method.The good performance of the simple threshold algorithm is explained by the fact that the dryweather attenuations over this time period are (more or less) constant with very low fluctuations.The moving window and the Factor Graph, on the other hand, have significantly higher values of type I and type II errors.This can be partially explained by the fact that these models rely on predefined threshold parameters which were not necessarily optimal over the considered time period.For example, it is pos-sible to decrease the type II error rate in the moving window algorithm by increasing the value of σ 0 .This will, however, also result in an increased type I error rate.Additional tests with different threshold parameters confirmed that the moving window algorithm produces, on average, less reliable classifications than the threshold and the Markov switching algorithm.Not surprisingly, the multivariate Markov switching model outperformed all the other univariate models in terms of type I and type II errors.Its false rain/dry detection rates are 2.46 % and 21.97 % for a rain detection threshold of 0 mm h −1 and 4.27 % and 11.82 % for a rain detection threshold of 0.1 mm h −1 (not shown).This confirms the intuitive idea that the state of the system can be estimated more accurately using 4 channels rather than 1.The improvement is, however, only minor because the univariate Markov switching model already produced good and similar classifications for all the considered channels (except for channel 2 for which no valid model parameters could be fitted).The fact that the univariate Markov switching model provides realistic classifications can also be seen in Fig. 3, which shows the estimated states (dry/rainy) for all the considered algorithms.A qualitative evaluation suggests that the best classifications are obtained for the threshold method and the univariate Markov switching model (MSU).The classifications obtained using the Factor Graph and the moving window algorithm are not satisfactory.Both the Factor Graph and the moving window produce considerable false dry detections.
The moving window algorithm also produces some false rain detections at the beginning of the period.Clearly, the threshold parameters (which were subjectively estimated for the entire dataset) are not optimal for this period.

Non-stationary dry-weather attenuation baseline
The results for the second dataset are shown in Table 3.As for the first dataset, the classification performances is illustrated in Fig. 4, where the results are plotted for a 11-days subset from 27 March 2011 to 7 April 2011.The first point to notice is that all the considered models have a very high rate of type II errors (about 50-60 % for the first rain detection threshold and 20-35 % for the second rain detection threshold).This is due to the large variability of the attenuation baseline during dry periods, which makes it difficult for the models to separate dry periods from light rainfall.Consequently, more rainy periods are classified as dry.This is also confirmed by the low type I error rates, meaning that very few dry periods are actually classified as rainy.The "best" average performance (among the univariate algorithms) is again obtained for the univariate Markov switching model and the simple threshold method, although these two models do not have the same type I,II error rates.As can be seen in Table 3, the threshold method produces less false dry detections but more false rain detections.The moving window algorithm has the highest rate of type II errors (65.74 % on average for a rain detection threshold of 0 mm h −1 ), but most of these false dry detections correspond to very light rain rates.This is indicated by the fact that, for a rain detection threshold of 0.1 mm h −1 , which essentially removes very light rainfall, a much lower type II error rate of 17.54 % is obtained.In fact, for the higher rain detection threshold, the moving window algorithm performs similarly to the simple threshold and the univariate Markov switching model.Again, the multivariate Markov switching algorithm outperformed (on average) the univariate algorithms in terms of false dry and rain detections.In particular, it is worth mentioning that no valid model parameters could be fitted for the univariate Markov switching model for channels 2 and 4 whereas the multivariate Markov switching model (using all 4 channels) was still able to provide valid parameter estimates for all channels.
The threshold method and the multivariate Markov switching algorithm (MSM) produce very good and similar results for this time period.The classifications obtained using the Factor Graph and the moving window algorithm do not look very good.In particular, the strong variability in the attenuation baseline causes the moving window algorithm to produce a large amount of false rain detections.This problem could be (partially) solved by considering a lower detection threshold σ 0 for this time period, but there is currently no easy way of doing this automatically in the absence of any control data from nearby weather stations.

Discussion and possible developments
The Markov switching model proposed in Sect. 2 already provides good results at a reasonable computational cost.It remains, however, very simple in its formulation and does not exploit the full potential of state space models.As a possible extension, the performance of an autoregressive state space model of order 1 was also investigated.Although more elaborate, the autoregressive model of order 1 only showed little improvement in performance compared to the much simpler AR(0) model.Because it is computationally more expensive and more difficult to fit autoregressive state space models, the AR(0) was preferred for practical applications.Next, the authors investigated how the Gaussian error assumption in Eq. ( 3) affects the dry/wet classification performance.It is well known that the distribution of rain rate values (and consequently path-integrated attenuation) is skewed and to a log-normal distribution than to a Gaussian distribution.
An alternative model formulation with non-Gaussian error structure was therefore considered: A t = μ 0 + ε 0 for every dry period μ 0 + ε 0 + ε 1 for every rainy period (11) where ε 0 is a Gaussian random variable with zero mean and standard deviation σ 0 and ε 1 is a positive random variable with log-normal distribution representing the rain-induced attenuation.The major drawback of such a formulation is that it has no analytical expression for the conditional density of A t knowing S t = 1.It is, at the expense of additional computation costs, however still possible to fit this model using numerical approximations.Surprisingly, the more complex and physical error structures did not improve the classification performance significantly.The reason for this can be seen in Fig. 5 which shows the probability density functions of attenuation values for dry and rainy periods.The sample distributions are not exactly Gaussian, but the fact that the tails of the distributions are not correctly reproduced is not critical with respect to the classification problem.In fact, the optimal classification threshold which is at the intersection between the two empirical probability density functions (i.e., about 49 dB) is very close to the threshold derived from the Gaussian model (i.e., the intersection between the two Gaussian densities).Similar results are obtained for all channels and all considered datasets and show that the Gaussian error assumption is not critical with respect to the classification problem.
It is important to note that the Markov switching models suggested here are general switching autoregressive models, which might not perfectly represent the structural patterns observed in each and every MWL dataset.This also holds true for the other models.However, as these models are applied in a classification context, there is usually more concern about overfitting.Overfitting would be problematic where the model sticks too closely to the data and reproduces irrelevant details, which impairs the capabilities to predict future observations.In our case, however, we do not use the ground truth on dry and wet periods from the disdrometers for training and only compare the performance after classification.
Another fundamental problem that needs to be addressed in future studies concerns the problem of the wet antenna effects on the classification into dry and wet periods.Most commercial microwave links do not have shielded antennas.Consequently, they experience some additional attenuation due to a thin water film formed on the surface of the antennas.This effect can be in the order of several dB and must be taken into account when estimating dry and rainy periods, especially during and immediately after a given rain event where the antenna can stay wet for several hours.Future investigations could consider two different states for dry periods, depending on the state of the antenna.

S t =
⎧ ⎨ ⎩ 0 for every dry period with dry antenna 1 for every dry period with wet antenna 2 for every rainy period (with wet antenna) (12) In this case, a possible attenuation model could be given by A t = ⎧ ⎨ ⎩ μ 0 + ε 0 for every dry period with dry antenna μ 1 + ε 1 for every dry period with wet antenna μ 2 + ε 2 for every rainy period (13) It must be noted, however, that such a model might be poorly identifiable, i.e., the parameters and states can not be identified without ambiguity because of the uncertainty affecting the power measurements and because of the strong dependence between the model parameters.

Conclusions
In this article, a new algorithm based on a Markov switching model has been introduced to classify attenuation measurements from commercial microwave links into dry and rainy periods.The performance of the algorithm has been evaluated using real data from a new and original experimental setup and compared to 3 existing classification methods.The results show that the Markov switching algorithm performs well and that its classification performance can be increased if multiple channel inputs are considered.Clearly, this is a big advantage compared to other univariate algorithms from the literature which cannot be generalized easily to the multivariate case.The fact that the Markov switching model does not require any empirically estimated threshold parameters is also of advantage.
The experimental set-up described in Sect. 3 provides a unique platform from which various aspects of rainfall retrieval using MWL can be investigated.For example, it is now possible to rigorously evaluate and compare the classification performances of the different algorithms presented in Sect.2, which is difficult based on single gauges, which are usually not directly under the MWL beam.The potential applications and scientific of this experiment go, however, far beyond the simple application presented in this article.Future studies will, for example, investigate the effect of wet antenna bias on retrieved rain rates, explore how attenuation of orthogonal polarizations can be used to retrieve the effective drop size distribution (DSD) along the link path, and the possibility to use multiple channels in order to improve the accuracy of the rain rate estimates.

Appendix A SNMP Programming on Mini-Link
The microwave link used in this experiment is an "Ericsson Mini-link TN ETSI", a widely used platform in commercial telecommunication applications.However, it is not a dedicated remote sensing device, which required the development of a custom software for high-frequent data logging.As the Mini-link is relatively inexpensive and widely used, details of our solution might be of use to others and facilitate future studies.
The Mini-link can provide some management information through a software called Mini-link Craft.In the case of rainfall estimation, the major parameters of interest are the transmitted and received powers.However, in its initial configuration, these values are only provided at a 15 min temporal resolution.This is clearly not enough considering the temporal and spatial dynamics of rainfall.Consequently, a simple network management protocol (SNMP) has been implemented to query the transmitted and received powers using a much higher temporal resolution.
SNMP is an Internet-standard protocol for managing devices on IP networks.It allows network management systems to monitor the conditions of network devices.Three SNMP versions can be distinguished: the initial implementation (SNMP v1) and its revised versions SNMP v2, SNMP v3 which offer enhanced security for Internet communications.In our case, a Windows/C++ library called SNMP++ (Mellquist, 1997), was used to query the MWL data.The SNMP for the Mini-link uses three key software components, as shown in Fig. A1.
SNMP manager: the client software running on the administrator's computer.
SNMP agent: the server software running on the Mini-link.Management information base (MIB): the MIB is a virtual database, i.e., a hierarchically arranged collection of information that lists all objects that can be accessed via SNMP reading/writing operations.Each object has a unique object identifier (OID).
Communication is usually initiated by the manager who sends a read/write request to the agent via the command Ge-tRequest.The agent reads/writes the desired values from/to the local information base, and sends a response via the Response command together with a status information.If the manager and the agents are connected through the Internet, IP addresses need to be assigned for remote communications.The SNMP manager and agents can also be connected locally through a USB connection.

Z. Wang et al.: Markov switching models to infer dry/rainy periods
Once the OID for each parameter and channels have been identified, the implementation of the data acquisition and logging is straight forward.A flowchart illustrating the procedure is shown in Fig. A2.For our application, the retrieved data are organised into daily files.Each file contains the date (dd/mm/yy), time (HH:MM:SS) in UTC and the transmitted/received powers (in dB) for all the considered channels.The path-integrated attenuation is then derived by subtracting the received power from the transmitted one.The data acquisition software can be made available on request.
used a 23.3 km, dual-frequency research link with 22 rain gauges and radar data.However, only 4 or 5 rain gauges were reasonably close to the considered link.More recently,Leijnse et al. (2007c) used a 4.89 km, 27 GHz research link with 6 rain gauges placed along the path of the link.Finally,Zinevich et al. (

Fig. 3 .
Fig.3.Illustration of the classification performances for the univariate Markov switching algorithm (MSU), the simple threshold, the Factor Graph and the moving window on a subset of dataset 1 (stationary case).Displayed are the observations from channel 1.The time is given in UTC.

Fig. 4 .
Fig.4.Illustration of the classification performances of the algorithms on a subset of dataset 2 (non-stationary case).are the observations from channel 4. Note that the results of the MSU, which did not converge for this channel, have been replaced by the results of the MSM.

Fig. 5 .
Fig. 5. Empirical probability density functions of attenuation values for dry and rainy periods (for dataset 1).The dashed lines represent the fitted densities of a Gaussian distribution with same mean and variance as the samples.The dry and rainy periods are derived from the disdrometer data.

Fig. A1 .
Fig. A1.SNMP client/server concept to communicate with and manage the Mini-Link.

Table 1 .
Longitude, latitude, height and frequencies of the installed microwave link.

Table 3 .
Classification performances for dataset 2 (non-stationary case).Same format than in Table2.Note that no model parameters could be fitted for the MSU algorithm on channels 2 and 4.