Tables

Abstract. The paper introduces a stochastic technique for forecasting rainfall in space-time domain: the PRAISEST Model (Prediction of Rainfall Amount Inside Storm Events: Space and Time). The model is based on the assumption that the rainfall height H accumulated on an interval Δ t between the instants i Δ t and (i+1)Δ t and on a spatial cell of size Δ x Δ y is correlated either with a variable Z , representing antecedent precipitation at the same point, either with a variable W , representing simultaneous rainfall at neighbour cells. The mathematical background is given by a joined probability density f H,W,Z ( h,w,z ) in which the variables have a mixed nature, that is a finite probability for null value and infinitesimal probabilities for the positive values. As study area, the Calabria region, in Southern Italy, has been selected. The region has been discretised by 10 km×10 km cell grid, according to the raingauge network density in this area. Storm events belonging to 1990–2004 period were analyzed to test performances of the PRAISEST model.


Introduction
The risk mitigation in landslide or flood prone areas is one of the most important topics in environmental sciences.In the next future this relevance will increase owing to the development of mitigation and adaptation policies related to climatic change (Stern, 2006;IPCC, 2007).In this scenario the non structural measures, mainly based on early warning system, will play, increasingly, a relevant role.So all the related hydrological topics like rainfall-runoff modelling or rainfall-landslide relationships will be also developed, with the general scientific aim of realizing an accurate simulation of the real phenomena, but also in order to forecast landslide Correspondence to: D. L. De Luca (davideluca@dds.unical.it)and flood events with a lag time large enough for activating civil protection measures.
Indeed, in all the cases where the phenomenon rapidly evolves, like flash floods or shallow landslides, the lag time between observed rainfall and flood or landslide occurrence results too short and must be extended by rainfall fields forecasting.This is the main reason for the rise of the interest in this topic (Reed et al., 2007;Bloschl et al., 2008).
In the technical literature rainfall forecasting models can be classified in time stochastic models, space-time stochastic models and meteorological ones.
In the first class, we consider the "discrete time-series models", that include AutoRegressive Stochastic Models (Box and Jenkins, 1976;Toth et al., 2000).They describe the rainfall process at discrete time steps, are not intermittent and therefore can be applied for describing the "within storm" rainfall.Space-temporal stochastic models can be classified in Multivariate models and Multidimensional ones.The former consider several rain gauges simultaneously and are intended to preserve the covariance structure of the historical rainfall data existing in the network points.On the temporal axis, forecasting is made by autoregressive scheme (STARMA and CARMA models, Cliff et al., 1975;Burlando et al., 1996).
The latter models attempt to characterize the rainfall phenomenon at every point over the area of interest (Bras and Rodriguez-Iturbe, 1984;Meiring et al., 1997).
Finally, meteorological models (Untch et al., 2006) solve in numerical way partial differential equations of atmosphere thermodynamics: they can be classified in GCM (Global Circulation Models) and LAM (Limited Area Models).
Meteorological models are useful qualitative-quantitative rainfall forecasting tools on 24-72 h interval and on large spatial scale.In such cases, indeed, absolute precision is not required for practical application, then the precision of forecasting model is quite enough.When both the forecasting lag time and spatial scale decrease the effectiveness and the precision of this kind of models also decrease (Koussis et al., 2003;Bartholmes and Todini, 2005;Sharma et al., 2007).Unfortunately this is the time space scale of the fast phenomena (flash floods and shallow landslide) that require rainfall forecast for civil protection measures.Then it is not plenty profitable to rely on meteorological models for quantitative rainfall forecasting, as the probability of both missed and false alarms may be too large.
Consequently, in order to perform short term real-time rainfall forecasts for small basins (i.e. with size ranging 100-1000 km 2 ), stochastic models appear to be competitive, as they take account of the hydrological characteristics of the investigated area.
Nevertheless, stochastic models input is only constituted by antecedent rainfalls, so they provide the same prevision, whether meteorological models forecast a wet period or a dry one.For these reasons, coupling stochastic and meteorological models appears a very interesting topic for rainfall forecasting in the small time space scale (Di Tria et al., 1999;Sirangelo et al., 2006).
This work introduces a space-temporal models to forecast rainfall fields named PRAISEST (Prediction of Rainfall Amount Inside Storm Events: Space and Time).It is a multidimensional space-time model, that can be considered like the generalization of the at-site model PRAISE proposed by the authors (Sirangelo et al., 2007).
PRAISEST is based on the assumption that the evaluation rainfall height H , accumulated over an interval t and on a spatial cell of size x y, depends on antecedent precipitation at the same site, and on rainfalls of neighbour cells.
In the following Sections the theoretical bases of the proposed model (Sect.2), fitting techniques (Sect.3) and rainfall generation algorithm (Sect.4) are discussed.The application of the model to the case study of Calabria region, in Southern Italy, is reported in Sect. 5.

Identification of random variables
In the PRAISEST model there are three fundamental random variables: a) H i+1 , defined as the rainfall height on the forecasting interval t between the instants i t and (i+1) t for a cell having area x y.
b) for a fixed lag in time, ν, and for a given pixel where the future rainfall H i+1 must be estimated, construct the autoregression of rainfall heights (1) where α j are autoregressive coefficients to be estimated for each lag, with the conditions 0<α j ≤1, for j =0, 1, ..., ν−1, c) W i+1 , defined as a weighted average of rainfall heights over the four closest neighbouring pixels: where the superscript indicates the pixel in the neighbourhood, as shown in Fig. 1.
Considering larger neighbourhoods or a different weighting procedure, to allow for either anisotropy or influence from a larger region, implies a greater number of parameters and gives us similar results.
As regards Z (ν) i , the linear stochastic dependence between the random variable H i+1 and the generic antecedent random variables H i−j , j =0, 1, 2, ..., i.e. the extension of the temporal "memory" ν of the rain field, for every cell, can be assumed equal to the minimum value of ν for which the sample maximum absolute scattering χ r (ν) (Sirangelo et al., 2007) results less than a fixed critical value χ r,cr .
The coefficients α j can be estimated by maximization of the coefficient of linear correlation ρ H i+1 Z (ν) i .If ν is large the number of parameters may be too high, then a technique of linear filtering results convenient, as the coefficients α j depend on a reduce number of parameters.
In this paper, the gamma-power function is used as filter (De Luca, 2005), depending on three parameters, that provides a good fitting of the estimated values of α j .
Obviously if the field can be considered locally isotropic, the coefficients β j assume the same value equal to 1/4.
For each pixel the aim is then to predict the rainfall total H i+1 at a subsequent timestep considering the triple , where W i+1 is a weighted average at the forecast timestep and this results in a implicit scheme.Section 4 outlines a methodology for using this scheme in the forecasting of rainfall fields.
The PRAISEST Model can be used considering several temporal and spatial scales, according to the available rainfall data.In the case of high-resolution timestep, for example hourly or sub-hourly, storm advection may be reproduced, in stochastic way, by the correlation structure referred to the spatial neighbourhood approach.
In the following, for notation simplicity, the subscripts of random variables H , W and Z will be removed where possible.

Structure of the joint and conditional probability density
Starting from the joint probability density f H,W,Z (h, w, z) is important for deriving the expression of the conditional one f H |W,Z (h|w, z), used to forecast rainfall fields.
To identify f H,W,Z (h, w, z) it is necessary to consider the mixed nature of random variables H , W and Z.All the three variables are non-negative and characterized by a finite probability in correspondence of the null value and by infinitesimal probabilities in correspondence of the positive values.
The conditional distribution f H |W, Z (h|w, z), necessary to perfom the forecasting, is characterized by four separate cases, due to W null or positive and Z null or positive.
Considering the cumulative distribution function (CDF) F H |W,Z (h|w, z), we obtain the following expressions: where G H,0,0 (h) is the CDF of g H,0,0 (h) and the probability referred to the event H =0|W =0 ∩ Z=0 is: -if W >0 and Z=0 then where G H |W,0 (h|w) is the CDF of the conditional density g H |W,0 (h|w), g W (w) is the marginal density with the respect to W , derived from g H,W,0 (h, w), and the probability referred to the event H =0|W >0 ∩ Z=0 is: where G H |0,Z (h|z) is the CDF of the conditional density g H |0,Z (h|z), g Z (z) is the marginal density with the respect to Z, derived from g H,0,Z (h, z), and the probability referred to the event H =0|W =0 ∩ Z>0 is: -if W >0 and Z>0 then where G H |W,Z (h|w, z) is the CDF of the conditional density g H |W,Z (h |w, z), g W,Z (w, z) is the marginal density with the respect to W and Z, derived from g H,W,Z (h, w, z), and the probability referred to the event H =0|W >0 ∩ Z>0 is: In each case reported in Eqs.(6-12) the marginal distribution of the positive variables H , W and Z will be assumed to follow a Weibull distribution F X (x) =1− exp (−λx η ), where the parameters λ and η can be estimated via the method of moments.
It is important to emphasise that each conditioning case will have different power transformation parameters, as the distribution of one variable differs depending on whether the other variables are positive or zero.Moreover, the whole ensemble of parameters varies from a pixel to each other.After performing a power transformation, to model the joint density of the variables it suffices to consider multivariate exponential distributions (Kotz et al., 2000).The Moran-Dowton multivariate exponential distribution for a vector X having p unit mean marginals is written as: where θ is an association parameter that is the same between all dimensions and where The Eq. ( 17) is characterized by exponential marginal density functions.The association parameter respect the condition θ ≥ 1 and it is related to the linear correlation coefficient between any two variables X i and X j as ρ i,j =1−1 θ.Consequently, if θ=1 then ρ i,j = 0 and the Eq. ( 17) can be written as: that implies the independence among the random variables.From the Eq. ( 17) the conditional density g H |W,Z (h|w, z) can be written as: where (λ h , η h ), (λ w , η w ), (λ z , η z ) are the power transformation parameters, respectively, for h, w, z and θ hwz denotes that the association parameter is estimated from the observations (h>0, w>0, z>0).As θ hwz is the same in each dimension it is evaluated from all of the pairs within this group (h, w), (h, z), and (w, z).
The conditional density g H |W,0 (h|w) can be written as: where the power transformation parameters (λ h , η h ), (λ w , η w ) are different from those of Eq. ( 20) because they are evaluated from the observations (h>0, w>0, z=0); the same ensemble of observations is used for the θ hw estimation.
A similar expression for g H |0,Z (h|z) can be developed requiring a separate θ hz that is estimated from the recordered data (h>0, w=0, z>0).Starting from Eq. ( 17), it is easily to determine the structure of the probability density function g 0,W,Z (w, z), while g H,0,0 (h), g 0,W,0 (w) and g 0,0,Z (z) are simply the univariate Weibull distributions and do not require any additional parameters apart from the power transformation ones.

Model calibration
The trivariate probability distribution function f H,W,Z (h, w, z) presents 42 parameters for every pixel.This number is not too large, if we consider that in 10 years there are 87 600 hourly data (supposing that the rainfall process is stationary in time during the whole year) related to every cell, i.e. the d/p (data/parameters in a generic cell) ratio is approximatively equal to 2050, and remains high enough (about 150) also if positive rainfall data are only considered.Using raingauge data and cells domain the ratio does not change if the number of stations and cells are similar.
This d/p ratio value allows consistent evaluations of PRAISEST parameters referred to the whole spatial domain.
The probabilities p H,W,Z , p H,W,0 , p H,0,Z , p 0,W,Z , p H,0,0 , p 0,W,0 and p 0,0,Z can be estimated by the frequencies of the corresponding events.
The power transformation parameters referred to the probability densities of the Eq. ( 5) can be estimated by using classic expressions of Weibull distribution parameters estimation, referred to the method of moments.
The estimation of the association parameter θ hwz , is performed minimizing the following function: where r H W , r H Z , r W Z are the sample linear correlation coefficients, and the sum of the weights ω 1 , ω 2 and ω 3 is unitary.The Eq. ( 22) depends only on the parameter θ hwz , since the remaining parameters have been evaluated in a previous step.
With similar procedures, the association parameters of the remaining density functions of Eq. ( 5) can be evaluated.

Rain fields generation algorithms
The generation of the rainfall heights on the entire domain differs from the standard Monte Carlo approach.In fact, at the current time i, the values of the random variable Z i in every cell are known, but values of W i+1 and H i+1 on the entire domain must be generated.Such generations cannot be carried out independently cell by cell, because the variables are linked by congruence equations.This problem has been solved using the following "Chess-Board" algorithm (Fig. 2): 1. for every "0" cells of the spatial domain, knowing the value of Z i , generation is made using the random number R (0) U , by the formula h i =z (0) i , i.e. the variable H i+1 is generated supposing zero as lower bound for W i+1 .The formulation of the density F H |Z (h|z), marginal with respect to W and conditional with respect to Z, is easily obtained from the Eq. ( 5).This type of generation is justified because, in the model, rainfall heights in the "0" cells are each other independent.

As regards the "1" cells, knowing the value of
i+1 is set equal to the linear combination of the H (0) i+1 in the neighbours "0" cells.
Consequently, generation is made by the formula h ing the random number R (1) U .The equations referred to h (0) i+1 and h (1) i+1 are numerically solved by using bracketing and regula falsi techniques (Press et al., 1988).

Parameters estimation
PRAISEST model has been applied using the hourly rain heights database of the tele-metering raingauge network of the "Centro Funzionale Meteorologico Idrografico Mareografico" of the Calabria region.The network, extended all over the Calabria and Basilicata regions has 92 stations for the period 1990-2001, and 126 stations from 2002 to 2004 (Fig. 3).Approximately, 13 million of hourly rainfalls form the database, of which about 7% are rainy.
The region was discretized by 10 km×10 km cell grid, according to the raingauge network density.
In order to respect the hypothesis of stationary process, only the data measured during the "rainy season", 1 October-31 May have been used (De Luca, 2005).In this period, correlation structure, mean and variance of the sample appear significantly homogeneous (Sirangelo et al., 2007).So the d/p ratio is equal to about 2050 and about 150 considering only rainy intervals.
The historical series do not refer to a regular mesh, so the model parameters have been estimated for every raingauge and then mapped on the regular discretized domain by using a surface spline technique (Yu, 2001).
The extension of the "temporal memory", i.e. the parameter ν, has been determined for every raingauge starting from sample partial autocorrelogram, and using the technique cited at point 2.1 (Sirangelo et al., 2007).The value of χ r,cr has been fixed equal to 0.025, and the estimate of ν has been ν=8, as depicted in Fig. 4, where the mean value of the scattering χ r (ν), considering all the telemeter-raingauges, is represented.
The coefficients α j have been calculated, for every raingauge, applying the gamma-power function as filter (Eqs.2-3).
For the evaluation of coefficients β j defining W i+1 (Eq.4), sample directional spatial correlograms have been analysed, using simultaneous rainfall, and considering, on abscissa, distance between raingauges over the range [5 km; 15 km], representing spatial resolution of the region, discretized by 10 km×10 km cell grid.Four circular sectors of 90 degrees, centred in the NE, SE, SW, NW directions have been considered.Sample correlation values appear to be independent on the directions, as depicted in Fig. 5, so the rain fields can be considered as local isotropic, and then β j =1/4, j =1, 2, 3, 4.
Table 1 reports a set of estimated parameters, evaluated as described in Sect.3. Figure 6 shows an example of parameter mapping in Calabria region, referred to θ hwz , for H >0 ∩ W >0 ∩ Z>0 event.
Greater values are located in the Southern Calabria, so in this area the variables H , W and Z appear more strongly correlated.

Model validation
The hourly scattered data, referred to raingauges network, have been interpolated, by a surface spline technique, on regular mesh, of 10×10 km square cells.
Each rain field simulation requires the knowledge of the rainfalls during the eight previous hours, equal to the temporal memory.The rain field simulations can be carried out for the successive hours, but the temporal extension of the forecasting should not exceed six hours, to avoid rainfall estimations based on only simulated precipitations.Beyond this limit the uncertainty in rainfall evaluation increases, as the influence of recorded rainfall decreases.
In the application here described, 10 000 simulations of the process have been carried out, by Monte Carlo technique described in Sect.4, in order to obtain a large synthetic sample, necessary to determine probabilistic distribution of rainfall height for every pixel and for each hour.The Monte Carlo technique is adopted because of the complexity of determining analytical probabilistic distributions for forecast rainfall during the hours successive to the first one.For these distributions, convolution operations are required.
Firstly, the validation is carried out focusing our attention on the ensemble of events inside the storm development characterized by Z 0 >1 mm (where the subscript "0" is related the eight previous hours of recorded data), which are of major interest for our approach, and then testing the model performance on reproducing, for each of the successive six hours of forecasting, the following sample moments which are not fitted in the parameters estimation: -mean m H |Z 0 >1 and standard deviations s H |Z 0 >1 ; -spatial correlation r H W |Z 0 >1 and autocorrelation of lag   -dry ratios f 00|Z 0 >1 , wet-to-dry f 10|Z 0 >1 , dry-to wet f 01|Z 0 >1 and wet-to-wet f 11|Z 0 >1 ratios for two subsequent images.
The number of events is about 2000 and the results are reported in Figs.7-8, in which the histograms represent the sample values averaged on the whole spatial domain.Moreover for each index, mean value and 95% band of PRAISEST simulations are reported.The figures demonstrate the overall capability of the method to reproduce the rainfall fields properties.
Moreover, as examples of model output, the applications relative to 1 February 1998 and 24 November 1999 events in the Calabria region are illustrated in Figs.9-10.
For the first event the rain fields from 07:00 p.m. of 31 January to 03:00 a.m. of 1 February have been used as model memory, and the simulation period starts from 03:00 a.m and finishes at 09:00 a.m. of 1 February.The second event is characterized by a rain field memory from 08:00 p.m. of 23 November to 04:00 a.m. of 24 November , while the simulation period starts from 04:00 a.m and finishes at 10:00 a.m. of 24 November.On the abscissa, the cells are sorted from left to right, and from North to South.In the figures, besides the rain histograms effectively occurred, for every cell, percentiles 90% and 95% of the simulated fields are reported.Following the axis of the abscissas, the chief towns of  Cosenza (CS), Crotone (KR), Catanzaro (CZ), Vibo Valentia (VV) and Reggio Calabria (RC) are met in this order.One tail significance test, at 5% and 10% significance level, has been performed.The diagrams show that observed rainfall for all, but one, cells are inferior to percentiles 95% of forecast values.In the most cases observed values are also inferior to percentiles 90% of forecast ones.Then in all cases the results obtained by PRAISEST model seem in agreement with observed data.

Conclusions
The PRAISEST model, presented herein, is a multidimensional space-time model for forecasting rainfall fields.Mathematical background is characterized by a trivariate proba-bility distribution, referred to the random variables H , Z and W , representing rainfall to forecast at the generic cell, antecedent precipitation in the same cell and rainfall in the adjacent cells.
The results in simulation at regional scale show the capability of the model to reproduce, for each forecasting hour, mean and standard deviations values, autocorrelations and spatial correlations, dry ratios, wet-to-dry, dry-to-wet and wet-to-wet ratios for two subsequent images.
Consequently it is able to provide distribution functions of the rainfall in the successive hours that are in agreement with the observed values, at least at 10% significance level.
PRAISEST therefore can be easily coupled with other models like rainfall-runoff and rainfall-landslide ones for nowcasting of fast phenomena, characterized by short lag time, like flash floods and shallow landslides.
Moreover the model is highly flexible and can be usefully adopted with any cell grid, within the limit of the raingauge density.Then it is very suitable for the analysis of spatial rainfall data like the radar derived ones, which give a finer spatial description of the precipitation fields.Radar data are, in fact, compatible with the cell organization of the PRAIS-EST spatial domain.
The proposed model also seems appropriate for coupling with meteorogical models in order to realize a Bayesian approach to rainfall nowcasting.

Table 1 .
Example of estimated parameters set.