1 Introduction

Point process theory has been widely utilised to develop stochastic models for rainfall collected at daily, hourly or sub-hourly aggregations. Rainfall can be thought of as a random process evolving continuously over time, which is usually recorded as cumulative amounts over disjoint time intervals of constant length, such as hours or days. One way to apply point process theory to modelling rainfall is to assume the existence of an underlying continuous-time rainfall generating mechanism which evolves randomly over time and whose outcome is only observed as the integral of the continuous process over the given sampling interval. There has been a substantial amount of work over the years on point process models for rainfall, see for example Onof et al. (2000) and Kaczmarska et al. (2014) amongst others.

Amongst the literature on stochastic models for rainfall, clustered point process models have featured heavily as they preserve the clustering properties of the rain generating mechanism. For example, summer rainfall often occurs in showers, i.e. heavy rainfall of short duration whereas the winter rainfall tends to be frontal. Most cluster-based models considered for this purpose are based on either Neyman–Scott or Bartlett–Lewis cluster processes (Cox and Isham 1980; Rodriguez-Iturbe et al. 1987), whereby these two are equivalent up to the second-order level (Cowpertwait 1998).

Rodriguez-Iturbe et al. (1987) developed stochastic models with rectangular pulses for rainfall at a single site based on Poisson and Poisson-clustered point processes. These original models were then followed by spatial-temporal extensions (e.g. Cox and Isham 1988). Focusing here upon the purely temporal models, the following important developments can be singled out:

  1. 1.

    the parameters describing the temporal structure of a storm were randomised so that each storm could have a different frequency of cells arrivals, its own cell duration and storm duration distributions (Rodriguez-Iturbe et al. 1988) so as to improve the reproduction of dry periods at a range of time-scales; this work was taken further (Kaczmarska et al. 2014) to include the cell intensity distribution parameter into the randomisation;

  2. 2.

    other distributions were used for the cell intensity (Onof and Wheater 1994), and in particular a dependence between cell intensity and duration was introduced (Evin and Favre 2008);

  3. 3.

    two cell-types were considered for each season (Cowpertwait 1994);

  4. 4.

    work on the most useful fitting statistics was carried out (Khaliq and Cunane 1996);

  5. 5.

    the models were regionalised (Kim et al. 2016);

  6. 6.

    the models’ ability to reproduce extremes was examined/improved (Verhoest et al. 1997; Cameron et al. 2000);

  7. 7.

    non-stationarity was introduced to reproduce temporally evolving rainfall properties (Burton et al. 2010; Evin and Favre 2013; Kaczmarska et al. 2015);

  8. 8.

    other types of pulses were considered, e.g. by adding a jitter to a rectangular shape (Onof and Wheater 1994) or replacing the pulse by the clustering of a sequence of instantaneous pulses (Cowpertwait et al. 2007);

  9. 9.

    Bartlett-Lewis rectangular pulse models were used in a disaggregation framework (Koutsoyiannis and Onof 2001; Kossieris et al. 2016).

Under the penultimate heading listed above, while some work was carried out on examining different pulse shapes, this was not published (Samuel 1999). However, this work pointed to the usefulness of considering exponentially decaying cell shapes, in particular when it comes to reproducing the properties of rainfall at the fine temporal scale.

Given the encouraging results obtained by using doubly stochastic Poisson point processes rather than Poisson cluster processes as the driving point process (Ramesh et al. 2012, 2013) it follows that attaching an exponentially decaying pulse to each point of such a process is worth exploring, in particular if the focus is on the reproduction of the properties of fine-scale rainfall.

The aim of this paper is to develop a class of doubly stochastic Poisson process (DSPP) models with exponentially decaying pulses to describe the probabilistic structure of the rainfall at a single rain-gauge. The model we consider here is similar in structure to those described in Thayakaran and Ramesh (2017), Ramesh and Thayakaran (2012) and Ramesh (1998) but is different in its form due to exponential pulses. The proposed model is applied to a set of sub-hourly rainfall data from Bracknell in England, obtained from the U.K. Meteorological Office, to illustrate its application in the modelling of temporal rainfall.

Mathematical formulation of the proposed doubly stochastic exponential pulse model is described in Sect. 2. Second-moment characteristics of the rainfall intensity are studied and expressions for the aggregated rainfall processes are derived in this section. A data analysis, which employed two different versions of the model, using 15 years of rainfall data from England is presented in Sects. 3 and 4. Section 5 compares the performance of the proposed model with that of a doubly stochastic rectangular pulse model. A Gamma distribution for the initial pulse depth is considered in Sect. 6. Conclusions and suggestions for further work are reported in Sect. 7.

2 DSPP exponential pulse model

2.1 Model description

Suppose that the rainfall bursts at a location occur according to a stationary doubly stochastic Poisson process \(\{N(t)\}\) whose arrival rate is controlled by a continuous time Markov chain on two states representing environmental conditions giving rise to light (or dry) and heavy (wet) rainfall episodes. Let \(\lambda\) and \(\mu\) be the transition rates of the Markov chain and \(\phi _{1}, \phi _{2}\) be the arrival rates of bursts in the two states. Associated with each ‘burst’ of the process \(\{N(t)\}\) is an exponential pulse of random ‘depth’ X which decays exponentially at a rate \(\beta\). The pulses terminate after a fixed duration d. We will leave the distribution of the initial depth X of the pulses unspecified with a mean of \(\mu _X\). Therefore, the rainfall intensity, Y(t), at time t is the sum of all the ‘active’ pulses at t. It is assumed that the pulses are mutually independent, and also independent of the point process \(\{N(t)\}\). Figure 1 provides a schematic description of the pulse process. The rainfall intensity Y(t), at time t, may be written as

$$\begin{aligned} Y(t) = \sum _{t_i \in [t-d,t]} X_{t_i} = \int _{u=0}^{d} X_{t-u}(u) dN(t-u), \end{aligned}$$
(1)

where \(X_{u}(\tau )\) is the random depth of the pulse originating at time u measured a time \(\tau\) later and N(t) counts the occurrences in the DSPP of pulse origins.

Fig. 1
figure 1

Schematic description of the exponential pulse model: a The arrival process of rainfall bursts based on a two-state DSPP. b The pulse process which originates with each burst and lasts for a fixed duration of d

2.2 Second-order properties of the intensity process

We first consider the rainfall intensity process Y(t) at time t. Since a pulse is only active for a constant duration d, for the model under consideration, we have

$$\begin{aligned} X_{t-u}(u) = \left\{ \begin{array}{ll} X e^{-\beta u} & \text{ with } \text{ probability } \text{1, } \text{ if } u \le d \\ 0 & \text{ with } \text{ probability } \text{1, } \text{ if } u>d. \end{array} \right. \end{aligned}$$

The second-order moment properties of Y(t) are related to the properties of the point process \(\{N(t)\}\). Taking expectations on both sides of (1) gives the mean of the rainfall intensity process as

$$\begin{aligned} E \left\{ Y(t) \right\}= & \int _{0}^{\infty } E \left\{ X e^{-\beta u} \right\} dN(t-u)\\= & \int _{0}^{d} E(X) e^{-\beta u} dN(t-u)\\=\, & m \mu _{X}(1-e^{-\beta d})/\beta \end{aligned}$$

where \(m=\frac{\lambda \phi _2+\mu \phi _1}{\lambda +\mu }\) is the mean intensity of the point process \(\{N(t)\}\) and \(\mu _{X}\) is the mean depth of a single pulse at its origin. The autocovariance of the rainfall intensity Y(t) at lag \(\tau\) is given by

$$\begin{aligned} C_{Y}(\tau )= & \text{ Cov } \left\{ Y(t), Y(t+\tau ) \right\} \\= & \int _{0}^{\infty } \int _{0}^{\infty } E \left\{ X_{t-u}(u) X_{t+\tau -v}(v) \right\} \text{ Cov } \left\{ dN(t-u), dN(t+\tau -v) \right\} , \end{aligned}$$

where \(\text{ Cov } \left\{ dN(t), dN(t+u) \right\}\) is the covariance density of the point process N(t) which is given by Ramesh (1995)

$$\begin{aligned} \text{ Cov } \left\{ dN(t), dN(t+u) \right\} = m \delta (u) + A e^{-(\lambda + \mu ) u} \end{aligned}$$

where \(\delta (.)\) is the Dirac delta function and the constant A is given by \(A=\frac{\lambda \mu (\phi _1-\phi _2)^2}{(\lambda +\mu )}.\) Substituting this in the above expression gives the autocovariance of the rainfall intensity process as

$$\begin{aligned} C_{Y}(\tau )= & \int _{0}^{\infty } \int _{0}^{\infty } E \left\{ X_{t-u}(u) X_{t+\tau -v}(v) \right\} \left\{ m \delta (u+\tau -v) + A e^{-(\lambda + \mu ) (u+\tau -v)} \right\} du dv\\= & m E(X^2) \int _{0}^{d} e^{-\beta (\tau +u)} du + A \mu _{X}^2 \int _{0}^{d} \int _{0}^{d} e^{-\beta (u+v)} e^{-(\lambda + \mu ) (u+\tau -v)} du dv. \end{aligned}$$

Computing this integral, after some algebra, gives us

$$\begin{aligned} C_{Y}(\tau ) =\left[ \frac{mE[X^2](1-e^{-\beta d})}{\beta } \right] e^{-\beta \tau } + \left[ \frac{A\mu _X^2\left( 1+e^{-2\beta d}-e^{-(\beta +\lambda +\mu )d}-e^{-(\beta -\lambda -\mu )d}\right) }{\left( \beta ^2-(\lambda +\mu )^2 \right) }\right] e^{-(\lambda + \mu )\tau }. \end{aligned}$$
(2)

We can obtain the variance of the rainfall intensity process by setting \(\tau =0\) in the above expression, i.e.

$$\begin{aligned} \text{ Var } \left\{ Y(t)\right\} =\left[ \frac{mE[X^2](1-e^{-\beta d})}{\beta } \right] + \left[ \frac{A\mu _X^2\left( 1+e^{-2\beta d}-e^{-(\beta +\lambda +\mu )d}-e^{-(\beta -\lambda -\mu )d}\right) }{\left( \beta ^2-(\lambda +\mu )^2 \right) }\right] . \end{aligned}$$
(3)

It is worth noticing that the above expressions for the mean, variance and autocovariance reduce to those of the Poisson exponential pulse model when \(\phi _{1}=\phi _{2}\).

2.3 Second-order properties of aggregated process

Since the interest primarily lies in studying the properties of rainfall in aggregated form, and much of the rainfall data is available in this form, we now derive the moment properties of the cumulative rainfall in disjoint time intervals of fixed length h. Let us define the cumulative rainfall totals in disjoint intervals of length h, for \(i=1,2,\ldots ,\) as

$$\begin{aligned} Y_{i}^{(h)} = \int _{(i-1)h}^{ih} Y(u) du. \end{aligned}$$

To derive the second-moment properties of this aggregated rainfall process we shall make use of the following general expressions from Rodriguez-Iturbe et al. (1987).

$$\begin{aligned} E \left\{ Y_{i}^{(h)} \right\}=\, & h E \left\{ Y(t) \right\} , \\ \text{ Var } \left\{ Y_{i}^{(h)} \right\}=\, & 2 \int _{0}^{h} (h-u) C_{Y} (u) du, \\ \text{ Cov } \left\{ Y_{i}^{(h)}, Y_{i+k}^{(h)} \right\}=\, & \int _{-h}^{h} C_{Y} (kh+u) (h-\mid u \mid )du. \end{aligned}$$

For our DSPP exponential pulse model, using equations (1), (2) and (3) and the above expressions we find that that the mean, variance and covariance of the aggregated rainfall process may be written as

$$\begin{aligned} \mu (h)&= E\left\{ Y_i^{(h)}\right\} =\frac{(1-e^{-\beta d})m\mu _X h}{ \beta } \end{aligned}$$
(4)
$$\begin{aligned} \sigma ^2(h)&= \text{ Var } \left\{ Y_{i}^{(h)} \right\} =2K_1\left( \frac{h}{\beta }-\frac{1}{\beta ^2}+{\frac{e^{-\beta h}}{\beta ^2}}\right) +2K_2\left( \frac{h}{\lambda +\mu } - \frac{1}{(\lambda +\mu )^2} + \frac{e^{-(\lambda +\mu )h}}{(\lambda +\mu )^2}\right) \end{aligned}$$
(5)

and for \(k=1,2,\ldots ,\)

$$\begin{aligned} \text{ Cov } \left\{ Y_{i}^{(h)}, Y_{i+k}^{(h)} \right\} =K_1\left( \frac{1-e^{-\beta h}}{\beta }\right) ^2e^{-\beta (k-1)h} + K_2\left( \frac{1-e^{-(\lambda +\mu )*h}}{\lambda +\mu }\right) ^2e^{-(\lambda +\mu )(k-1)h} \end{aligned}$$
(6)

where \(K_1=\frac{mE(X^2)(1-e^{-\beta d})}{\beta }\) and \(K_2=\frac{A\mu _X^2\left( 1+e^{-2\beta d}-e^{-(\beta +\lambda +\mu )d}-e^{-(\beta -\lambda -\mu )d}\right) }{\left( \beta ^2-(\lambda +\mu )^2\right) }.\)

The above expression shows that the rate of decay of the autocorrelation function of the aggregated process is influenced not only by the transition rates, \(\lambda\) and \(\mu\), of the underlying Markov process but also by the decay rate \(\beta\) of the pulses. One convenient form for this model is to assume an exponential distribution for the initial pulse depth X. Other distributions like Gamma or Pareto can also be applied easily.

3 Model with variable pulse duration

In this section, we start off our analysis with a model that allows the pulse lifetime to vary, while keeping the exponential distribution for initial pulse depth. One way to do this is to take the pulse lifetime d as a random variable with a specified distribution. Another approach is to take d as a parameter of the model and seek to estimate it from the data, along with other parameters. We take this second approach in this paper. When d is taken as a parameter, the expressions (4) to (6) for mean, variance and covariance are still valid and we treat them as functions of one additional parameter.

In our application, we take the initial pulse depth \(X_{i}\) at the pulse origins as independent random variables with an exponential distribution which has mean \(\mu _X\). Hence, \(E(X)=\mu _X\) and \(E(X^2)=2\mu ^2_X\). Our model then has seven parameters and we estimate them by the method of moments approach using the observed and theoretical values of the second-order properties of the rainfall accumulations. The parameter \(\mu _{X}\) can also be estimated separately for each month using the sample mean and the following equation, which follows from (4),

$$\begin{aligned} \mu _{X}=\left( \frac{\hat{\beta }}{\hat{m} \left( 1-e^{-\hat{\beta } d}\right) }\right) \bar{x } \,\, \end{aligned}$$
(7)

where \(\bar{x}\) is the estimated average of hourly rainfall for each month.

We use the proposed exponentially decaying pulse model to analyse 15 years of sub-hourly rainfall data from Bracknell, England for the period 1986 to 2000 collected by the Meteorological Office. Table 1 gives the summary statistics for the mean, standard deviation, lag 1 autocorrelation, coefficient of variation, proportion of dry and wet period for aggregated rainfall at \(h=1/6, 1\) h for each month.

Table 1 Empirical summary statistics of Bracknell rainfall data

The mean \(\mu (h)\), standard deviation \(\sigma (h)\) and the lag one autocorrelation \(\rho (h)\) of the aggregated rainfall process are used to estimate all seven parameters of the model where

$$\begin{aligned} \rho (h)=\text {Corr}\left[ Y_{i}^{(h)},Y_{i+1}^{(h)}\right] . \end{aligned}$$

The above properties of the rainfall at three different aggregation levels (at h=10, 20, 60 min) are used in our estimation. The estimates of the functions from the empirical data, denoted by \(\hat{\mu }(h), \, \,\hat{\sigma }(h)\) and \(\hat{\rho }(h)\), are calculated for each month using 15 years of sub-hourly rainfall series accumulated at scales \(h=1/6, 1/3, 1\) h. The estimated values of the model parameters \(\hat{\lambda }\), \(\hat{\mu }\), \(\hat{\phi _{1}}\), \(\hat{\phi _{2}}\), \(\hat{\beta }\), \(\hat{d}\) and \(\hat{\mu }_X\) for each month can be obtained by minimizing the sum of squares of differences in the observed and fitted values of the functions, as given below using standard routines.

$$\begin{aligned} \sum _{h=\frac{1}{6},\frac{1}{3},1} \left[ \left( \hat{\mu }(h)-{\mu (h)} \right) ^2 + \left( \hat{\sigma }(h)-{\sigma (h)} \right) ^2 + \left( \hat{\rho }(h)-{\rho (h)} \right) ^2 \right] . \end{aligned}$$

Alternatively, Jesus and Chandler (2011) provide a useful generalised method of moment approach, using weighted estimation of the objective function, to estimate parameters. However, since we are only using a small data set of 15 years of rainfall, we follow the method used by Cowpertwait et al. (2007) and employ the objective function given below.

$$\begin{aligned} \sum _{h=\frac{1}{6},\frac{1}{3},1} \left[ \left( 1-\frac{\hat{\mu }(h)}{\mu (h)} \right) ^2 + \left( 1-\frac{{\mu }(h)}{\hat{\mu }(h)} \right) ^2 + \left( 1-\frac{\hat{\sigma }(h)}{\sigma (h)} \right) ^2 + \left( 1-\frac{{\sigma }(h)}{\hat{\sigma }(h)} \right) ^2 +\left( 1-\frac{\hat{\rho }(h)}{\rho (h)} \right) ^2 + \left( 1-\frac{{\rho }(h)}{\hat{\rho }(h)} \right) ^2 \right] . \end{aligned}$$
(8)

We utilise the above objective function to estimate the parameters of our model, where the pulse lifetime d is allowed to vary from one month to the next. This objective function is minimised numerically, using R routines (R Core Team 2017) that employ function evaluations as well as derivatives, separately for each month to obtain estimates of the model parameters. To avoid difficulties in optimisation, the approach we used was to employ an initial search algorithm that uses the Nelder-Mead downhill simplex method (which uses function evaluations only) to find a good region of optimal parameter values in the parameter space. A derivative based algorithm is then employed to find refined estimates. The parameter estimates for this model, when d is allowed to vary, are given in Table 2 for all 12 months.

Table 2 Parameter estimates for the exponential pulse model with variable lifetime

The estimates show that the mean sojourn times \(1/\mu\) of the wet state (State 2) are shorter in summer months than those of the winter months, with an average duration that varies from around 25 to 53 min. The rain pulse arrivals occur at a rate of about 40–60 per hour when the Markov chain is in the wet state. In addition, the values of \(\hat{\mu }_X\) are again larger for summer months showing higher initial rainfall intensity for the pulses. In the light rain state, the estimates of the pulse arrival rate \(\hat{\phi }_1\) are the smallest for summer months which contributes towards longer dry period. The estimates \(\hat{\beta }\) are also high, in general, which shows that the rain cells decay fast and deposit the rain quickly. The estimates of d suggest that the average duration of the pulse lifetime is between 1 and 2 h, except for the summer months July, August and also September which have shorter duration.

Fig. 2
figure 2

Observed (red) and fitted (blue) values of the mean rainfall at h = 1/12 h aggregation for the exponential initial pulse depth model with variable pulse duration d, along with a simulation band (black) from 1000 simulations

To assess how well our model performs, fitted values of the theoretical properties were calculated, using the estimated parameter values, at various aggregation levels and compared with the corresponding empirical values. Simulation bands were constructed based upon 1000 simulated realisations, each of length 15 years, from the fitted model. The simulation bands, taken as the maximum and minimum of the 1000 values of the statistics from the simulations, are also displayed in the figures throughout this section. In all the plots, the red line represents the empirical values, the blue line shows the fitted values and the dark dashed lines show the simulation bands.

The empirical and fitted means of the aggregated rainfall are in near perfect agreement in Fig. 2 showing that the mean rainfall has been reproduced well by the fitted model at \(h=1/12\) h. The same is true at all the other higher aggregations, as the mean was just scaled up by the values of h, and the plots (not shown here) displayed identical patterns. Figure 3 and 4 display the empirical and fitted values of the standard deviation of the accumulated rainfall at sub-hourly and higher aggregations, along with simulation bands. Here again both observed and fitted curves are in excellent agreement, at all aggregations. The simulation bands suggest that the sampling distribution of the standard deviation is skewed at sub-hourly aggregations. At higher aggregations that are not used in the fitting both the observed and fitted values are in near perfect agreement for all months and fall well within the simulation band.

Fig. 3
figure 3

Observed (red) and fitted (blue) values of the standard deviation of the aggregated rainfall at h = 1/12, 1/6, 1/3, 1/2 h for the exponential initial pulse depth model with variable pulse duration d, along with simulation bands (black) from 1000 simulations

Fig. 4
figure 4

Observed (red) and fitted (blue) values of the standard deviation of the aggregated rainfall at h = 1, 6, 12, 24 h for the exponential initial pulse depth model with variable pulse duration d, along with simulation bands (black) from 1000 simulations

The observed and fitted values of the lag 1 autocorrelation of the aggregated rainfall are in excellent agreement in Fig. 5 for sub-hourly accumulations as low as \(h=10\) min. Although the model performs poorly for \(h=5\) min, it does well at other larger values of h at sub-hourly scales. The model performs very well at \(h=1\), in Fig. 6, but shows slight differences between the empirical and fitted values for larger h. Nevertheless, the differences fall well within the simulation bands. Hence the fitted model appears to capture the pattern of autocorrelations well, at both lower and higher aggregations.

Fig. 5
figure 5

Observed (red) and fitted (blue) values of the autocorrelation of the aggregated rainfall at h = 1/12, 1/6, 1/3, 1/2 h for the exponential initial pulse depth model with variable pulse duration d, along with simulation bands (black) from 1000 simulations

Fig. 6
figure 6

Observed (red) and fitted (blue) values of the autocorrelation of the aggregated rainfall at h = 1, 6, 12, 24 h for the exponential initial pulse depth model with variable pulse duration d, along with simulation bands (black) from 1000 simulations

The model performs well for coefficient of variation at sub-hourly aggregations showing good agreement between the observed and fitted curves. However, both curves fall at the edge of the simulation band as shown in the top panel of Fig. 7 for \(h=1/2\). This indicates that the sampling distribution of the simulated values is highly skewed at sub-hourly aggregations. One point to bear in mind here is that the simulated values of the coefficient of variation have two sources of variation, the mean and the standard deviation, and their joint sampling distribution will certainly be one of the reasons for this behaviour. The model fits reasonably well at larger aggregations as shown in the other panels of Fig. 7 where the observed and fitted values are in good agreement. The skewness of the sampling distribution of the coefficient of variation also becomes less prominent when h becomes large.

Fig. 7
figure 7

Observed (red) and fitted (blue) values of the coefficient of variation of the aggregated rainfall at h = 1/2, 6, 12, 24 h for the exponential initial pulse depth model with variable pulse duration d, along with simulation bands (black) from 1000 simulations

The empirical values of the proportion of dry periods are displayed in Fig. 8 together with simulation bands from the fitted model at finer aggregations. The model appears to reproduce these reasonably well and capture their pattern across the year quite well at \(h=1/12\), but not at other h values. In general, our model underestimates the proportion of dry periods (and overestimates the proportion of wet periods) at larger aggregations. This may have resulted from the occasional arrival of light rain pulses in state 1, generated by the model, which will impact more on the proportion of dry and wet values at larger values of h. These are, however, minor discrepancies given that these statistics are not used in the fitting and dependent more on the scale of measurement and affected by occasional arrival of rain pulses in state 1.

Fig. 8
figure 8

Observed (red) values of the proportion of dry periods of the aggregated rainfall at h = 1/12, 1/6 h for the exponential initial pulse depth model with variable pulse duration d, along with simulation bands (black) from 1000 simulations

4 Analysis with fixed pulse duration

The estimates of the pulse duration d in the previous section varied across months from 0.25 to 1.78 h. Nevertheless, the estimated values of the exponential decay parameter \(\beta\) for this dataset suggest that most of the rain cells deposit almost 95% of their rain within about 10–15 min from their origin. Hence, although dependent on the application in hand, experimenting with a fixed duration for the pulses is a worthwhile exercise. We explored this for our data with d = 1, 2 h and present the results here for the case \(d=1\).

We shall take the pulse duration as 1 h (\(d=1\)) in this section, as opposed to the variable pulse lifetime model explored in the earlier section. We can use the same objective function (8) to estimate the parameters, as before, but the difference is that the pulse lifetime d is taken as fixed at \(d=1\). Therefore this model now has one fewer parameters. The objective function is minimised, using the R routines used earlier, separately for each month to obtain estimates of the model parameters. Parameter estimates for this model are displayed in Table 3. They suggest that the arrival rates \(\phi _2\) in the heavy rain state (State 2) are, in general, higher in the summer months than those of the winter months. In addition, the mean sojourn times \(1/\mu\) of the heavy rain state are shorter in summer months, mostly between about 25 to 55 min. Values of \(\hat{\mu }_X\) are generally larger for summer months showing higher initial rainfall intensity for the pulses. All this falls in line with the fact that summer rainfall often occurs in showers, i.e. heavy intensity rainfall of short duration whereas the winter rainfall tends to be more of a frontal type. The arrival rates \(\phi _1\) in the light rain state or dry state (State 1) are also smallest in the summer months. The estimates \(\hat{\beta }\) seem to suggest that, in general, the rain cells decay fast and deposit much of the rain quickly within minutes, although their lifetime is taken to be \(d=1\) h in this model.

When compared with the parameter estimates of the earlier model given in Table 2, there appears to be no specific pattern in the changes realised in the estimates. Estimates of \(\mu _X\) show little changes and the \(\beta\) shows slight changes with no noticeable pattern. Estimates of \(\lambda\) show some changes with an increase for the months May, June and November and a drop in months February to April. Not much change is observed in the estimates of \(\mu\) and \(\phi _1\). The estimates of \(\phi _2\) show slight changes across the months, with no special pattern, but gone up for May and June.

Table 3 Parameter estimates for the exponential pulse model with fixed lifetime \(d=1\)

Fitted values of the theoretical properties were calculated, using the estimated parameter values, at various aggregation levels and plotted along with their corresponding empirical values. Simulation bands, based upon 1000 simulations of the process from the fitted exponential pulse model with fixed pulse duration, are also displayed at all aggregations. Each of the simulated realisation is of length 15-years and the simulation bands are taken as the maximum and minimum values of the statistics over the 1000 simulations. In all the plots in this section, the red line represents the empirical values, the blue line shows the fitted values and the dark dashed lines show the simulation bands. The pink line in Figs. 9, 10, 11, 12, 13 and 14 is for the model described in the next section, given here for comparison and will be discussed later in Section 5.

Figure 9 shows the empirical and fitted mean rainfall at sub-hourly aggregation \(h=1/12\) h. It is clear from the plot that the mean rainfall has been reproduced perfectly by the fitted model. The same is true at all other higher aggregations, and the plots (not shown) displayed identical pattern as the mean was just scaled up by the values of h.

Figures 10 and 11 show the empirical and fitted values of the standard deviation of the accumulated rainfall at sub-hourly and higher aggregations, along with simulation bands from the fitted model. Again the observed and fitted curves are in excellent agreement at all aggregations, especially at those that are not used in the fitting. Both curves appear to fall closer to the upper simulation band consistently at sub-hourly scales. This seems to indicate that the sampling distribution of the rainfall standard deviation is slightly skewed at these sub-hourly aggregations. However, this skewness decreases as h increases and the sampling distribution does not appear to be skewed at higher aggregations.

Fig. 9
figure 9

Observed (red) and fitted (blue) values of the mean rainfall at h = 1/12 h aggregation for the exponential initial pulse depth model, along with a simulation band (black) from 1000 simulations. The pink line is for the rectangular pulse model described in Sect. 5

Fig. 10
figure 10

Observed (red) and fitted (blue) values of the standard deviation of the aggregated rainfall at h = 1/12, 1/6, 1/3, 1/2 h for the exponential initial pulse depth model, along with simulation bands (black) from 1000 simulations. The pink line is for the rectangular pulse model described in Sect. 5

Fig. 11
figure 11

Observed (red) and fitted (blue) values of the standard deviation of the aggregated rainfall at h = 1, 6, 12, 24 h for the exponential initial pulse depth model, along with simulation bands (black) from 1000 simulations. The pink line is for the rectangular pulse model described in Sect. 5

The observed and fitted values of the lag 1 autocorrelation of the aggregated rainfall are displayed in Fig. 12. The observed and fitted curves for the autocorrelation are in very good agreement at lower aggregations, except for \(h=1/12\) h and perhaps for the month September at \(h=1/3, 1/2\). The model performs well again at \(h=1\) h in Fig. 13. At larger values of h, however, there appear to be slight differences between the empirical and fitted values. Nevertheless, the differences mostly fall inside the simulation bands and also these larger values of h are not used in the fitting. Hence the fitted model appears to capture the pattern of autocorrelations well, at both lower and higher aggregations.

Fig. 12
figure 12

Observed (red) and fitted (blue) values of the autocorrelation of the aggregated rainfall at h = 1/12, 1/6, 1/3, 1/2 h for the exponential initial pulse depth model, along with simulation bands (black) from 1000 simulations. The pink line is for the rectangular pulse model described in Sect. 5

Fig. 13
figure 13

Observed (red) and fitted (blue) values of the autocorrelation of the aggregated rainfall at h = 1, 6, 12, 24 h for the exponential initial pulse depth model, along with simulation bands (black) from 1000 simulations. The pink line is for the rectangular pulse model described in Sect. 5

The model performs very well for the coefficient of variation at small aggregations showing good agreement between the observed and fitted curves. However, they fall closer to the edge of the simulation bands at \(h=1\) h as shown in the top panel of Fig. 14. Again this shows that the sampling distributions of the simulated values are highly skewed at smaller aggregations. This skewness issue is overcome at larger aggregations and the model fits reasonably well as shown in the other panels of Fig. 14 where the observed and fitted values are in good agreement with minor differences falling within the simulation bands.

The observed values of the proportion of dry periods for this model are displayed in the top panel of Fig. 15 together with simulation bands from the fitted model at finer aggregation \(h=1/12\). The model appears to reproduce these reasonably well and capture their pattern across the year quite well at \(h=1/12\). At larger values of h, our model underestimates the proportion of dry periods. This may have resulted from the occasional arrival of light rain pulses in state 1, generated by the model, which will impact more on the proportion of dry values at larger values of h.

Fig. 14
figure 14

Observed (red) and fitted (blue) values of the coefficient of variation of the aggregated rainfall at h = 1, 6, 12, 24 h for the exponential initial pulse depth model, along with simulation bands (black) from 1000 simulations. The pink line is for the rectangular pulse model described in Sect. 5

Fig. 15
figure 15

Observed (red) values of the proportion of dry periods of the aggregated rainfall at h = 1/12 h aggregation for the exponential (top panel) and rectangular (bottom panel) initial pulse depth models, along with simulation bands (black) from 1000 simulations of the two models

Comparison of the results of the two models shows that the variable pulse duration model has performed slightly better than the fixed duration model, in general, in reproducing some of the rainfall properties, such as standard deviation and correlation at certain levels of aggregation. The improvement, however, is not substantial and both models appear to reproduce the rainfall characteristics equally well for the most part.

5 Model comparison with a rectangular pulse model

To assess how well the proposed exponentially decaying pulse model performs, when compared with existing point process models for rainfall, we use a doubly stochastic rectangular pulse model (Ramesh 1998) as these two models have the same structure with regard to the cell arrivals. The main difference in the two models comes from the mechanism for the pulse shape and its duration. We shall give a brief description of this model first. Suppose that the rain cells \(\{N(t)\}\) occur according to a stationary two-state doubly stochastic Poisson process, as before, with parameters \(\lambda , \mu , \phi _1, \phi _2\). Associated with each event of the process \(\{N(t)\}\) is a rectangular pulse of random duration L, having an exponential distribution with parameter \(\eta\), and a constant but random depth X representing rainfall intensity. Second-moment properties of the depth process and the aggregated rainfall process are described by Ramesh (1998). We make use of the expressions for the mean, standard deviation and lag 1 autocorrelation of the aggregated rainfall process and their observed values to fit this model to the same 15-years of sub-hourly rainfall data from Bracknell.

Table 4 Parameter estimates for the Rectangular pulse model

These second-moment properties of the aggregated process at three different aggregation levels (at \(h=10, 20\) min and 12 h) were used in our estimation. The estimates of the rectangular pulse model parameters were obtained, separately for each month, by minimizing the objective function of the type in equation (8). The parameter estimates are displayed in Table 4 and they show similar features to the earlier model estimates. The average sojourn times in the dry state are higher during summer months, than those of the other months, and the sojourn times in the wet state are in general shorter during summer months. The pulse duration is also shorter during summer months with the life time of about 45–55 min and about 1–1.6 h in the winter. The depth of the rectangular pulses in summer tends to be larger than that of the winter months, suggesting heavy intensity rain for short duration in summer months.

The fitted mean rainfall of this rectangular pulse model at \(h=1/12\) h aggregation is displayed in Fig. 9 along with the fitted mean from the exponential pulse model and the empirical mean rainfall. The pink line shows the fitted values of this rectangular pulse model in all the plots in this section and the other lines of the plots are as described earlier in Section 4. It is clear from Fig. 9 that the mean rainfall has been reproduced well by both models. The same is true at all other aggregations.

Figures 10 and 11 compare the fitted values of the standard deviation of the accumulated rainfall from the two models at sub-hourly and higher aggregations, respectively, along with the empirical values. Both models seem to do equally well at \(h=1/2, 1\) h as the observed and fitted curves are in excellent agreement. However, the exponential pulse model demonstrates a better alignment between the observed and fitted curves at all other values of h. The largest discrepancy between the observed and fitted values for the rectangular pulse model is observed at sub-hourly aggregations.

The observed and fitted values of the lag 1 autocorrelation of the aggregated rainfall for the two models are compared in Figs. 12 and 13. They suggest that the rectangular pulse model vastly overestimates the autocorrelation at sub-hourly and hourly aggregations whereas the exponential pulse model clearly performs better at these aggregations. At higher aggregations, however, the rectangular pulse model performs better and provides good alignment with empirical values.

Figure 14 compares the observed and fitted values of the coefficient of variation of the two models. Both models seem to do equally well at \(h=1\) h as the observed and fitted curves are in excellent agreement in both cases, although the sampling distribution of the coefficient of variation appears to be highly skewed as discussed earlier. However, the exponential pulse model seems to perform well and demonstrates a better alignment between the observed and fitted curves at all values of h at higher aggregations. The proportion of dry periods is compared in Fig. 15 at finer aggregation \(h=1/12\). Both models appear to reproduce these reasonably well and capture their pattern across the year quite well at \(h=1/12\).

6 Gamma distribution for initial pulse depth

To introduce greater variability in the initial depth of the pulses, we now take the distribution of \(X_{i}\) at the pulse origins as independent random variables with a Gamma distribution. Let \(\theta\) be the scale parameter and \(\alpha\) be the shape parameter of the distribution. Our model then has seven parameters and we estimate them by the method of moments approach as before. The mean \(\mu (h)\), standard deviation \(\sigma (h)\) and the lag one autocorrelation \(\rho (h)\) of the aggregated rainfall process are used to estimate all the seven parameters of the model. The above properties of the aggregated process at three different aggregation levels (at h = 10, 20, 60 min) are used in our estimation. The estimates of these functions from the empirical data, denoted by \(\hat{\mu }(h), \, \,\hat{\sigma }(h)\) and \(\hat{\rho }(h)\), are calculated for each month using 15 years of sub-hourly rainfall series accumulated at scales \(h=1/6, 1/3, 1\) h. The estimated values of the model parameters \(\hat{\lambda }\), \(\hat{\mu }\), \(\hat{\phi _{1}}\), \(\hat{\phi _{2}}\), \(\hat{\theta }\), \(\hat{\alpha }\) and \(\hat{\beta }\) were obtained, separately for each month, by minimizing equation (8) using standard routines.

Although the parameter estimates of this initial Gamma pulse model showed similar properties to those of the exponential initial pulse model, there was not much improvement in the reproduction of the statistical properties studied. The only improvement came from the reproduction of the extreme value properties of the rainfall. This is illustrated by Fig. 16 where we compare the extreme values of the 15 years of observed rainfall data with those generated by the two models. The annual maxima of the empirical data were ordered and plotted against the corresponding Gumbel reduced variates at each aggregation level. One thousand copies of the rainfall series, each 15 years long, were then simulated from the two fitted models. The annual maxima of each of the 1000 simulated series, at each aggregation level, were extracted and ordered to make up the interval plots against the corresponding Gumbel reduced variates. These interval plots, generated separately for the two models, were superimposed on the corresponding Gumbel reduced variate plot of the empirical data for comparison.

Figure 16 shows the results of the exponential and Gamma initial pulse depth models for \(h=1/12, 1, 24\) h. The red solid line shows the ordered empirical annual maximum values and the interval plots show the variability of the simulated ordered maxima from the two fitted models. The circles in the interval plots show the mean of the 1000 simulated ordered maxima corresponding to that plotting position. At \(h = 1/12\) h aggregation level, both models consistently underestimate the extremes. When \(h = 1\) h there was evidence of underestimation at the upper end of the reduced variates. Despite this, there was notable improvement resulting in good agreement at \(h = 24\) h aggregation, as all of the empirical values fell within the range of the simulated values. The interval plots suggest that the Gamma initial pulse model demonstrates clear improvement, over the exponential initial pulse model, as the discrepancy between the observed and simulated extremes becomes smaller at both \(h=1/12\) and \(h=1\) h aggregations. Gamma initial pulse model also tends to do better at daily aggregation level.

Therefore, the proposed models, although underestimating the extremes at sub-hourly aggregations, appear to capture the extremes well at daily aggregations. In addition, the Gamma initial pulse model shows a clear improvement, over the exponential initial pulse model, in reproducing extremes. The estimation of extreme values at sub-hourly scales is a common problem for most stochastic models for rainfall, and our results concur with the findings of previous published studies, see for example Cowpertwait et al. (2007) or Verhoest et al. (1997).

Fig. 16
figure 16

Ordered annual maxima of the aggregated rainfall at h = 1/12, 1, 24 h, plotted against the reduced Gumbel variates. Empirical annual maximum values are shown by a red solid line. Interval plots based on annual maxima of 1000 simulations from Exponential (blue) and Gamma (black) initial pulse depth models are also shown. The circles in the interval plots are the mean of the 1000 simulated maxima. Plotting positions of the interval plots for the Gamma model are moved to the right by a small distance to aid comparison. The return periods are specified at the foot of the plot above the x-axis

7 Conclusions and potential future improvement

We developed a class of doubly stochastic point process models with exponentially decaying pulses to describe the statistical properties of the rainfall intensity process. Second-order moment characteristics of the rainfall intensity and aggregated rainfall processes are studied. The data analysis, using sub-hourly rainfall data from England, showed that the proposed models fit most of the empirical rainfall properties well at various aggregation levels. Models with fixed duration for the pulse lifetime as well as variable pulse lifetime were used in the analysis which revealed that the latter provides little improvement. A Gamma distribution for the initial pulse depth showed no improvement in reproducing the second-moment properties. However, it showed notable improvement in reproducing extremes.

The performance of the proposed model, in terms of reproducing statistical properties of the aggregated rainfall, was compared with that of a doubly stochastic rectangular pulse model that had the same underlying structure for cell arrivals. Both models performed equally well in reproducing the mean rainfall. The proposed exponential pulse model seemed to do better at sub-hourly aggregations for most of the other properties considered and even at higher aggregations for some of them. The only instance where the rectangular pulse model seemed to do better was the lag 1 autocorrelation at higher aggregations.

In terms of weakness, the proposed models found it difficult to reproduce the coefficient of variation at lower aggregation levels, although the observed and fitted values were in good agreement. A closer look at the simulated values of the statistics from the fitted model indicates that this may be due to the skewed nature of the sampling distribution of the simulated values of the coefficient of variation. Another drawback is the underestimation of extremes at hourly and sub-hourly aggregations. There is, however, scope to expand this model in different ways which is discussed below.

One possibility for future work would be to generalise the distribution of the initial pulse depth to consider two different distributions for the two states. This might introduce more variability in the pulse process. Another possibility would be to consider other distributions for the initial pulse depth, such as Pareto for example. Alternatively, to make substantial structural changes to the cell arrival process, one could explore the possibility of employing a three state Markov chain for the underlying process.

Although the proposed model has produced good results in reproducing fine-scale rainfall properties, at present the model only deals with point rainfall at a single site. Therefore, another direction of research would be to explore the extension of this model to a multi-site framework to model rainfall data from multiple stations in a catchment area.

In terms of model calibration, it would be useful to explore the recent developments in the parameter estimation process, such as that of Jesus and Chandler (2011), for our models and also the use of “momfit” software (Chandler et al. 2010) which will allow us to quantify parameter uncertainty. These improvement will form part of our future work in this area.