Modelling rainfall trends in England and Wales

The monthly England and Wales precipitation (EWP) series (once power transformed to induce symmetry and to stabilise variance) may be characterised as having linear seasonal trends with a white noise error process superimposed. However, these trends are not stable, for they are interrupted by four regime shifts occurring in 1828, 1871, 1917 and 1976. If these shifts are ignored then the series is consistent with a trend pattern in which winters are becoming increasingly wet and summers drier. If only the last regime from 1976 is considered, then summers are still becoming drier but winters have no trend, with spring becoming wetter. The unusually wet winter of 2014 is seen to have been a consequence of very high January and February rainfall relative to that predicted, the conjunction of which is unprecedented during the two and a half centuries over which the EWP series has been available, during which time such pairs of values have been essentially uncorrelated. Subjects: Geostatistics; Mathematical Statistics; Statistics; Statistics & Computing; Statistics & Probability


Introduction
modelled the monthly England and Wales precipitation series (EWP) from 1766 to 2002 using a variety of trend extraction techniques and provided evidence of a trend towards wetter winters, particularly in December and January, and drier summers, concentrated in June and July, although he found that it was difficult to come to firm conclusions about trends extracted from such a noisy data series. In general, however, Mills concluded that his findings were consistent with the earlier, essentially descriptive, analysis of Alexander and Jones (2001

PUBLIC INTEREST STATEMENT
Monthly rainfall in England and Wales may be characterised as having linear seasonal trends with an uncorrelated (white noise) error superimposed. However, these trends are not stable, for they are interrupted by four shifts occurring in 1828, 1871, 1917 and 1976. If these shifts are ignored then the series is consistent with a trend pattern in which winters are becoming increasingly wet and summers drier. If only the last regime from 1976 is considered, then summers are still becoming drier but winters have no trend, with spring becoming wetter. The unusually wet winter of 2014 is seen to have been a consequence of very high January and February rainfall relative to that predicted, the conjunction of which is unprecedented during the two and a half centuries over which rainfall data has been available, during which time such pairs of values have been essentially uncorrelated.
In the decade since this publication, attention has focused on modelling high-frequency extreme "precipitation events" in England and Wales using simulation models: see Pearson, Shaffrey, Methven, and Hodges (2015), Rhodes, Shaffrey, and Gray (2015) and Otto et al. (2015). No further analyses of the monthly EWP series have been reported, but since the original Mills study was published various techniques have become available for detecting the possible presence of breaking trends in time series, a feature that could well have an impact on a series as long as the EWP. Furthermore, the unusually wet winter of 2014, which contained the wettest January on record, provides a further impetus to reanalyse the EWP series using these new techniques and an additional (almost) 12 years of observations. Consequently, Section 2 revisits and updates Mills' parametric model of monthly EWP for the period 1766-2014, extending it to allow for both deterministic and stochastic seasonal and trend behaviour. Section 3 subjects the model to various structural break tests to ascertain the presence of breaking trends and estimates the implied regime break points and associated trends. The unusually high winter rainfall of 2014 is investigated in Section 4, in which the monthly observations for December, January and February are compared to their forecasted values and the extent of their "unusualness" is calibrated. A summary of the results is offered in Section 5. Following Mills (2005), the basic model for the monthly EWP series takes the form In Equation (1), rainfall x t is transformed by the Box and Cox (1964) power transformation to ameliorate the skewness in the raw data, a consequence of x t being bounded below at zero, and to induce linearity and constancy of variance. The estimate of 0.6 for the transformation parameter λ obtained by Mills (2005) is again used here. The s i,t , t = 1, 2, … , T, are "dummy" variables defined to take the value 1 in month i and 0 elsewhere (where i = 1 signifies January, etc.). Their inclusion allows a deterministic monthly pattern to be modelled. The presence of the s i,t t "interaction" variables allows for the possibility of different monthly linear time trends. The α i and β i parameters measure the intercept and slope of these trends, so that if β i ≠ 0 then the seasonal pattern for month i evolves linearly over time. As the sample period is January 1766-December 2014, there is a total of T = 2,988 observations: the data are taken from the UK Met Office (Hadley Centre) website at http://www. metoffice.gov.uk/hadobs/hadukp/data/monthly/HadEWP_monthly_qg.txt.

A model for monthly precipitation
The error u t can, in general, follow a seasonal ARMA process (see, e.g. Mills, 2014).
are polynomials in the lag operator B defined such that B j a t ≡ a t−j and a t is zero mean white noise with variance 2 a . The presence of the polynomials Φ(B 12 ) and Θ(B 12 ) allows the error to be seasonally (as well as non-seasonally) autocorrelated.

Deterministic and stochastic trends and seasonality
More general models result if unit roots are allowed in the (B) and Φ(B 12 ) polynomials. If the nonseasonal autoregressive polynomial contains a unit root, so that it can be factorised as then Equation (1) where it is taken that ∇s 12,t = s 12,t − s 1,t+1 , Equation (3) becomes Thus, x ( ) Alternatively, suppose that the seasonal autoregressive polynomial contains a unit root: Equation (1) then becomes, with ∇ 12 = (1 − B 12 ), Now ∇ 12 s i,t = 0 and ∇ 12 s i,t t = 12s i,t , so that (5) becomes and x ( ) t now contains a stochastic seasonal random walk with differing seasonal drifts.

Estimating and testing the model
To determine the most appropriate form of the combined model (1) and (2), initial analysis using the information from the sample autocorrelation and partial autocorrelation functions along with residual diagnostic checks from fitted models established that the polynomial orders could be set at p = P = Q = 1 and q = 0, leading to the model The estimates of the error specification are ̂= 0.034 ± 0.020, Φ = −0.951 ± 0.006 and Θ = 0.989 ± 0.002, clearly showing the absence of both a stochastic trend (since < 1) and ( x (0.6) The residuals from this model exhibit no autocorrelation and little evidence of non-normality. The question of whether the deterministic seasonal model contained a non-linear component was addressed by including additional quadratic trends, taking the form s i,t t 2 , but these were found to be insignificant.
Since the estimate of is small and only significant at approximately the 8% level, this parameter is set to zero in subsequent models, along with the restriction Φ = −Θ: the models thus have a white noise error u t = a t . Table 1 reports the ordinary least squares (OLS) estimates of Equation (7) under this error specification, along with a restricted model in which seven insignificant trends have been excluded. A likelihood ratio test of these seven zero restrictions has a marginal significance level of just 0.75 and imposing them has no effect on the remaining trend estimates but slightly improves the precision of the seasonal coefficients associated with the deleted trends. Although all the remaining estimated coefficients are highly significant, the R 2 statistic shows that the model explains just 13.6% of the variation in (transformed) rainfall with an accompanying regression standard error of 23.7, which is some 28% of the mean of the series.

Table 1. OLS estimates of Equation (7)
Note: Standard errors are robust to both autocorrelation and heteroskedasticity; rainfall scaled as mm × 10. The trend estimates are significantly positive for January, March and December and significantly negative for July and September (the first four at marginal significance levels of 0.5% and smaller, the last at a level of 2.2%). Using these estimates, monthly trends may be calculated as The average trend increases for January, March and December are 0.131, 0.070 and 0.121 mm, respectively, while the trend decreases for July and September are −0.095 and −0.070 mm. Thus, consistent with the findings of Mills (2005) and the earlier study of Alexander and Jones (2001), the evidence points to a tendency for increasingly wet winters and increasingly dry summers.

Testing for structural breaks
The trends calculated from Equation (8), when they are statistically significant, are essentially linear, so that the annual changes in trends for each month have been effectively constant (and possibly zero) throughout the almost two and half centuries of the sample period. The estimation of Equation (7) does, however, assume that the model has remained stable throughout this very long period, and this is an assumption that certainly needs checking. Bai (1997), Bai and Perron (1998, 2003a, 2003b, and Perron (2006) develop and survey a variety of tests for examining the regression constancy assumption against the alternative that the model has m potential breaks (and hence m + 1 regimes), perhaps occurring at unknown times. A selection of these tests are available in EViews 8 (2013), which provides a convenient description of them, these being applied with a maximum of five breaks to Equation (7) and a "trimming percentage" of 15%, which sets the minimum length of a regime to be approximately 37 years. All tests produce a common set of four break points (defined as the start date of the new regime) at March 1828, January 1871, December 1917 and May 1976. 1 A set of restricted models were then developed for each of the five regimes. Rather than report each estimated set of coefficients, the implied trends computed from Equation (8) are shown in Figure 1 aggregated into the four seasons and overlain on the "global seasonal trend" estimated from the coefficients of Table 1. As might be expected, the global seasonal trends average out the trends in each of the five regimes. However, the trends do shift across the regimes in interesting ways. Spring and autumn have had the most stable trends, with annual changes of 0.024 and −0.023 mm across the whole sample period. The spring trend had an essentially constant slope (0.024 ± 0.001 mm) until the final break in 1976, whereupon there has been a declining trend with a slope of −0.418 mm. The autumn trends were declining over the first three regimes, but for the last 100 years the trend has been flat. Winter has shown the largest trend increase of 0.084 mm per annum, with the regime trends being positive for the first three regimes but again essentially flat for the last 100 years. It is summer that has shown the most volatility in trend movements. An overall trend decline of −0.032 mm per annum masks much larger declines, of −1.124 and −0.526 mm, in the second and third regimes, and an increasing trend of 0.765 mm in the regime since 1976. Thus, if attention is only focused on this last regime, then winter and autumn rainfall trends have been constant, the spring trend has been negative and the summer trend has been positive.

How unusual was the winter of 2014?
The rainfall amounts for December 2013, January 2014 and February 2014 were 134.2, 184.6 and 136.7 mm, respectively, and during these months a great deal of flooding, with associated damage, occurred in parts of England and Wales. How unusual are these amounts? The December and February values have been exceeded by 7 and 6.25%, respectively, of the complete sample while the January value has been exceeded by just 0.5% (16 of the 2983 months for which data are available), but how extreme are these values given the models fitted to the data?
For the (restricted) global model of Table 1, the standardised residuals (û t ∕̂a) for these three months are 0.864, 2.105 and 2.031. Since these may be considered to be (independent) drawings (8) i,t = (0.6(̂i + 12̂i) + 1) 1.6667 from a standard normal distribution, the probabilities of observing values at least as large as these are 0.194, 0.018 and 0.021, so that the January and February values, although relatively large, are by no means unusually extreme. For the model fitted to the last regime, the standardised residuals are 0.831, 2.209 and 1.855 with associated probabilities of 0.203, 0.014 and 0.032, reasonably consistent with the global model. The correlation between the standardised January and February residuals over the entire period from 1766 to 2014 is just 0.02, so that the conjunction of two such large residuals in any single year, as has occurred in 2014, is unprecedented. This can be seen clearly from Figure 2, which presents a scatterplot of this pair of residuals for each year of the sample period.
It would thus appear that the extreme rainfall during the winter of 2014 is a consequence of unusually high, relative to predicted, January and February rainfall, the conjunction of which has been unprecedented in the last 250 years.

Summary
The monthly EWP series (once power transformed to induce symmetry and to stabilise variance) may be characterised as having linear seasonal trends with a white noise error process superimposed. However, these trends are not stable, for they are interrupted by four regime shifts occurring in 1828, 1871, 1917 and 1976. If these shifts are ignored, then the series is consistent with a trend pattern in which winters are becoming increasingly wet and summers drier. If only the last regime from 1976 is considered, then summers are still becoming drier but winters have no trend, with spring becoming wetter.
The unusually wet winter of 2014 is seen to have been a consequence of very high January and February rainfall relative to that predicted, the conjunction of which is unprecedented during the two and a half centuries over which the EWP series has been available, during which such pairs of values have been essentially uncorrelated.

Funding
The author received no direct funding for this research.