Parametric Linear Stochastic Modelling of Benue River Flow Process

The dynamics and accurate forecasting of streamflow processes of a river are important in the management of extreme events such as floods and droughts, optimal design of water storage structures and drainage networks. In this study, attempt was made at investigating the appropriateness of stochastic modelling of the streamflow process of the Benue River using data-driven models based on univariate streamflow series. To this end, multiplicative seasonal Autoregressive Integrated Moving Average (ARIMA) model was developed for the logarithmic transformed monthly flows. The seasonal ARIMA model’s performance was compared with the traditional Thomas-Fiering model forecasts, and results obtained show that the multiplicative seasonal ARIMA model was able to forecast flow logarithms. However, it could not adequately account for the seasonal variability in the monthly standard deviations. The forecast flow logarithms therefore cannot readily be transformed into natural flows; hence, the need for cautious optimism in its adoption, though it could be used as a basis for the development of an Integrated Riverflow Forecasting System (IRFS). Since forecasting could be a highly “noisy” application because of the complex river flow system, a distributed hydrological model is recommended for real-time forecasting of the river flow regime especially for purposes of sustainable water resources management.


Introduction
Inherent in the principles of water resources management is the judicious utilization and conservation of the available water resources.One of the ways to enhance this is the proper estimation of water demand both quantitatively and qualitatively.Within this overall management system, the hydrologist is often required to estimate the magnitude of extreme events, whereas operation of some of the design works is often dependent on reliable estimates of flow for an ensuing period of time.Since river is an essential component of the hydrologic cycle, its flow forecasting provides a veritable, and basic information on a wide range of problems related to the design and operation of the entire river system.A very common constraint encountered in the context of water resources planning is inadequacy of streamflow records.The available streamflows, known as historical records, are often quite short, generally sometimes less than a quarter of a century in length.Thus, a system designed on the basis of the historical record only faces a chance of being inadequate for the unknown flow sequence that the system might experience.The historical record comprising a single short series does not cover a sequence of low flows as well as high flows.Hence, the reliability of a system has to be evaluated under these conditions which are not possible with historical records alone.
Statistically, the historical record is a sample out of a population of natural streamflow process.Thus, the generated flows are neither historical flows nor a prediction of future flows but rather are representative of likely flows in a stream or river.Streamflow, being a natural phenomenon, has a random component, though not fully random since it has been observed that it exhibits heteroscedastic behavioural pattern.Forecasting river flow in general or after heavy rainfall event is important for public safety, environmental issues, and water management.For these purposes, mathematical models have been developed based either on physical considerations [1][2][3][4] or on statistical analysis [5][6][7].Conventional models for streamflow forecasting typically involve a number of physical variables that function as inputs.A physical variable that is not very useful for forecasting on its own can often be useful when used in conjunction with other variables.Given the number of physical variables that could be considered as potentially relevant, it is apparent that a very large number of different combinations of both variables and mathematical relationships that link them together are available when developing a streamflow forecasting model.Determining an appropriate model structure by trial-and-error process is therefore not always practical [8].
The non-practical determinate nature of model structure for streamflow/river flow forecasting can really be appreciated in a wider context considering the fact that river flow is usually treated as a random process, purely stochastic.The justification is that river flow is a function of precipitation and other processes which, at present level of knowledge, seem to evolve randomly in time and space.Even if the underlying phenomena and their interactions were thoroughly understood, it would not be possible to describe mathematically the rate of discharge in a natural water course without involving unsystematic or unknown effects [9].Considering the issues involved in river flow studies within the premise of a wider hydrological horizon, it is pertinent to appreciate the following seemingly, contemporaneous paradoxes: 1) In the face of the stifling dearth of long and continuous data availability, can realistic generalizations be made from forecasting the dynamics of the Benue River?
2) Considering the complex nature of river flow and the significant variability it exhibits in both time and space, what is the appropriateness of using stochastic method for modelling the Benue River flow process?
To this end, the objective of this study is to model the streamflow process of the Benue River with Autoregressive Integrated Moving Average (ARIMA) models, focusing on short term forecasting for the purposes of evaluating suitability of particular model type as a preliminary step towards developing an enhanced "River Flow Forecasting System" for the river.

1) Hydrology of the Benue River
The Benue River is the major tributary of the Niger River.It is approximately 1 400 km long and almost navigable during the rainy season (between July and October).Hence, it is an important transportation route in the regions it flows through.Its headwaters rises in the Adamawa Plateau of the Northern Cameroon, flows into Nigeria south of the Mandara Mountains through the east-central part of Nigeria before entering the Niger River at Lokoja (Figure 1a).The wide flood plain is used for agriculture, with main crops being sugar cane and rice.There is only one high-water season because of its southerly location; this normally occurs from May to October, while on the other hand, the low-water period is from December to June. Figure 1b explains the hydro logical flow regime of the Benue River in line with the general climatic pattern.There are definite wet and dry seasons which give rise to changes in river flow and salinity regimes.The flood of the Benue River (upper, middle, and downstream) lasts from July to October, and sometimes up to early November.

2) Data Base Management
In this study, historical time series for gauging stations at the base of the Benue River (i.e., Lower Benue River Basin) at Makurdi (7°44′ N, 8°32′ E) was used.A total of 26 years  water stage and discharge data were collected and used.The daily flow data were aggregated to monthly and annual data series by taking the average of each month's flow and calendar year.Similarly, the annual maximum and minimum daily average discharges were obtained according to the water year, i.e., months of April to March for the streamflow process.

3) Model Formulation and Forecast Strategy
The possibility of fitting a multiplicative seasonal ARIMA model to the logarithms of the monthly flows was examined.The forecasts from this model were compared to forecasting using a conventional Thomas-Fiering model.Comparison of forecast errors was also performed to bring to the fore the suitability of either of the models for forecasting the streamflow process of the river.Model formulation and development was patterned after Box and Jenkins [1], Carlson et al. [11] and McKerchar and Delleur [12].a) Thomas-Fiering Model Thomas and Fiering [13] described a linear stochastic model for simulating synthetic flow data.On a monthly basis, this represents the means, standard deviations, serial correlations between successive flows, and the skewness.This model uses a linear regression relationship to relate the flow 1 t Q  in the (t+1)th month, (t being from the start of the generated sequence) to the flow Q  be the mean monthly discharges during months j and j+1, respectively, within a repetitive annual cycle of 12 months, j b be the regression coefficient for estimating the flow in the (j+1)th month from the jth month, and t t be a normal deviate with zero mean and unit variance, the Thomas-Fiering equation will be If an average first-order serial correlation coefficient r 1 is used to replace the 12 monthly r j values, it can easily be shown using the relationship That is, the model ( 1) is the first-order case of the general non-seasonal autoregressive model where y t , y t-j , σ j and ф j represents the transformed (i.e., standardized) series; transformed series at the previous time step, standard deviation of each month, and autoregressive parameter value of order one for each month, respectively.In the application of the Thomas-Fiering model, negative values are sometimes generated.It is recommended that these values be retained and used to derive the subsequent values in the sequence, and when once the generated sequence is completed, all the negative values in the generated sequence be replaced by zero.Similarly, if there is no occurrence of flow for a particular month, then generation of flow for such a month may not be carried out.Since there is flow all year round in the Benue River, this procedure was ignored.

b) ARIMA Analysis of the Monthly flow data
To be able to identify the most suitable model to fit the flow series, serial correlations were calculated for possible differencing schemes d = 0, 1, 2 and D = 0, 1, 2, where d and D stand for non-seasonal and seasonal differencing, respectively.Figure 2 shows the autocorrelation function plots for these differencing schemes.
To account for runoff phenomenon in the streamflow data, the prospect of seasonal differencing seem more promising since seasonality cannot really be accounted for by non-seasonal differencing, nor is an integrated moving average scheme expected to account for the non-seasonal autoregressive behaviour.Thus considering this factor, a multiplicative ARIMA model 1, 0, 2 1,1,1,  was examined.This model has the form  seasonal autoregressive, and seasonal moving average parameters, respectively; while z t and a t are logarithmic transformed series and model random shocks, respectively.

c) Flow Forecasting
The ARIMA model was used to forecast flows for one to 24-month ahead.With reference to an origin at time t (here, t = 288), the model was used to make minimum mean square error forecasts of z t+L for , where L is the lead time.The values forecasted for z t+L for an origin at t with lead time L will be written as 12 in respect of parameter estimation (Table 1) and the final parameter values (Table 2) as well as the corresponding diagnostic check for model adequacy (Table 3).At 5% level of significance, the autocorrelation plot of the model residual reflects that the residual series may be considered random (Figure 3).In terms of the forecasting function, the general ARIMA model can be written in three alternative forms: as a difference equation, an infinite sum of the current and weighted previous values of shocks a t , and an infinite sum of weighted previous observations plus the current value of a t .Conditional expectation of any of these forms supplies a forecasting function.In this regard, the difference equation was used.By recalling that Z  , using square brackets to signify conditional expectation, noting that (5) and taking expectation of the model, which has the general form the forecasting function can be obtained according as: Equation ( 6) can be expanded for the respective lead time (L) to make the forecasts with z t+L , the dependent forecast variable as a function of L. Both the Thomas Fiering and ARIMA models were used to make forecasts of the monthly flow series.Subsequently, the forecasts from the models were compared with the actual flows.Because the last 2 years flow data was used for the comparison, the parameters were re-estimated for both models for the entire flow series shortened by 2 years (i.e., the model fit was done with 26 years of flow data).The flow forecasts were considered from the aspect of choosing a particular time origin and taking cognizance of the behaviour of the forecast function as the lead time L increases; that is, the long-term behaviour of the forecast function should be a useful theoretical check on the fit of a model.Taking the origin t = 288, forecasts for the logarithms of the flow were made using both models.

Results and Discussion
Figure 4 shows the behaviour of the ARIMA model forecast function; the forecasts are quite close to the monthly means.Baring data quality problems, stationarity issues, and model over-fitting, for an ideal forecast function, this behaviour is to be expected.Forecasts in the distant future for a trend-free series should be the unconditional estimates of the means.Figure 5 indicates that the forecasts are all within the bounds with respect to actual flows.Based on this, and taking into consideration the data size for model fitting, the ARIMA model has reproduced the monthly means well.Figure 6 illustrates the standard errors of the forecasts.This figure compares the monthly standard deviations of the logarithms of the monthly flow with the standard errors for forecasts of the two models under discourse, respectively, for   .As L becomes large (say, greater than 4), the standard error of a forecast for Thomas-Fiering model, tends closely to that of the historic flow, whereas the ARIMA model deviates away significantly.The behaviour of the Thomas-Fiering model in this regard is further explained by Figure 7, where it was used to simulate the flow regime for 26 years.It was able to reproduce the flow dynamics clearly well.This attribute reinforces its suitability to be used for long-term flow forecasting of the Benue River.The failure of the ARIMA model to account for the seasonal pattern in the standard deviations is a major limitation of the model.In particular, it leads to problems in transforming forecasted flow logarithms into natural flows.

Conclusions
Based on the results of analysis done, it is evident that autoregressive and ARIMA models have an important place in stochastic hydrology.Specifically, logarithms of monthly flows may be represented either with a low-order autoregressive model (if the series are first standardized) or with a multiplicative seasonal ARIMA model of the order     ) may be used for forecasting of the Benue River monthly flow, though the former performed relatively better than the later.The ARIMA model was able to forecast flow logarithms, but because it did not adequately account for the seasonal variability in the monthly standard deviations, the standard errors associated with the forecasts may not be physically correct.Also, the logarithms cannot be correctly transformed into natural flows; thus giving concern for cautious optimism.However, the stochastic modelling does show that the ARMA type models could be used as preliminary models which may form the basis for understanding the dynamics of the streamflow process.For the purposes of developing real-time Integrated River flow Forecasting System for the Benue River within the overall context of water resources management strategy, consideration should be given to distributed river flow hydrological models that incorporate hydroclimatic forcing.It suffices to note also that the appropriateness of the stochastic process for every flow series may be debated in the context of nonlinear determinism and chaos, according to which seemingly complex and irregular behaviours could be the outcome of simple deterministic systems with only a few nonlinear interdependent variables with sensitive dependence on initial conditions.On this basis, nonlinear deterministic methods could be viable complement to linear stochastic ones for studying river flow dynamics if sufficient caution is exercised in their implementation and interpretation of results.
the adequacy of the model was done by evaluating the autocorrelation function for the residuals by modifying the model to take into account any non-random features.Figure3shows the residual autocorrelation function for model   Figure 3. Residual autocorrelation function for ARIMA     12 1,0, 2 1,1,1,  model.