Development of Rainfall Empirical Models for Osun Watershed, Nigeria

The aim of this study was to conduct a sequential detection for the possible trends in seasonal rainfall data series using various statistical packages. Farmers in Osun basin lack adequate planning with respect to agricultural activities for its maximum productivity as rainfall trends are often cited as one of the causes of socio-economic problems such as food insecurity. The available rainfall records provide information on monthly basis as obtainable from other meteorological stations nationwide. Rainfall records from 1980 to 2009 which were most consistent required for the development of rainfall models were used. A time series analysis was used to develop various models. The volume of rainfall within the study location was observed to fluctuate with December having the lowest amount of rainfall of 1.10 mm. The average rainfall was observed to be on the increase from the month of February through to the month of November with the highest amount of rainfall recorded in the month of September with an average rainfall of 556.60 mm. The polynomial equation was used to develop a best line of fit of y 1.4227x 3 29.502x 154.72x + 9.2348 with a R Square value of 0.7541. A double maxima of June and September was observed while April which is used to be the beginning of the raining season tends. October which used to be the beginning of dry season was also observed to tend towards raining month. The linear, exponential growth and s-curve equation models of the forms Y y and y (cid:2) were developed respectively. The data set were further decomposed and it was observed that there was a seasonal effect. The seasonal indices show the average downward movement within the first 3 months and the last 2 months of the season and average upward movements from the 4th to the 10th month. In conclusion the Linear model from this study proved to be the best for predicting rainfall events for the Osun watershed. It is therefore recommended that the Linear model be used to predict rainfall for the Osun watershed.


INTRODUCTION
Over decades now, the study of climate change has gained much importance because of its dynamic, complex nature and its influence on various sectors of our environment which includes the threat to global climate change.
[1] stated in their finds that any change in any component or sector of the environment leads to change in the whole system and a fundamental component of the environment is climate. Thus, a change in the climate of an environment will affect every other component of that environment. Therefore, climate change can therefore no longer be described as a thing of the future but a process that is currently ongoing. It can therefore be said unequivocally that climate change is now a reality, and the adversities of this transformations will cause the greatest challenge facing the world today. To this end, scientists have shown that due to the increase in the concentration of greenhouse gases in the atmosphere, the climate is changing which affects its major components such as temperature, relative humidity and the amount of rainfall distribution. The report of [2,3] stated the last 100 year (1906 -2005), a linear trend of average global surface temperature was in the range of 0.56 to 0.92°C which is larger than the corresponding trend range of 0.4 to 0.8°C for the years between 1901 and 2000. [4] stated that between the years 1900 and 2005, the world precipitation was found to either be increasing or decreasing in different parts of the world globally.
For accurate and good quality results, change in time and space requires long-term trend analysis which is dependent upon the available homogeneous data. Such data set are also affected by non-climatic factors such as changes in instruments, station location, station environment and so on make climate data unrepresentative of temporal climate variability [5].
There is a growing awareness across the world today as it concerns global warming with respect to changes in the various hydrological parameters as temperatures are increasing and will continue to increase for the next century. Different parts of the world are currently experiencing this change; the case of Osun watershed is therefore not an exception as such changes are responded to differently.
Weather forecasting in the last two decades has experienced a strong pattern towards probabilistic forecasts. This takes the form of probability distribution over future weather conditions [6]. Changes in weather conditions (such as temperature, quantity of rainfall, relative humidity, etc) has made weather forecasting paramount to every sector of the world. Some researchers have identified in their study the use of the multimodel analysis in probabilistic climate projections stated that complex models have been developed by several scientist to represent nonlinear climate systems through the use of computer simulations to forecast climate variables such as temperature, relative humidity, and precipitation over a given location or country [7,8,9,10]. Such developed models are known to have different level of uncertainties which are majorly grouped into three; this includes initial conditions, boundary conditions, and parameter uncertainties. To attend to these uncertainties can be complex and almost impossible.
Soft Computing and Statistical Techniques has been identified by some researchers that Multiple Linear Regression (MLR) can be to develop models for forecasting weather parameters which can forecast the weather condition for a particular area using the data collected from such areas [11]. From such data, statistical information is extracted in the time series. Inputs to the models to be developed are chosen based on the correlation of the statistically analyzed data from where the regression equations are developed. The data set are usually divided into two parts; one set to develop the MLR equations and the other set is used to test the developed model.
With the current turn of the Nigerian government from the production of crude oil as source of revenue to agriculture, the success or failure of harvest and water scarcity in any period of any year in the future must be considered greatly. Rainfall is an important sector of any economic in the development of a country's economy that mostly depend on agriculture. An integral percentage of the rural populace requires rain as a key for the growth of their agricultural activities. Osun basin whose economy is heavily dependent on productive rain fed agriculture, rainfall trends are often cited as one of the causes of socio-economic problems such as food insecurity. Consideration for irrigation activities in this part of the world is not prominent as most agricultural activities depend mainly on rain. A dry period is usually experienced as a result of good rainfall which can be for a long or short period of time which may affect the crop yield which take its turn on the economy of the nation.
The aim of this study is to conduct a sequential detection for the possible trends in seasonal rainfall data series using the various statistical packages. To also statistically establish the trend and distribution pattern of the annual rainfall regime for the study area and to develop empirical model which can be used for predicting rainfall and some other hydro-metrological information in the study area.

Study Location
Osun State is located in the tropical rain forest zone of Nigeria which has a total land mass of approximately 14,875 sq km and lies between latitude 7° 30 ′ 0″ and 7° 50' 0"N and longitude 4° 30 ′ 0″ and 4° 50' 0" E at an altitude of 353 meters above sea level. Though a landlocked State, it is has many rivers and streams which runs through her and serves the water needs of the State. The State is located within the tropical rain forest which in most cases experiences rainfall from March ending to November of the same year while the dry season starts from the month of January to the month of February. Thus, making the people in the area predominantly farmers.

Data Used
The most consistent and available rainfall records of 1980 to 2009 were collected from the Osun meteorological station. This record provided information on monthly rainfall amounts only, as is obtainable in most of the other meteorological stations nationwide. Older records before 1980 were not available as most had missing data from the Metrological Services Departments of the Federal Ministry of Aviation Oshodi Lagos State.

Rainfall Analysis
The analysis of the Osun 30 year rainfall data were sorted on monthly basis. The sorting of the data was made possible based on the identification and selection of all the annual rainfall values for the various selected durations (months). Rainfall amounts on monthly basis were calculated in millimeters.

Linear Regression (LR) Model
The linear regression (LR) model shows the variable Y which is thought to be a linear combination of one or more variables which is measured using the same unit [12]. Simple linear regression model are mostly of the form: where X is the predictor variable and b 0 and b 1 is the unknown constant variables while the multilinear regression model are mostly of the form. This equation is also called the mathematical model for linear regression.

Multi-linear Regression (MLR) Model
The MLR model is used to develop empirical equations for forecasting weather and other parameters which are known to have observed data set. This developed equation is capable of forecasting conditions for which the data set are provided for. The data is cleaned up to have the same statistical indicators which will be used to extract all hidden information which are present in the time series. [13] stated that these hidden information includes moving average (MA), exponential moving average (EMA), rate of change (ROC), oscillator (OSC), moments and coefficients of skewness and kurtosis can be determined over a certain period of time. The obtained empirical equations for the observed data set are then used to forecast the expected target of the modeler. The data is usually divided into two parts with the first used to obtain the equation and the remaining parts used to data are used to test the developed model. The MLR model consist of predictors which are expressed in powers of first, second and third orders to form the third-order polynomial model with the predictor variable.
The MLR model is mostly of the form where b 0 , b 1 , b 2 ,b 3 are regression coefficient and X 1 , X 2 X 3 are the predictor or independent variable and e is unexplained part of the dependent variable with a zero mean and constant variance which is also called the error level of the equation.
According to [14], preliminary analysis of regression analysis for 30 year data was carried out to determine characteristics like moments and dependence structure of the data set. This was carried out to be able to evaluate randomness and trend pattern. In this regard, the time series plot was examined to establish whether there is any relationship as well as seasonal characteristics like trend and moments. The objective here is to evaluate seasonality in the moments. Analysis of dependence structure was done in time and frequency domains; basically through autocorrelation and spectral density, respectively. Building MLR model is an iterative process which involves finding effective independent variable to explain the process we are trying to model or understand.

Model Validation
The evaluation of the computer model involved in carrying out statistical analysis recommended by [15] to validate the model. These validations are:

Model Development
The rainfall data collected was from 1980 to 2009. The data collected were used to generate linear, quadratic, exponential growth and S-curve models of trend analysis modeler. The models were used to forecast data for years 2000 to 2009 and compared with the observed data. The best data that best compared with the original (observed) was used to forecast to the year 2030. The various models were developed using Minitab 16.0 software which is in line with the studies of [11].

Rainfall Data Analysis
In this region of Nigeria, the fact that rain falls almost throughout the year, agricultural activities has been observed to be relatively slow and unrewarding. To make this a profitable venture, the rainfall data for the period of 1980 and 2009 years for Osun watershed was obtained from the metrological station at Oshogbo. It was observed that average annual rainfall for the period of study ranged between 926.33 mm and 1995.17 mm. Table 1 presents the average monthly and yearly rainfall for a period of 30 years. It can be seen from the table that the rainfall period per year spans through the months of February to November with the actual heavy down pour starting from April to October. This is typical of the zone and similar to the observations made by [16]. Though, within the years of observation some of the months had missing values. This rainfall pattern is influenced by the proximity of the study area to the Atlantic Ocean which is believed to bring in the rains into the inter-lands of Nigeria. The further away from the ocean the study location is the reduced the rain fall impact.
It was also observed from the table that the dry and wet season periods were carefully outlined to be between the months of February and November for the wet season while the months of December of the previous year to the January of the following year as the dry season. This also shows the seasonal cycle of the series which is not stationary. This is similar to the works of [17,18]. They studied the time series analysis for rainfall data of Jordan which they used as a case study for using time series. In their study, they tried to fit an ARIMA model stationary data in both variance and the required mean. In the case of [19], it was observed from their study which was centered on Statistical Study of Annual and Monthly Rainfall Patterns in Ekiti State, Nigeria showed that the monthly rainfall increased progressively from the months of February and decreased from the month of November-December paving way for the dry season to set in. A double maxima of June and September was observed while April which is used to be the beginning of the raining season is tending towards dry month and October that used to be the beginning of dry season is also tending towards raining month. This was also observed in this study most especially towards the later part of the years of study for the Osun basin. This shift according to [19] is known to have a significant impact on the ecosystem of the area and also its agricultural activities. It was observed from the table also that from the months of April, there was gradual increase in the amount of rainfall until it got to the pick where it was again observed to be declining in a gradual mode from the month of October. This can be stated as the general trend of rainfall pattern in the study area. From the Table 1, a trend was also observed over the 30 years data period that the volume of rain fall within the study location was observed to be fluctuating. In some of the years, it was observed that there were rainfalls during the early part of the year. Such years includes 1982, 1986, 1988, 1990, 1994, 1997, 2001, 2003, 2006, and 2009. This is similar to the works of [20]. Though not heavy, it was also observed from that the some of the years recorded some amount of rainfall during the month of January. Nineteen eighty two was observed to have recorded the lowest average rainfall of 926.33 mm while 1985 had the highest rainfall amount of 1995.17 mm. The average rainfall was observed to be on the increase from the month of February through to the pick months of between July and September after which a recession was observed to start occurring. The polynomial equation was used to develop a best line of fit of y = 1.4227x 3 -29.502x 2 + 154.72x + 9.2348 with a R Square value of 0.7541. This result is similar to the works of [21]. Fig. 1 shows the best line of fitness for the average monthly rainfall data for the study period of 30 years.

Statistical Analysis
Several statistical method are currently being employed by researchers to enable them determine some basic information from data sets. Some of which have employed such methods to forecast several hydrological parameters, such researchers include [22,23,16,24,25,26,20,14] etc.
The observed total annual rainfall data was statistically analyzed using Minitab 16.  Table 1. The Minitab 16.0 was used develop the Linear, Exponential Growth and S-curve models. The data were trained for the first twenty year for a period of 1980 to 1999 to develop the models. Table 2 shows the developed empirical models for the average yearly rainfall.
The developed empirical models were used to develop new sets of corresponding forecasted values, a forecast for the ten years of 2000 to 2009 was developed and the values compared with the actual observed data. Figures 2 and 3 show compression of the determined various values of the developed model for both average annual rainfall and average monthly rainfall.   The data set were further decomposed and it was observed that there was a seasonal effect. This was discovered because of the cyclic nature of the graph as presented in the appendix. Thus the de-trended data and seasonally adjusted data were observed to be different the original observations from the data set. This is in conformity with the works of [14].
For the rainfall data, the chart of the seasonal indices shows the average downward movement within the first 3 months and the last 2 months of the season and average upward movements from the 4th to the 10th month. The chart of percent variation by season shows that the 1 st month has the least variation and the 3 rd month has the most variation. This is similar to the works of [1]. When the data was de-trended by season, it was observed that the 1 st , 2 nd , 6 th , 9 th , 11 th and 12 th months where the absolute value of the seasonal effect was largely felt while the 3 rd , 4 th , 5 th , 7 th and 8 th months had less variation.
It was also observed that because of the detrended data and also the seasonally adjusted data which looked slightly different from the original observations, it can be conclude that a trend component and a seasonal component were present in the data. The residuals graph shows that the fitted values are under predicted in part of the second and last annual cycle (the graph exhibits large positive residuals in these regions) this similar to the works of [27].
The linear model was observed to have a steadily increasing predicted value which in real life may not be visible as rainfall values do not increase geometrically. The S-Curve model gave an almost steady and common value (amount) of rainfall for the study area. The exponential growth model had the predicted values for thirty years to be cyclical in nature which is a clear reflection of the study area. Tables 3 and 4 show the comparison of the average measured and forecasted values for the various empirical models developed for a period of ten years.