Statistical Analysis of 30 Years Rainfall Data: A Case Study

Rainfall is a prime input for various engineering design such as hydraulic structures, bridges and culverts, canals, storm water sewer and road drainage system. The detailed statistical analysis of each region is essential to estimate the relevant input value for design and analysis of engineering structures and also for crop planning. A rain gauge station located closely in Trichy district is selected for statistical analysis where agriculture is the prime occupation. The daily rainfall data for a period of 30 years is used to understand normal rainfall, deficit rainfall, Excess rainfall and Seasonal rainfall of the selected circle headquarters. Further various plotting position formulae available is used to evaluate return period of monthly, seasonally and annual rainfall. This analysis will provide useful information for water resources planner, farmers and urban engineers to assess the availability of water and create the storage accordingly. The mean, standard deviation and coefficient of variation of monthly and annual rainfall was calculated to check the rainfall variability. From the calculated results, the rainfall pattern is found to be erratic. The best fit probability distribution was identified based on the minimum deviation between actual and estimated values. The scientific results and the analysis paved the way to determine the proper onset and withdrawal of monsoon results which were used for land preparation and sowing.


Introduction
Water is vital for any life process and there can be no substitute for it. Water is also used for transportation, is a source of power and serves many other useful purposes for domestic consumption, agriculture and industry. The main important source of water in any area is rain and it has a dramatic effect on agriculture. Plants get their water supply from natural sources and through irrigation. The yield of crops particularly in rain-fed areas depends on the rainfall pattern, which makes it important to predict the probability of occurrence of rainfall from the past records of hydrological data using statistical analysis. Frequency or probability distribution helps to relate the magnitude of the extreme events like floods, droughts and severe storms with their number of occurrences such that their chance of occurrence with time can be predicted easily. By fitting a frequency distribution to the set of hydrological data, the probability of occurrences of random parameter can be calculated. To fit the distribution, the hydrological data is analyzed and the variability in the data is studied from the statistical parameters. Suchit Kumar Rai et al., studied the change, variability and rainfall probability for crop planning in few districts of Central India [1]. Nyatuame et al. [2] performed the statistical analysis and studied the variability in the distribution of rainfall. Rajendran et al. [3] carried out the frequency analysis of rainy days and studied the rainfall variation. The present study is carried out for Musiri (Tiruchirapalli district, Fig.1) town situated at a distance of 29 km from Tiruchirapalli city. The region has a latitude and longitude of 10.9549°N and 78.4439°E respectively. Agriculture is the main occupation in this town situated on the northern bank of Cavery river. The crops include paddy, sugarcane, banana and vegetables. Musiri receives rainfall from both the northeast and southwest on an average (30 years) of 245.49 mm and 352.62 mm respectively. The daily rainfall data is collected from the Indian Meteorological Department (IMD), Musiri station, for a period of 30 years . This data is used for the Yearly, Monthly and Seasonal Rainfall-Probability analysis. Figure 2 presents the historical annual rainfall for the station.

Methodology
The methodology adopted in this study is Rainfall Statistics (Table 1), Probability analysis using plotting position and probabilistic methods ( Table 2). From the Preliminary study and analysis, variation in results among the plotting position methods is found to be insignificant. Mean X avg ∑ X i / n X is the rainfall magnitude in mm, i=1, 2, to n and n is the length of the sample.
Standard deviation Σ [∑ (X i -X avg ) 2 / (n-1)] 1/2 X is the rainfall magnitude in mm, i=1, 2, to n and n is the length of the sample.

Co-efficient of Variation
C v 100 x (σ / X avg ) X avg is the Mean σ is the Standard deviation Co-efficient of Skewness σ is the Standard deviation N = Total no. of years X avg is the Mean X is the rainfall magnitude in mm, i=1, 2 to n

Annual Rainfall Analysis
The annual rainfall data is analyzed and the variation in distribution over the area is studied with the statistical parameters. The best fit distribution method is found using various plotting position and probabilistic methods. where, m is rank of the data and N = length of the sample (no. of years).

Monthly Rainfall Analysis
From the Preliminary study and analysis, variation in results among the plotting position methods is found to be insignificant and hence, only Weibull method is adopted for the analysis among them. From the Probabilistic methods, Gumbel and Normal distribution methods are used.The rainfall data are arranged into a number of intervals with definite ranges. Mean and standard deviation were found out for the grouped data. Chi-square values are calculated for the above methods, with the obtained probabilities. The method that gives the least Chi-square value is found to best fit the distribution. Weibull Distribution is a continuous probability distribution type where in rainfall amounts are assigned with a rank and the corresponding probabilities are found out using probability density function: The Probability Density function for the Normal Distribution method is as follows: Goodness of Fit is a test used to find out the best fit probability distribution. The best fit distribution varies for different time period. Chi-squared test is used in the determination of best fit distribution for weekly and seasonal rainfall in this study. Chi-Squared Test is used for continuously sampled data only and is used to determine if a sample comes from a population with a specific distribution.
∑ from i = 1 to k, O = Observed frequency, E = Expected frequency, i = Number of observations and k = the total number of data used Chi-Square Formula adopted in the study is as follows: Xc is the Chi -squared value, P(X) is the probability density function

Seasonal Rainfall Analysis
In this analysis, the variation of distribution of rainfall is studied with the statistical parameters using formulae mentioned in Table 1.

Effective Rainfall
Water requirement for various crops is found and they are related with the effective rainfall, which is calculated from the rainfall data. Effective rainfall is the amount of rainfall effectively used by the crops. The effective rainfall in the study area is calculated using the formula: Rₑ = 0.8*P-25, if P≥75mm (7) Rₑ = 0.6*P-10, if P<75mm (8) where, Rₑ is the Effective Rainfall (mm), P is the Total Monthly Rainfall (mm)

Annual Rainfall Analysis
The rainfall data are ranked in descending order and various plotting position and probabilistic methods are applied to determine the return period. Rainfall magnitudes were calculated for different return periods using the rainfall-return period equation obtained from the graphs for all plotting position methods (  For Musiri, California method gives the maximum value for rainfall for different return periods and Hazen method is found to give the least value and is hence not acceptable for the analysis. It is seen that Chegodayev method gives a maximum rainfall which is approximately 99.9% to that of the average maximum rainfall unlike other methods of distribution and is hence the best fit distribution for annual rainfall data. It is also seen that whenever there is an increase in return period, the rainfall amount also increases and vice-versa. Hence, Rainfall and return period are proportional to each other. Considering the results of Plotting positions to be actual, Gumbel distribution (Extreme value type-I) gives a value that is closer to the actual value and is hence the best method of fit for the annual rainfall data (Table 4).  Table 5 shows that the Standard deviation value is considerable large which indicate there is larger variation in rainfall pattern. Skewness represents the distribution of data about the mean. It is equal to zero in the case of normal distribution. When the peak of the sample is towards the right of a plotted graph, it is said to be negatively skewed and when the peak of the sample is towards the left of a plotted graph, it is said to be positively skewed. From the above table, it is clear that Normal series and Log-transferred series data are positively skewed.

Monthly Rainfall Analysis
The rainfall data are arranged into a number of intervals with a range of 25 mm and the frequency of occurrence is found out initially, to convert the normal data into a grouped data. Mean and Standard Deviation were found out for the same to check the variation in rainfall. Chi-square values obtained are compared for all the methods and the Least Chi-square value in all the cases is given by Gumbel distribution. Thus, from the above Table 6, it is inferred that the chi-square value is least for Gumbel distribution, showing that it best fits the monthly rainfall data.

Seasonal Rainfall Analysis
From the above Table 8, it is shown that Musiri receives the rainfall seasonally on an average of 245.49 mm and 352.62 mm during south-west (SW) and north-east (NE) monsoons respectively and it also shows that to how much extent the distribution of rainfall is varied over the study area during the respective monsoons. The monthly rainfall series data of both the monsoons are positively skewed.

Crop Planning
Average effective rainfall for 30 years were found and tabulated. From table 8, it is observed that the average effective rainfall is higher in the months of August, September, October and November. Whereas, the average effective rainfall in the months of January, February and March is zero. Based on the effective rainfall the crop planning is done which is shown in Table 9. effective rainfall data, crop water requirement for various crops were identified and segregated into rain-fed and irrigated. The Table 8 above shows the best sowing and harvesting time for cultivation of various crops and water requirement through rain and irrigation.

Conclusion
From the Rainfall Probability analysis on the Annual and Monthly rainfall for Musiri Region, it is evident that Gumbel Distribution (Extreme Value Type -I) is ascertained as the best fit distribution type considering its Least Chi-Square Value among all other methods of analysis. Chegodayev Distribution from Plotting Position methods is found to best fit the Annual rainfall data. The present statistical analysis provides clear picture on rainfall data and it is found that rainfall available in the region is insufficient to carry out wet crop. Conjunctive use of surface water, available rainfall and ground water is essential for better agricultural and irrigation management for this area, Thus, the analysis helps in understanding the rainfall pattern of Musiri region and also in efficient crop planning and water availability of the region.