Modelling Rainfall Duration And Severity Using Copula

Malaysia experiences two monsoon periods which are usually accompanied by floods and such occurrences have been quite frequent over the past two decades. Muda river which is located in the Northern part of Malaysia has experienced major flood events which are linked to the rainfall duration and severity. Copula modelling has become popular in various fields like insurance, finance, geostatistics, hydrology, etc. In Malaysia, copula modelling of floods and rainfall have not been undertaken yet and therefore the objective of this study is to fit the copulas to the rainfall duration and severity data of a selected station in Malaysia. We found that both the rainfall duration and severity follow Gamma distributions and the Gumbel-Hougaard copula appears to be a suitable model for modelling rainfall duration and severity.


Introduction
The monsoon seasons in Malaysia [Ref. 10] are the southwest monsoon that occurs from June to September each year and the northeast monsoon stretches from November to March.During these periods, flood is a frequent occurrence and on some occasions it is costly and life threatening.The Stormwater Management and Road Tunnel (SMART) in the capital city of Kuala Lumpur is one of the mitigation projects constructed by the government to reduce the impact of flash floods during heavy downpour.Another mitigation project in Northern Malaysia (along the Muda river) is the Muda River Mitigation Plan [Ref. 3].Muda river lies in the border that separates the States of Kedah and Penang.The area in the vicinity of Muda river has experienced numerous floods over the past 20 years and major flood events were recorded in the years 1988, 1995, 1998 and 2003.The occurrences of floods are ISBN-1391-4987 © IASSL usually linked to the duration and severity of rainfall.Therefore, it is essential to model these variables so that good mitigation plans and proper water resources management can be carried out.
In Malaysia, Tekolla (2010) studied the rainfall and flood frequency analysis for Pahang river basin by applying the methods of Principal Component Analysis and Log-Pearson Type III distribution respectively.Shafie (2009) did a case study on floods of 2006 and 2007 in the State of Johor, Malaysia by analysing and discussing floods from the viewpoint of the maximum precipitation and their causes.Bustami et al. (2011) modelled the flood mitigation structures for Sarawak river sub-basin using infoworks River Simulation (RS) and the results showed that the most appropriate structure is by constructing retention ponds.Razi 2006) studied a copula-based flood frequency (COFF) analysis at the confluences of river systems and bivariate frequency analysis of floods using copula respectively.In a recent study, Rauf and Zeephongsekul (2011) modelled rainfall duration and severity of North-Eastern Victoria using copulas.In that study, the rainfall duration and severity were defined by the Standardised Precipitation Index (SPI).
Although flood and rainfall analyses in Malaysia have been done by many methods, copula modelling has not been undertaken.Therefore the objective of this study is to fit the copulas to the rainfall duration and severity data of a selected station in Malaysia based on the maximum loglikelihood function.In Section 2, a brief description of the SPI, bivariate copulas and parameter estimation are discussed.Results are presented in Section 3 and finally Section 4 contains the conclusions.

The Data Set
In this study, we considered the Klinik Jeniang station (station no.: 5806066) that is situated on the upstream of the Muda river in the State of Kedah, Malaysia.A map of this station is shown in Figure 1 and its exact location is 0548N and 10037E.The station has a relatively high annual mean rainfall of 2325 millimeters (mm).©IASSL ISBN-1391-4987 The Department of Irrigation and Drainage (DID), Malaysia collects data at this station in three different ways.The method of data collections are (i) weekly collection using rain recorder, (ii) quarterly card collection using logger charts, and (iii) daily manual gauge readings.The rainfall data used in this study consist of 504 monthly precipitation values from the year 1970 to 2011, which was provided by the DID Malaysia.

Standardised Precipitation Index (SPI)
The Standardised Precipitation Index (SPI) was first developed by McKee et al. (1993).For details on the calculation of SPI, one can refer to Guttman (1998).However, for the sake of easy reference, we illustrate the computation of the 3month SPI (SPI-3) by the following example:

Example
The monthly precipitation values (January 1970 to December 2011) of the Klinik Jeniang station are shown in Table 1.
The details of the computation are as follows: First, we obtain the 3-month cumulative precipitation values for each month.For example, the month of March 1970    1. Next, let be the entries of Table 2 where and .As zero's are omitted in the computation of the mean of th column, we define as: As is defined for , therefore and where is the probability of .
5. The found are then transformed to a standard normal distribution with mean, and variance .These transformed values are the SPI-3 values which are displayed in Table 3.For the purpose of this study, the rainfall duration, is defined as the length of time where SPI  1.For a given duration period, , the corresponding rainfall severity, is defined as the sum of SPI values  1 observed during that period.

Brief Discussion of Bivariate Copula
Sklar (1956) developed copula functions which are able to join two univariate marginal functions that are uniformly distributed on to form bivariate functions.
Sklar's theorem states that if is a joint distribution with marginal distributions given by and , then there exists a copula such that If , and are the density functions of , and respectively, then and For a more comprehensive study of Copula, see Nelsen (2006).
In the study, we considered two one-parameter Archimedean copulas, namely, the Gumbel-Hougaard copula and Clayton copula as in a recent study by Rauf and Zeephongsekul (2011), they too considered these two copulas.The Gumbel-Hougaard copula is given by , while the Clayton copula is given by , [17,23].

Parameter Estimation
Estimation of parameters can be done in various ways.The method that we employed in this study is the maximum likelihood method.By assuming that the rainfall duration follows an Exponential distribution and rainfall severity follows a Gamma distribution [Ref.12,15], the parameters of the two marginal distributions were estimated separately.If the estimated parameter of the Exponential distribution, and the estimated parameters of the Gamma distribution and , then .The parameter of the copula, would then be estimated by the loglikelihood function [Ref. 22] given as where .
The maximum loglikelihood estimator of is .

Results
Based on these data, the 3-month SPI were computed and a plot of it is shown in Figure 2. Out of these SPI values, 36 of them have SPI values  1.00 which are the wet conditions.The summary statistics for the rainfall duration and severity are tabulated in Table 5 while the Kendall's tau of rainfall duration and severity is found to be 0.848.The density plots of rainfall duration and severity are shown in Figure 3.It is usual to fit Exponential distribution for rainfall duration [ Ref 12].Following this, we also fitted an Exponential distribution ( ).However, a chi-square goodness of fit test rejected the null hypothesis that the rainfall duration follows an Exponential distribution at the 5% level of significance (exact p-value ). Therefore we fitted a Gamma distribution ( , ) for the rainfall duration.Then, a chi-square goodness of fit test was conducted and the null hypothesis that the rainfall duration is a Gamma distribution was not rejected.Similarly, rainfall severity also follows a Gamma distribution ( , ).Hence, in our study, both the rainfall duration and severity follow Gamma distribution.
We then used the package "copula" in R [Ref. 22] to fit the joint distribution to the rainfall duration and rainfall severity with the copulas given in Section 2.3.The fitted parameter values of the copulas together with their loglikelihood functions are presented in Table 6.The maximised loglikelihood function is used as the criterion to choose the better joint distribution and Gumbel-Hougaard emerged as the better model.

Gumbel-Hougaard Clayton
The suitable joint distribution for the rainfall duration and rainfall severity for the Klinik Jeniang station is therefore given by: where

Conclusions
The objective of this study was to fit copulas for rainfall duration and severity for the selected station, Klinik Jeniang station as this station is located in the vicinity of the Muda river which has experienced numerous flood situations over the past two decades.We found that both rainfall duration and severity follow Gamma distribution and the Gumbel-Hougaard copula appears to be an appropriate model based on the loglikelihood function.This model will be useful for making probability statements regarding the return periods in the future.The authors are currently considering different lengths of SPI and the results will be reported in a future paper.
et al. (2010) estimated the flood of Johor river using Hydrologic Modeling System (HEC-HMS) and found that models developed by HEC-HMS are useful tools to predict flood levels.A study on flood frequency analysis of annual stream flow using L-Moments and TL-Moments was done by Ahmad et al. (2011) for the stations in the State of Negeri Sembilan, Malaysia.Copula Modelling has become more popular in recent times and has been used in various fields, for instance, the fields of insurance and finance [Ref.1], geostatistics [Ref.4], hydrology [Ref.8] and drought studies [Ref.15, 20, 21, 23].There have been numerous uses of copulas in the analysis of floods.Hu et al. (2010) applied copula functions to construct a joint distribution for typhoon and plum rain (of the East Asian rainy season) while Grimaldi and Serinaldi (2006) applied asymmetric copula in multivariate flood frequency analysis.Wang et al. (2009) and Shiau et al. (

Table 2
is the sum of January, February and March 1970, similarly for April 1970 is the month of February, March and April 1970.However, no values are available for January and February 1970 as the precipitation values of November and December 1969 are not available.The values are tabulated as in