Seasonal and Non-Seasonal Generalized Pareto Distribution to Estimate Extreme Significant Wave Height in The Banda Sea

The information of extreme wave height return level was required for maritime planning and management. The recommendation methods in analyzing extreme wave were better distributed by Generalized Pareto Distribution (GPD). Seasonal variation was often considered in the extreme wave model. This research aims to identify the best model of GPD by considering a seasonal variation of the extreme wave. By using percentile 95 % as the threshold of extreme significant wave height, the seasonal GPD and non-seasonal GPD fitted. The Kolmogorov-Smirnov test was applied to identify the goodness of fit of the GPD model. The return value from seasonal and non-seasonal GPD was compared with the definition of return value as criteria. The Kolmogorov-Smirnov test result shows that GPD fits data very well both seasonal and non-seasonal model. The seasonal return value gives better information about the wave height characteristics.


Introduction
Information on extreme waves is necessary for marine planning, managing and evaluating. The Occurrence of high waves may disturb the transportation activities [1] and destruct the conservation area of mangroves and coral reefs [2]. Maritime structures in the coastal and offshore region must be designed for an extreme condition such as high waves [3]. The structures designed to stand in extreme wave condition according to expected structure lifespan. The information of maximum waves that may occur in certain periods (return periods) such as 5, 20 or maybe 100 years usually needed for marine planning. In analyzing extreme wave event, Peaks Over Threshold (POT) method was recommended in the IAHR Working Group on Extreme Wave Analysis [4]. POT uses data that exceeds the threshold as an extreme value. POT was naturally described using Generalized Pareto Distribution (GPD [5], then this method was used widely to predict extreme waves [6], [7], [ Wave seasonality or variations that repeat periodically (weekly, monthly, yearly, etc.) were often examined carefully in modeling extreme waves [9] but other researchers only consider the location of the site [6] [10]. The seasonal and nonseasonal extreme wave model uses another model (GEV) gave different result according to [11]. The primary objective of this research is to identify the difference between seasonal and nonseasonal GPD model in estimating extreme wave return value for maritime planning and management.
In the Banda Sea, there were inter-island transportation and fishing activity. Good planning and management required to support this activity. Information on extreme waves needs to be estimated to determine the risk of disasters that could interfere with such activities. The wave characteristics in the Banda Sea has associated with the monsoon and vary each month with the highest peak occurring in the Australia monsoon period or in June-July-August (JJA) [12].

Peaks Over threshold
The POT method employed a series of Hs above a defined significant wave height level or threshold level. If threshold value (u) is too low then the exceeded data will produce a biased estimator. On the other hand, if the selected value u was too high then there will not be enough data to fit the model, resulting in large variations. One method of determining the frequently used threshold value was the percentile method. This percentile method was easier and practical, but the resulting threshold determination is accurate [13]. The 95% percentile will be used in this study as an extreme Hs limit. The 95% percentile was also used by [14] and generates the GPD model corresponding to the Hs data used.

Generalized Pareto Distribution
In general, the parameters of GPD are known as σ scale parameter (σ > 0), k shape parameter ( ∈ ℝ) dan μ location parameter(μ ∈ ℝ). For threshold u, convergent on GPDs that have a cumulative distribution function (CDF) as follows: And probability function (pdf) of GPD : The parameter estimation was using the L-moment method in Easyfit software. Non-seasonal GPD uses all data that exceed the threshold value. The seasonal GPD was created by splitting the model by its seasonal variation in this study we use monthly variation, reference to the study about Indonesian wave characteristics by [12]. The seasonal GPD has twelve model represent monthly variation in the study site. The determination of threshold values and parameter estimation using the same method with non-seasonal GPD model. H0 rejected if D > D1-α/2, using a 95% confidence level (α = 5%) with a value of D derived from :

Return Value
The return value (x m ) of the extreme wave height exceeding the threshold (u) at least once in the m observation is as follows: ζ u Is the probability of events that exceed the threshold u.
Return value in N-years dan Is the amount of data in a year then = × . The equation of Nyears return value as follows : definition of return value that is the maximum Hs that occurs at least once during the return period. The return value greater than the maximum Hs was identified as incorrect and for smaller repeat periods of maximum Hs per period was identify as correct. The difference between the maximum Hs and the return period is also considered in model validation.

Result and Discussion
The threshold value specified using 95% percentile. Data greater than the threshold value are considered to be extreme values and are used to estimate GPD parameters. The number of extreme data used to estimate nonseasonal GPD parameter was 365 data with threshold value 1.76 meters. The threshold in seasonal GPD varies with the lowest threshold in November (0.74 m) and the highest in July (2.15 m). At the peak of monsoon Australia (JJA) the threshold value is higher than in other months, it means high waves occur in those months. The result of parameter estimation for nonseasonal and seasonal GPD using the L-moment method in Easyfit software was shown in Table 1. The deviations in non-seasonal GPD (Figure 2) look significant at a high value of Hs but significant deviations also occur in low-value of Hs in seasonal GPD. QQ plot for seasonal GPD (Figure 3) seen to be around diagonal lines, but it can be seen that considerable deviations on seasonal models in certain months. These results can be taken into consideration in the selection of the best models. Generally, the QQ plots for non-seasonal and seasonal GPD are spreading close to diagonal lines. Kolmogorov-Smirnov goodness of fit test shows that the value of D1-α/2 greater than D value (Table 2). It means that extreme Significant wave height follows GPD with confidence level 95%. According to QQ plot and Kolmogorov-Smirnov test, the non-seasonal and seasonal GPD fit the extreme significant wave height in the location.  QQ plot for seasonal GPD   Seasonal return value gives varying value each month while nonseasonal return value only has one value. These monthly variation of seasonal return value gives better information about the characteristics of the study location. The model has similar result with other model [11] in variation but has different in another characteristics. The maximum seasonal return value resulting a lower value in the initial period and increases steadily exceed the non-seasonal return values. The difference between nonseasonal and maximum seasonal return value was very small (less than 1 m). Return values which have positive k (Jan, Jun, and Nov) gave more increases return value in each period than negative k return value. On comparing with maximum Hs in study location, non-seasonal and maximum seasonal return value in the early period was slightly different (< 0,2 m). The nonseasonal return value has 1 incorrect value for 1 year period, but in 2 -4 years period Nonseasonal return value closer to maximum Hs than seasonal Hs.