A Monte Carlo simulation of multivariate general Pareto distribution and its application

L. Yao, W. Dongxiao, Z. Zhenwei, H. Weihong, and S. Hui South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, China State Key Laboratory of Tropical Oceanography, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou, China School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, China China Water Resources Pearl River Planning Surveying & Designing Co., Ltd., Guangzhou, China


Introduction
Statistical modeling of extreme values (EV) plays a crucial role in design and risk evaluation in ocean engineering, and multivariate extreme-value distributions have been extensively developed over the last decades.This has been shown by several studies (Morton and Bowers, 1996;Sheng, 2001;Yang and Zhang, 2013).Problems concerning ocean environmental extremes are often multivariate in character.An example of this is that ocean environments (including waves, wind and currents) all contribute to the forces experienced by offshore systems during typhoons.Thus the severity of such a typhoon event may be described by a function of both wind speed peak and concomitant wave height.When the force of a system is dominated by both wind and concomitant wave, it may be sufficient to employ a 50-year return wave and 50-year return wind as a design criterion.However, the 50-year return wind and 50-year return Introduction

Conclusions References
Tables Figures

Back Close
Full wave do not occur at the same time.Therefore, any simple analysis assuming a perfect correlation between the wind and waves is likely to overestimate the design value (Morton and Bowers, 1996).Therefore, analyzing the encounter probability among the ocean environments by means of the multivariate distribution can offer useful reference in evaluating a project's safety and cost.
In multivariate EV theory, two sampling methods -the block maxima method and the peaks-over-thresholds (POT) method -have been developed.These methods respectively correspond to two natural distributions: multivariate EV distribution (MEVD) and multivariate general Pareto distribution (MGPD).MEVD is the natural distribution of the block maxima of all components.A typical example is that a block is a year and the block maxima are the annual maxima.MGPD is the theoretical distribution of the multivariate peaks-over-threshold (MPOT) method, in which the sample includes all values which are larger than a suitable threshold.Rootzén and Tajvidi (2006), based on the research of Tajvidi (1996), suggest that MGPD should be characterized by the following few properties: (i) exceedances (of suitably coordinated levels) asymptotically have a multivariate GP distribution if and only if componentwise maxima asymptotically are EV distributed,(ii) the multivariate GP distribution is the only one which is preserved under (a suitably coordinated) change in exceedance levels.The MPOT method has a high utilization rate of raw data and a more stable calculation result; as a result it has recently become widely used.The study of Morton and Bowers (1996) is based on the response function with wave and wind speed of anchoring semi-submersible platforms, enabling analysis of extreme anchorage force and corresponding wave height and wind speed through the use of logical extreme-value distribution.In that study, the authors do not use the natural distribution of the MPOT method, MGPD, to fit samples but instead bivariate extreme-value distribution to fit the POT samples.Coles and Tawn (1994) also follow the same idea.MGPD theory has improved greatly in recent years, but the definition of MGPD still needs further research.Bivariate threshold methods were developed by Joe et al. (1992) and Smith (1994) based on point process theory.Introduction

Conclusions References
Tables Figures

Back Close
Full MGPD has been the focus of certain studies, and further detail about it can be found in Rootzén and Tajvidi (2005), Tajvidi (1996), Beirlant et al. (2005) and Falk et al. (2004).However, due to difficulties regarding MGPD in the solving procedure, (in general, with the dimension increased, the calculated quantity and complexity also rapidly increase), the application of MGPD in ocean engineering has been restricted.The use of Monte Carlo simulation is feasible to solve these problems because it only changes inner product operation and the complexity of the algorithm does not increase as dimension decreases.Liu et al. (1990) use the Monte Carlo simulation for the design of offshore platforms, and practical examples prove its fast calculation speed and high precision in compound extreme-value distribution.Philippe (2000) presents a new parameter estimation method of bivariate extreme-value distribution that uses Monte Carlo simulation.Shi (1999) presents a Monte Carlo method from a simple trivariate nested logistic model.Stephenson (2003) gives methods for simulating from symmetric and asymmetric versions of the multivariate logistic distribution, and compares many of the Monte Carlo simulation methods of multidimensional extreme distribution.
We have developed a procedure to handle the application of MGPD in marine engineering design.This paper uses the Monte Carlo simulation to solve the MGPD equation, and is structured as follows.The Monte Carlo method is introduced in Sect. 2. Fundamental to the application of MGPD is the choice of the optimal joint threshold and the estimation of the joint density: these aspects, including a case study, are discussed in Sect.3. Finally, in Sect.4, the advantages of MGPD and its Monte Carlo simulation are outlined.

MGPD theory
The MGPD method is based on the extreme-value theory, and has been widely used around the world.Generalized extreme value distribution (GEVD) is the theoretical Introduction

Conclusions References
Tables Figures

Back Close
Full distribution of all variation block maxima.GPD describes the properties of extremes of all variations over threshold after declustering, the so-called POT distribution.Based on the relationship of GPD and GEVD, H(x) = 1+log(G(x)), log(G(x)) > −1, the distribution function of MGPD can be deduced: where (x 1 , . .., ) is a multivariate extreme-value distribution function whose marginal distribution is a negative exponential distribution (detail in René, 2007).Thus MGPD has a variety of different types of distribution functions (Coles et al., 1991) because of different Pickands' dependence functions.The logistic dependence function is simple to use and has favorable statistical properties, and it is widely used in hydrology, finance and other fields.The bivariate logistic GPD is where r is the correlation parameter of dependence function and r > 1. x and y, in the interval (−1, 0), are variables of standardization.The density function is In this paper, we will always consider MGPD W with uniform margins.The marginal distributions of x and y are always transformed into a GPD with uniform margins by a suitable marginal transformation (René, 2007).Before the transformation, the distributions of x and y are F (x; r 1) and F (x; r 2), where the parameter r 1 and r 2 will also be evaluated.These parameters r, r 1 and r 2 can be evaluated by means of the following method: first, r 1 and r 2 are evaluated, and then they are introduced into the MGPD W for estimation of r.Alternatively, it can be evaluated by means of a global method: estimation of the parameters by using the maximum likelihood for the density function w(x, y; r, r 1, r 2).The global method evaluated results as more reliable due to the final function form concerned, but the processes of evaluation are more complex.The maximum-likelihood function is ln (w r (x i , y i )) . (5)

Simulation method
Using polar coordinates to better demonstrate the simulated method of MGPD, where T p is the change of the vector (x 1 , . .., x d ) into a polar coordinates.C = x 1 + . . .The c 0 above is the joint threshold in the MGPD method.This paper determines the threshold following the principle of Coles and Tawn (1994).

Joint probability distribution
With the development of offshore engineering, joint probability study for extreme sea environments such as wind, waves, tides and streams is beginning to receive much more attention.API (American Petroleum Institute), DNV (DET NORSKE VERITAS) and so on were not proposed an explicit method as design criteria for marine structures although they made some relevant rules.API (1995) suggests three options, one of which is "Any 'reasonable' combination of wind speed, wave height, and current speed that results in the 100-year return period combined platform load".The joint return period of two variables needs to be considered for the probability of encounter between variables.Conditional probability can represent the probability of encounter between the extreme value of main marine environmental elements and the extreme value of its simultaneous marine environmental elements.For example, the probability of a 50-year return wave and a 50-year return wind speed occurring simultaneously at the same place is very small.Therefore, it is critical to use conditional probability to describe the probability of their joining together and analyze the effect of all kinds of marine environmental elements with regard to engineering.Introduction

Conclusions References
Tables Figures

Back Close
Full The joint distribution of bivariate Pareto distribution function W (x, y) is W (x, y) = Pr(X < x, Y < y).
W x (x) and W y (y) are marginal distributions of x and y, respectively.Conditional extreme-value distribution can be as follows: Conditional probability 1: conditional probability 2: conditional probability 3: conditional probability 4: Another four conditional probability distributions can be deduced by swapping two variables.
3 Case study

Sample selection of over-threshold values and marginal distribution
The raw data of the paper are wave height and synchronous wind speed observed four times a day over 23 years from an ocean hydrological station in the South China Sea 2740 Introduction

Conclusions References
Tables Figures

Back Close
Full (SCS).In the sample, the maximum winds reached 40 m s −1 and the maximum wave height was 8.50 m.The extreme wind speed and its corresponding wave height are selected as research samples.The sample of the over-threshold method is from the extreme value of blocks, and the principle behind declustering is to maintain sample independence.In the SCS, typhoons occur frequently and are the cause of almost all extreme wind speeds and wave heights.Generally, a typhoon may last several days or 1 week in the SCS, and so this paper declusters by 5-day intervals, taking the maximum value of a block.If the interval between two extremes is less than 2 days, then we need to delete smaller values from the samples in order to keep independence.Except for some individual processes of the storm which last a long time, most of the data meet the requirements of independence.
After the sample has been fixed and completed according to the requirements of independence, 1436 groups of extreme wind speed and corresponding wave height are selected.Their marginal distributions can be described by means of GEVD.GEVD includes three types of extreme-value distribution, and both marginal distributions use three variables for GEVD in this paper.
where ξ, σ and µ are the three variables of GEVD; these are estimated by means of a maximum-likelihood estimate.Figure 1 shows the probability plot of marginal distribution.For the annual maximum of wind speed and wave height, a Pearson type III distribution is used to obtain return period values of wind speed and wave height in one dimension (see Fig. 2).The Pearson type III distribution is OSD Introduction

Conclusions References
Tables Figures

Back Close
Full

The joint probability distribution
The bivariate logistic generalized Pareto distribution was selected, and the data were converted to negative exponential distribution in the interval (−1,0) due to the active interval of the method being (−∞,0).The MGPD model of the paper is based on multivariate extreme-value distribution; the joint threshold can be calculated using the method developed by Coles and Tawn (1994).The joint threshold is c 0 = −0.7,and there are 450 groups of the combination of wind speed and wave height over c 0 .Figure 3a shows the samples of over-threshold values.In the left-hand panel of Fig. 3a, c 0 = −0.7 is a curve, and the right side of the curve shows values over the threshold.In the right-hand panel of the Fig. 3a, c 0 = −0.7 is a line, and the area to the top right of the line represents over-threshold values the converted data in polar coordinates.The joint distribution is shown in Fig. 3b.

Comparison of stochastic simulation results
Figure 4 shows the over-threshold values and the data of stochastic simulation by N = 50 000 and N = 100 000, respectively.
The simulation results are in agreement with the actual situation, showing that the MGPD method was successful.The scatter diagrams show the results directly, but they require further quantitative analysis in order to show the differences of them objectively.A couple of the conditional probabilities mentioned above are used in this paper: (1) P (H > h|V > v) and ( 4) P (H < h|V < v), which mean (1) the probability of the wave height over h under the wind speed over v and (4) the probability of the wave height less than h under the wind speed less than v, respectively.Both of these actually respond to the probability of extreme-value wave height and its corresponding wind speed joint occurrence.
Figure 5 shows the calculation of the conditional probability P (H > h|V > v) by group h = 7.99 m and v = 37 m s −1 .The Monte Carlo method is used to calculate its conditional probability through the result of the simulation based on the definition of con-Introduction

Conclusions References
Tables Figures

Back Close
Full theory of extreme values, which is well founded, and the intrinsic properties of all extreme variables are taken into consideration.
2. Through analysis of conditional probability, the Monte Carlo method of MPOT has only small errors, and provides a solution for the analysis of multivariate and complex cases, and thus the technique shows promise for future use.
3. Conditional probability includes the probability of extreme events being encountered, and provides the theoretical basis for finding the best balance point between engineering cost and risk.
4. The model of MGPD has the ability to describe the probability of multivariate extreme-event occurrence at the same time.A larger sample size than traditional annual extreme-value methods allows for the extreme features of the raw data to be maintained as best as possible.Introduction

Conclusions References
Tables Figures

Back Close
Full  Full  Full  Full Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | x d and Z = (x 1 /C, . .., x d −1 /C) are radial component and angular components, respectively; these are referred to as the Pickands polar coordinate.In the Pickands polar coordinate, W (X ) presents different properties.Let us assume that (X 1 , . .., X d ) follows MGPD W (X ) and that its Pickands dependence function D exists as a d -order differential.We define the Pickands density of H(X ) as φ(z, c) = |c| d −1 ∂ d ∂x 1 , . .., ∂x d H T −1 p (z, c) .(Discussion Paper | Discussion Paper | Discussion Paper | If we assume that µ = R d −1 ϕ(z)dz > 0 and constant c 0 < 0 exist in a neighborhood of zero, then the simulation method of MGPD is as follows: (1) generate uniform random numbers on unit simplex R d −1 ; (2) generate random vector (z 1 , . .., z d ) based on the density function f (z) = ϕ(z) µ of Z = (z 1 , . .., z d −1 ) in the Pickands polar coordinate combined with the acceptance-rejection method; (3) generate uniform random numbers on (c 0 , 0); and (4) calculate vector cz 1 , . .., cz d −1 , c − c d −1 i =1 z i , which is a random vector for satisfying the multivariate over-threshold distribution.
Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Table 1 .
Parameters of marginal distribution.

Table 2 .
Comparison of the results of conditional probability 1.

Table 3 .
Comparison of the results of conditional probability 4.