A Copula Based Bayesian Approach for Paid-Incurred Claims Models for Non-Life Insurance Reserving

Our article considers the class of recently developed stochastic models that combine claims payments and incurred losses information into a coherent reserving methodology. In particular, we develop a family of Heirarchical Bayesian Paid-Incurred-Claims models, combining the claims reserving models of Hertig et al. (1985) and Gogol et al. (1993). In the process we extend the independent log-normal model of Merz et al. (2010) by incorporating different dependence structures using a Data-Augmented mixture Copula Paid-Incurred claims model. The utility and influence of incorporating both payment and incurred losses into estimating of the full predictive distribution of the outstanding loss liabilities and the resulting reserves is demonstrated in the following cases: (i) an independent payment (P) data model; (ii) the independent Payment-Incurred Claims (PIC) data model of Merz et al. (2010); (iii) a novel dependent lag-year telescoping block diagonal Gaussian Copula PIC data model incorporating conjugacy via transformation; (iv) a novel data-augmented mixture Archimedean copula dependent PIC data model. Inference in such models is developed via a class of adaptive Markov chain Monte Carlo sampling algorithms. These incorporate a data-augmentation framework utilized to efficiently evaluate the likelihood for the copula based PIC model in the loss reserving triangles. The adaptation strategy is based on representing a positive definite covariance matrix by the exponential of a symmetric matrix as proposed by Leonard et al. (1992).


Introduction
As discussed in Merz and Wüthrich [16] the main task of reserving actuaries is to predict ultimate loss ratios and outstanding loss liabilities. In general such predictions are based on past information that comes from a variety of sources. Under a credibility based framework, the weighting of different data sources and their relative contribution to the estimated reserve is difficult to determine. Therefore, it is important to consider developing a unified prediction framework for the outstanding loss liabilities, known as the paid-incurred-claims (PIC) class of models. However, to date only simple dependence structures have been considered, with three parameters for the correlations which were not incorporated into the formal Bayesian estimation approach, and instead fixed deterministically a priori. There are two technical difficulties in extending the current restrictive assumptions within a Bayesian framework. The first is being able to generate the positive definite matrices; the second is evaluating the joint likelihood of the mixture copula defined over the observed payments and incurred losses in each accident year row of the reserving matrix. Our article significantly extends the dependence structure of current PIC models by solving these two problems. The first problem is resolved through utilisation of a class of matrix-variate Inverse-Wishart priors coupled with an adaptive Markov chain sampler that restricts the proposed Markov chain states to remain on the manifold of such matrices. The second problem is solved by using a data augmentation strategy which treats the unobserved parts of the loss triangle as missing data so that one can perform evaluation of the copula based likelihood required for inference on the model parameters.
In order to ensure the financial security of an insurance company, it is important to predict future claims liabilities and obtain the corresponding prediction intervals which take into account parameter uncertainty. The PIC model is a claims reserving method which statistically combines information about claims payments and incurred losses.
It allows actuaries to best utilise the available information for loss reserves. The Munich chain ladder method introduced by Quarg and Mack [24] is one of the first claims reserving approaches in the actuarial literature to unify outstanding loss liability prediction based on both sources of information. This method aims to reduce the gap between the two chain ladder predictions that are based on claims payments and incurred losses data, respectively. It is achieved by adjusting the chain ladder factors with paid-incurred ratios to reduce the gap between the two predictions. The main drawback with the Munich chain ladder method is that it involves several parameter estimates whose precisions are difficult to quantify within a stochastic model framework.
Merz and Wüthrich [16] recently introduced a log-normal PIC chain model and used Bayesian methods to estimate the missing (future) part of the claims reserving triangles based on both payment and loss incurred information.
Its major advantage is that the full predictive distribution of the outstanding loss liabilities can be quantified.
One important limitation of the model of Merz and Wüthrich [16] is that it does not develop the dependence properties of the PIC model that will be applicable to loss reserving data observed in practice. Our article extends the proposed Bayesian PIC models of Merz and Wüthrich [16] to capture additional dependence structures.

Brief Background
Dependence within payment data, within incurred loss data, and even between payment and incurred loss data commonly exists due to the nature of the loss process. Payment and incurred loss ratios in the previous development period are likely to impact that of the next development period. Hence, correlation between development periods is practically appealing in claims reserving practice. Moreover, incurred loss is essentially payment data plus case estimates which are projections foreseen by case managers to estimate the remaining payments. Correlation between payment and incurred loss data is also found. Happ and Wuthrich [11] propose a fixed covariance structure to describe the correlation between payment and incurred loss, assuming that the correlations between different development periods are identical. In reality, correlations differ across development periods for various reasons, such as different stages of the life cycle for a claim and internal policy changes. In order to fully incorporate the actual correlations, we introduce a block covariance structure to allow for the variation between different development periods within payment and incurred losses. We also develop a second class of hierarchical mixture of copulas models.
The estimation challenge involves constructing and sampling from the resulting Bayesian models for PIC with flexible dependence structures. To specify the model, we vectorize the triangular random structures for payments and incurred loss and, applying appropriate permutations, we then assume a copula dependence structure on the vectorized data. We use a Gaussian copula with an unknown correlation matrix, which is restricted to be block diagonal for parsimony, or a mixture Archimedian copulas across development periods.
We estimate the Bayesian models by MCMC methods, using data augmentation to generate missing data values in the loss triangle and use an adaptive Metropolis algorithm to generate the unknown parameters. Bayesian simulation methodology is used to carry out inference on all aspects of the models considered and to obtain predictive distributions for reserves.

Contributions
We design a novel class of PIC models and illustrate it with two examples. The first involves a mixture of Clayton and Gumbel copulas for upper and lower tail dependence features in the development years for payments and incurred losses. The second example involves a Gaussian copula model in which the covariance structure is a telescoping block diagonal form representation which captures dependence between development lag years in the payments and incurred losses. By a telescoping block diagonal matrix we mean one in which the main diagonal is comprised of sub-blocks for which each incremental sub-block contains one less row and column compared to the previous. In constructing these models we consider hierarchical Bayesian models with hyperparameters on the priors for development factors and specially developed matrix-variate priors on the covariance structures which preserves the conjugacy properties of the independence models developed in Merz and Wüthrich [16] and Merz and Wüthrich [15].
For the independent and Gaussian copula based PIC models we develop a class of conjugate posterior models that can be efficiently estimated via an MCMC sampler known as a block Gibbs sampler. However, the extension to general copula dependence structures requires non-conjugate priors, making it necessary to develop adaptive MCMC algorithms. Adaptive sampling uses previous iterates of the Markov chain to form more efficient Metropolis proposals for the parameters, this class of MCMC algorithm has received growing attention in the statistics literature since it was recently developed and is now recognized as an important tool for Bayesian inference. There is an increasing interest in utilizing adaptive MCMC to facilitate more efficient sampling (Andrieu and Thoms [2], Atchadé and Rosenthal [3]). The adaptive techniques that we adopt in this paper fall within the general framework of adaptive Metropolis, and employ the optimal scale factors (Roberts and Rosenthal [25]) from the Single Component Adaptive Metropolis (SCAM) algorithm (Haario et al. [10]). There have been some initial utilisations of adaptive MCMC specifically in financial modelling such as [21] and the references therein. In addition the adaption strategies we consider in this paper involve extensions of Euclidean space Adaptive Metropolis to the space of positive definite matrices, creating a class of matrix variate Markov chain adaptive proposals.
In the mixture copula based PIC models, we design data augmentation strategies which are a class of auxiliary variable methods. We modify these approaches to the PIC copula based models in order to circumvent the challenge of intractable likelihood evaluations which arise form the structure of the PIC reserving triangle. In particular we argue that the tail dependence features of the model should be consistent accross all development years for both payment and incurred loss data. This poses an evaluation challenge for the likelihood as it involves evaluation of marginal likelihood quantities given the observed data in accident year i, given by payment and incurred losses.
The integral required when utilising mixture copula structures over the accident years is intractable, therefore we introduce auxiliary variables into the Bayesian model in a data-augmentation structure to overcome this dificulty.

Review of the Merz-Wuethrich Independence Copula Paid-Incurred Claims Model
This section introduces the PIC model which involves two sources of information. The first is the claims payment data, which involves payments made for reported claims. The second source of data incorporated into the statistical estimation are the incurred losses corresponding to the reported claim amounts. The differences between the incurred losses and the claim payments are known as the case estimates for the reported claims which should be equal once a claim is settled. This imposes a set of constraints on any statistical model developed to incorporate each of these sources of data into the parameter estimation. We use the constraints proposed in Merz and Wüthrich [16] which are used to specify a model based on a claims triangle constructed from vertical columns corresponding to development years of claims and rows corresponding to accident years. This structure for the observed data is summarized in triangular form which is utlised for both the claims payments and the incurred losses, as presented in Figure 1.
Without loss of generality, we assume an equivalent number J of accident years and development years. Furthermore, we assume that all claims are settled after the J-th development year. Let P i,j be the cumulative claims payments in accident year i after j development periods and I i,j the corresponding incurred losses. Moreover, for the ultimate loss we assume the constraint discussed on the case estimates corresponds to the observation that predicted claims should satisfy P i,J = I i,J with probability 1, which means that ultimately (at time J) the claims reach the same value and therefore satisfy the required constraint.
We define (i) P 0:J,0:j = {P k,l : 0 ≤ k ≤ J, 0 ≤ l ≤ j}. (vi) Define the vectorization operator on a p × n matrix A, denoted by V ec(A), as the stacking of the columns to create a vector.
As in Merz and Wüthrich [16], we consider a Log-Normal PIC model as this facilitates comparison between existing results and the results we derive based on different dependence frameworks in extensions to this model.
We now introduce the PIC model and the statistical assumptions for the independent case, followed by remarks on the resulting marginal posterior models for the payment and incurred losses.
Model Assumptions 2.1 (Independent PIC Log-Normal (Model I)). The model assumptions for the independent model of Merz and Wüthrich [16] are: • The cumulative payments P i,j are given by the forward recursion (source Merz and Wüthrich [16]).
• It follows that Component2: Discounted final development year restricted payment and incurred Component3: incurred As noted in Merz and Wüthrich [16], there are several consequences of the model assumptions made regarding the restriction I i,J = P i,J which applies for all i ∈ {1, 2 . . . , J}. The first is that this condition is sufficient to guarantee that the ultimate loss will coincide for both claims payments and incurred loss data. The second is that this model assumes that there is no tail development factor beyond the ultimate year J. However this could be incorporated into such models, see Merz and Wüthrich [15].
Merz and Wüthrich [16] discuss the relationship between the proposed Independent Log-Normal PIC model and existing models in the literature for payment loss based reserving and incurred loss based reserving. In particular, Merz and Wüthrich [16] [Section 2.1 and 2.2] show that the resulting cumulative payments P i,j , conditional on model parameters Θ, will satisfy the model proposed in Hertig [12] and the incurred losses I i,j , conditional on model parameters Θ, will satisfy the model proposed in Gogol [8]. Lemma 2.2 summarizes their results.
Lemma 2.2. The relationships between consecutive payment development year losses in a given accident year is given conditionally according to in agreement with Hertig's model. With conditional moments given according to the Chain Ladder property as in Merz and Wüthrich [16,Lemma 5.2] by, Furthermore, conditional upon the model parameters Θ, for all 0 ≤ j < j + l ≤ J the relationships between consecutive incurred losses in a given accident year are given in Merz and Wüthrich [16] [Proposition 2.2] according to

5)
These results are consistent with the model assumptions of Gogol, and are derived using properties of multivariate normal distribution, see Lemma 2.1 in Merz and Wüthrich [16].

Dependence via Payment Loss Ratios and Incurred Loss Ratios (Model II)
This section generalizes the model by Happ and Wuthrich [11], which has a static covariance structure, see Happ and Wuthrich [11, Figure 1 Model Assumptions 3.1 (Dependent Payment-Incured Ratios: PIC Log-Normal (Model II)). The model assumptions for the Gaussian copula PIC Log-Normal model involve: • The random matrix Σ i ∈ R (2J+1)×(2J+1) representing the covariance structure for the random vector constructed from log payment ratios ξ i,j = log Pi,j Pi,j−1 and log incurred loss ratios ζ i,j = log in the i-th development year, denoted by Ξ i = (ξ i,0 , ξ i,1 , ζ i,1 , ξ i,2 , ζ i,2 , . . . , ξ i,J , ζ i,J ), is assumed distributed according to an inverse Wishart distribution prior (see definition and properties in Lemma A.2 and Lemma A.3), where Λ i is a ((2J + 1) × (2J + 1)) positive definite matrix and k i > 2J.
• Conditionally, given Θ = (Φ 0 , . . . , Φ J , Ψ 0 , . . . , Ψ J ) and the (2J + 1) × (2J + 1)-dimensional covariance matrix Σ, we have: -The random matrix, constructed from log payment ratios ξ i,j = log Pi,j Pi,j−1 and log incurred loss ratios , denoted by Ξ and comprised of columns where it is assumed in the model in Happ and Wuthrich [11] that Σ i = Cov(Ξ i ) = Σ. However, this need not be the case and it is possible to consider two extensions, the first in which Cov(Ξ i ) varied as a function of i ∈ {0, 1, . . . , J} and the second being the most general of these model structures given by the assumption -For all accident years, i ∈ {0, 1, . . . , J}, the ultimate payment losses and incurred losses are equal a.s., • The matrix Σ is positive definite and the components of Θ are independent with prior distributions

4)
and hyper-prior distributions This model extends the model developed in Happ and Wuthrich [11] which assumes that Σ is fixed and known with a tri-diagonal structure. The extension we consider generalizes the dependence structure to be unknown a priori and given an inverse Wishart prior for matrix Σ, so it forms part of the inference given the data, in the Bayesian inference. In addition, unlike in Happ and Wuthrich [11] where they assume Σ = Σ i , ∀i ∈ {0, 1, . . . , J}, we also allow for variation in Σ i across development years.
Given these model assumptions, we now consider two consequences of the proposed model structures for the dependence between the log payment ratios and the log incurred loss ratios given in Lemma 3.2 and Lemma 3.5.
Lemma 3.2. Conditional upon Λ i and k i , for all i in {0, 1, . . . , J}, and given the marginal distributions for Σ i follow Σ i ∼ IW (Λ i , k i ) with Λ i a ((2J + 1) × (2J + 1)) positive definite matrix and k i > 2J, the joint distribution for the (2J 2 + 3J + 1) × (2J 2 + 3J + 1) covariance matrix Σ for the vectorized matrix for Ξ, given by Ξ = V ec(Ξ), under the assumption of independendence between development years, results in a joint distribution given by: with degrees of freedom k = J i=0 k i > 2J 2 + 3J and scale matrix Furthermore, the joint prior mean and mode for the distribution of the random matrix Λ are Λ, and m Σ = 1 Remarks 3.4. It is noted in Happ and Wuthrich [11] and Lemma 3.2 that in formulating the likelihood structure for this dependent model it is more convenient to work with the one-to-one (invertible) transformation for the observed data defined marginally for the i-th development year according to

10)
where M i is the i-th column of matrix M and X i ∈ R 2J+1 defined by This results in the joint matrix variate Normal distribution for random matrix X = [X ′ 0 , X ′ 1 , . . . , X ′ J ] of all observed losses for payment and incurred data given after vectorisation X = V ec (X) by (3.11) Furthermore, if we consider the property of multivariate Gaussian distributions given in Lemma 3.5 we can find for the i-th accident year the required conditional distribution of the unobserved claims for payment and incurred loss data under the specified model. Furthermore, we can find the conditional distribution for unobserved claims for payment and incurred losses in the i-th accident year, given all observed claims triangles for payments and incurred losses data, see Lemma 3.5 below. This is directly relevant for specifying the resulting likelihood model. given Y (2) and the marginal distribution of Y (1) is (2) and the Schur complementΣ = Σ 1,1 − Σ 1,2 Σ −1 2,2 Σ 2,1 under the partitioning of the mean and covariance given by (3.13) Definition 3.6 below defines a family of permutation matrix operators. This permutation family allows the representation of the vectorization of the two loss triangles under different permutations that facilitate dependence specifications in the proposed models that admit conjugacy.
′ . Define the family of permutation matrix operators, denoted by P * i and indexed by p × 2, p ≤ n 2 , indices matrix (vector of tuple elements) i with j-th element [i ] j = {(k, l) ; k, l ∈ {1, 2, . . . , n}}, and defined according to the mapping P * i : V ec(Y ) → V ec(Y ) * given by where we define Y [i ] j as the element of matrix Y corresponding to the resulting tuple index location in the j-th element (column) of (tuple vector) i , P * i an n 2 × n 2 permutation matrix defined by

15)
and P i is a matrix with only non-zero identity elements at the p locations in the indices matrix tuples in i corresponding to index elements.
Using the property of the multivariate Gaussian distribution in Lemma 3.
the dependence structure Ω = I J+1 and the observed payment losses and incurred losses in the i-th accident year, denoted by X the conditional distribution for the log of the unobserved payment losses and incurred losses (X for which the first J − i elements of the permuted random vector X * correspond to all un-observed payment and incurred loss random variables, and the remaining elements are the observed payment and incurred data, denoted X * . Then, conditional on the model parameters Θ, the general dependence structre Σ = Σ ⊗ Ω with matrices Σ and Ω, and the following results hold: • The conditional distribution for the log of the unobserved payment losses and incurred losses in the i-th year, corresponding to the first J − i elements of the permuted random vector X * sub-matrix denoted below by Γ and defined by the top sublock of the permuted covariance matrix (3.20) • Given, this covariance matrix one specifies the conditional mean vector, denoted bȳ − µ (2) , according to the subblocks of the Γ covariance matrix defined with respect to the first J −i elements X * (1) and remaining elements of X * as well as µ (1) = P * i V ec(M ) 1:J−i and the second J−i to J−i+ J n=−1 (J − n) elements are given by µ (2) Having specified these statistical assumptions, we can formulate the joint likelihood from the observed data for both payments and incurred claims conditional upon the model parameters according to Equation .
We note that our proposed models also allow one to consider the dependence structures of Happ and Wuthrich [11] who assume that Σ i = Σ, ∀i ∈ {0, 1, . . . , J} and Ω = I J+1 , with the specific setting of Σ via a tri-diagonal correlation matrix with three correlation parameters which are assumed either known a priori or estimated prior to inference in the PIC model. Such an approach was motivated by the belief that a positive change in incurred loss results in an immediate payment in the same development period, and the remaining increased expectation is paid with some settlement delay. Therefore, the incurred losses increments ζ j i are assumed to be positively correlated to the claims payments increments ξ i,j , ξ i,j+1 and ξ i,j+2 with positive correlations ρ 0 , ρ 1 , ρ 2 , respectively. However, the argument for more general dependence structure that were introduced as extensions to the model of Happ and Wuthrich [11] are developed to account for the fact that these assumption may not be true, especially in long tail portfolios, such as compulsory third party. If the status of a claimant changes and requires long term medical treatment and rehabilitation, it might result in substantially high loss in the subsequent lengthy lag periods. The paper also assumes that the dependence is the same across different lag years, which is not always a realistic feature of such data. Our article aims to fill this gap and enhance the correlation structure in PIC models whilst maintaining a parsimonious model specification.

Dependence Between Development Lag Years for Payment Losses and Incurred Losses (Model III)
This section considers an alternative dependence structure motivated by the fact that dependence between lag years is practically appealing in claims reserving practice. It affects the estimation of outstanding claims the most, and is widely recognized by actuaries in claims reserving. Lag is the measure of the difference between incurred month and paid month. Depending on the nature of the portfolio, many insurance claims often have lengthy settlement periods due to various reasons such as late reported claims, judicial proceedings, or schedules of benefits for employer's liability claims. During the lengthy lag periods, large payments in the previous lag period normally follow by small payments in the subsequent lag period. There may in fact be positive correlation if all periods are equally impacted by a change in claims status, e.g. if a claim becomes litigated, resulting in a huge increase in claims cost. There may also be negative correlation if a large settlement in one period replaces a stream of payments in later periods.
The model developed in this section mainly focuses on capturing this feature of dependence between lag years. To achieve this we propose a block covariance structure for the covariance matrix, which will respect the dependence between each lag point. The model we propose is summarised in Model Assumptions 3.9 below.
Model Assumptions 3.9 (Dependent Development Lag Years: PIC Log-Normal (Model III)). The following statistical model assumptions are developed: in the i-th accident year and analogously for incurred loss data Σ I i ∈ SD + (J − i). When i = 0 we consider Σ P 0 ∈ SD + (J + 1) and for incurred loss data log I 0,0:J−1 with Σ I 0 ∈ SD + (J). Assume an inverse Wishart distribution (see Lemma A.3 and Lemma A.2) for each matrix defined according to where Λ P i and Λ I i are the inverse scale matrices for the prior for the payment and incurred loss data covariance priors respectively. Hence, the joint covariance between all observed payment and incurred loss data satisfies the telescoping diagonal block size covariance structure: (3.23) • Conditionally, given Θ = (Φ 0 , . . . , Φ J , Ψ 0 , . . . , Ψ J ) and the covariance matrix Σ, we have the following results -Consider the marginal distribution of the first J n=−1 (J − n) elements of the vectorized random matrix of observed payment and incurred losses, with i-th column X i ∈ R 2J+1 given by
-For all accident years, i ∈ {0, 1, . . . , J}, the ultimate payment losses and incurred losses are equal almost surely, P i,J = I i,J .
• The matrix Σ is positive definite and the components of Θ are independent with prior distributions and hyper-prior distributions for all i ∈ {1, . . . , J} and j ∈ {0, . . . , J}.
This proposed model is therefore another generalization of the dependence structure of the model structure proposed in Happ and Wuthrich [11]. As such, the likelihood structure is given by the multivariate Gaussian given in Equation (3.21) with the covariance matrix given by the telescoping diagonal block size covariance matrix structure in Equation (3.23).

Hierarchical Bayesian Conjugacy Under Gausian Copula Dependent PIC: Models I, II, III
Under the Gaussian copula based dependence models, the ability to obtain the observed data likelihood in the form of a multivariate Gaussian distribution means that we obtain conjugacy properties. This makes the estimation of such models by MCMC more efficient because we can us Gibbs sampling in blocks. This section presents a generic set of such conjugate models for any of the dependence structures specified in Models I, II and III.
Lemma 3.10. Conditional upon the parameters Θ and the covariance matrix Σ, the permuted data P * i (V ec(X)) can be transformed to produce the independent likelihood in Equation (2.2). This is achieved by considering the class of vector transformations T : R (d×1) → R (d×1) , such that if the initial covariance structure of random vector X was given by Σ = Cov (X), then the resulting covariance structure Cov (T (X)) = I d . The required rotation-dilation transformation is obtained by the spectral decomposition of the covariance according to a spectral decomposition (see Stoica and Moses [27] matrix of the eigenvalues of Σ. Therefore the following holds for each of the models under a transform of the vector of permuted observations T P * i (V ec(X)) : where each accident year's dependence between payments and incurred losses is given by the (2J + 1) where each of the covariance matrices Σ P i and In each case, the resulting transformed random vector T P * i (V ec(X)) , with elements P i,j and I i,j , will produce a likelihood model given for the transformed data according to the independent Model I of Merz and Wüthrich [16] as defined in Equation (2.2). Of course this is defined now with respect to components in the likelihood corresponding to the transformed components, as detailed in Equation (3.11).
Remarks 3.11. The consequence is that results in Lemma 3.10 are that the conjugacy properties derived for the independent model in Merz and Wüthrich [16] can be directly applied post-transformation. This is of direct interest for MCMC based sampling schemes.
In the models described so far, the following full conditional posterior distributions are now of relevance to the Bayesian MCMC estimation procedures developed for Models I, II and III.
with posterior mean Π and posterior covariance ∆, where the components of ∆ −1 = (a n,m ) 0≥n,m≤2J are each given by where δ n=m is the indicator of the event that index m matches n, m ∧ n is the minimum of m and n and the posterior mean is given on the transformed scale by, Given the transform vector Φ 0:J , Ψ 0:J , the parameters on the orginal scale can be expressed according to the unique solution to the system of linear equations: 1. Model II -On the untransformed scale, the solution is given by the following system of equations Model II -On the untransformed scale, the solution is given by the following system of equations for each i ∈ {0, 1, . . . , J}, where we can randomly select i or deterministically scan through i for the results, 3. Model III -On the untransformed scale, the solution is given by the following system of equations, • Conjugate Posterior Distribution for the Covariance Matrix: Given the transformed observed payment and incurred losses have a multivariate Gaussian likelihood, as specified in Equation (3.21), with covaraince matrix Σ = Σ ⊗ Ω and mean vector V ec (M ). Then the posterior for the covariance matrix is the Inverse-Wishart-Gaussian distribution detailed in Peters and Mellen [22] [Section 3] and Peters and Godsill [23] Σ|Φ 0:J , Ψ 0: In cases in which the covariance matrix Σ takes any of the block diagonal forms presented in Models II and III, we may utilise Lemma A.2 and the result in Equation (3.12) to further decompose the posterior covariance into blockwise components.
• Conjugate Posterior Distribution for the Hyper-Parameters on Development Factors: For all i we have the following Inverse Gamma-Gaussian conjugacy for the hyper parameters in Models II and III, We next present alternative tail dependence structures for the PIC model. Previous studies on claims reserving that have incorporated copula based models, such as Zhang and Dukic [29] have done so through regression based frameworks. Zhang and Dukic [29] develop a parametric copula model to account for dependence between various lines of insurance claims. Their paper considers a bivariate Gaussian copula model with marginal generalized linear models to capture the positive correlation between the two insurance lines. Our article significantly extends the dependence modelling capability of the PIC model structure remaining in the frameworks presented above.
However, to do so requires the introduction of auxiliary variables to enable computation. The approach developed involves modifying the posterior distribution by embeding the target posterior distribution for the model parameters into a much higher dimensional support comprised of the original model parameters and the additional auxiliary variables. The reason for this expansion of the posterior dimensions will be come clear below and is in general known in Bayesian statistics as an auxiliary variable framework.

Models: Model IV
This section presents an alternative parameteric approach to modelling and capturing dependence and tail dependence in the PIC model structure which involves considering copula based models within the PIC reserving framework. The dependence can be considered over the following combinations such as: 1. Independent accident years and dependence between payment losses over the development years; 2. Independent accident years and dependence between incurred losses over the development years; 3. Independent accident years and dependence jointly between payment and incurred losses over the development years via a mixture copula, hierarchical copula (HAC) as in Kurowicka  We present fundamental properties of members of the Archimedean family of copula that we consider when constructing mixture copula models in the PIC framework in the Appendix, see Lemma B.1 for the characteristics of the Archimedean family of copulas and Lemma B.2 for the required distribution and densities for three members of this family. In addition references Denuit et al. [6],Aas et al. [1],Embrechts [7], Min and Czado [17] and Patton [19] provide more detail.
In Lemma B.1 the property of associativity of Archimedean copula models is particularly useful in the PIC model framework as it allows us to obtain analytic expressions for the likelihood structure of the matrix-variate PIC model. This is particularly useful if one specifies the model as a hierarchical Archimedean Copula (HAC) construction.
We consider the following popular members of the Archimedean family of copula models, due to their analytic tractability, their non-zero tail dependence properties and their parsimonious parameterizations. In addition, generating random variates from these class of models is trivial given the generator for the member of the Archimedean family of interest. Lemma B.2 in the appendix presents the three Archimedean copulas for Clayton, Gumbel and Frank copulas that we consider and their properties. We use the following notation for copula densities we consider on [0, 1] d , see Nelsen [18, Section 4.3, Table 4.1] and Lemma B.2: the Clayton copula density is denoted by c C (u 1 , ..., u n ; ρ C ) with ρ C ∈ [0, ∞) the dependence parameter; the Gumbel copula density is denoted by c G (u 1 , ..., u n ; ρ G ) with ρ G ∈ [1, ∞) the dependence parameter; and the Frank copula density is denoted by c F (u 1 , ..., u n ; ρ F ) with ρ F ∈ R/{0} the dependence parameter.
In addition, we also note that the properties of these copulas of interest include that the Clayton copula does not have upper tail dependence, however its lower tail dependence can be expressed as λ L = 2 −1/ρ C . The Gumbel copula does not have lower tail dependence, however its upper tail dependence of the Gumbel copula can be expressed as λ U = 2 − 2 1/ρ G . The Frank copula does not have upper or lower tail dependence.
In this class of copula dependence models we consider the marginal distribution of each log payment or log incurred loss as distributed according to a Gaussian distribution and the joint distribution vector is modelled via a mixture copula comprised of the above three components from the Archimedean family. Such a copula construction will still produce a copula as shown in Lemma 4.1.

Understanding Bayesian Data Augmentation
The modeling framework of Data Augmentation in the Bayesian framework is typically invoked to deal with situations in which the likelihood evaluation is intractable to perform point-wise. This would make Bayesian inference in such a model also generally intractable. For example if one considers the generic likelihood p (y 1:n |θ) with observation random vectors Y 1:n , which can be evaluate point-wise as a function of parameter vector θ with respect to a realization of the observation process y 1:n .
In the setting we encounter in the PIC models, we can generically consider the data random vector observation is partitioned into two vector sub-components Y = Y (1) , Y (2) , of which only one component, say Y (1) , is actually observed. Then evaluation of the likelihood pointwise for θ given a realization of Y where Y (2) * 1:n are auxiliary random vectors with prior distribution p Y (2) * 1:n |θ , 'augmented' to the posterior parameter space to allow tractability of the posterior inference. This will be explained in detail for the PIC copula models below.

Data Augmentation in the Bayesian PIC Copula Models
Definition 4.2 gives some useful notation for the results that follow. Definition 4.2 (Auxiliary Data for Data Augmentation). Consider the defined loss data under the one-toone (invertible) transformation for the observed data given by the joint matrix for all observations and auxiliary variables given by X = [X ′ 0 , X ′ 1 , . . . , X ′ J ]. In this framework, the i-th accident year is defined according to, X i = [log I i,0 , log P i,0 , log I i,1 , log P i,1 , . . . , log I i,J−1 , log P i,J−1 , log I i,J ]. Consider the permutation of each vector of log payments and log incurred losses given by . Now consider the further partition by the decomposition of observed log payment losses and unobserved log payment losses as well as these quantities for the incurred losses defined for the i-th accident year by, Therefore the total data matrix of losses is given by X = X 0 , . . . , X J . Note, the introduction in this section of the notation subscripts obs and aux allows us to make explicit the fact that the upper triangle of log payment losses and the upper triangle of log incurred losses are un-observed quantities for these random variables, while the lower triangular regions for such losses are observed. We denote these random variables as auxiliary variables (augmented) to the observed data random variables to create a complete data set of all losses.
By considering the unobserved data in the lower payment and incurred loss triangles as auxiliary variables to be jointly estimated along with the model parameters, we will demonstrate below that only under this approach is consistency ensured in the copula structure of the PIC model. However, we first make the following model assumptions about the statistical features of the PIC model.
The following assumptions illustrate a choice of copula models for the mixture from the Archimedean family.
However, there are many related specifications and frameworks that can be explored in this context, be we leave that to future research.

Model Assumptions 4.3 (Data-Augmented
Mixture Copula PIC (Model IV )). The model assumptions and specifications for the copula model we develop involve: • Let the random matrix Σ i ∈ R (2J+1)×(2J+1) be the covariance for X i = X P i,obs , X P i,aux , X I i,obs , X I i,aux with X i ∈ R 2J+1 for all i = 0, . . . , J. We assume that Σ is diagonal where where α i and β i are the known hyper-parameters for shape and scale.
Here we only consider the case of Ω = I J+1 for the marginal independent case.

5)
with supper script P and I denote the components for the log payments and log incurred losses in the i-th development year respectively and the density is given by ∈ R/ {0}. Therefore the total conditional distribution corresponding to the likelihood model considered is given by,

Marginal Distribution in Data Augmented Likelihood PIC Model
(4.7) • Assume that the tail dependence features of the Data-Augmented copula PIC model are such that the dependence structure is homogeneous accross accident years, ρ P = ρ P i and ρ I = ρ I i for all i ∈ {0, 1, 2, . . . , J}. • Conditional on Σ, Φ = [Φ 0 , Φ 1 , . . . , Φ J ] and Ψ = [Ψ 0 , Ψ 1 , . . . , Ψ J ] the hierarchical prior distribution on the auxiliary payment data for the i-th accident year is given by a normal distribution, centered on the development year mean, The hierarchical prior distribution on the auxiliary incurred loss data for the i-th accident year is given by with Σ 2 the lower portion of covariance Σ corresponding to the lower triangle matrix from (J − i + 1) through to J for all i ∈ {0, 1, 2, . . . , J}.
• The matrix Σ is distributed as Σ ∼ IW (Λ, k) and the copula parameters are distributed as ρ G,P ∼ IG α G , β G , ρ C,P ∼ IG α C , β C and ρ F,P ∼ N 0, σ F Hence, we have made precise the auxilliary data scheme used in formulating the Data-Augmented-PIC model. In particular illustrating the importance of the role of the auxiliary data in evaluation of the model and estimation of the PIC claim development factors. Also we note we get indirectly via the data augmentation the distribution for the predicted payment and incurred Loss reserves.
Remarks 4.4. The following remarks provide motivation for the Data-Augmentation and resulting incorporation of auxiliary payment and incurred Losses data.
• The use of data augmentation in the above model structure is critical in the PIC model formulation, since it allows one to ensure that the dependence structure considered (in this case a HAC-Mixture) is consistent both across accident years and across development years. Note: In the case of a linear dependence structure such as with a covariance / correlation matrix under a Gaussian Copula or Independent Copula model, such as those presented previously under Models I,II, III, we have that conditional distributions and marginal distributions are Gaussian. This means that the evaluation of the likelihood is analytic without the need for auxiliary variables.
• In order to evaluate the likelihood one has two choices, to evaluate the observed data likelihood (Equation (4.12)) or to evaluate the full data likelihood (Equation (4.7)).
-The PIC copula model equivalent of Equation 4.2 is the observed data likelihood is given for the i-th accident year by p X P i,obs , X I i,obs |Θ, Σ, Ω, ρ = · · · p X P i,obs , X I i,obs |Θ, Σ, Ω, ρ, X P i,aux , the equivalent mean. -Clearly, the marginalization required to evaluate the Observed data likelihood involves intractable integration, except in special cases in which the copula models are Gaussian or independence copulas.
• The full data likelihood comprised of observed and auxiliary data involves incorporating auxiliary variables to represent the unobserved data in the lower reserve triangle for payment and incurred loss triangles. These become part of the inference procedure and are required to be estimated jointly with the model parameters in the estimation methodology.

Estimation via Adaptive Data-Augmented MCMC for Claims Reserving PIC Models
It has been shown for the Independent and Gaussian copula models that we can obtain the observed data likelihood analytically. Therefore the posterior distribution for all the model parameters can be sampled via a MCMC procedure comprised of block Gibbs sampler updates. In the case of a more general copula dependence model in which the observed data likelihood cannot be analytically evaluated pointwise, we must resort to a Data Augmentation scheme. In this case we will be able to perform sampling via a general MCMC Metropolis-Hastings sampler. In particular we will consider automating such a sampler using an adaptive MCMC scheme.

Adaptive Metropolis within Data-Augmented Copula PIC Models
This section presents the adaptive proposal we use to sample the parameters and the auxiliary variables. The advantage of an adaptive MCMC mechanism is that it automates the proposal design through consideration of a proposal distribution that learns the regions in which the posterior distribution for the static model parameters and auxiliary data has most mass. As such, the probability of acceptance under such an on-line adaptive proposal is likely to improve as the iterations progress and the generated MCMC samples will ideally have reduced autocorrelation. In such cases the variance of Monte Carlo estimators of integrals of smooth functionals formed from such samples will be reduced.
There are several classes of adaptive MCMC algorithms, see Roberts and Rosenthal [25]. The distinguishing feature of adaptive MCMC algorithms, compared to standard MCMC, is the generation of the Markov chain via a sequence of transition kernels. Adaptive algorithms utilize a combination of time or state inhomogeneous proposal kernels.
Each proposal in the sequence is allowed to depend on the past history of the Markov chain generated, resulting in many possible variants.
Haario et al. [10] develop an adaptive Metropolis algorithm with proposal covariance adapted to the history of the Markov chain was developed. Andrieu and Thoms [2] is presenting a tutorial discussion of the proof of ergodicity of adaptive MCMC under simpler conditions known as Diminishing Adaptation and Bounded Convergence. We note that when using inhomogeneous Markov kernels it is particularly important to ensure that the generated Markov chain is ergodic, with the appropriate stationary distribution. Two conditions ensuring ergodicity of adaptive MCMC are known as Diminishing Adaptation and Bounded Convergence. These two conditions are summarised by the following two results for generic Adaptive MCMC strategies on a parameter vector θ. As in Roberts and Rosenthal [25], we assume that each fixed MCMC kernel Q γ , in the sequence of adaptions, has stationary distribution P (·) which corresponds to the marginal posterior of the static parameters. Define the convergence time for kernel Q γ when starting from a state θ ∈ E, as M ǫ (θ, γ) = inf{s ≥ 1 : Q s γ (θ; ·) − P (·) ≤ ǫ. Under these assumptions, they give the following two conditions which are sufficient to guarantee that the sampler produces draws from the posterior distribution as the number of iterates tend to infinity. The two sufficient conditions are: • Diminishing Adaptation: lim n→∞ sup θ∈E Q Γs+1 (θ, ·) − Q Γs (θ, ·) tv = 0 in probability. Note, Γ s are random indices.
• Weak Law of Large Numbers: lim j→∞ In general, it is non-trivial to develop adaption schemes which can be verified to satisfy these two conditions. In this paper we use the adaptive MCMC algorithm to learn the proposal distribution for the static parameters in our posterior Φ. In particular we work with an adaptive Metropolis algorithm utilizing a mixture proposal kernel known to satisfy these two ergodicity conditions for unbounded state spaces and general classes of target posterior distribution, see Roberts and Rosenthal [25] for details.

Euclidean and Riemann-Manifold Adaptive Metropolis within Data-Augmented Copula PIC Models
This section presents the specific details of the Adaptive Metropolis algorithm that we combine with Data- and matrix Σ. At the j-th iteration of the Markov chain we have existing state Υ (j−1) and Σ (j−1) which is used to construct the proposal distribution q Υ (j−1) , Υ * q Σ (j−1) , Σ * . The choices we make for the two proposals will involve a novel development of a new adaptive proposal for positive definite matrices, required for the covariance matrix Σ should we choose not to specify it as diagonal.

Euclidean Space Adaptive Metropolis for Static Parameters:
We first detail the proposal for updating Υ using a mixture of multivariate Gaussian distributions as specified for an Adaptive Metropolis algorithm which involves sampling from the proposal where we define the sample covariance for Markov chain past history by Cov Υ (j) 0≤j≤t−1 and we note the following recursive evaluation, which significantly aids in algorithmic computational cost reduction The theoretical motivation for the recommended choices of scale factors 2.38, 0.1 and dimension d are provided in Rosenthal et al. [26].

Riemannian Manifold Adaptive Metropolis for Covariance Matrices:
Next we develop a novel proposal distribution for the sampling of the covariance matrix Σ ∈ Sym + (d) in an adaptive MCMC proposal, restricted to the Riemann manifold of symmetric, postive definite (d × d) matrices, denoted by the space Sym + (d). , are not independent. The consequence of this is that we cannot simply apply the property of closure under convolution of independent Wishart distributed random matrices to find a suitable proposal.
Therefore, we will adopt a strategy to perform adaptive moment matching of a distribution with support Sym + (d).
We detail one possibility involving an inverse Wishart distribution fitted to the sample mean of the marginal posterior for the covariance. We note that future work could also consider specifying a distribution on the superset of the Riemannian manifold of symmetric positive definite matrices, given by the Riemannian manifold of symmetric Adaptive Metropolis inverse Wishart Mixture: We note that one way to achieve this is a mixture of inverse Wishart distributions given by  4)), we note that we need to ensure that the sample average considered is restricted to the Riemann-manifold of positive definite matrices.
This is satisfied through the choice of the estimator To see this we observe that since we only form positive linear combinations of matrices on this manifold, with a scaling, such linear combinations will always remain on the manifold Sym + (d).

Real Data Analysis
To illustrate the proposed models and compare with existing models and estimation methods in the actuarial literature we consider, as in Merz and Wüthrich [16], the example presented in Dahms [4] and Dahms et al. [5] (Tables 10 and 11). As in the second analysis framework in Merz and Wüthrich [16], we treat the claim development factors, the likelihood dependence parameters and the hyperparameters on the claim development factor priors as parameters which we incorporate into the posterior inference.
We we carried out convergence diagnostics. This included the Gelman-Rubin R-statistics (all less than 1.5), the ACF plots for each parameter were checked to ensure all parameters had ACF's which were less than 10% by lag 20.
Then the first 20% of samples were discarded as burnin and the remaining samples were used in inference results presented below.

Results: Euclidean and Riemann-Manifold Adaptive Metropolis for hierarchical Bayesian Copula PIC Models
In the simualtion results, we consider a block Gibbs sampler with the following three stages performed at each iteration of the adaptive Metropolis-within-Gibbs sampler for the PIC Model III and Model IV:

23). Under this hierarchical
Bayesian model, the joint covariance between all observed payment and incurred loss data under the dependent development years assumption, satisfies a telescoping diagonal block size form covariance matrix structure.
Hence, the sampling of this structure can be performed blockwise on each covariance sub-block; (Mixture Clayton-Gumbel Copula Model IV) -Perform Euclidean space Adaptive Metropolis updates of the mixture copula parameters.

Hierarchical Bayesian Gaussian Copula (telescoping block covariance) PIC (Model III)
This section presents the estimation results for the Gaussian Copula based PIC models (Model III) on the real data. Figure 2 summarizes the dependence structure by a heatmap for the posterior distribution of the Gaussian copula covariance matrix. As mentioned in the introduction, the telescoping block covariance refers to the fact that the covariance structure is reducing in rank by 1 on each diagonal block for the payment data and then the incurred data. This model has the joint covariance between all observed payment and incurred loss data under the assumption that the development years are dependent, satisfying a telescoping diagonal block size form covariance matrix structure. Summarising the information from such posterior samples for distributions of covariance matrices is non-trivial as discussed in Tokuda et al. [28], where they develop a four layer approach. Our article adopts aspects of the ideas proposed in Tokuda et al. [28] to interpret the features of the posterior distribution samples for the dependence structures.
The posterior mean for estimated PIC covariance structure is obtained by using Monte Carlo samples from the Riemann-Manifold Adaptive Metropolis sampler and given by the estimator, is the s-th sample of the J(J − 1) × J(J − 1) covariance matrix. The estimated posterior mean covariance matrix is reported in a heatmap for the correlation matrix in Figure 2. In addition, we present examples based on posterior mean covariance for covariance sub-blocks p Σ P 4 |P , I and then for p Σ I 4 |P , I , where Σ P 4 ∈ SP + (6) and Σ I 4 ∈ SP + (5), again converted to heatmaps of the correlation. We see that although the priors selected for the dependence features in Model III in all cases favoured independence, since the scale matrices were all diagonal i.e. Λ P 5 = I 6 and Λ I 4 = I 5 , the resulting summaries of the marginal posteriors of the covariances clearly indicate non-trivial dependence patterns in the development years within the payments data and the incurred loss data. This is observed throughout each sub-block covariance matrix. Table 9   year on the ability of the development factors in the PIC model to explain variation in the observed loss data. Table 9 summarises the results for the average PCA weight (largest eigenvalue) and average posterior eigenvector.
Tokuda et al. [28] develops a framework which formalizes an approach to the summary of dependence structures.
For the running example of results that we present for distributions p Σ P 4 |P , I and p Σ I 4 |P , I , under such an approach the third and fourth layers of summary are presented in Figure 3. This involves the presentation of contour maps of these marginal posteriors that are constructed using adaptive MCMC samples of these matrices.
In Figure 4 Table 1. We note that the results in this section for the Gaussian copula models are obtained using the log ratio observational data and the restults for the Mixture Archimedian copula model are more conveniently obtained using the log observations (not ratio data).
It is also worth noting other approaches that can be adopted in the case of the Gaussian copula model. One could also included a data-augmentation stage in the analysis as was utilised in the Mixture Archimedian copula example. In addition, the covariance matrices could have been specified under different structures with more or less parsimony. The examples utilised in this section were those which provided a reasonable trade-off between parsimonious model specification, while allowing a meaningful decomposition of the results.
The results of the comparison between the Gaussian copula PIC model and the independent PIC model illustrated that whilst the posterior marginal mean development factor estimates are not affected by the dependence feature included, the marginal posterior shape is affected. This is reflected by the comparison of the posterior confidence intervals for the Gaussian copula PIC model when compared to the payment or incurred individual models where there is a significant difference present in the shapes of the marginal posterior. It is expected that this will have implications for the estimation of reserves using these different will be quantified in the next section.

Data-Augmented hierarchical Bayesian Mixture-Archimedian Copula PIC (Model IV)
This section presents the estimation results for the mixture of Clayton and Gumbel Copula based PIC models (Model IV) on the real data are presented in this section. Figure 5 presents a summary of the mixture copula dependence structure obtained from posterior samples of the copula parameters under the hierarchical Bayesian model. The results in this section are obtained using the log observational data, not ratio data. The figures summarise succinctly the estimated posterior dependence structure for the hierarchical Bayesian mixture Copula model, through plots of the dependence structure as captured by the estimatd mixture copula distribution, the scatter plots of copula parameter for the lower tail and rank correlation (Kendall's tau) and the upper tail copula parameter versus rank correlation. These results clearly demonstrate posterior evidence for non-trivial tail dependence features in the payment and incurred data, as well as potential for asymmetry in the upper and lower tail dependence. Note, uniformative prior choices were made on the copula parameters with uniform priors over [0,50] and [1,50] respectively, indicating these estimated copula parameters are data driven results.

Models
This section discuss the effect of modelling the dependence structures on the reserving estimates. First we note two important details in calculating the reserves. We need to be able to draw samples from the predictive distributions for the payment and incurred data given below, for each accident year i, using In general it is not possible to solve these integrals analytically. Howerver, for the Gaussian copula models developed  Next we consider the distributions of the outstanding loss liabilities estimated using the S samples from the MCMC obtained for the posterior PIC model. We denoted these by random variables R(P , I) (s) s=1:S where R(P , I) (s) = P i,J − P i,J−i and depending on whether payment, incurred, or both data is present we denoted by R(P ) (s) , R(I) (s) and R(P , I) (s) respectively. Figure 8 presents the MCMC estimated claims reserve marginal posterior predictive distributions for each accident year per model developed. We compared our results to those obtained in Merz and Wüthrich [16] and find good agreement between the mean reserve per accident year and each proposed model. In addition, we note the possible differences between the distributions can be attributed to the utilisation of the full versus partial hierarchical Bayesian models in this paper and the different dependence 15  structures considered. Additionally, we note that further analysis on comparisons to existing models in the literature can be obained for the models of Mack [14], Dahms [4] and Quarg and Mack [24] for this data analysis in Merz and Wüthrich [16] [ Table 4] and in the spreadsheet provided by Professor Mario Wuethrich at URL 1 .

Conclusions
This paper extends the class of PIC models to combine the two different channels of information as proposed in Merz and Wüthrich [16] by introducing several novel statistical models for the dependence features present within and between the payment and incurred loss data. This allows us to obtain a unified ultimate loss prediction which incorporates the potential for general dependence features. To achieve this we developed full hierarchical Bayesian models which incorporate several different potential forms of dependence, including generalized covariance matrix structure priors based on inverse Wishart distributions and conditional Bayesian conjugacy in the PIC independent log-normal model. This forms a general class of Gaussian copula models which extends the approach of Happ and Wuthrich [11].
Second, we develop a class of hierarchical mixture Archimedian copula models to capture potential for tail dependence in the payment and incurred loss data, again developing and demonstrating how to appropriately construct a full Bayesian model incorporating hyperpriors. In this regard, we also develop a class of models in which dataaugmentation is incorporated to both overcome challenging marginal likelihood evaluations required for the MCMC methodology to sample from the PIC Bayesian models. This had the additional feature that it also allowed for joint Bayesian inference of the reserves as part of the posterior inference. Finally, to perform inference on these approaches we developed an adaptive Markov chain Monte Carlo sampling methodology incorporating novel adaptive Riemann-manifold proposals restricted to manifold spaces (postive definite symmetric matrices) to sample efficiently the covariance matrices in the posterior marginal for the Gaussian copula dependence. We make these advanced MCMC accessible to the actuarial audience to address challenging Bayesian inference problems in Claims Reserving modelling.
The consequence of these models for actuaries is that a new extended suite of flexible dependence structures have been incorporated into the recently proposed PIC models. These can now be extended and compared to existing chain ladder methods. We perform an analysis on real payment and incurred loss data discussed in Merz and Wüthrich [16] and compare our models with the analysis for the independent PIC model (partial) and the (full) Bayesian PIC model as well as several different dependent models and the payment only model. Furthermore, we provide reference on further comparisons to the alternative models of Mack [14], Dahms [4] and Quarg and Mack [24] for this data.

Acknowledgement
We would like to thank Prof. Mario Wuethrich of ETH for his discussions on this topic and for introducing us to the family of models that we have explored. 1. C is an Archimedean copula if it can be reprsented by where ϕ is the generator of this copula and is a continous, strictly decreasing function from [0, 1] to [0, ∞] such that ϕ(1) = 0 and ϕ [−1] is the pseudo inverse of ϕ. 2. C is symmetric, C(u, v) = C(v, u) ∀(u, v) ∈ [0, 1] × [0, 1] 3. C is associative, C(C(u, v), w) = C(u, C(v, w)) ∀(u, v, w) ∈ [0, 1] 3 . 4. If c > 0 is any constant, then cϕ is a generator of C 5. According to Denuit et al. [6,Definition 4.7.6], the extension of the Archimedean copula family to ndimensions is achieved by considering the strictly monotone generator function ϕ such that ϕ : (0, 1] → R + with ϕ(1) = 0, then the resulting Archimedean copula can be expressed as, The members of the Archimedean copula family utilised in this manuscript are given below in Lemma B.2.
Lemma B.2. From the results in Nelsen [18, Section 4.3, Table 4.1] the distribution and density functions of the Clayton copula are given respectively as: where ρ C ∈ [0, ∞) is the dependence parameter. The Clayton copula does not have upper tail dependence. Its lower tail dependence is λ L = 2 −1/ρ C . The distribution function of the Gumbel copula is where ρ G ∈ [1, ∞) is the dependence parameter. The Gumbel copula does not have lower tail dependence. The upper tail dependence of the Gumbel copula is λ U = 2 − 2 1/ρ G . The distribution function of the Frank copula is satisfies the two conditions of a n-variate copula distribution given in [Definition 2.10.6] of Nelsen [18]. The first of these conditions requires that for every u = (u 1 , u 2 , . . . , u n ) ∈ [0, 1] n , one can show thatC (u) = 0 if at least one coordinate of u is 0. Clearly since we have shown thatC (u) = i=1 m w i C i (u) and given each member C i (u 1 , u 2 , . . . , u n ) ∈ A n is define to be in the family of Archimedean copulas each of which therefore satisfies this condition for all such points u, then it is trivial to see that the probability weighted sum of such points also satisfies this first condition. Secondly one must show that for every a and b in [0, 1] n , such that a ≤ b (i.e. a i < b i ∀i ∈ {1, 2, . . . , n}) the following condition on the volume for copulaC is satisfied, VC ([a, b]) ≥ 0. As in Nelsen [18] we adopt the notation for the n-box, [a, b], representing [a 1 , b 1 ] × [a 2 , b 2 ] × · · · × [a n , b n ] and we define the n-box volume for copula distributionC by [Definition 2.10.1, p.43] of Nelsen [18] giving where the domain DomC of the mixture copulaC satisfies [a, b] ⊆ DomC. In addition we note that this sum is understood to be taken over all vertices c of n-box [a, b] and sgn(c) = 1 if c k = a k for an even number of k's or sgn(c) = −1 if c k = a k for an odd number of k's. Equivalently, we consider △ b k a kC (t) =C (t 1 , t 2 , . . . , t k−1 , b k , t k+1 , . . . , t n ) − C (t 1 , t 2 , . . . , t k−1 , a k , t k+1 , . . . , t n ). In the case of the mixture copula, we can expand the volume of the n-box