Outer power transformations of hierarchical Archimedean copulas: Construction, sampling and estimation

A large number of commonly used parametric Archimedean copula (AC) families are restricted to a single parameter, connected to a concordance measure such as Kendall's tau. This often leads to poor statistical fits, particularly in the joint tails, and can sometimes even limit the ability to model concordance or tail dependence mathematically. This work suggests outer power (OP) transformations of Archimedean generators to overcome these limitations. The copulas generated by OP-transformed generators can, for example, allow one to capture both a given concordance measure and a tail dependence coefficient simultaneously. For exchangeable OP-transformed ACs, a formula for computing tail dependence coefficients is obtained, as well as two feasible OP AC estimators are proposed and their properties studied by simulation. For hierarchical extensions of OP-transformed ACs, a new construction principle, efficient sampling and parameter estimation are addressed. By simulation, convergence rate and standard errors of the proposed estimator are studied. Excellent tail fitting capabilities of OP-transformed hierarchical AC models are demonstrated in a risk management application. The results show that the OP transformation is able to improve the statistical fit of exchangeable ACs, particularly of those that cannot capture upper tail dependence or strong concordance, as well as the statistical fit of hierarchical ACs, especially in terms of tail dependence and higher dimensions. Given how comparably simple it is to include OP transformations into existing exchangeable and hierarchical AC models, this transformation provides an attractive trade-off between computational effort and statistical improvement.


Introduction
Archimedean copulas (ACs) are dependence models frequently used in finance, insurance and risk management, e.g., for stress testing. In contrast to elliptical copulas such as the prominent Gaussian and t copulas, ACs allow for asymmetry in the joint tails, which is of particular interest, e.g., in risk management [28,Chapter 5] or hydrology [7,25]. ACs are also appreciated for their simple and explicit construction, for efficient sampling techniques and for likelihood-based inference; see [18].
However, as follows from the construction, all multivariate margins of an AC are the same, which limits ACs' applicability particularly in high dimensions. Also, a vast majority of known ACs are one-parametric, which, on the one hand, enables the mentioned advantages, but on the other hand causes the following limitation. The single parameter completely determines all properties of the AC, as well as for many families it is related in a one-to-one relationship to the strength of the dependence, e.g., expressed by Kendall's tau or Spearman's rho, see Table 1 in [9]. It is thus natural that this parameter is frequently estimated in the method of moments fashion such that the model's dependency measure is close to its empirical counterpart. However, this often results in a model that fits well in its body, but not that well in its tails.
To alleviate these limitations, several approaches have been introduced in the literature. This work particularly focuses on the following two: (1) Given a one-parameter family of ACs, a way to construct its two-parameter extension called the outer 2 power AC (OPAC) family is proposed in Nelsen [31,Theorem 4.5.1]. With an additional parameter, one can fix, e.g., the model's Kendall's tau (τ ) to a desired value (to keep a good fit in the body), and then fine-tune both parameters to get a good fit in one of the tails. Such a property is crucial, e.g., in risk management applications [18,28]; (2) Joe [20, pp. 87] proposed a way how to construct hierarchical (or nested ) ACs (HACs) by nesting several ACs into each other. This allows certain multivariate margins to differ (which lead to asymmetric models) and extends the oneparameter model to allow up to (d − 1)-parameters. However, to this date, all contributions in the literature addressing HACs' estimation have been restricted to the case where all ACs nested in a HAC are one-parametric; see [10,13,32] to mention a few.
This work merges these two approaches, resulting in hierarchical outer power ACs (HOPACs), which are tail-asymmetric copulas that allow for different multivariate margins with extra flexibility added by the outer power (OP) transformation. To illustrate the flexibility, Figure 1 presents trivariate examples of HOPACs, ACs, OPACs and HACs based on Ali-Mikhail-Haq copulas that are combined with N(0, 1) margins and displayed via contour plots of bivariate marginal densities.
In the first column, the parameter of the trivariate Ali-Mikhail-Haq copula is chosen such that Kendall's tau of all bivariate margins is 0.3. Such a choice provides a good fit in the model's body if we observe Kendall's tau close to this value in data of interest. Following the nature of Ali-Mikhail-Haq copulas, the dependence is clearly tail-asymmetric, i.e., for small joint values (found in the bottom-left corners of each density plot) it is stronger than for large values (top-right corners). However, if we observe stronger dependence for large values in our data, such a model does not allow to adjust to such a situation and require one to discard the Ali-Mikhail-Haq family for modeling purposes. Involving the OP transformation, this adjustment is possible, as illustrated in the second column, where the extra parameter allows us to fine-tune the upper-tail dependence coefficient, e.g., to 0.3, while still keeping the same value of Kendall's tau. This allows one to build dependence models that fit well both in the body and tail, and, as we demonstrate in Section 4, just this transition from ACs to OPACs already yields substantial improvement in tail-dependence modeling for risk management applications.
In contrast to ACs and OPACs, which are exchangeable and thus all their multi- Figure 1.: Contour plots of the bivariate marginal densities of a random vector (X 1 , X 2 , X 3 ) with N(0, 1) margins and varying (across columns) Ali-Mikhail-Haq copulas. Note that τ and λ u denote the corresponding pairwise Kendall's tau and upper-tail dependence coefficient, respectively.
variate margins are the same, HACs and HOPACs provide asymmetry in the bivariate margins. It can be seen from the last two columns that the margin corresponding to (X 2 , X 3 ) differs (in τ ) from the remaining two. Moreover, as Ali-Mikhail-Haq ACs are limited in τ to [0, 1/3), the OP transformation allows to attain any τ ∈ [0, 1), which makes (H)OPACs based on this family more suitable for modeling purposes. An Ali-Mikhail-Haq OPAC with τ ≥ 1/3 is, e.g., the bivariate margin corresponding to (X 2 , X 3 ) in the fourth column. Similarly, as Ali-Mikhail-Haq ACs are tail independent, i.e., λ u = 0 for all parameter values, their OP transformation allows to attain λ u with any value from [0, 1), which is illustrated by the same example. This extended flexibility in tail dependence modeling enabled the Ali-Mikhail-Haq HOPACs to gain the best results in several scenarios out of all copula constructions/families considered in a risk management application provided in Section 4. Also note that in [19], a sub-class of HOPAC models based on Clayton copulas shows the best performance among all the copula models considered. The limitation of the mentioned sub-class lied in the fact that those HOPACs were restricted to one certain copula structure with two nesting levels. Moreover, they were constructed in a way that one of the parameters of each two-parameter OPAC nested in a HOPAC was arbitrarily fixed to a certain value that was the same for all of them, i.e., these nested two-parameter OPACs were in fact turned to one-parameter ACs. We relax both such limitations.
Finally, note that the OP transformation establishes a connection between several one-parameter AC families from the list in Nelsen [31, pp. 116-119]. For example, the three families denoted 4.2.1 (Clayton), 4.2.12 and 4.2.14 are special cases of the OP Clayton family. To have these families in a single HAC, it was necessary to develop sampling and estimation strategies allowing to nest different families, which was done in [15] and [13], respectively. With regard to OP transformations, sampling and estimation of HACs involving these different one-parameter families can be addressed by restricting to hierarchical models involving just one but OP-transformed AC family, i.e., Clayton HOPACs.
The paper is organized as follows. Section 2 recalls basic concepts concerning ACs, the OP transformation and their relationship to three measures of association. For the tail dependence coefficients a new result simplifying their computation is proposed. This section also recalls an efficient sampling strategy for OPACs and presents a simulation study on the viability of two OPAC estimators. Efficient sampling and estimation strategies for HOPACs are then developed in Section 3, and their excellent abilities in tail dependence modeling are demonstrated in an application from risk management reported in Section 4. Section 5 provides concluding remarks.

Outer power transformation
Theorem 4.5.1 in [31] implies that for any β ∈ [1, ∞) and any generator ψ of a 2-AC generates again a proper 2-AC. Parametric families generated in this way are referred to as outer power families, where the unintuitive use of "outer" relates to the fact that they were named with reference to generator inverses. Given a one-parameter generator ψ (a,θ) , its OP transformed version with parameter β ∈ [1, ∞) is denoted by , and the lower-and upper-tail dependence coefficients λ l := lim t↓0 C ψ (t, t)/t and λ u : ψ (a,θ,β) . A well-known example of an OPAC family is the generalized Clayton copula (also denoted BB1, see [21]), which, as mentioned before, encompasses the three oneparameter families from Nelsen [ Note that the Gumbel family has generator ψ (G,θ) = e −t 1/θ , so OP Gumbel copulas are simply Gumbel copulas with parameter θβ instead of θ since where θ ∈ [1, ∞), is again a one-parameter Gumbel with the parameter θβ. For this reason, this family is not further considered.

Outer power transformed dependence measures
As can be observed from the examples in Figure 1, the OP transformation can have an impact on measures of association such as Kendall's tau (e.g., its values reached beyond 1 3 reached for the Ali-Mikhail-Haq family) or the tail dependence coefficients (the upper-tail independent Ali-Mikhail-Haq family reached λ u > 0). We now consider these three measures of association in more detail for OPACs.
Given a one-parameter 2-AC C ψ(a,θ) , there exists a functional relationship between the parameter θ and Kendall's tau that can sometimes be expressed in a closed form, e.g., τ (C) (θ) = θ/(θ + 2) for the Clayton family. This relationship can easily be extended to OPACs. As follows from Proposition 3.7 in [15], given a 2-OPAC C ψ(a,θ,β) , its corresponding Kendall's tau τ (a) (θ, β) is We thus see how Kendall's tau of Ali-Mikhail-Haq copulas can cover the whole [0, 1), while τ (A) (θ) only covers 0, 1 3 . A similar result can be derived for the coefficients of tail dependence under additional assumptions on ψ (a,θ) (or ψ (a,θ) ) such as regular variation. Note that λ l (C) (λ u (C)) denotes the lower (upper) tail dependence coefficient of a 2-AC C.
(2) Using (2.12) from [14], If ψ is regularly varying at zero with index α 0 , using c = 2 1/β implies that 2 − 2 1/β lim s↓0 ψ (2 1/β s) ψ (s) Having the index α (α 0 ) for a one-parameter family, the proposition provides λ l (λ u ) for its OP family. It can easily be verified that α = −θ −1 for a = C, whereas ψ (a,θ) is not regularly varying at ∞ for a ∈ {A, F, J}, and that α 0 = 1 for a ∈ {A, C, F}, whereas α 0 = θ −1 −1 for a = J; see also the last two columns of Table 1. These two columns also reveal that β influences λ u for all listed families, whereas λ l only for the Clayton family. From this point of view, β is playing an important role particularly for upper-tail dependence modeling. Given a bivariate OPAC C ψ(a,θ,β) and also considering its Kendall's tau via (3), an interesting question is if, given (τ, λ u ) ∈ [0, 1] 2 , there exist values of θ and β such that τ (a) (θ, β) = τ and λ u (C ψ(a,θ,β) ) = λ u . This question is answered in Figure 2, which highlights the pairs of τ and λ u for which such θ and β exist (so which τ and λ u are attainable). We observe a larger region of attainable pairs for the upper-tail independent AC families of Ali-Mikhail-Haq, Clayton and Frank; for the Joe family, λ u > 0 even with β = 1. Hence, fixing τ to a desired value in order to obtain a good fit in an OPAC's body, the most flexibility in the upper tail is provided by, somewhat unexpected, the upper-tail independent AC family.
Finally note that all three considered measures of association are monotone with respect to β, which, for λ l and λ u , follows from Proposition 3.7 in [15]. This is also clearly visible in Figure 3, where samples of size n = 500 from a bivariate OPAC with different values of θ and β are shown for each working family.

Sampling from OPACs
The samples in Figure 3 were obtained using the well-known sampling algorithm of [27] together with Theorem 3.6 from [15]; for the sake of completeness, we recall the latter below as it is needed later in Section 3.2. Note that LS −1 [f ] denotes the inverse LaplaceStieltjes transform of a function f .
As sampling from S is a standard routine, sampling from an OPAC only requires a sampling strategy for LS −1 [ψ], which is known for many one-parameter families; see Hofert [14,

Estimating OPACs
For the one-parameter case, two AC estimators are particularly popular, 1) the maximum likelihood (ML) estimator and 2) the Kendall's tau inverse estimator; see [9]. The latter can be viewed as a generalized method of moments estimator and is statistically not as efficient as ML. In contrast, the ML estimator naturally extends to any parameter dimension and is also feasible for estimating OPACs. To complement this estimator, we also consider a distance-based estimator (based on a goodness-of-fit test statistic) in what follows.
Given a copula family a, the ML estimator for the parameters θ and β of a d-OPAC C ψ(a,θ,β) is defined by and a distance-based estimator, denoted S n , by where c ψ(a,θ,β) is the density of C ψ(a,θ,β) and C n (u) = 1 (1) Both estimators converge to the true values (θ 0 and β 0 ) with increasing n; (2) The ML estimator is unbiased and more efficient than S n , which is expected from classical statistical estimation theory; (3) The standard errors ofθ andβ increase with τ (a) (θ 0 , β 0 ), whereas for τ (a) (θ,β) they decrease; and (4) Conclusions 1.-3. are independent of the family a considered.
To summarize, both estimators are viable for OPAC estimation. We thus use these estimators also for HOPAC estimation considered in Sections 3.3 and 3.4.  3. The nested case

Hierarchical Archimedean copulas
To construct a hierarchical Archimedean copula (HAC ), one replaces some arguments of an AC by other (H)ACs, see Joe [20, pp. 87]. To obtain a proper copula, one also needs to verify certain nesting conditions. For example, given two 2-ACs C ψ1 and C ψ2 , a 3-variate HAC, denoted C ψ1,ψ2 , can be constructed by A tree representation of such a construction is given in Figure 6a. Using the language of graph theory, an undirected tree (V, E) related to this representation can be derived by enumerating all of its nodes; here V is a set of nodes {1, ..., m}, m ∈ N, and E ⊂ V × V. For example, in Figure 6b, The leaves {1, 2, 3} correspond to the HAC variables u 1 , u 2 and u 3 , whereas the nonleaf nodes {4, 5}, called forks, correspond to the ACs (uniquely determined by the corresponding generators) nested in C ψ1,ψ2 . When deriving a particular (undirected) tree for the tree representation in Figure 6a, we assume that (1) the leaves 1, 2 and 3 in Figure 6b correspond to u 1 , u 2 and u 3 in Figure 6a, respectively, and that is given by Figure 6b.
(2) the fork indices 4 and 5 are set arbitrarily (we could have also derives an undirected tree where the fork indices 4 and 5 are switched). In order to enumerate the forks uniquely, we set them according to the corresponding Kendall's tau, meaning that the root, which has always the lowest Kendall's tau, is numbered by m, the node with the second lowest Kendall's tau by m − 1, etc.
As each fork corresponds to a generator, we represent this relationship using a labeling denoted Ψ, which maps the forks to the corresponding generators. In our example, Using this notation, (6) can be rewritten as Observe that the indices of the arguments of the inner copula C Ψ [4] correspond to the set of children of fork 4, i.e., to {2, 3}, and the indices of the arguments of the copula C Ψ [5] correspond to the set of children of fork 5, i.e., to {1, 4} where the number 4 represents the copula C Ψ [4] (u 2 , u 3 ). This implies that one can express C ψ1,ψ2 (u 1 , u 2 , u 3 ) in terms of the triplet (V, E, Ψ). Following this observation, we denote a HAC in an arbitrary dimension by C (V,E,Ψ) ; see also Definition 3.1 in [13]. Finally, let Ψ[i] = ψ (ai,θi,βi) , i ∈ {4, 5} be two OPAC generators. The graphical representation depicted in Figure 6c fully determines the parametric HAC C (V,E,Ψ) = C ψ1,ψ2 given by (6) and (7), i.e., its structure, the families of its generators and its parameters.
To guarantee that a proper copula results from nesting ACs, we will use the sufficient nesting condition (SNC) proposed by Joe [20, pp. 87] and [29]. It states that if for all parent-child pairs of forks (i, j) appearing in a nested construction is a copula. This SNC has three important practical advantages, which are that 1) its expression in terms of the corresponding parameters is known, 2) this expression does not depend on the copula dimension d for all pairs for which it is known, and, most importantly, 3) efficient sampling strategies based on a stochastic representation for HACs satisfying the SNC are known; see [16]. Note that there also exists a weaker sufficient condition, see [34], which however lacks those three advantages and is thus of limited practical use.

Constructing and sampling HOPACs
Starting with a simple trivariate nested copula structure, e.g., the one depicted in Figure 6, we can sample from it according to the algorithm proposed by [29]. Note that one can easily apply the same strategy also to the general d-variate HAC (d-HAC) case with d ≥ 3. The sampling algorithm, extending the one of [27] for ACs, requires one to know how to sample the two following random variables: , where the tildes emphasize that the generators are OP transformed, say ψ 1 = ψ (a1,θ1,β1) and ψ 2 = ψ (a2,θ2,β2) , where a 1 and a 2 are labels of c.m. families of one-parameter generators, θ 1 ∈ Θ a1 , θ 2 ∈ Θ a2 and β 1 , β 2 ≥ 1.
A strategy for sampling from V 1 has been recalled in Section 2.4. It is important to note that V 1 is non-negative and, in fact, strictly positive with probability 1, as a result of Bernstein's Theorem [2] and the fact that ψ 1 is a c.m. generator.
To sample from V 12 , consider that for all t ∈ [0, ∞), Note that V 1 acts as a parameter for ψ 12 . The following proposition provides an explicit way to sample from V 12 .
Proposition 2 implies that: (1) Any OP family (β 2 ≥ 1) can be nested into a non-OP family ( , which simplifies to θ 1 ≤ θ 2 for many families when a 1 = a 2 ; see the second column of Table 2.3 in [14]. To the best of our knowledge, this nesting case has never been considered in the literature. , under β 1 = 1, can be fully translated to know-how of sampling its non-OP transformed version, i.e., to LS −1 [(ψ 12 ) V1 ], which is known for many families; see the third column of Table 2.3 in [14]. Also note that free implementations of sampling algorithms for LS −1 [(ψ 12 ) V1 ] are available, e.g., in the R package copula [17] or in the HA-Copula toolbox for MATLAB and Octave [11].
For β 1 > 1, an OP family cannot be nested into a proper OP (β 2 > 1) family in the way mentioned above except for a special case already considered in [15]. Namely if a 1 = a 2 and θ 1 = θ 2 , then which is a proper Gumbel generator provided that β 1 ≤ β 2 , with inverse Laplace- Hence, to construct a proper HOPAC of the form C ψ1 {u 1 , C ψ2 (u 2 , u 3 )} under the SNC, it must hold that either As mentioned above, these constraints can easily be translated to the general nesting case just by checking these constraints to hold for every parent-child pair of nodes in the nested copula structure.
Let us now consider the case when a 1 = a 2 and the condition on [ψ −1 (a1,θ1) {ψ (a2,θ2) }] to be c.m. is simplified to θ 1 ≤ θ 2 . Construction of HOPACs under the SNC is then similar to the one-parameter-generator HACs, where the parameters θ i involved have to be increased as one goes further down in a branch of the copula structure. In the HOPAC structure, the parameters also have to increase but one can choose which one of them. Let us illustrate this with the model depicted on the left-hand side of Figure 7. One can observe that θ or β increases as one goes down in a branch of the copula structure. More precisely, if β parent = 1, then θ child ≥ θ parent and of course β child ≥ β parent , see, e.g., the parent-child pair of forks (10,8) or (11,9). This new flexible case is enabled by Proposition 2. Or, if β parent > 1 then θ child is fixed to θ parent 's value and only β increases as one goes down in a branch, as the pair of forks (9,7) indicates. This case has already been used in [19]. For completeness, let us mention that the pair of forks (11,10) represents nesting of two one-parameter ACs. In the following section, we develop HOPAC estimators under the SNC for HOPACs mentioned above in which the condition on [ψ −1 (a1,θ1) {ψ (a2,θ2) }] to be c.m. can be simplified to θ 1 ≤ θ 2 provided a 1 = a 2 .

Estimating HOPACs
The literature provides a set of diverse HAC estimation methods; see [32] or [10,13] for those concerning estimation of both structure and parameters. However, as already mentioned in Section 1, all of them restrict to HACs involving just one-parameter ACs. On the one hand, these methods cannot be directly used for HOPAC estimation, on the other hand, they serve as a natural starting point for development of such estimators.
In general, three ingredients are necessary to get a HAC estimated, 1) its structure, 2) the AC families and 3) their parameters. 1) typically encompasses a kind of agglomerative clustering, where the structure finally results from clustering of variables under concern according to some measure of similarity between pairs of the variables, e.g., according to Kendall's tau. As the structure estimator in [10,13] does not, by contrast to [32], require any assumptions on the family of the nested ACs, it is thus immediately feasible also for HOPAC estimation. Moreover, there exist theoretical justifications for such an estimator -given a HAC, [33] show that its structure can be uniquely recovered from all its bivariate margins, and Theorem 2 in [12] show that it is possible just from all its pairwise Kendall's coefficients. Finally, as this estimator, formalized by Algorithm 1 in [13] (see also our Appendix), showed the best results in the ratio of successfully estimated true HAC structures on the basis of simulation studies, see [10] or [37], we adopt it to our HOPAC estimation approach.
To estimate the families, the mentioned references provide two different approaches. The one used in [32] and [10] arbitrarily assumes the same family for all nested ACs, whereas then one in [13] involves an extra model selection step, which chooses for each AC the best fitting family out of some predetermined pool of families, and thus allows the families in the estimated HAC to differ. Whereas the first approach substantially simplifies the parametric constraints following from the SNC (to the condition θ parent ≤ θ child for most of the cases), the latter requires an extensive analysis of the input family pool as not all families can be mixed together or since the parameter ranges of the families in the pool have to be adjusted before the estimation process. Hence, when the OP transformation comes to play, which makes the estimation process substantially more complex even under the assumption of the same family for all nested ACs, the approach allowing for different nested families becomes pretty challenging.
In this work, we focus on the case of having the same family for all ACs nested in a HOPAC. Note that this case still covers estimation of HACs with one-parameter ACs from different families, for example, the families of Clayton, 12 and 14, as stated in Section 1, where the latter two family numbers correspond to the notation used in Nelsen [31, pp. 116-119].
For estimation of parameters, we use a mixture of existing step-wise procedures. This follows from the existence of a close relationship between bivariate margins of a HAC and the location of ACs nested in this HAC. To clarify, according to Proposition 3 in [10], given two random variables U i and U j from the vector (U 1 , ..., U d ) distributed according to a d-variate HAC, the copula of (U i , U j ) is the AC allocated in the HAC structure at the node where the two branches -1) the one from the leaf u i to the root, and 2) the one from the leaf u j to the root -meet for the first time. This AC (node) is called youngest common ancestor of u i and u j . In Figure 8a, the youngest common ancestor of u 1 and u 2 is node 5 (AC C Ψ [5] ), whereas for u 1 and u 3 it is node 7 (AC C Ψ [7] ). It follows that the parameters of this copula can be estimated from the observations of U i and U j only. To this end, we use the AC maximum-likelihood (ML) estimator as in [32]. As there are often more of such pairs with the same youngest common ancestor, we use this fact in the way that we estimate the parameter(s) of the youngest common ancestor from the observations of all pairs of random variables corresponding to this ancestor, and then aggregate these estimates by using the mean; the mean performed best in the simulation study involving the mean, maximum and minimum aggregation statistics implemented in Step 2 of Algorithm 3 in [10].
The concept of our approach to the HOPAC estimation is summarized in Algorithm 1, which requires two inputs: 1) a one-parameter Archimedean family a, e.g., from Table 1, and 2) realizations u = (u ij ) ∈ [0, 1] n×d of pseudo-observations (U ij ) j∈{1,...,d} i∈{1,...,n} given by whereF n,j denotes the empirical distribution function corresponding to the jth margin, R ij denotes the rank of X ij among X 1j , ..., X nj , and (X i1 , ..., X id ), i ∈ {1, ..., n} are i.i.d. random vectors distributed according to a joint distribution function with continuous margins F j , j ∈ {1, ..., d}, and a HOPAC C. The algorithm returns a HOPAC from family a with estimated structure and parameters. Now, let us discuss several properties of the concept using the following example. Let C be the Clayton HOPAC from Figure 8a, with d = 4, and let (ū ij ) j∈{1,...,4} i∈{1,..,n} be a sample from it shown in Figure 8b. Now assume C to be unknown and let us estimate it based on the pseudo-observations (u ij ) j∈{1,...,4} i∈{1,..,n} of (ū ij ) j∈{1,...,4} i∈{1,..,n} . The algorithm estimates the structure in its first two steps using Algorithm 3, which returns a binary tree (V,Ê) and estimates of Kendall's tau (τ 5 ,τ 6 ,τ 7 ) corresponding Algorithm 1 The HOPAC estimation concept 1: compute the matrix of pairwise Kendall's tau (τ ij ) from u 2: estimate the structure using Algorithm 3 with (τ ij ) 3: for each fork k in the structure do 4: for each pair of leaves (i k , j k ) in the structure such that k is the youngest common ancestor of i k and j k do 5: use ML to estimate OPAC's θ and β for (u m,ik , u m,jk ), m = 1, ..., n 6: end for 7: aggregate these estimated θs and βs by computing their meansθ k andβ k 8: let C ψ (a,θ k ,β k ) be the OPAC estimate corresponding to the fork k in the estimated HOPAC structure 9: end for  to each fork in that tree, all shown in Figure 9a. As described above, the families of nested ACs are assumed to be from some pool of available families, e.g., a pool implemented by the software toolbox we use. Such an assumption implies that the user, when deciding which family suits best for the considered data, should repeat the whole estimation process for each of the available families and then perform some extra evaluation of their fit. For simplicity, assume the Clayton family for all the nested ACs. Recall that the HOPAC estimates will be built under the SNC in its simplified parametric form. This means that to satisfy the SNC, it must hold that for each parent-child pair of forks in the structure Let us consider two possible estimators under R 1 and R 2 following the concept in Algorithm 1.

Bottom-Up estimator
The idea of the Bottom-Up estimator lies in traversing through the forks in the estimated structure in the way that one starts at the bottom of the structure and then continues up until the root is reached, similar to [32] and [10,13]. A way to guide the traversing process consistently is, e.g., according to Kendall's tau estimates returned by Algorithm 3 -starting with the fork with the highest value, then process the fork with the second highest value, until one gets to the fork with the lowest value, the root.
In our example, fork 5 corresponds to the bottom of the structure. As it is the youngest common ancestor of leaves u 1 and u 2 , we compute the ML estimator for θ 5 and β 5 according to (4): log c ψ(C,θ,β) (u i1 , u i2 ).
As node 6 is the youngest common ancestor of u 3 and u 4 , the corresponding parameters θ 6 and β 6 can be estimated according to (13) with u i1 and u i2 being replaced by u i3 and u i4 , respectively. The estimated parameter values are shown in Figure 9b. Having the bottom level estimated, we continue to the upper levels until we reach to the root.
Here the SNC comes into play. Asθ 5 =θ 6 , it clear that R 2 is violated because it is impossible to satisfy botĥ θ 7 =θ 5 andθ 7 =θ 6 for our example. The restriction R 2 was, however, a constraint under which the model was built; see Figure 8a. It follows that turning to the restriction R 1 prevents a good fit for node 7, as R 1 requires thatβ 7 = 1 and thatθ 7 has to be trimmed to the closest value allowed by R 1 , i.e., to 0.844; see Figure 9c. A sample of 1000 observations from this HOPAC estimate is shown in Figure 9d. It is evident that choosing R 1 instead of R 2 substantially affects the fit -compare the distributions of the pairs (U 1 , U 3 ), (U 2 , U 3 ), (U 2 , U 4 ) and (U 1 , U 4 ) shown in Figure 8b (which correspond to the true HOPAC) with the corresponding ones of the estimate in Figure 9d. A way to cope with this problem could be to test if the parameters θ of the children are all equal to some aggregated value like their mean. With m children of a given fork, this would however require additional m 2 tests (and there is the problem of multiple testing), making the computation substantially more involved. An efficient solution requiring at most one test for each fork is presented in the following section.

Top-Down estimator
A solution to those problems consists of reversing the way in which the structure is traversed during the estimation, meaning starting at the root of the copula structure and then using the depth-first approach to go through all forks. This way of estimation already appeared in connection to other hierarchical copula structures; see [38] or recently [4]. Before considering our example with the Top-Down estimator, we first present it in term of pseudo-code in Algorithm 2.
Estimation of the parameters is then performed by calling the function TopDown{u, a, (V,Ê), 2d − 1, Θ a , [1, ∞), β R } presented in Algorithm 2, where a is an Archimedean family, 2d − 1 denotes the root in the binary tree (V,Ê), Θ a (see Table 1) and [1, ∞) are ranges for the parameters θ and β of the OPAC estimate CΨ [2d −1] corresponding to the root, and β R ∈ [1, ∞) is an upper bound for accepting β parent = 1 in R 1 , commented on more below. Recall that descendants of a given node refer to the set of the children of that node, the children of these children, etc.
Several aspects of Algorithm 2 are worth being addressed: • The function TopDown recursively goes through all the forks of the tree (V,Ê) in the depth-first search manner, which can be seen from steps 2, 4, 13 and 14. • The assumption for i < j in step 4 is without loss of generality as in all remaining Algorithm 2 The Top-Down HOPAC estimator Inputs: u = (u ij ) ∈ [0, 1] n×d -realizations of (U ij ) j∈{1,...,d} i∈{1,...,n} given by (12) a -a one-parameter Archimedean family, e.g., from Table 1 (V,Ê) -a binary tree (copula structure) k ∈ {d + 1, ..., 2d − 1} -a fork to estimate its parameters r θ ⊂ R -a range for parameter θ r β ⊆ [1, ∞) -a range for parameter β β R ∈ [1, ∞) -an upper bound for accepting β parent = 1 in • According to Remark 2 in [12], all pairs from the Cartesian product of l i and l j defined in steps 5 and 6 have the same youngest common ancestor k. It follows from Proposition 3 in [10] that bivariate margins corresponding to these pairs share the same copula, which motivates the mean aggregated ML estimator used in step 7. Note that such an aggregation approach in a one-parameter version has already been successfully used in Algorithm 3 in [10], see step 2 therein. Also note that the viability of such an aggregation approach is studied in Section 3.4. • As the result of the argmax in step 7 is the vector of two components (θĩj, βĩj), all the operators to its left (the two sums and the division) are considered component-wise. • The sets l i and l j are always disjoint, which follows from the fact that that node i and node j do not lie at the same branch of (V,Ê). This fact avoids that the same pair appears twice (first as (i, j) and then as (j, i)) in the first two sums in step 7.
• The if statement in step 8 decides which one of the restrictions R 1 and R 2 applies for the children i and j of node k at the recursive steps 13 and 14. As the parametersθ andβ are estimated by ML, we have asymptotic normality and the variances of these estimates. As θ child appearing in R 2 is not yet available (it is estimated in further steps that depend on the decision in step 8), it is thus convenient to test for R 1 as it requires only the value of β parent . In practice, however, testing the hypothesis β parent = 1 would slow down the computations, therefore we decided to only check whetherβ lies in the prescribed interval [1, β R ]. The involved parameter β R also allows us to emphasize one of the restrictions -we emphasize R 1 with larger values of β R , whereas R 2 with smaller ones. In the following, we use β R = 1.05 as it turned out to provide a convenient balance between R 1 and R 2 .  Figure 8a. Let a to be again the Clayton family, and recall that Θ C = (0, ∞), see Table 1. Finally, to obtain the parameter estimates, call TopDown(u, C, (V,Ê), 7, Θ C , [1, ∞), 1.05).
As the recursive steps 13 and 14 involve the estimation of bivariate OPACs for nodes 5 and 6 (l i ← {i} and l j ← {j} in both of the nested calls of TopDown), we just show the results of step 15, which areΨ [5] ← ψ (C,0.994,5.145) andΨ [6] ← ψ (C,0.994,4.858) . Finally, step 15 results inΨ [7] ← ψ (C,0.994,3.083) . The resulting estimated HOPAC C (V,Ê,Ψ) is depicted in Figure 10a. We observe that the parameters are relatively close to the true parameters (shown in Figure 8a), particularly, compared to the Bottom-Up analogue, for β of node 7. This is further reflected via distributions of samples from these HOPACs -compare the distributions and particularly the strength of the correlation in the pairs (U 1 , U 3 ), (U 2 , U 3 ), (U 2 , U 4 ) and (U 1 , U 4 ) shown in Figures 8b, 9c and 10b corresponding to the true copula, the Bottom-Up and Top-Down estimate, respectively. In contrast to the Bottom-Up estimate, the Top-Down one closely follows the true distribution.
It is clear that the arbitrary assumption of the Clayton family needs an extra attention. As suggested above, an extra criterion should be used to evaluate feasibility of such an assumption. For example, the goodness-of-fit test statistic used in the estimator defined in (5) can be used to this end; see [8]. Evaluating this criterion for the sample from Figure 8b and the Top-Down HOPAC estimate shown in Figure 10a, we obtain S n = 0.0148. For a = A we get S n = 0.0148, for a = F we observe S n = 0.0694 and for a = J we receive S n = 0.1516. It is not surprising that S n for the true family is minimal. What might be surprising is that the minimum is also obtained for the Ali-Mikhail-Haq family. However, looking at page 117 in [31], Table 4 Figure 10c, and considering that the parameters θ for both families are relatively close to 1, this result rather confirms that the presented framework works correctly.

Simulation study
To evaluate the HOPAC estimator presented in the previous section, N = 100 repetitions of the following routine for each of the families Ali-Mikhail-Haq, Clayton, Frank and Joe are performed. This routine first randomly generates a HOPAC model, then Figure 11.: Three randomly generated HOPAC models. samples from it, computes several estimates based on the sample, and finally measures certain types of discrepancy between the model and the estimate, and eventually between the sample and the estimate. More precisely, the setup is as follows: (1) Given a dimension d ∈ {5, 10, 20}, a correlation matrix of size d × d is randomly generated according to the sampling algorithm proposed in [26]. (2) This matrix is then passed to Algorithm 3, which returns a binary tree with d leaves, which serves as the structure of the randomly generated HOPAC model. The algorithm also returns the estimatesτ d+1 , ...,τ 2d−1 corresponding to the forks in that tree, which are used, in the next step, to generate the parameters of the HOPAC model. (3) The forks in the structure are traversed depth-first starting from the root (k = 2d − 1) and for each given fork k ∈ {d + 1, ..., 2d − 1}, the parameter β is set equal to 1, i.e., to the case corresponding to R 1 , with probability of 50%. Hence, if β is 1, the parameter θ is just adjusted in a way that Kendall's tau of this fork remains equal toτ k . For the case corresponding to R 2 (the remaining 50%), the parameter θ is first generated randomly and then β is adjusted to keep Kendall's tau equal toτ k ; see Figure 11 for examples. (4) Assume the same family a ∈ {A, C, F, J} for each OPAC nested in the HOPAC model and sample from it with sizes n ∈ {200, 400, ..., 1000}.
(5) Based on these samples, compute realizations of the following estimators: (a) The OPAC estimator (θ ML ,β ML ) defined by log c ψ(a,θ ij ,β ij ) (u mi , u mj ). (14) We include this estimator to the study in order to show to which level OPACs are (un)able to fit HOPACs, in other words, how important is the presence of hierarchy/structure in the copula model. Also note that accessing the density c ψ can be challenging due to need of differentiating the cumulative distribution function d-times. The estimator given by (14) is thus used instead of the standard (non-agregated) generalization of (4). It is, however, worth noting here that this simple (OPAC) estimator shows an excellent improvement/complexity trade-off in the tail-dependence modeling application reported in Section 4, which hints at feasibility of such an aggregation approach in general. (b) The HAC estimator (denoted HAC) based on one-parameter generators given by Algorithm 2, where the OP transformation is avoided simply by setting the input argument r β equal to [1,1]. Note that this estimator is included in our study in order to stress out the importance of having the OP transformation in the copula model. (c) The HOPAC Top-Down S n -based estimator (TD-Sn) given in Algorithm 2, where the OPAC ML estimator in step 7 is replaced by the distance-based estimator S n given by (5).  Figures 12 and 13, whereĈ and C n denote the estimated and empirical copulas, respectively,τ ij andλ u ij denote the Kendall's tau and upper-tail dependence coefficient corresponding to the youngest common ancestor of leaves i and j in the estimated structure, respectively, τ n ij denotes the sample version of Kendall's tau corresponding to variables U i and U j , and λ u,n,5% ij denotes the non-parametric estimate of the upper-tail dependence coefficient for variables U i and U j at the level k/n = 0.05 according to (13) in [35], where k ∈ {1, ..., n}. (b) True versus estimate. This group includes the three measures given at the top of Figures 14 and 15,where the pair (θ i , β i ) and τ i and λ u i correspond to the fork i in the copula model whereas the pair (θ i ,β i ) andτ i and λ u i correspond to the fork i in the copula estimate. Note that the lower-tail dependence coefficient is not considered as it equals 0 for all families considered except Clayton, see Table 1. Also note that these measures require that the structure of the model and the estimate match and are thus evaluated only for TD-Sn and TD-ML. To generate N = 100 estimates matching the true structure, a new sample according to the model is generated in each out of N = 100 repetition until the structure returned by Algorithm 3 equals the true structure. The ratio of true structures returned out of N = 100 tries is depicted in Figure 16a on page 29.
It can be observed that: • As n increases, all measures decrease (converge to 0) for both HOPAC estimators TD-Sn and TD-ML. • TD-ML shows lowest means and standard errors in most of the cases.
• The estimators either without hierarchy (OPAC) or with no OP transformation available (HAC) are unable to model HOPAC data, as is clear from Figures 12 and 13. • All the previous observations are independent of the underlying family.
The quality of the structure estimator (Algorithm 3) is also evaluated, see Figure 16. Note that each bar in Figure 16a shows the value N/m × 100, where m is the number of sampling repetitions until N = 100 true structures have been recovered. As such an equal-or-not criterion is too strict as it does not take into account how much the estimated structure differs from the true structure. An extra proportional criterion based on a trivariate decomposition of the full structure according to [36] is also evaluated, see Figure 16b. There, each bar shows the value r/m × 100, where r = m i=1 r j with r j being the ratio of the trivariate structures from the decomposition of the estimated structure matching the trivariate structures from the decomposition of the true structure to d 3 . Note that such a criterion has already been used, e.g., in [37]. As can be observed, the ratio of estimated true structures is: • Independent of the family underlying the sample.
• Increasing with n.
• Converging to 100 (in n); this convergence is slower as d increases. The impact of increasing d is substantially lower for the proportional equal-or-not criterion.
Finally note that other classes of copulas, e.g., elliptical or vine [5,22], could be included in this simulation study. These were however not included as: (1) It follows directly from their theoretical construction that radially symmetric elliptical copulas cannot fit asymmetric HOPACs; (2) For vine copulas, realizations of the sample versus estimate measures require either the cumulative distribution function (Ĉ), which is computationally demanding already for d = 5, or to access bivariate margins (to getτ ij orλ u ij ), which is, in general, not possible.
These two important classes of copulas are included as benchmarks in the application reported in Section 4, where they are compared to HOPACs in their ability of tail dependence modeling.

Empirical Study
Value-at-Risk (VaR) has been established as an important risk measure in Quantitative Risk Management. In our study, we consider two different datasets of daily stock prices downloaded from Alpha Vantage 3 . The first one contains the five time series of stock prices of ADI (Analog Devices, Inc.), AVB (Avalonbay Communities Inc.), EQR Ali-Mikhail-Haq    It is important here that the clustering of the companies is given by their industry  sector: ADI and TXN belong to the high-tech industry, AVB and EQR to the real estate industry and LLY to the pharmaceutical industry. Therefore, we would expect that the structure of the HOPAC or HAC used in the study will resemble these groupings and thus the copula structure will play an important role. The other dataset contains the first 10 time series of daily stock prices from the S&P500 according to their market capitalization. For both datasets we use the time span 2002-02-01-2019-02-01. The negative profit-and-loss function of the portfolio is defined as L t+1 = d j=1 b j P j,t (1 − e Rj,t+1 ), where P j,t and R j,t are the price and log-return respectively of the asset j at time point t, d is the dimension of the portfolio and is equal to 5 for the first dataset and to 10 for the second one. Weights of the assets in the portfolio are denoted by b j , for j = 1, . . . , d with d j=1 b j = 1. As the study aims at proving the general power of the HOPAC model and not the comparison between different portfolio allocation schemes we consider only the equally weighted portfolio b j = 1 d , j = 1, . . . , d, advocated by [6]. Let F L denote the distribution function of L t+1 . This leads to the VaR of the portfolio at level α, VaR(α) = F −1 L (α). We focus on α = {0.95, 0.99}. The distribution function F L is estimated by simulating the path of the asset return from the underlying multivariate process estimated in the rolling window fashion on the windows of widths w = {126, 252, 504}. This corresponds to half a year, one year and two years of data. We fit all copulas (except for the historical simulation method [28,Chapter 2]) to pseudo-observations constructed from the i.i.d. standardized residuals and the underlying temporal dependency is modeled by marginal the GARCH(1, 1) method with t-distributed innovations: R j,t = µ j,t + σ j,t Z j,t with σ 2 j,t = ω j + α j σ 2 j,t−1 + β j (R j,t−1 − µ j,t−1 ) 2 , and ω > 0, α j ≥ 0, β j ≥ 0, α j +β j < 1. Afterwards, various copula models are estimated on the standardized residuals Z j,t . Thus, the estimated VaR t,w (α) at a given time point t window width w and level α is computed as follows: a) we estimate the GARCH(1, 1) for all univariate time series of log-returns on the time interval (t − w − 1, t − 1]; b) extract standardised residuals and estimate the copula; c) simulate a sample of where t s is the first day (always set to 505; maximal window width plus 1) and t e is the last day (4279; last day in the data) of the back-testing period. The closerα is to the theoretical level α, the better the model. We thus compare the absolute deviations |α − α| for the different models. We are aware of the various tests in the spirit of [24] but as they did not give any new insights visible from pure deviations we decided not to present them in the paper.
All in all, our study considers 20 models: a) AC, OPAC, HAC and HOPAC for Ali-Mikhail-Haq, Clayton, Frank and Joe copulas; b) Gaussian and t-copulas; c) R-Vine copulas, see [3] and d) the quantile-based historical estimator (denoted "Historical") which is computed directly on the true profit-and-loss function without any underlying time-series model.
The results are summed up in Figure 17, where the first two columns of panels correspond to the 5d portfolio and the second two columns to the 10d portfolio. The first column of panels for each portfolio depicts the results for α = 0.95 and the second for α = 0.99. The rows of panels show the results for the different widths w = {126, 252, 504} of windows. Each panel represents deviations |α − α| for all the models on the log-scale. Solid lines represent all the (H)(OP)AC models based on Ali-Mikhail-Haq (red), Clayton (green), Frank (blue) and Joe (black) families and the particular model (AC, HAC, OPAC or HOPAC) is shown on the horizontal axes. All the remaining models are represented with non-solid lines (these models are neither hierarchical nor OP transformed).
We clearly see from Figure 17 that the OPAC and HOPAC estimators outperform all the remaining estimators in almost all cases, almost independently of the type of the copula generator. Moreover, this implies that the OP transformation consistently improves the non-OP (H)AC estimators. In particular, the OP transformation is crucial for families that are unable to model upper-tail dependence, such as Ali-Mikhail-Haq, Clayton or Frank, or stronger correlations (e.g., Ali-Mikhail-Haq). For the Joe family, we observe good results also for the non-OP estimators. The OP-based estimators more frequently outperform the benchmark estimators for bigger αs, particularly when compared to the historical estimator. An improvement given by considering hierarchies (i.e., from OPAC to HOPAC) is observed mainly for d = 10 what may be explained by the fact that it becomes more important to model hierarchies in higher dimensions. Surprisingly, for d = 5 where the structure has such a clear role in the selection of the stocks and α = 0.95, exchangeable OPACs provide better results than HOPACs. For the AC-based estimators, no substantial influence of the size of the time window (w) is observed and can be explained by the relative robustness of these models over time.

Conclusion
We demonstrated the improvements the OP transformations can bring to exchangeable ACs and hierarchical ACs. For the exchangeable case, a simplified way to compute the tail dependence coefficients was proposed. Also, two feasible OPAC estimators were considered via simulations. Then, a new way of construction, an efficient sampling strategy and an estimator were provided for HOPACs, including a simulation study confirming the feasibility of the proposed estimator. Excellent abilities of the (H)OPAC models were finally demonstrated on an application from risk management.
Their interpretation is, however, less clear, and the same applies to the hierarchical case.