Likelihood Inference for Factor Copula Models with Asymmetric Tail Dependence

For multivariate non-Gaussian involving copulas, likelihood inference is dominated by the data in the middle, and fitted models might not be very good for joint tail inference, such as assessing the strength of tail dependence. When preliminary data and likelihood analysis suggest asymmetric tail dependence, a method is proposed to improve extreme value inferences based on the joint lower and upper tails. A prior that uses previous information on tail dependence can be used in combination with the likelihood. With the combination of the prior and the likelihood (which in practice has some degree of misspecification) to obtain a tilted log-likelihood, inferences with suitably transformed parameters can be based on Bayesian computing methods or with numerical optimization of the tilted log-likelihood to obtain the posterior mode and Hessian at this mode.


Introduction
Dependence models with multivariate copulas have had many applications in the past two decades to handle non-Gaussian dependence; in particular, for applications such as risk analysis where variables can have more dependence in the joint tails than with Gaussian dependence with the same strength of central dependence.
When pairwise scatterplots of variables suggest lower and upper tail dependence, possibly asymmetric in the strength in the joint lower tails (extreme of lower quadrant) versus the strength in the joint upper tails (extreme of upper quadrant), several different parametric copula families with tail dependence are among the best based on information criteria such as the Akaike information criterion (AIC).However model-based bivariate lower and upper tail dependence measures can be quite different for these different parametric copulas, and the comparisons of lower and upper tail dependence measures might not match the visual comparisons on the pairwise scatterplots.This is because likelihood methods are influenced a lot by data in the middle (rather than extremes), and all simple parametric models have some degree of misspecification.
For univariate distributions, it is well known that inferences involving large quantiles should not be based on a fitted parametric distribution because extrapolation is not reliable when the data values in the middle have the most influence in the parameter estimates.There are two approaches for univariate inferences involving extremes: (a) from univariate extreme value theory with the assumption of a well-behaved tail density, the peaks-overthreshold method based on generalized Pareto distribution [1] can be used, or (b) splicing models [2] can be used with different flexible densities for the body and tail, if inferences are also needed for non-extremes.For the joint tail region, there is a multivariate Pareto approach such as that in [3], but there is no convenient way to combine with a density for the body.
The goal in this article is to propose a method that incorporates "prior" information on the relations of bivariate lower/upper tail dependence pairs, thereby placing more weight on joint extreme observations when estimating the dependence parameters of the multivariate copula; the splicing of densities for the body and the joint tails is avoided.This approach should lead to parameter estimates of copula dependence parameters with more reliable inference for tail dependence and other tail-based quantities.
How different parametric copula models lead to quite different tail inferences is illustrated with some financial returns data over a few consecutive years.Consider the financial returns for different market indexes or stocks in the same sector of a market; for dependence analysis, commonly, a copula-GARCH model (see [4]) is applied to GARCHfiltered returns.Pairwise normal scores plots after rank transform to N(0, 1) show tail dependence with the clouds of points being sharper than the elliptical shape in the extreme lower and upper quadrants.Often, there appears to be a stronger dependence in the joint lower tail than in the joint upper tail.
When different flexible parametric multivariate copula families, such as vine and factor copula models, are fit to multivariate GARCH-filtered returns, the best-fitting models based on AIC imply lower and upper tail dependence for any pair of returns.This is based on the results of [5,6], which imply that if bivariate copulas, for pairs of variables in the first tree of the vine or with a variable linked to a latent variable, have lower and upper tail dependence, then the bivariate copulas of all pairs of variables have lower and upper tail dependence.Note that factor copulas are vine copulas that include latent variables.
However, if model-based tail dependence parameters are computed based on the best few fitted models, they can be quite different among the models and sometimes may not match what is seen in the normal score plots.For example, (a) sometimes, a modelbased lower tail dependence parameter may be closer to 0 than expected based on the plot, or (b) the model-based lower tail dependence parameter may be smaller than the model-based upper tail dependence parameter, in contrast to the the visual inspection of the plot.
With the non-parametric method for empirical tail dependence measures in [7], it is possible to compare empirical and model-based lower and upper tail dependence to show quantitatively that model-based measures might not be reliable for all bivariate margins.This is because the fit of parametric multivariate models based on likelihood tends to be dominated by the data in the middle of the distribution.Inference concerning the middle (e.g., medians and non-extreme marginal orthant probabilities) can be reliable but not necessarily inference concerning the extremes (e.g., extreme marginal orthant probabilities or multivariate quantiles of the form defined in [8]).
This article shows the use of a tilted likelihood to estimate parameters of the 1-factor copula so that inferences in the joint tails are improved.The 1-factor copula for d variables has a vector parameter ϑ j for the bivariate linking copula of the jth variable and the latent variable (the latter explains the joint dependence of the observed variables).The tilting depends on the nature of the variables.For d GARCH-filtered stock returns for stocks in the same sector, the dependence parameters {ϑ j : 1 ≤ j ≤ d} can be considered a sample from a super-population so that it is reasonable to assume a common prior distribution for the ϑ j .The tilted log-likelihood is based on the sum of the 1-factor copula log-likelihood and the logarithm of this prior density that is based on tail dependence summaries from some "previous data".
There is a numerical data example in Section 2 for preliminaries to show explicitly why likelihood inference can be inadequate for tail inference; tail dependence parameters are defined, and examples of normal score plots are given in this section.Sections 3 and 4 contain the theory and numerical methods to develop a "prior" to help with tail inference for the 1-factor copula model with asymmetric tail-dependent copulas linking to the latent variable.Section 5 illustrates the theory for a data example with GARCH-filtered stock returns from stocks in a S&P sector to show improved tail inference.Section 6 has some simulation results to compare with the data example.Section 7 concludes with a discussion on the generality of the approach proposed for the 1-factor model; the basis is a "superpopulation" assumption for some bivariate margins with lower and upper tail dependence.
The background results for tail dependence, copulas, and factor models are given in Appendix A.

Numerical Data Example to Illustrate Discrepancy for Tail Inference
In this section, a numerical low-dimensional data example is used to clarify what is meant by the possible poor joint tail inference following maximum likelihood.
Definitions of bivariate tail dependence and the copula as summaries of dependence are presented to explain concepts of dependence in joint tails.
Let F 1:d be an absolutely continuous d-variate distribution with univariate margins F 1 , . . . ,F d and copula C 1:d such that For the bivariate margin F jk = C jk (F j , F k ) with j ̸ = k, the probabilistic version of the lower and upper tail dependence parameters is: Consider a random sample from F 1:d with y i = (y i1 , . . ., y id ) for i = 1, . . ., n.Because λ jk,L and λ jk,U are limiting quantities (as u → 0 + ), there are no direct empirical (data) versions.For the numerical examples in this section and later sections, the sample version comes from a limit of tail-weighted dependence measures.
A general reference for concepts (in the above and in later sections) with copulas and dependence is [9], and the estimator of tail dependence from a limit of tail-weighted dependence measures is given in [7].For the probabilistic version, the tail-weighted dependence measures are indexed by a parameter α > 1 and the limit, as α → ∞ is the tail dependence parameter.After computing the empirical tail-weighted dependence measure for a grid of α values, typically in the interval [10,20], a regression model is fit for the empirical measure versus a power of α −1 , and then the tail dependence parameter is estimated as the extrapolation with α −1 → 0.
The data example involves GARCH-filtered stock returns with all stocks in the same sector.Appendix A.4 has some background for GARCH time series and copula-GARCH models.
The S&P 500 data set of GARCH-filtered stock returns (January 2013 to December 2015, good economic conditions) used for illustration is analyzed in [10].The sample consists of n = 754 days.For the finance sector, some initial descriptive statistics analyses based on 10 stocks were chosen from 64; the ticker symbols of the 10 stocks are COF, RJF, SCHW, FRC, GL, FD, TROW, GS, BLK, and ICE.Normal score plots of GARCH-filtered returns for a few pairs amongst these 10 stocks are given in Figure 1 (see Appendix A.3 for the mathematical definition of the transform).They show tail dependence, with the clouds of the points being sharper than the elliptical shape and having a stronger correlation in the lower quadrant than in the upper quadrant.These few stocks are used to demonstrate (in small tables) a typical situation of differences in empirical and model-based tail quantities.To check that a 1-factor dependence structure is reasonable, the non-parametric transform to normal scores is applied to GARCH-filtered returns, and factor analysis (see [11]) is applied to the resulting correlation matrix.The loadings are, respectively, 0.741, 0.802, 0.838, 0.475, 0.688, 0.821, 0.665, 0.609, 0.690, 0.830.The average absolute difference between the empirical and model-based correlation matrix is 0.03, and the maximum absolute difference between the empirical and model-based correlation matrix is 0.21 (with two discrepancies with an absolute difference > 0.10), so the 1-factor structure is reasonable as a first-order approximation when considering that a 10 × 10 matrix with 45 correlations is approximated by a simple correlation matrix with 10 parameters.With a larger dimension (more stocks in the same sector), a 1-factor model with some weak conditional dependence (see [12]) could be a better dependence model.
Two parametric copula models are fitted to account for non-Gaussian dependence-1factor with d = 10 linking copulas that are all BB1, or all reflected BB1 (abbreviated as BB1r).These are referred to briefly as 1-factor BB1 and 1-factor BB1r.The details of these models are summarized in Appendices A.1 and A.2 in Appendix A; in particular, Appendix A.1 has the definition of the 2-parameter bivariate BB1 copula and some of its dependence properties, and Appendix A.2 has the definition of the 1-factor copula for d variables based on conditional independence of observed variables given a latent variable.
Table 1 has empirical and model-based lower and upper tail dependence measures: λ jk,L , λ jk,L ( ϑ), λ jk,U , λ jk,U ( ϑ).Model-based values are based on maximum likelihood estimates (MLEs) with 1-factor BB1r and 1-factor BB1.Table 2 has an empirical Spearman rank correlation as a central measure of dependence: ρ jk,S .The values of ρ jk,S ( ϑ) for the two 1-factor copula models are quite close to the empirical values compared with some discrepancies for tail dependence measures.Table 3 has summaries, averaged over d(d − 1)/10 = 45 bivariate margins.Tables 1 and 3 show that tail inferences from different models with lower and upper tail dependence can be quite different, but the models can have similar inferences for central quantities.The tail asymmetry of financial returns, with commonly more dependence in the joint lower tail than in the joint upper tail, is explained and discussed in [13,14].The 1-factor BB1r model has a smaller AIC value than 1-factor BB1, and it better matches the empirical property of lower tail dependence, being often larger than upper tail dependence.However, the 1-factor BB1r model tends to overestimate the difference in the lower and upper tail dependence, and the 1-factor BB1 model tends to underestimate the difference in lower and upper tail dependence.This motivates the tilted likelihood in Section 3 with an appropriate "prior" so that model-based tail dependence measures are closer to empirical counterparts.
It has been observed in many data examples (see [15] and Chapter 7 of [9]) that modelbased assessment of tail dependence may not be accurate.The more recent development of tail-weighted dependence measures in [16] allows for better assessment on the reliability of a parametric copula model for tail inferences, by comparing empirical and model-based directional tail-weighted measures.

Tilted Likelihood for 1-Factor Copula Model with Tail Dependence
This section has a modified log-likelihood using a prior based on previous data for tail dependence parameters in a 1-factor copula model (as given in Appendix A.2).The starting point is the copula-based log-likelihood after univariate margins have been estimated.
We consider mainly inference on dependence parameters for the data in the transformed uniform scale, considered as a realization of a random sample {U i } for a copula cumulative distribution function (cdf) C U (•; ϑ), where ϑ = (ϑ 1 , . . ., ϑ d ).The log-likelihood for a random sample of size n is: ( For the 1-factor copula based on BB1r (and other) linking copulas, there are lower bounds on components of the 2-dimensional ϑ j .
For likelihood inference, there is invariance to 1-1 transforms of ϑ j to η j , with the latter being functions of lower and upper tail dependence parameters.Specifically, ) with λ jL , λ jU being the lower and upper tail dependence parameters for the bivariate copula linking variable j to the latent variable.Note that η j is unbounded.The tilted log-likelihood or log "posterior" is: where the density f H does not depend on j.The above is called the tilted log-likelihood because the goal is to obtain parameter estimates that put less weight in the middle of the data space and more weight in the tails based on "prior" expected behavior of how the lower and upper tail dependence parameters are related.
With the appropriate transformation, the prior can be taken as multivariate normal.For bivariate BB1r or BB1, η j is 2-dimensional, and f H is assumed to be bivariate normal.The latter is reasonable if the form of η j is chosen so that (2) is closer to a quadratic in a neighborhood of its mode.Asymptotic likelihood theory (see [17]) implies that the loglikelihood is quadratic in a neighborhood of the mode, as n → ∞, but the adequacy of the approximation for moderate sample size n depends on the transform.
The justification of "independent" prior densities for different variables is based on some empirical checks for 1-factor copula construction with different bivariate linking copulas (with or without tail dependence).The inverse Hessian (roughly the covariance matrix of the sampling distribution of the MLE) of the negative log-likelihood in (1) for the 1-factor copula is close to the block diagonal, with a block for each η j .The product form of the "prior" is based on an assumption of a "super-population" for the variables linked to the latent variable (e.g., stocks in a market sector).The density f H can be considered a frequency density of η j values over a large "super-population".
A method is described in Section 4 to decide on choices for f H . Similar ideas to the tilted log-likelihood have been used to obtain an adjusted loglikelihood that corrects some undesirable behavior of the MLE, given in [18,19].There are also some connections with variational Bayes inference such as when the posterior density is assumed to be approximated by a multivariate Gaussian density after a suitable transform so that parameters are unconstrained.However, with copula applications [20,21], parsimonious and possibly unrealistic assumptions are made for the covariance matrix (such as diagonal or factor structure) of the Gaussian density.The optimization involves a Kullback-Leibler divergence of the Gaussian approximation and the posterior.This differs from optimizing (2) with no constraints on the form of the Hessian matrix at the mode.

Numerical Optimization for Posterior Mode and Hessian at Mode
The tilted log-likelihood has the penalized log-likelihood as an analogy so that standard numerical optimization methods can be used for estimating the mode and its Hessian.
The tilted log-likelihood in (2) and its log-likelihood counterpart in (1) are functions of 2d parameters for the 1-factor BB1r copula with d variables.For the log-likelihood, Ref. [22] discusses an efficient numerical procedure where the log-likelihood, gradient, and Hessian are analytically derived and coded in Fortran90, and all integrals are evaluated via Gauss-Legendre quadrature (see [23]).
The code is modified to handle the transform from BB1 parameters (θ j , δ j ) to (η 1j , η 2j ), and this requires care in using the chain rule for partial derivatives.The code for (2) and its gradient and Hessian are inputted into an efficient modified Newton-Raphson algorithm, as summarized in Section 6.2 of [9].This leads to much faster computations than coding the negative of (2) in R and using a quasi-Newton method for numerical minimization based on numerical gradients and Hessians because many more iterations are needed compared with the modified Newton-Raphson.With the use of Fortran90 (for loops), analytic derivatives, and modified Newton-Raphson iterations, the time to deduce the posterior mode is decreased by a factor larger than 20 for 2d = 40 parameters.Without the increased speed, the simulation study reported in Section 6 would take too much time.Also numerical optimization with the quasi-Newton method performs much worse as the number of parameters increases beyond 40.
With the negative Hessian at the mode of the tilted log-likelihood, the inverse Hessian can be used to obtain interval estimates for functions of the parameters.

Closer Match of Empirical and Model-Based Tail Dependence
Suppose diagnostic plots suggest tail dependence for all pairs of variables.Maximum likelihood estimation with a parametric copula might not provide good model-based estimates of tail dependence parameters or reliable inferences for tail-based quantities.In this section, a least squares method is used to obtain parameter estimates for the 1-factor copula that will make the empirical and model-based tail dependence parameters closer to each other.That is, there is an objective function to find copula parameters to better match model-based and empirical tail dependence parameters.
Let θ be the vector of all parameters (ϑ 1 , . . ., ϑ d ).The jth component is ϑ j = (θ j , δ j ) for the 1-factor BB1r or BB1 copula; see Appendix A.1 for the parametric BB1 family.The steps below assume that the 1-factor BB1r has lower AIC than 1-factor BB1 (empirical evidence from many applications of 1-factor copulas to GARCH-filtered stock returns).

2.
Get empirical matrix of lower tail dependence λ jk,L , upper tail dependence λ jk,U , central dependence Spearman rho ρ jk,S .

3.
Minimize with ϑ as starting point.Let the result be ϑ.

6.
Get the sample mean vector and covariance matrix for a sample of size d for the two transformed λ's.The mean vector and covariance matrix are used as parameters for the bivariate normal prior f H in (2).For the tilted likelihood, use the parametrization The data set mentioned in Section 2 as used in [10] has 64 stocks in the finance sector, 21 stocks in the energy sector and 60 stocks in the health sector of S&P 500 (years 2013-2015).The above procedure is applied to 20 random stocks from the finance sector, 10 random stocks from the energy sector, and 20 random stocks from the health sector.Below in (4) to (6) are the mean vector and covariance matrix for f H for three cases: They are used as the parameters of three bivariate normal distributions.The three cases are used in subsequent sections to allow a sensitivity analysis of the parameters in f H .
All three cases in (4) to ( 6) indicate stronger tail dependence in the joint lower tail compared with the joint upper tail because of the larger value in the first component of µ.Of the three cases, the first example has the strongest expected lower tail dependence because of largest first component of µ.For the first two cases, the median lower tail dependence is larger than 0.5 because of the positive value.The median upper tail dependence is less than 0.5 for all three cases.

Data Example with Prior and Tilted Likelihood
This section summarizes the application of the tilted log-likelihood for GARCH-filtered stock returns.Initially, three 1-factor copula constructions, with BB1, BB1r and BB7 bivariate linking copulas to the latent variable, were fitted with maximum likelihood for different subsets of stocks.Here, as is common from many empirical applications, the 1-factor copula based on BB1r is best, based on the AIC.
The tilted log-likelihood in (2) was then used for analysis of random subsets of stocks from the finance, energy and health sectors; these were different subsets from those used to determine the prior parameters (4)- (6).The qualitative conclusions are similar for different random subsets, so below we report details for one case of 20 randomly chosen finance stocks, considered one representative application of the theory in the preceding sections.
Inferences for tail dependence are compared for five cases below with summaries in Table 4.
1-factor BB7, f H based on finance sector stocks.
Table 4 shows that for BB1r, there is little sensitivity to the three priors (4)- (6).However, the worse fitting 1-factor BB1 and 1-factor BB7 models (based on last column of Table 4) do not lead to better matching with empirical dependence measures using the prior in (4).Overall, these latter two models fit worse in the middle of the data space, leading to smaller values for (2) at the mode.For 1-factor BB1 with the tilted log-likelihood (2), we looked at the negative inverse Hessian (covariance matrix of normal approximation) in posterior mode for row 2 of Table 4.There is almost zero correlation of the parameters for different variable indices j (for different stocks).The inverse Hessian is too large to show in its entirety, but an extract of some entries is converted into standard deviations and correlations in Tables 5 and 6.
Table 5. Posterior mode and standard deviation (SD) of η 1j , η 2j parameters.Note that µ η 1j > µ η 2j implies strong lower tail dependence than upper tail dependence for variable j with the latent variable.µ η 1j > 0 means that the estimated lower tail dependence with latent variable exceeds 0.5.The SD σ values come from the square root of diagonal values of the negative inverse Hessian at the mode.The correlation values for each diagonal 2 × 2 block come from converting a covariance matrix to a correlation matrix.

Bayesian Computing with STAN
Results based on the prior in (2) were also obtained via Bayesian computing with STAN (Hamiltonian Monte Carlo).Estimation for a 1-factor copula model via the Hamiltonian Monte Carlo is shown in [24], but their inferences do not include asymmetric tail dependence.
In Bayesian inference, the parameter vector Θ * consists of both the (transformed) copula dependence parameters η = (η 1 , η 2 , . . ., η d ) and the latent variables v = (v 1 , v 2 , . . ., v n ) in (A3).We assume a joint independent uniform prior distribution for the latent variables and a (product of) bivariate normal prior for the copula dependence parameters for the bivariate linking copulas.The prior density is given by where the mean and variance of the bivariate normal prior f H are given in (4)- (6).The "complete" likelihood function with the latent variables as parameters is c jV (u j , v i ; ϑ j (η j )), (7) where c jV is given in Appendix A.2. Since the Bayesian estimation treats the latent variables as additional parameters, the likelihood function consists of the conditional density function given the latent variables instead of the joint density function.The posterior density function of the parameters (up to a constant) is To perform Bayesian inference on the (transformed) copula dependence parameters of the 1-factor model, we use the No-U-Turn sampler (NUTS) proposed by [25].NUTs is an extension of the Hamiltonian Monte Carlo algorithm, implemented within the STAN framework developed by [26].The 1-factor copula models with BB1 and reflected BB1 copulas are fitted to the GARCH-filtered returns in STAN.For the data example with results summarized in Tables 5 and 6, the posterior statistics of η (including posterior means, standard deviations, and correlation matrix) are similar to the results obtained from maximizing the tilted likelihood function in (2).In comparison with Table 5, the median and maximum absolute differences are, respectively, (a) 0.006 and 0.033 for µ η 's, (b) 0.002 and 0.014 for σ η 's, and (c) 0.023 and 0.059 for ρ's.
From (7), it is seen that the log posterior is (up to a constant) equal to: this is equivalent to the tilted log-likelihood in (2) after marginalizing over the latent variables v. Therefore, the two approaches should yield essentially the same result.With a flat prior on the η j , the posterior estimates should align with the maximum likelihood estimates.However, in the case of estimating BB1 or reflected BB1 copulas, identifiability issues arise when using a flat prior.The two parameters of BB1 or reflected BB1 are negatively dependent, which can result in different combinations of parameter values producing similar likelihood values.This issue might be overlooked in maximum likelihood estimation since it converges to one of the maxima, with an appropriate starting point for numerical optimization.However, it becomes evident in Bayesian estimation, where the model struggles to distinguish between different parameter values in the posterior distributions.We found that incorporating informative priors can effectively mitigate this problem.These priors leverage tail dependence measures to provide additional information about the relationship between the parameters, thereby improving the model's ability to identify meaningful and interpretable parameter values.

Simulation Summary
This section has some simulation results for comparisons.Simulated data sets of size n = 754 and d = 20 are obtained to match the data example in Section 5; the algorithm for the simulation is in Algorithm 22 of [9].For each simulated data set, (η 1j , η 2j ) for j = 1, . . ., d are generated at random from (4) and then a random sample {(u i1 , . . ., u id ) : i = 1, . . ., n} is generated from 1-factor BB1r based on the tail dependence parameters.For each simulated data set, as a sensitivity analysis, the log posterior in (2) for all three choices of f H based on (4)-( 6) is maximized to obtain the mode and the approximate covariance matrix of the posterior density; also the MLE based on (1) is obtained.
The MLEs of the η 1j , η 2j parameters are transformed to the estimated θ, δ parameters of BB1r.Similarly, three sets of posterior modes for η 1j , η 2j parameters are transformed to estimate the θ, δ parameters.Then, the following root mean squares (rms) are computed: for the four sets of estimators.The superscripts m = 1, 2, 3 indicate the three priors, and the superscript m = 0 indicates maximum likelihood.Over 100 simulated data sets, the rms summaries are given in Table 7.As expected, all three priors lead to closer estimates to the (θ, δ) parameters used to generate the simulated data sets than the MLEs.The three sets of {( θ(m) j , δ(m) j )} for m = 1, 2, 3 are relatively much closer to each other than with the MLE.For all simulated data sets, the value of the tilted log-likelihood (2) at the posterior mode is largest for prior (4) and smallest for prior (6).Another summary in Table 8 is the closeness to the empirical λ jk,L and λ jk,U over the ( d 2 ) pairs: where m ∈ {0, 1, 2, 3} as above, and M ∈ {L, U, C} for lower tail dependence, upper tail dependence, and central dependence, with dependence measures a ∈ { λ jk,L , λ jk,U , ρ jk,S }, respectively.From Table 8, there is better matching with the tilted log-likelihood for upper tail dependence but no improvement for lower tail dependence.The Spearman values ρ jk,S are much closer for empirical versus model based parameters.
In the comparison of the simulation results to those for the stock return data in Section 5, the improvements in using (2) are less.This can be explained as follows.For finance stock return data with stocks from one sector, the 1-factor structure with lower and upper tail dependence is reasonable, and BB1r linking copulas can be considered good approximations, and there might also be weak conditional dependence of some stock returns conditioned on the latent variable.That is, the 1-factor BB1r copula model has some small degree of model misspecification, and this can explain why tilting to obtain model-based tail dependence parameters to match empirical counterparts should lead to better tail inference.

Discussion
A method has been proposed for improved tail inference when preliminary data and likelihood analysis suggest asymmetric tail dependence.The approach of the tilted log-likelihood introduces a prior distribution involving lower and upper tail dependence parameters.Incorporating the prior places more weight on the behavior of the joint lower and upper tails compared with the center of the probability space, thereby improving the extreme value inference.This can account for a small degree of model misspecification in the parametric model.The prior is chosen so that model-based lower and upper tail dependence parameters can be a closer match to empirical counterparts for a previous data set that has some similar features to the data set under consideration.
For simpler exposition, the theory is applied to a 1-factor copula model that can handle non-Gaussian dependence structures with asymmetric tail dependence.The tilted log-likelihood approach can be extended to other structured factor copula models (e.g., bi-factor and1-factor with weak residual dependence) with asymmetric tail dependence, where a super-population assumption is reasonable for how observed variables are linked to latent variables.Also, the approach can be applied to vine copula models with bivariate tail dependence for all pairs of variables by choosing bivariate copulas with lower and upper tail dependence in tree 1 of the vine.From [5], lower and upper upper tail dependence in the first vine tree lead to this property for all pairs of variables.By including a prior based on pairs of variables with stronger dependence and asymmetric tail dependence, there could be a better match of vine copula model-based tail dependence measures and empirical counterparts.
The skew-t copula (see [27]) can also be used for asymmetric tail dependence.However, the functional relation of copula parameters and tail dependence parameters is much more complicated than with the BB1 copula (latter in Appendix A.1) such that the tilted loglikelihood approach would be have to implemented with a different transform of the copula parameters.
Bayesian computing methods can be used if there are latent variables.Alternatively, a tilted log-likelihood similar to (2) can be optimized via (a) a quasi-Newton method if the total number of parameters is not large (say, fewer than 40), (b) a modified Newton-Raphson method if the analytical gradient and Hessian can be obtained, or (c) sequential estimation of parameters if possible (Section 5.5 of [9]).For methods (b) and (c), numerical optimization of the tilted log-likelihood is used to obtain the (approximate) posterior mode, and then the Hessian in this mode, in order to obtain interval estimates of the functions of parameters.

Figure 1 .
Figure 1.Normal score plots for some pairs of GARCH-filtered stock returns.Lower and upper semi-correlations, as used in Section 2.4 of [9], show more dependence in the lower quadrant than in the upper quadrant and suggest asymmetric tail dependence.

Table 1 .
Matrices of tail dependence measures for 10 stock GARCH-filtered returns: model-based 1-factor BB1r, empirical, model-based 1-factor BB1, respectively.Lower (upper) tail dependence below (above) diagonal.Bootstrap standard errors (SEs) for estimates of lower and upper tail dependence are mostly in the range 0.04 to 0.075.

Table 2 .
Empirical Spearman rank correlation matrix for 10 GARCH-filtered stock returns.Bootstrap SEs for Spearman are in the range 0.017 to 0.036.Model-based Spearman rhos based on 1-factor BB1r and 1-factor BB1 are quite close to the respective empirical values.

Table 3 .
Summaries to indicate how well model-based tail dependence and central dependence approximate respective empirical values.The averages and fractions are over ( 10 2 ) = 45 bivariate margins.

Table 4 .
Closeness to corresponding empirical values to model-based ML or posterior modal η L .ηU values for lower tail dependence λ jk,L , upper tail dependence λ jk,U and central dependence parameters ρ jk,S of d(d − 1)/2 pairs; d = 10 GARCH-filter stock returns.The quantiles in columns 2 to 4 are from the average absolute difference over d(d − 1)/2 pairs.

Table 7 .
(8)ues of average difference in values in(8)over 100 simulated data sets.Also included are the fraction of times that the posterior mode from (2) is closer to the "true" vector compared with the MLE, and lower/upper quartiles of (8).

Table 8 .
Values of average difference for values of (9) in 100 simulated data sets.Also, the fraction of times that the posterior mode from (2) improves on the MLE based on (9), and lower/upper quartiles of (9).