Estimation of conditional extreme risk measures from heavy-tailed elliptical random vectors

In this work, we focus on some conditional extreme risk measures estimation for elliptical random vectors. In a previous paper, we proposed a methodology to approximate extreme quantiles, based on two extremal parameters. We thus propose some estimators for these parameters, and study their asymptotic properties in the case of heavy-tailed distributions. Thereafter, from these parameters, we construct extreme conditional quantiles estimators, and give their consistency properties. Using recent results on the asymptotic relationship between quantiles and other risk measures, we deduce estimators for extreme conditional Lp-quantiles and Haezendonck-Goovaerts risk measures. In order to test the efficiency of our estimators, we propose a simulation study. A financial data example is also proposed.


Introduction
In many fields such as finance or actuarial science, quantile, or Value-at-Risk (see Linsmeier and Pearson (2000)) is a recognized tool for risk measurement. In Koenker and Bassett (1978), quantile is seen as minimum of an asymmetric loss function. However, Value-at-Risk, or VaR, has some disadvantages, such as that of not being a coherent measure in the sense of Artzner et al. (1999). These limits have led many authors to use alternative risk measures. On the basis of Koenker's approach, Newey and Powell (1987) proposed another measure called expectile, which has since been widely studied (see for example Sobotka and Kneib (2012) or more recently Daouia et al. (2017a)) and applied (Taylor (2008) and Cai and Weng (2016)). Later, Breckling and Chambers (1988) introduced M-quantiles, a family of measures minimizing an asymmetric loss function, and Chen (1996) focused on asymmetric power functions to define L p −quantiles. The cases p = 1 and p = 2 correspond respectively to the quantile and expectile. Recently, Bernardi et al. (2017) provided some results concerning L p −quantiles for Student distributions, and have shown that closed formula are difficult to obtain in the general case. In parallel, Artzner et al. (1999) introduced the Tail-Value-at-Risk as an alternative to Value-at-Risk, and this risk measure subsequently had many applications (see e.g. Bargès et al. (2009)). Moreover, TVaR belongs to a larger family of risk measures called Haezendonck-Goovaerts risk measures and introduced in Haezendonck and Goovaerts (1982), Goovaerts et al. (2004) and Tang and Yang (2012). In the same way as L p −quantiles, we do not have an explicit formula in the general case. However, for a heavy-tailed random variable, Daouia et al. (2017b) proved that L p −quantile and L 1 −quantile (or quantile) are asymptotically proportional. Then, as proposed in Daouia et al. (2017a), an estimator of a L p −quantile may be deduced from a suitable estimator of the quantile, for extreme levels. In the same spirit, Tang and Yang (2012) provided a similar asymptotic relationship between a subclass of Haezendonck-Goovaerts risk measures and quantiles. Finally, all these risk measures we introduced may be estimated through a quantile estimation in an asymptotic setting. Extreme quantiles estimation is a very active area of research. In recent years, we can give many examples : Gardes and Girard (2005) focused on Weibull tail distributions, El Methni et al. (2012) proposed a study for heavy and light tailed distributions, Gong et al. (2015) was interested in functions of dependent variables, and de Valk (2016) provided a methodology for high quantiles estimation. The question of extreme conditional quantiles estimation has also been explored in Wang et al. (2012) in a regression framework. However, Maume-Deschamps et al. (2017) and Maume-Deschamps et al. (2018) have shown that the regression setting may lead to a poor estimation of extreme measures in the case of elliptical distributions. Elliptical distributions, introduced in Kelker (1970), aim to generalize the gaussian distribution, i.e to define symmetric distributions with different properties, such as a heavy tail. This is why elliptical distributions are more and more used in finance (see for example Owen and Rabinovitch (1983) or Xiao and Valdez (2015)). For all these reasons, we consider, in this paper, an elliptical random vector Z = (X, Y ) with the consistency property (in the sense of Kano (1994)), where X ∈ R N , Y ∈ R, and propose to estimate some extreme quantiles (and deduce L p −quantiles and Haezendonck-Goovaerts risk measures) of Y |X = x, i.e. of a component conditionnally to the others. In order to improve the conditional quantile estimation, we proposed in Maume-Deschamps et al. (2017) a methodology based on two extremal parameters, and the unconditional quantile of Y . Indeed, if we denote F −1 Y |x (α) the quantile of level α of Y |X = x, the latter is asymptotically equivalent to a quantile of Y (F −1 Y will be the quantile function of Y ), in the following manner : where δ is a known function (detailed later) depending on α and two parameters η and called extremal parameters. One can notice that Equation (1.1) may only holds under the consistency property of Z. Maume-Deschamps et al. (2017) has also shown that extremal parameters do not exist for some consistent elliptical distributions (see e.g. the Laplace distribution). In this paper, the goal will be in a first time to give a sufficient condition on Z that ensures the existence of η and . This is why a regularly varying assumption is done. After having proved their existence, estimators for the parameters η and are proposed, and therefore for extreme conditional quantiles. The paper is organized as follows. Section 2 provides some definitions and properties of elliptical distributions, including the extremal parameters introduced in Maume- Deschamps et al. (2017). A particular interest is given to consistent elliptical distributions. Section 3 is devoted to extremal parameters η and . Under a regularly varying assumption, their existence is proved, and estimators are proposed. By adding some conditions, consistency and asymptotic normality results are given. In Section 4, we use the results of Section 3 to introduce some estimators of extreme quantiles, and give consistency and asymptotic normality results. The asymptotic relationships between L p −quantiles and quantiles recalled in Section 5 allow us to give extreme L p −quantiles estimators. The same approach is proposed for extreme Haezendonck-Goovaerts risk measures. In order to analyze the efficiency of our estimators, we propose a simulation study in Section 6, and a real data example in Section 7.

Preliminaries
In this section, we first recall some classical results on elliptical distributions. We consider a d−dimensional vector Z from an elliptical distribution with parameters µ ∈ R d and Σ ∈ R d×d . Then the density of Z, if it exists, is given by : c d and g d will respectively be called normalization coefficient and generator of Z. Cambanis et al. (1981) gives another way to characterize an elliptical distribution, through the following stochastic representation : where ΛΛ T = Σ, U (d) is a d−dimensional random vector uniformly distributed on the unit sphere of dimension d, and R is a non-negative random variable independent of U (d) . R is called radius of Z. In the following, the radius must have a particular shape. Indeed, Huang and Cambanis (1979) and Kano (1994) propose a representation for some particular elliptical distributions. Let us consider (Z d ) d∈N * a family of elliptical distributions of dimension d. Then (Z d ) d∈N * possesses the consistency property if it admits the following representation for all d ∈ N * : where χ d is the square root of a χ 2 distribution with d degrees of freedom, ξ is a non-negative random variable which does not depend on d, and χ d , ξ and U (d) are mutually independent. In Kano (1994), such elliptical distributions are said consistent, have the advantage of being stable by linear combinations (combining Theorem 2.16 of Fang et al. (1990) and Theorem 1 in Kano (1994)), and allow us to define elliptical random fields (see, e.g., Opitz (2016)). In the following, we focus on consistent elliptical distributions, and take the notation For the sake of clarity, we will say that a random variable with stochastic representation (2.3) is (ξ, d)−elliptical with parameters µ and Σ. Using this terminology, the purpose of the paper is as follows. Let Z = (X, Y ) ∈ R N +1 be a (ξ, N + 1)−elliptical random vector with parameters µ and Σ, where X ∈ R N and Y ∈ R. Consistency property of Z implies that X and Y are respectively (ξ, N )− and (ξ, 1)−elliptical distributions with parameters µ X ∈ R N , Σ X ∈ R N ×N and µ Y ∈ R, Σ Y ∈ R. We also denote Σ XY the covariance vector between X and Y . The aim is thus to provide a predictor for the quantile of the conditional distribution Y |X = x. According to Theorem 7 of Frahm (2004), such a distribution is still elliptical, with a radius R * different from R in the general case. In particular, we have : and using the translation equivariance and positive homogeneity of elliptical quantiles (see McNeil et al. (2015)), conditional quantiles of Y |X = x may be expressed as : , where α ∈]0, 1[. Thus, in order to give a good prediction of q α (Y |X = x), we need to estimate the conditional function Φ −1 R * . Unfortunately, when we have a data set X 1 , ..., X n , we only observe the unconditional distribution of X. This is why, in Maume-Deschamps et al. (2017), we have given a predictor for conditional quantiles, based solely on the unconditional c.d.f Φ R (t) = P R 1 U (1) ≤ t . This approximation is based on two parameters η ∈ R and 0 < < +∞ such that : = . Table 1 gives some examples of coefficients η and for classical elliptical distributions. However, we have shown in Maume-Deschamps et al. (2017) that such parameters not always exist for all elliptical distribution (see, e.g, Laplace distribution). In a first time, we can wonder in which setting these parameters exist. We thus consider the following assumption, that will ensures the existence of η and .
Assumption 1 (Second order regular variations). We assume that there exist a function A such that A(t) → 0 as t → +∞, and where γ > 0 and ρ < 0.
This assumption is widespread in literature of extreme quantiles (see, e.g, Daouia et al. (2017a)). A first consequence is that Φ R , or equivalently F R1 is attracted to the maximum domain of Pareto-type distributions with tail index γ. Furthermore, it entails Φ −1 as t → +∞ (see de Haan and Ferreira (2006)). As example, Student distribution satisfies Assumption 1.
The following lemma provides some results concerning asymptotic equivalences.
Lemma 2.1 (Regular variation properties). Under Assumption 1, we get the following regular variations properties : (i) The random variable ξ satisfies (ii) For all d ∈ N * , the random variable R d = χ d ξ is attracted to the maximum domain of Pareto-type distribution with tail index γ, and These results will be usefull throughout the paper, and especially in the following result which proves the existence of our parameters.
One can notice that η is only related to the tail index γ, and not to the covariate vector x, while is depending on c N g N (M (x)). In the next, we thus denote rather (x), in order to emphasize the role played by the covariate vector x. We can now give the following predictor for q α (Y |X = x) : From there, we have proved in Theorem 7 of Maume-Deschamps et al. (2017) that q α↑ (Y |X = x) and q α (Y |X = x) were asymptoticaly equivalent as α → 1, i.e (2.14) A similar equivalence has been easily deduced for α → 0, using the symmetry properties of elliptical distributions. In this paper, we focus on the case α → 1, case α → 0 being easily deduced. In Section 3, we propose some estimators for extremal parameters η and (x). Before that, we need to do a little simplification. Indeed, Equation (2.13) shows that the extreme quantile estimation requires the prior estimation of quantities µ Y |X and σ Y |X . These quantities may be easily estimated by the method of moments or fixed-point algorithm (c.f p.66 of Frahm (2004)). In a spatial setting, even if the variable Y is not observed, a stationarity assumption on the random field makes it possible to estimate these values (see Cressie (1988)). Furthermore, the speed of convergence of these methods is higher than those of the estimators we propose in this paper, and therefore do not interfere in the asymptotic results. This is why, in the following, we suppose that µ Y |X , σ Y |X , and therefore µ X , Σ X are known. Then, it remains to estimate η, (x) and Φ −1 R * . Section 3 focuses on η and (x), while Section 4 deals with Φ −1 R * .

Extremal coefficients estimation
In this section, the aim is to estimate the extremal parameters η and (x) conditionaly to the covariates vector X = x. For that purpose, we consider a random sample X 1 , ..., X n independent and identically distributed from an (ξ, N )−elliptical vector with the same distribution as X, and denote The aim is then to give two suitable estimatorη andˆ (x), respectively for η and (x).
3.1. Estimation of η. We notice that coefficient η is directly related on the tail index γ. Then, using a suitable estimator of γ, we easily deduce η. There are several estimators widespread in the literature. As examples, Pickands (1975), Schultze and Steinebach (1996) or Kratz and Resnick (1996) provide some estimators for γ. In the following, we use the Hill estimator, introduced in Hill (1975) : and k n = o(n) such that k n → +∞ as n → +∞. In this context, the statistic W may be : • The first (or indifferently any) component of the reduced centered covariate vector where Λ X T Λ X = Σ X . This approach works well, but we do not use all available data.
• The Mahalanobis norm (X − µ X ) T Σ X −1 (X − µ X ). This approach has the advantage of using all available data. Indeed, according to Theorem 2 of Hashorva (2007b), the two last quantities both admit γ as tail index. In the following we will use the one-component approach, since the asymptotic results we give are valid under Assumption 1, applied to the univariate c.d.f Φ R . Moreover, numerical comparisons seem show that the second approach does not significantly improve the estimation of the parameters. Main properties of γ kn may be found in de Haan and Resnick (1998). Under second order condition given in Assumption 1, de Haan and Ferreira (2006) proved the following asymptotic normality forγ kn .
where λ = lim n→+∞ √ k n A n kn and k n = o(n) such that k n → +∞ as n → +∞. Then, using Proposition 2.2 and Equation (3.1), we define the following estimator for η.
Definition 3.1 (Estimator of η). We defineη kn as As an affine transformation of Hill estimator, asymptotic normality ofη kn is obvious. In order to simplify the next results, we suppose λ = 0 in what follows. 3.2. Estimation of (x). The form of (x), given in Proposition 2.2, leads to a more complicated estimation. Indeed, (x) is related on both γ and c N g N (M (x)). Our estimator for γ is given in Equation (3.1). Concerning c N g N (M (x)), we propose a kernel estimator. Class of kernel estimators, introduced in Parzen (1962), makes it possible to estimate probability densities. Then, the following lemma will be usefull for the construction of our estimator. This result comes from p.108 of Johnson (1987).
Using Lemma 3.2, we introduced a kernel estimatorĝ hn for c N g N (M (x)).
Definition 3.2 (Generator estimator). We defineĝ hn as (3.6) where the kernel K fills some conditions given in Parzen (1962) and bandwith h n verifies h n → 0 and nh n → +∞ as n → +∞. Parzen (1962) provided the asymptotic normality for kernel estimators. We first define some assumptions concerning K and g N needed for the next results.
• (K1) : K is compactly supported on [−1, 1] and bounded. In addition, In the neighborhood of M (x), g N is bounded and twice continuously differentiable with bounded derivatives. The following results may be found in Li and Racine (2007). Under conditions (K1) − (K2), it may be proved that : By adding the condition nh 5 n → 0 as n → +∞, we also obtain the asymptotic normality : Using the previous results given above, the following asymptotic normality forĝ hn is easily deduced.
Proposition 3.3 (Asymptotic normality of generator estimator). Under conditions (K1) − (K2), and taking a sequence h n such that h n → 0, nh n → +∞ and nh 5 n → 0 as n → +∞, then the following relationship holds : Replacing γ byγ and c N g N (M (x)) byĝ hn in Equation (2.12), we are now able to provide an estimator (x) for (x), in the following definition. Furthermore, under Assumption 1, we give the asymptotic normality ofˆ (x). Definition 3.3 (Estimator of (x)). We defineˆ kn,hn (x) as : whereγ kn andĝ hn are respectively given in Equations (3.1) and (3.6).
Proposition 3.4. Under Assumption 1, conditions (K1) − (K2) and if lim n→+∞ √ k n A n kn = 0, the following asymptotic relationships hold : where (Ψ is the digamma function (see p.258 of Abramowitz et al. (1966) We have the asymptotic normality for our estimatorsη kn andˆ kn,hn (x). The next proposition gives the joint distribution according to the asymptotic relations between k n and h n . The proof derives from delta method.

Extreme quantiles estimation
In this section, we propose some estimators of extreme quantiles q αn (Y |X = x), for a sequence α n → 1 as n → +∞. For that purpose, we divide the study in two cases : • Intermediate quantiles, i.e we suppose n(1 − α n ) → +∞. It entails that the estimation of the α n −quantile leads to an interpolation of sample results. • High quantiles. According to de Haan and Rootzén (1993), we suppose n(1 − α n ) → 0, i.e we need to extrapolate sample results to areas where no data are observed. In both cases, the asymptotic results require some conditions we will provide throughout the section. The first one brings together the assumptions of Proposition 3.5. • . Condition (C) will be common to both approaches, and ensures in a first time that Hill estimator is unbiased, according to Equation (3.2). Moreover, k n = o(nh n ) means thatĝ hn converges to c N g N (M (x)) faster thanγ kn to γ. In practice, this condition seems appropriate, because k n must not be too large for the Hill estimator to be unbiased, and h n must be tall enough to provide a good estimation of (x). 4.1. Intermediate quantiles. We consider the case where n(1 − α n ) → +∞ with α n → 1 as n → +∞.
According to Equation (2.14), we can approximate The idea is then to estimate a quantile of level on the unconditional radius R, easier to deal with. By noticing that nv n ∼ (x) −1 n(1 − α n ) → +∞ as n → +∞, we introduce the following statistic order based estimatorq αn (Y |X = x) for q αn (Y |X = x), inspired by Theorem 2.4.1 in de Haan and Ferreira (2006).
Definition 4.1 (Intermediate quantile estimator). We define (q αn (Y |X = x)) n∈N as : ,η kn andˆ kn,hn (x) are respectively given in Definitions 3.1 and 3.3, and W is the first (or indifferently any) component of the vector Λ X −1 (X − µ X ).
In order to prove the consistency of our estimator, we need a further condition (C int ) concerning the sequences α n and k n , usefull in the proof.
Obviously, (C int ) contains n(1 − α n ) → +∞, as mentioned above. Furthermore, ln(1 − α n ) = o( √ k n ) ensures that the rate of convergence in Theorem 4.1 goes to infinity (see below) and the last relationship allows us to eliminate a term in the proof. In order to make this condition more meaningful, let us propose a simple example: we choose our sequences in polynomial forms k n = n b , 0 < b < 1 and α n = 1 − n −a , a > 0. It is straightforward to see that ln(1 − α n ) = o(k n ) and ln(n(1 − α n )) = o(k n ) , ∀a > 0, 0 < b < 1. However, In a first time, we give a result concerning the asymptotic behavior ofq αn (Y |X = x) with respect to q αn↑ (Y |X = x). Then, with Equation (2.14), we easily deduce a consistency result forq αn (Y |X = x).
. Under Assumption 1, and conditions (C), (C int ) : And therefore : The same asymptotic normality with Φ −1 This condition, which seems quite simple, is difficult to prove in a general context. Indeed, we need a second order expansion of Equation (2.14). But the second order properties of the unconditional quantile Φ −1 R given by Assumption 1 are not necessarily the same as those of the conditional quantile Φ −1 R * , which makes the study complicated. However, in some simple cases, we are able to solve the problem. We thus give another assumption, stronger that Assumption 1. In the following, we refer to this assumption for results of asymptotic normality.
It is obvious that Assumption 2 implies Assumption 1. Indeed, according to Hua and Joe (2011), Equation (4.4) is equivalent to say that c 1 g 1 (t 2 ) is regularly varying of second order with indices −1−γ −1 , ρ/γ and an auxiliary function proportional to t ρ γ . Then, Proposition 6 in Hua and Joe (2011) entails Φ R (t) is second order regularly varying with −γ −1 , ρ/γ and the same kind of auxiliary function. Finally, this is equivalent (see de Haan and Ferreira (2006)) to Assumption 1 with indicated γ and ρ, and an auxiliary function A(t) proportional to t ρ . Furthermore, according to Kano (1994), the dependance on d in Equation (4.4) remains coherent with the assumption of consistent elliptical distributions, the latter having to have a function g d depending on d. As an example, the Student distribution fills Assumption 2. The latter allows us to provide a second order expansion for Equation (2.14). In order to prove the asymptotic normality ofq αn (Y |X = x), we add a technical condition C HG int that involves tail indices γ and ρ.
• C HG int : (C int ) holds. In addition, , and : Condition C HG int means that sequence k n must not be too large. In view of Equation (4.5), it is obvious that if N or γ goes to infinity, C HG int is not filled. The tail of the underlying distribution may thus not be too heavy, and the size N of the covariate not too large. Similarly, they no longer hold if γ or ρ goes to 0, i.e. if the underlying distribution is either too lightly varying, or its c.d.f. takes too long to behave like λt −1/γ . Proposition 4.2 (Asymptotic normality ofq αn (Y |X = x)). Assume that Assumption 2 and conditions (C), C HG int hold. Then : We notice that asymptotic variance in Equation (4.2) tends to 0 as the number of covariates N goes to +∞. Indeed, we observe a fast convergence ofq αn to q αn↑ when N is large. However, C HG int is not filled if N is tall. Then asymptotic normality (4.6) no longer holds. This is explained by the fact that more N is tall, more q αn (Y |X = x) /q αn↑ (Y |X = x) (see Equation (2.14)) tends to 1 slowly. 4.2. High quantiles. We now consider n(1 − α n ) → 0 as n → +∞. In the following definition, we introduce another quantile estimatorq αn (Y |X = x) for q αn (Y |X = x). We first recall that the idea is to estimate an unconditional quantile of level 1 − v n = 1 − 2 + (x) (1 − α n ) −1 − 2 −1 . A quick calculation proves that v n is asymptotically equivalent to (x) −1 (1 − α n ), and therefore nv n → 0 as n → +∞. The use of statistic order (at level nv n ) is then impossible in that case. According to Theorem 4.3.8 in de Haan and Ferreira (2006), a way to estimate such a quantile may be to take the statistic order at the intermediate level k n (we recall k n → +∞), and apply an extrapolation coefficient (k n /(nv n )) γ . This approach inspired the following estimator.
Definition 4.2 (High quantile estimator). We define q αn (Y |X = x) n∈N as : The aim is now to study the asymptotic properties ofq αn (Y |X = x). As for the intermediate quantile estimator, we propose a result of asymptotic normality, under a condition (C high ) (given below) which we then refine under Assumption 2.
The second statement is added in order to apply Theorem 4.3.8 in de Haan and Ferreira (2006), and the third one is a notation used in the following. Let us propose a simple example: if we choose our sequences in polynomial forms k n = n b , 0 < b < 1 and α n = 1 − n −a , a > 0, the first condition is filled if and only if a > 1, ln(n(1 − α n )) = o( √ k n ) and the last assertion holds with a particular θ given later. The consistency result that follows immediatly is given just below.
And therefore : We can emphasize that condition (C high ) is filled in most of the common cases. Indeed, the simple examples to find that do not satisfy (ii) are of the form α n = 1 − n −1 ln(n) −κ , κ > 0 and k n = ln(n). But such a choice of sequences would lead to a poor estimation ofγ kn andη kn , since k n → +∞ very slowly, and moreover a poor estimation of the quantile, the level α n tending to 1 slowly. These sequences are therefore not recommanded in practice. Next corollary gives the value of θ when sequences k n and α n have a polynomial form.
Assumption 2 places us in a framework where it is quite simple to prove it, if we add the following condition : • C HG high : (C high ) holds. In addition, As C HG int , condition C HG high means that sequence k n must be small enough. In view of Equation (4.10), we deduce that if N or γ goes to infinity, C HG high is not filled. The tail of the underlying distribution may thus not be too heavy, and the size N of the covariate not too large. Similarly, they no longer hold if γ or ρ goes to 0. By combining Assumption 2 and C HG high , the following result is obtained.
Proposition 4.5 (Asymptotic normality of high quantile estimator). Assume that Assumption 2 and conditions (C), C HG high hold. Then : We can make the same kind of remark as in the previous subsection when N is large. In the following, we give estimators for two other classes of extreme risk measures, based on the estimators given in Equations (4.1) and (4.7). The first one generalizes quantiles.

Some extreme risk measures estimators
5.1. L p −quantiles. Let Z be a real random variable. The L p −quantiles of Z with level α ∈]0, 1[ and p > 0, denoted q p,α (Z), is solution of the minimization problem (see Chen (1996)) : where Z + = Z1 {Z>0} . According to Koenker and Bassett (1978), the case p = 1 leads to the quantile q 1,α (Z) = F −1 Z (α), where F Z is the c.d.f of Z. The case p = 2, formalized in Newey and Powell (1987), leads to more complicated calculations, and admits, with the exception of some particular cases (see, e.g., Koenker (1992)), no general formula. The general case p ≥ 1 has seen some recent advances. Bellini et al. (2014) has shown that L p −quantiles get the translation equivariance and positively homogeneity properties for p > 1. More recently, the particular case of Student distributions has, for example, been explored in Bernardi et al. (2017). However, it seems difficult to obtain a general formula. On the other hand, in the case of extreme levels α, i.e. when α tends to 1, Daouia et al. (2017b) proved that the following relationship holds, for a heavy-tailed random variable with tail index γ.
where B(., .) is the beta function. We add that for a Pareto-type distribution with tail index γ, the L p −quantile exists if and the only if the moment of order p − 1 exists, i.e. if γ < 1/p. The expectile case p = 2 leads to the result of Bellini et al. (2014). Using this result, we can estimate the conditional L p −quantiles from the quantile estimated in Section 4. For that purpose, we need to know the tail index of the conditional radius R * , given in the following lemma.
Lemma 5.1. The conditional distribution Y |X = x is attracted to a maximum domain of Pareto-type distribution with tail index (γ −1 + N ) −1 , i.e With Lemma 5.1 and Equation (5.2), we define the following estimators for the L p −quantile of Y |X = x, according to whether if n(1 − α n ) tends to 0 or +∞.
Definition 5.1. Let (α n ) n∈N be a sequence such that α n → 1 as n → +∞. If either p ≤ N or γ < 1 p−N , we define: kn nṽn whereγ kn andṽ n are respectively given in Equation (3.1) and Theorem 4.1.
We have proved the convergence in probability ofq αn (Y |X = x) andq αn (Y |X = x). Furthermore, the convergence in probability of the asymptotic term, and consequently the empirical L p −quantile is not difficult to get, this is why we omit the proof.
Proposition 5.2 (Consistency of L p −quantile estimators). Assume that Assumption 1 and condition (C) hold. Under conditions (C int ) and (C high ) respectively,q p,αn (Y |X = x) andq p,αn (Y |X = x) are consistent, i.e. : Using the second order expansion of Equation (5.2) given in Daouia et al. (2017b), and doing some stronger assumptions, we can deduce the following asymptotic normality results. For that purpose, let us add two conditions.
• C Lp int : (C int ) holds. In addition, √ k n (1 − α n ) = o (ln(1 − α n )), and : • C Lp high : (C high ) holds. In addition, These conditions will be used below. If we compare C

Lp int and C
Lp int with C HG int and C HG int respectively, sequence k n must be chosen smaller. Finally, we can draw the same conclusions than above, i.e. these conditions are applicable for regularly varying distributions with an intermediate level γ, and a small number of covariates N . To sum up, among all these conditions, we can deduce the following ordering : . Proposition 5.3 (Asymptotic normality of L p −quantile estimators). Assume that Assumption 2 and condition (C) hold. Under conditions C Lp int and C Lp high respectively, and if p > 1, then : An example of L 2 −quantile, or expectile, is provided in Section 6. The second risk measure we focus on is called Haezendonck-Goovaerts risk measure.

Haezendonck-Goovaerts risk measures.
Let Z be a real random variable, and ϕ a non negative and convex function with ϕ(0) = 0, ϕ(1) = 1 and ϕ(+∞) = +∞. The Haezendonck-Goovaerts risk measure of Z with level α ∈]0, 1[ associated to ϕ, is given by the following (see Tang and Yang (2012)) : where H α (Z, z) is the unique solution h to the equation : ϕ is called Young function. This family of risk measures has been firstly introduced as Orlicz risk measure in Haezendonck and Goovaerts (1982), then Haezendonck risk measure in Goovaerts et al. (2004), and finally Haezendonck-Goovaerts risk measure in Tang and Yang (2012). According to Bellini and Rosazza Gianin (2008), such a risk measure is coherent, and therefore translation equivariant and positively homogeneous. The particular case ϕ(t) = t leads to the Tail Value at Risk with level α TVaR α (X), introduced in Artzner et al. (1999). In the following, we denote H p,α (Z) the Haezendonck-Goovaerts risk measure of Z with a power Young function t p , p ≥ 1. In Tang and Yang (2012), the authors provided the following result.
Definition 5.2. Let (α n ) n∈N be a sequence such that α n → 1 as n → +∞. If either p ≤ N or γ < 1 p−N , we define : kn nṽn The condition p ≤ N or γ < 1 p−N simply ensures the existence of H p,αn (Y |X = x). Using the consistency results given in Propositions 4.1 and 4.3, the consistency of these estimators is immediate. The proof is also omitted from the appendix.
Proposition 5.5 (Consistency of H-G estimators). Assume that Assumption 1 and condition (C) hold.
Under conditions (C int ) and (C high ) respectively,Ĥ p,αn (Y |X = x) andĤ p,αn (Y |X = x) are consistent, i.e. : Proposition 5.6 (Asymptotic normality of H-G estimators). Assume that Assumption 2 and condition (C) hold. Under conditions C HG int and C HG high respectively, we have : (5.14) We can emphasize that conditions for asymptotic normality are less strong in the case of Haezendonck-Goovaerts risk measures. We propose some examples (with p = 1, i.e TVaR) in Sections 6 and 7.

Simulation study
In this section, we apply our estimators to 100 samples of n simulations of a Student vector Z = (X, Y ) ∈ R 4 (X ∈ R 3 and Y ∈ R) with ν = 2 degrees of freedom, and compare with theoretical results. According to de Haan and Ferreira (2006), the Student distribution with ν degrees of freedom fills Assumption 1 with indices γ = 1/ν, ρ = −2/ν, and an auxiliary function A(t) proportional to t −2/ν . The latter even fills Assumption 2, and is the only heavy-tailed elliptical distribution (to our knowledge) where we can obtain closed formula for conditional quantiles. In addition, such a degree of freedom makes the tail of the distribution sufficiently heavy to easily observe the asymptotic results. We can notice that the unconditional distribution Y has tail index 1/2, then, using Lemma 5.1, the conditional distribution Y |X = x has tail index 2/7 < 1/2, and admits quantile, expectile (L 2 −quantile) and TVaR. This section beeing uniquely devoted to the performance of our estimators, we take for conveniance µ = 0 R 4 and Σ = I 4 . Let us now estimate the extreme quantiles of Y |X = x. For that purpose, we have to chose an arbitrary value of x. We thus suppose for example that the observed covariates x satisfy M (x) = 1. 6.1. Choice of parameters. As mentioned in Sections 3 and 4, the asymptotic results obtained are sensitive to the choice of sequences k n , h n , α n , and to a lesser extent to the kernel K. The latter will be the gaussian p.d.f in the following. Concerning the sequences, we propose in this section to consider the polynomial forms α n = 1 − n −a , a > 0, k n = n b , b > 0 and h n = n −c , c > 0. In order to deal with high quantiles, we fix in a first time a = 1.25. We now have to chose carefully the parameters b and c, fulfilling the conditions (C), (C high ) and C HG high . (C) imposes b < 1−c, b < 4c and b < 4/(ν +4) = 2/3, (C high ) is satisfied with θ = a/(a + b − 1) (see Corollary 4.4), C HG high entails b ≤ 2a = 2.5 and b ≤ 4a/(N + ν) = 1. Finally, it seems reasonable to chose b (respectively c) as tall (respectively small) as possible. The choices b = 0.6 and c = 0.2 seem to be a good compromise.
6.2. Extremal parameters estimation. The next step is to estimate the quantities η and (x). For that purpose, we use our estimatorsη kn andˆ kn,hn (x) respectively introduced in Equations (3.3) and (3.10). These two estimators are related to the Hill estimatorγ kn , and asymptotic results of Section 3 hold only if the data is independent. This is why we do the estimation of γ only with the n realizations of the first component from the vector Z. Figure 1. From left to right: boxplots of 100 estimatorsη kn andˆ kn,hn (x), for different sample sizes n. Theoretical values are in red. The chosen sequences are k n = n 0.6 , h n = n −0.2 and α n = 1 − n −1.25 . Figure 1 shows the boxplots of our estimatorsη kn andˆ kn,hn (x). In this example, the theoretical value of η is 3/2 + 1 = 2.5, and (x) is equal to 5.292757 (cf. Table 1). 6.3. Extreme risk measures estimation. It remains to estimate the conditional quantiles, expectiles and TVaRs of Y |X = x. Theoretical formulas (or algorithm) for conditional quantiles and expectiles may be found in (Maume-Deschamps et al., 2017) and Maume-Deschamps et al. (2018). Furthermore, using straightforward calculations, formulas for Tail-Value-at-Risk may be obtained.
where Φ ν is the c.d.f of a Student distribution with ν degrees of freedom. In order to give an idea of the performance of our estimator, we propose in Figure 2 some box plots representing 100 relative errors (based on sample sizes n from 1 000 to 10 000 000) of our quantile estimator (4.7) with α n = 1 − n −1.25 . Finally, we would like to compare these results with other estimators already used. The most common and widespread methods for estimating conditional quantiles and expectiles are respectively quantile and expectile regression, introduced in Koenker and Bassett (1978) and Newey and Powell (1987). In Maume-Deschamps et al. (2017) and Maume-Deschamps et al. (2018), we have shown that such approach leads to a poor estimation in case of extreme levels. Indeed, in this example, a quantile regression estimator will converge to Φ −1 ν (α n ) = 1530.15, very far from 7.31, the theoretical result. Obviously, since the quantile regression estimator does not assume any structure on the underlying distribution, the latter is clearly less efficient than the tailored extreme quantile estimators introduced in this paper. Figure 2. Box plots representing 100 relative errorsq αn (Y |X=x) qα n (Y |X=x) − 1 (based on sample sizes n from 1 000 to 10 000 000) with α n = 1 − n −1.25 , k n = n 0.6 and h n = n −0.2 . It may also be interesting to compare the empirical variance of our estimator with our asymptotic result given in Proposition 4.5. Furthermore, the latter allow us to provide confidence intervals for q αn (Y |X = x). We thus introduce the notationζ n the empirical variance of = 0.0005536332 in this section. In addition, we denote m n the number of times the theoretical value q αn (Y |X = x) is in the 95% confidence interval. Table 2 gives an overview of the behavior of these quantities according to n.  Table 2. Empirical varianceζ n , number of confidence intervals containing the theoretical value m n for 100 estimationsq αn (Y |X = x) of q αn (Y |X = x), with n ranging from 1 000 to 10 000 000. Chosen sequences are α n = 1 − n −1.25 , k n = n 0.6 and h n = n −0.2 .
Finally, based on these quantile estimates, we deduce, using Definitions 5.1 and 5.2, L 2 −quantile (or expectile) and Tail-Value-at-Risk estimates.
In the previous figures, only the first component of the vector is used to estimate the tail index. There is therefore some loss of information. We have suggested in Section 3 another approach. Furthermore, Resnick and Stȃricȃ (1995) or Hsing (1991) proved that the Hill estimator may also work with dependent data. Thus it would be possible to improve the estimation ofγ kn by adding the other components of the vector in Equation (3.1), but in that case the asymptotic results of Propositions 3.1 or 3.3 would not hold anymore.

Real data example
As an application, we use the daily market returns (computed from the closing prices) of financial assets from 2006 to 2016, available at http://stanford.edu/class/ee103/portfolio.html. We focus on the first four assets, i.e iShares Core U.S. Aggregate Bond ETF, PowerShares DB Commodity Index Tracking Fund, WisdomTree Europe SmallCap Dividend Fund and SPDR Dow Jones Industrial Average ETF which will be our covariate X. Figure 4 represents the daily return for each day. The reason for focusing solely on the value of these assets could be, for example, that they are the first available every day. The aim would be to anticipate the behavior of another asset on another market. We thus consider the return of WisdomTree Japan Hedged Equity Fund as random variable Y . The size of the sample is 2520. The first 2519 days (from January 3, 2007 to December 5, 2016) will be our learning sample, and we focus on the 2520th day, when the covariate X is x = (−0.0185%, −0.4464%, 0.9614%, 0.1405%). Pending the opening of the second market, let us estimate the quantile of the return Y given X = x. After a brief study of the autocorrelation functions, we consider that the daily returns can be considered as independent. Concerning the shape of the data, histograms of the marginals seem symmetrical. Furthermore, the measured tail index is approximately the same for the 4 marginals. This is why suppose that the data is elliptical. After having estimated µ and Σ by the method of moments, we get M (x) = 1.072952. We apply our estimatorsη kn andˆ kn,hn (x) given in Equations (3.3) and (3.10). We take as sequences k n = n 0.6 (b = 0.6) and h n = n −0.2 (c = 0.2), and as kernel K the gaussian p.d.f, hence we deduce the asymptotic confidence bounds from Equation (3.14). We then obtainη kn = 2.617846 and kn,hn (x) = 6.44334. Let us now estimate the high quantile q αn (Y |X = x) with level α n = 1−n −a , a > 1. In order to minimize the asymptotic variance of Equation (4.11), we chose a = (1−b) (γ kn + 1) = 1.047146. By applying estimator (4.7), we get a quantile of level 0.9997256 close to 3.744985% for Y |X = x. In other words, before the opening of the second market, we consider that given the returns of our first four assets, that of WisdomTree Japan Hedged Equity Fund has a probability 0.9997256 of beeing less than 3.744985%. For information, the true return that day was 0.7141%.

Conclusion
In this paper, we propose two estimatorsˆ kn,hn (x) andη kn respectively for extremal parameters (x) and η introduced in Equation (2.13). We have proved their consistency and asymptotic normality according to the asymptotic relationships between the sequences k n and h n . Using these estimators, we have defined estimators for intermediate and high quantiles, proved their consistency, given their asymptotic normality under stronger conditions, and deduced estimators for extreme L p −quantiles and Haezendonck-Goovaerts risk measures. Consistency and asymptotic normality are also provided for these estimators, under conditions. We have also illustrated with a numerical example the performance of our estimators, and applied them to real data set. As working perspectives, we intend to propose a method of optimal choice of the sequences k n and h n , which is not totally discussed in this paper. Furthermore, the shape of (x) and η leaving Assumption 1 is a current research topic. More generally, the asymptotic relationships between conditional and unconditional quantile in other maximum domains of attraction, using for example the results of Hashorva (2007a), may be developed. However, we need a second-order refinement, as we need a second-order refinement of Equation (2.14) to propose asymptotic normalities 4.2 and 4.5 under weaker assumptions than Assumption 2. Finally, it seems that the ratio of the two terms in Equation (2.14) tends to 1 more and more slowly when the covariate vector size N becomes large. Then, our estimation approach may perform poorly if N is tall. This is why it might be wise to propose another method when the covariate vector size N is large.

Appendix
Proof of Lemma 2.1.
Replacing η in the previous equation, (x) is easily deduced : Proof of Proposition 3.4. It is obvious that under conditions (K1)−(K2), √ k n (ĝ hn − c N g N (M (x))) P → 0 as n → +∞ if k n = o(nh n ) and √ k n h 2 n → 0. Then we get the following asymptotic normality : Since (x) = u(γ), the delta method entails A quick calculation of u , using Equation (2.12), gives the first result. The second part of the proof is similar. Indeed, if nh n = o(k n ) and nh 5 n → 0 as n → +∞, then The delta method completes the proof. 2 In order to make the proof of Theorem 4.1 easier to read, we give the following lemma, which provides the asymptotic behavior of a statistic order under Assumption 1.
The delta method entails that the second term tends to N (0, γ 2 ). Moreover, Assumption 1 and √ k n A n kn → 0 as n → +∞ ensure the asymptotic nullity of the first term.
Proof of Theorem 4.1. In a first time, we can noticeṽ n is related toˆ kn,hn (x). Then, according to Proposition 3.4, (i) entails that we can deal with v n instead ofṽ n in Equation (4.2). Furthermore, we give the decomposition : Under Assumption 1, and according to Proposition 3.1 and Theorem 2.4.1 in de Haan and Ferreira (2006) (with (C)), we have : By noticing that v n is equivalent to (x) −1 (1−α n ) as n → +∞, and using condition 1/η kn −1/η → 1 as n → +∞, and therefore the first term of the decomposition tends to 0. It thus remains to calculate the limit of the second term. It is not complicated to notice that we get the result (4.2). Using asymptotic relationship (2.14), the consistency 4.3 is obvious. 2 Proof of Proposition 4.2. We recall that density of Φ R * is proportional to c N +1 g N +1 M (x) + t 2 , and, from Assumption 2, there exist λ 1 , λ 2 ∈ R such that : The previous expression may be rewritten as follows, where λ 1 , λ 2 , λ 3 ∈ R : In order to make the proof more readable, we do not specify the values of constants λ i , because they are not essential. Then, in the case, ρ/γ ≤ −2, we get In other terms, c N +1 g N +1 M (x) + t 2 is regularly varying of second order with indices −N − 1 − γ −1 , −2, and an auxiliary function propotional to t −2 . According to Proposition 6 of Hua and Joe (2011), Φ R * (t) = +∞ t c N +1 g N +1 M (x) + u 2 du ∈ 2RV −N −γ −1 ,−2 with an auxiliary function proportional to t −2 . Equivalently, there exists λ 1 , λ 2 ∈ R such that Since Assumption 1 and Assumption 2 provide Φ −1 , for some constants λ 1 , λ 2 ∈ R. In that case, we considered ρ ≤ −2γ, hence −ρ > 2γ/(γN + 1). We then deduce the following expansion : , for a certain constant λ ∈ R. We can notice that (1 − α n )/v n = 2(1 − (x))(1 − α n ) + (x), and let us now focus on the limit : The first term gives is easy to calculate. Indeed, since √ k n (1 − α n )/ ln (1 − α n ) → 0 as n → +∞, we deduce By a similar calculation, the second term also tends to 0, supposing √ kn γN +1 → 0 as n → +∞. Then, we deduce, using Proposition 4.1 : Now, let us focus on the case ρ/γ > −2. The proof is exactly the same, with Using the same calculations and doing the further assumption lim Proof of Theorem 4.3. Firstly, we can notice that kn nvn Furthermore, according to Theorem 4.3.8 in de Haan and Ferreira (2006) From Assumption 1, it is not difficult to prove that ln Φ −1 R (1 − v n ) / ln (k n /(nv n )) is asymptotically equivalent to γ ln(1 − α n )/ ln (n(1 − α n )/k n ). Then, if we focus on the second term, it comes, using the limit given in (C high ) : When n → ∞, this expression is the sum of the following bivariate normal distribution : To conclude, ln kn nvn ∼ ln kn n(1−αn) as n → +∞, hence the result. The consistency is immediate. 2 Proof of Proposition 4.5. The proof is similar to that of Proposition 4.2. Indeed, we have given, in the case ρ/γ ≤ −2 : , for a certain constant λ ∈ R. It thus remains to calculate (1 − α n ) = 0.
Now, let us focus on the case ρ/γ > −2. The proof is exactly the same, with , λ ∈ R.
Using the same calculations and doing the further assumption lim Proof of Lemma 5.1. The density of Y |X = x is given by where M (x) = (x − µ X ) Σ −1 X (x − µ X ). In order to simplify, we consider the case reduced and centered, i.e µ Y |X = 0 and σ Y |X = 1. A quick calculation gives Equation (2.10) leads to Proof of Proposition 5.3. We recall in a first time that condition C Lp int entails C HG int . We have the following decomposition : We know that f L γ −1 kn + N −1 , p , as a function ofγ kn , is asymptotically normal with rate √ k n (see Equation (3.2)). Then, the first term in the sum clearly tends to 0 as n → +∞. Using Proposition 4.2, the second term tends to the normal distribution given in (4.6). Finally, we have to check that the third term tends to 0. For that purpose, we use the second order expansion given in Daouia et al. (2017b): q p,αn R * U (1) f L (γ −1 + N ) −1 , p q αn R * U (1) = 1 − (γ −1 + N ) −1 r(α n , p) + (λ + o(1)) A * 1 1 − α n , where r(α n , p) = λ 1 1 qα n (R * U (1) ) E R * U (1) + o(1) + λ 2 A * 1 1−αn (1 + o(1)), λ, λ 1 , λ 2 ∈ R are not related to n and A * (t) is the auxiliary function of Φ R * 1 − 1 t . It seems important to precise that the conditional distribution R * U (1) is regularly varying with tail index γ −1 + N > 1, then E R * U (1) exists and, R * U (1) being symmetric, equals 0. Then, a sufficient condition for asymptotic normality may be We know, using Assumption 2 and the proof of Proposition 4.5, that q αn R * U (1) = Φ −1 R * (α n ) is asymptotically proportional to (1−α n ) − γ γN +1 , while A * 1 1−αn is asymptotically proportional to (1 − α n ) − ρ γN +1 if ρ > −2γ and (1 − α n ) 2γ γN +1 otherwise. Finally, it is not difficult to check that C Lp int leads to the nullity of the two limits, and therefore to the third term of the decomposition, hence the result. The proof is exactly the same for the second normality, replacingq p,αn (Y |X = x) byq p,αn (Y |X = x), ln (1 − α n ) by ln kn n(1−αn) and using Proposition 4.5 instead of 4.2. 2