Partial and average copulas and association measures

This research was supported by the IAP Research Network P7/06 of the Belgian State (Belgian Science Policy). The first author gratefully acknowledges support from the GOA/12/014 project of the Research Fund KU Leuven. The second author gratefully acknowledges support from the grant GACR 15-04774Y. The third author is an extraordinary professor at the North-West University, Potchefstroom, South Africa. The authors are grateful to an Associate Editor and two referees for their valuable comments that led to a considerable improvement of the paper.


Introduction
Suppose we observe a random vector (Y 1 , Y 2 ). In statistics we often need to characterize the degree of dependence of Y 1 and Y 2 . The most standard (and probably also the oldest) measure of dependence is Pearson's correlation coefficient which proved to be useful in many situations. In particular, if (Y 1 , Y 2 ) follows a bivariate normal distribution, then ρ (P) (Y 1 , Y 2 ) completely characterizes the dependence structure of (Y 1 , Y 2 ). On the other hand, ρ (P) (Y 1 , Y 2 ) can be of little use if the bivariate distribution of (Y 1 , Y 2 ) is far away from being normal. Moreover, ρ (P) is even not defined if the distribution of (Y 1 , Y 2 ) does not have finite and positive variances of Y 1 and Y 2 . That is why alternative measures of dependence have been introduced. Among the most popular measures of dependence are Kendall's tau and Spearman's rho. See Section A.1 for a brief recall of their definitions and an overview of other commonly-used association measures.
The situation becomes more difficult when we observe a three-dimensional vector (Y 1 , Y 2 , X) and one is interested in the relationship between Y 1 and Y 2 when the effect of X is taken into consideration. A simple concept which has proved to be useful in many situations is that of the original partial Pearson's correlation coefficient given by (1.2) In Section 2 we will introduce a new concept of partial association measures, not to be confused with this original partial correlation coefficient. Similarly as for the global Pearson's correlation coefficient ρ (P) (Y 1 , Y 2 ) defined in (1.1), the partial Pearson's correlation coefficient only completely characterizes the dependence structure of Y 1 and Y 2 taking into account X, if (Y 1 , Y 2 , X) has a trivariate normal distribution. Among the earliest attempts to get away from this normality assumption, is the concept of the original partial Kendall's tau (see (A.4)). However some criticism were formulated regarding this measure. See Section A.2 for a brief recall and some examples to illustrate the criticism.
A more comprehensive and detailed characterization of the dependence structure is provided by conditional measures of dependency/association, that measure the dependence structure in (Y 1 , Y 2 ) conditionally upon an event formulated in terms of X. The simplest (and most common) conditional setting is to consider dependency in (Y 1 , Y 2 ) given the event X = x (i.e. a value x taken by the covariate). In Section A.3 we briefly recall such conditional association measures, and provide an example to illustrate their merits when compared to the original partial type of association measures. If one is in particular interested in high value settings of the covariate one might consider looking at the conditioning event that X ≥ x (or conversely X ≤ x). This is for example often the case in economic (e.g. production frontier) or actuarial applications. In Section 4 an example of such a conditioning event is included. Although the presentation in this paper almost entirely focuses on conditioning upon the event X = x, the concepts and methodology apply to general events in terms of the covariate X.
Conditional association measures thus quantify clearly how the dependence structure between the two components in (Y 1 , Y 2 ) changes in terms of (the event related to) X. Graphically such a conditional association measure is depicted as a function of X. Of interest is then to look into issues of average (or alternatively, for example, median) strength of dependence. Furthermore, one might want to quantify the differences in dependence structures within (Y 1 , Y 2 ) and (V 1 , V 2 ), conditionally upon a similar event related to the same covariate X. Comparing the strengths of the two dependence structures, taken into account the behaviour of the common covariate, then translates into comparing two curves. A first approach to do so is to look into a kind of global (mean) behaviour of the curves.
In summary, the aim of the paper is to provide some insights in different ways to study such a global/mean behaviour of conditional dependencies. We discuss two approaches to do this, leading to the concepts of partial and average conditional association measures. A unifying framework for our study is provided by focusing on association measures that can be expressed as a functional of the copula function C (assumed to be unique), denoted as ϕ(C). See Table 1 for some commonly-used association measures in this class. In case of the conditional dependence (given X) one has to deal with a conditional copula function C X , leading to the corresponding conditional association measure ϕ(C X ). A first approach towards a global/mean behaviour of the conditional dependencies is to take the average (with respect to X) of this conditional association measure, i.e. E X {ϕ(C X )}. We refer to this as the average conditional association measure. In a second approach one starts from the so-called partial copula, defined by E X {C X (·, ·)} and denoted byC(·, ·); and then considers the corresponding association measure ϕ(C). This is referred to as the partial association measure. Table 1 (third column) gives an overview of some average and partial association measures. We show that for most, but not all, conditional association measures these approaches coincide. An interesting case where they do not coincide is Kendall's tau. A second contribution of this paper consists of discussing estimation of the partial and average association measures. A crucial starting point for this is the estimation of C X andC; that of C X has been Table 1 Overview of some (un)conditional association measures and their average and partial versions. * Measures of tail dependence as introduced in Schmid and Schmidt (2007) (abbreviated as S.S. (2007) below) unconditional association measure conditional, Type in terms of C average, Facts functional ϕ (C) and partial versions Spearman's rho ρ = 12 Kendall's tau Cx (u1,u2) dCx(u1,u2) − 1 in general: Tail coefficients studied in recent literature, whereas that ofC is part of the contribution in this paper. For the most interesting case of partial and average Kendall's tau we also establish the asymptotic behaviour of the estimators.
The paper is further organized as follows. In Section 2 we discuss briefly the unifying framework for the various concepts of association measures (such as unconditional and conditional), and introduce the two fairly-new concepts of association measures. Section 3 discusses estimation of the various concepts, and for the proposed estimators of the average copula and the average and partial Kendall's tau, we establish the asymptotic properties in Section 5. The use of the various concepts of association measures is illustrated on a real data application in Section 4. Some conclusions and discussions are in Section 6. Appendix A provides a brief review on association measures, their specific drawbacks and merits. The proofs of the theoretical results, the assumptions under which these hold, as well as some needed auxiliary results are provided in Appendices B-F.

Various concepts of association measures, defined in terms of copulas
Many measures of association can be expressed as functionals of copulas, which link the marginal distributions into the joint distribution. This unifying framework, together with different conceptional notions of copulas, allows to provide a unified approach towards various concepts of association measures. In Section 2.1 we briefly review existing concepts, whereas in Section 2.2 we introduce new concepts, all in the same unifying framework.

Unconditional (global) and conditional copulas and association measures
To formalize the definition of a (unconditional) copula let H(y 1 , y 2 ) be the joint distribution function of the random vector (Y 1 , Y 2 ) and denote by F Y1 and F Y2 the marginal distribution functions of Y 1 and Y 2 respectively. Then a copula C Y1,Y2 on [0, 1] 2 is a function such that In case of continuous marginal distribution functions F Y1 and F Y2 , the copula function C Y1,Y2 is uniquely defined. See Nelsen (2006). Many association measures can be expressed as specific functionals of C Y1,Y2 , say as ϕ(C Y1,Y2 ) or, shortly, as ϕ(C). For example, Kendall's tau and Spearman's rho, are given by Table 1 lists other association measures indicating the specific functional ϕ(·). These include also the lower and upper tail coefficients (denoted with λ L and λ U ) and other association measures focusing on tail behaviour, such as these 2426 I. Gijbels et al. introduced by Schmid and Schmidt (2007). For a detailed study of association measures see Chapter 5 of Nelsen (2006). In the literature so far, one has studied unconditional copulas as well as conditional copulas. The latter concept was introduced in Patton (2006), and serves to study the conditional dependence structure of Y 1 and Y 2 given X = x (as the simplest conditioning event). Denote the joint and marginal distribution functions of (Y 1 , Y 2 ), conditionally upon X = x, as If F 1x and F 2x are continuous, then according to Sklar's theorem (see e.g. Nelsen, 2006) there exists a unique copula C x which links the conditional marginals into the conditional joint distribution (2. 2) The function C x fully describes the conditional dependence structure of the bivariate vector (Y 1 , Y 2 ) given X = x and it is called a conditional copula.
As discussed in Gijbels, Veraverbeke and Omelka (2011) and Veraverbeke, Omelka and Gijbels (2011) the conditional measures of association that do not depend on the marginal distributions of Y 1 and Y 2 can be written as functionals of C x . For instance the conditional Kendall's tau defined in (A.6) can be expressed as Similarly, the conditional Spearman's rho is given by Other conditional association measures, including conditional tail coefficients are given in the first rows in each block of column 3 of Table 1.

Partial and average copulas and association measures
Conditional association measures (or more generally conditional copulas) are very useful when one wants to get a deeper insight into the dependence structure and how it changes with the covariate X. However, it still might be of interest to summarize/capture the strength of this dependence with one single number. Indeed, in case of two random vectors (Y 1 , Y 2 , X) and (V 1 , V 2 , X), such a global number would allow us to make simple comparisons of the strengths of the dependencies between Y 1 and Y 2 on the one hand and these between V 1 and V 2 on the other hand, taking into account the covariate X. We thus need one number (one copula) summarizing the dependence of Y 1 and Y 2 when adjusted for X. We now discuss two approaches to get to such a summarizing type of copula.
A first obvious idea is to average the conditional measures with respect to the distribution of X and to get the average conditional copula and, for example, average (conditional) Kendall's tau or the average (conditional) lower tail coefficient (see Table 1) The concept of average conditional copula was first mentioned by Bergsma (2011Bergsma ( , 2004. Another way is to follow the original idea that a partial correlation coefficient is supposed to measure correlation of Y 1 and Y 2 after removal of any part of the variation due to the influence of X, (see e.g. p. 306 of Cramér, 1946). The most general way of removing the effect of X on Y 1 and on Y 2 is through their conditional (marginal) distribution functions, which results into U 1 = F 1X (Y 1 ) and U 2 = F 2X (Y 2 ). Note that neither U 1 nor U 2 depends on X any more, and both are uniformly distributed (due to the probability integral transform). Indeed, for example, for all t ∈ [0, 1], with F X the cumulative distribution function of X. See also Song (2009) who exploits this transformation in the problem of testing for conditional independence. So, after having removed the effect of X on the marginal distributions, the dependence structure of the transformed random variables is fully described by the copula functionC corresponding to the pair (U 1 , U 2 ). We will call this the partial copula. See also Definition 3 of Bergsma (2004). As the marginals of (U 1 , U 2 ) are already uniform,C coincides with the joint distribution function of (U 1 , U 2 ): the partial copula is defined bȳ Joe (2006) builds on partial correlations to generate random correlation matrices. The paper uses a vine decomposition to access the joint density of pairwise correlations. Bedford and Cooke (2002) introduced the concept of vines for dependent random variables. In Gaussian copulas commonly-used association measures such as Kendall's tau, Blomqvist's beta, Spearman's rho and Gini's index can all be expressed in terms of Pearson's correlation coefficient, whereas the upper and lower tail coefficients are zero. Kim et al. (2011) studied partial correlation assuming a Gaussian copula forC.
The notions of average (conditional) copula and partial copula, defined in respectively (2.3) and (2.4), in fact coincide, as is stated and proved in Proposition 1.
Proposition 1. For random variables Y 1 and Y 2 with continuous distribution functions, it holds that Proof. This is straightforward sincē Thus the copula of Y 1 and Y 2 after removal of the effect of X on the marginal distributions, coincides with the average conditional copula function. In other words, there are two ways of viewing the copulaC: it is the copula describing the dependence between Y 1 and Y 2 after removal of the effect of X; but also the copula obtained after taking the expectation (with respect to the covariate X) of the conditional copula.
We can now think of considering association measures derived from the partial copulaC. We call these measures the partial association measures. For instance the partial Kendall's tau is given bȳ where (U 1 , U 2 ) is an independent copy of the random vector (U 1 , U 2 ), defined in (2.4). Similarly, the partial Spearman's rho is defined bȳ The partial measureτ should not be confused with the original partial Kendall's tau given in (A.4).
Note thatC = C A does not imply that the average conditional measures, obtained by averaging the conditional measures with respect to X, equal the partial measures. In general this holds true only when C x does not depend on x (see also Section 2.3 below) or if the measure of association is a linear functional of the underlying copula. Thus whileρ (S) See also Section A.4 in Appendix A for an example whereτ = τ A . Nevertheless, for many association measures the functional ϕ(C) constitutes a linear functional in C, and hence the equality E X {ϕ(C X )} = ϕ(C) is rather evident. For other association measures such as the upper and lower tail coefficients that involve limit expressions, Proposition A.1 establishes the coincidence.

Simplified pair-copula construction
Sometimes, it is reasonable to expect that the covariate X only affects the marginal distributions of Y 1 and Y 2 , but does not affect the dependence structure. This results in the conditional joint distribution of (Y 1 , Y 2 ) given by H x (y 1 , y 2 ) = C F 1x (y 1 ), F 2x (y 2 ) . (2.7) This is also called the simplified pair-copula construction in the recent literature. See e.g. Hobaek Haff, Aas and Frigessi (2010) and Acar, Genest and Nešlehová (2012). Note that in model (2.7) the conditional copula C x does not depend on x (i.e. C x = C), which is in contrast to the general model (2.2). Hence, in this special setting, the conditional and the partial copula coincide (C =C) and also all the three types of association measures (conditional, average conditional, partial) coincide, i.e. for the Kendall's type of association measures: types of association measures), with their respective notations. The entries in the last column will be discussed in Section 3.

Summary: Various concepts of association measures
In Figure 1 we depict the unconditional, original partial, the conditional, the partial and the average conditional Kendall's tau for Example A.3.

Estimation of copulas and association measures
Suppose we have independent random vectors (Y 11 , Y 21 , X 1 ), . . . , (Y 1n , Y 2n , X n ), all having the same distribution as (Y 1 , Y 2 , X). To illustrate the estimation of (un)conditional, average and partial association measures, we concentrate on estimation of different notions of Kendall's tau. The various notions of other association measures (Spearman's rho, Gini coefficient, Blomqvist's beta, upper and lower tail coefficients, ...) can be estimated analogously.
A crucial point is that all these different notions of association measures (unconditional or global, conditional, partial and average) can be expressed as functionals of the corresponding notion of copula (i.e. ϕ(C X ) or ϕ(C)). Hence, plugging in an appropriate nonparametric estimator for the specific copula function (C X orC) into these expressions, leads directly to a nonparametric estimator for the specific notion of the association measure. Nonparametric estimation of unconditional and conditional copulas has been studied in the (recent) literature, whereas nonparametric estimation of the partial copula is largely unexplored. In Section 5 we discuss and study a nonparametric estimator for the partial copula. In this paper we focus on kernel type of estimation, for two reasons: (i) to be able to rely on results available in the literature on nonparametric estimation of a conditional copula; (ii) since all estimators have explicit forms this allows to establish asymptotic results. Obviously alternative flexible estimation methods such as spline basis expansions (see e.g. Kauermann and Schellhase, 2014) and/or Bayesian methods (see e.g. Burda and Prokhorov, 2014) could also be applied.
In the sequel of this section, we immediately focus, for brevity and clarity, on the estimators of the association measures, resulting from the above plug-in step involving nonparametric estimation of the appropriate notion of copula.

Conditional and average conditional Kendall's tau
Note that by mimicking formula (3.1) it would be possible to estimate the conditional Kendall's tau through the formula where {w ni (x, h n )} is a sequence of weights that smooth over the covariate space. But as discussed in Veraverbeke, Omelka and Gijbels (2011) it is better to replace the original observations (Y 1i , Y 2i ) in formula (3.2) with observations that are already adjusted for the effect of the covariate X. For detailed discussions on nonparametric estimation of conditional copulas see Veraverbeke, Omelka and Gijbels (2011) and Gijbels, Veraverbeke and Omelka (2011), among others. The method that is used to remove the effect of X on the marginal distributions of Y 1 and Y 2 depends on what can be assumed about this effect (see Section 3.4). In general let G j (y, x) stand for the transformation that removes the effect of X on Y j . Generally, G j (y, x) = F jx (y) does the job, but sometimes simpler functions (not requiring nonparametric estimation of F jx , and hence the introduction of an additional smoothing parameter) are available. For in- Further, let G jn stand for an estimate of the function G j . Then the adjusted observations are given by The estimate of the conditional Kendall's tau is then given by An estimator of the average (conditional) Kendall's tau τ A = E X τ (X) is now simply where τ n (X i ) is the estimate (3.4) evaluated at the point X i .

Partial Kendall's tau
The population version of partial Kendall's tau was introduced in (2.5). With the help of the adjusted observations given by (3.3) one can estimateτ bȳ Note that while bothτ and τ A are well defined and reasonable summaries of the dependence of Y 1 and Y 2 when adjusted for X, the advantage ofτ is that its estimatorτ n given by (3.6) is less computationally intensive than τ A n . On the other hand we establish an asymptotic normality result for τ A n under more general assumptions than forτ n (see Section 5).

Some standard methods of adjustments
In this section we list some appealing methods for adjusting the observations for the effect of the covariate.

Parametric location-scale model estimation of F 1x and F 2x
Consider the following model where m 1 , m 2 , σ 1 , σ 2 are known functions, β 1 , β 2 , γ 1 , γ 2 are unknown finitedimensional parameters and ε 1 and ε 2 are independent of X with unknown distribution functions F 1ε and F 2ε .
Note that the 'ideal' transformation function would be given by the function where β jn and γ jn are the estimates of the unknown parameters. The adjusted observations now coincide with the estimated residuals

Nonparametric location-scale model estimation of F 1x and F 2x
In this setting one assumes that the influence of the covariate on the marginal distributions is given by the model where m 1 , m 2 , σ 1 and σ 2 are unknown functions and both ε 1 and ε 2 are independent of X with E ε 1 = E ε 2 = 0 and var(ε 1 ) = var(ε 2 ) = 1. For simplicity of presentation we will consider only local linear regression estimates of these unknown functions (j = 1, 2): with the weights w ni (x, g n ) given by with k(·) a given kernel function.

The transformation function is now given by
and its σjn (x) . The adjusted observations now coincide with the estimated residuals (3.10)

General nonparametric estimation of F 1x and F 2x
Sometimes, one has no idea about the influence of X on Y 1 and Y 2 . Then one uses the general transformation functions G j (y, x) = F jx (y) that is estimated as where {w ni (x, g jn )} is a sequence of local linear weights introduced in (3.8). The estimator in (3.11) is a standard kernel distribution function estimator. Other nonparametric estimators for a conditional distribution function can be used. For a recent contribution in this area, see e.g. Veraverbeke, Gijbels and Omelka (2014).

Application
As a practical illustration, the data on hydro-geochemical stream and sediment reconnaissance from Cook and Johnson (1981) are revisited. They consist of the observed log-concentrations of seven chemicals in 655 water samples collected near Grand Junction, Colorado. The data can be found e.g. as a data set called uranium in the R-package copula (Kojadinovic and Yan, 2010). Following Acar, Genest and Nešlehová (2012) we first concentrate on Cobalt (Co), Scandium (Sc) and Titanium (Ti). The pairwise scatter plots are shown in Figure 2. Suppose we are interested in the relation of Cobalt (Y 1 ) and Scandium (Y 2 ) when Titanium (X) is taken into account. For exploration purpose, we fitted simple linear models (lm) Y j = β j1 +β j2 X +ε j , indicated in Figures 2 (b) and (c) with a dotted line. Similarly, nonparametric location models Y j = m j (X) + ε j were fitted with the help of the locpol R-package (Cabrera, 2012). The fits are indicated in Figures 2 (b) and (c) with a solid line (lp).
As the fits of the nonparametric mean functions in Figures 2 (b) and (c) are in reasonably good agreement with the simple linear fits for the majority of data points, we use the following methods of adjustments to estimate the partial Kendall's tau: lm Adjustment by simple linear regression models Y j = β j1 + β j2 X + ε j ; unif Adjustment by nonparametric estimation of the conditional distribution functions F 1x and F 2x (see Section 3.4.3).  The results for both methods, partial lm and partial unif, applied to the triplet (Co, Sc, Ti)=(Y 1 , Y 2 , X) are quite comparable, as can be seen from Table 3. The table also lists the estimated value of the average conditional Kendall's tau defined in (2.6), and the sample version of original partial Kendall's tauτ K , see (A.4), which is slightly higher than all the above values. Note that all these values are lower than the unconditional (global) Kendall's tau, defined in (A.2), of Co and Sc (so unadjusted for Ti) that is 0.535.
As explained in previous sections, although the marginals may be adjusted for the effect of the covariate, the dependence structure may change with the value of the covariate. Then the (original) partial, partial and average conditional measures provide different approaches to measuring average dependence over X.
To quantify the effect of the covariate in more detail we present also the conditional Kendall's tau that measures the dependence of Co and Sc when Titanium is fixed to a given value. In Figure 3 we present the estimator of the conditional Kendall's tau, constructed from the observations adjusted (nonparametrically) for the values of the covariate, i.e. using , where the observations are weighted according to the distance of X i to the point of interest x, as in (3.4). For details about the construction of an estimator of the conditional Kendall's tau see Gijbels, Veraverbeke and Omelka (2011). The bandwidth used to construct the weights (for smoothing in the covariate direction) was fixed to 0.57 in order to have comparable results with Acar, Genest and Nešlehová (2012).

Fig 3. Estimated unconditional, conditional and partial Kendall's tau for (a) Cobalt (Co) and Scandium (Sc) given Titanium (Ti); (b) Cesium (Cs) and Pottasium (K) given Titanium (Ti); and (c) Cesium (Cs) and Scandium (Sc) given Titanium (Ti).
The estimated conditional Kendall's tau together with pointwise 95 % confidence intervals is plotted for different values of Ti in Figure 3 (a), using a solid line and dotted lines, respectively. The lower and the upper limits of the confidence intervals are derived by the bootstrap method presented in Omelka, Veraverbeke and Gijbels (2013). The estimates for the unconditional (global) Kendall's tau and the partial (via the method unif) Kendall's tau are indicated by the horizontal lines (dashed-dotted and dashed lines respectively). The range of Ti extends from the 5th to the 95th quantile of that variable. The 10th and the 90th quantiles of Ti are indicated by dotted vertical lines.
The dependence between Cobalt and Scandium clearly depends on the Titanium value, as shown by the estimated conditional Kendall's tau.
We also present similar results for log-concentrations of other chemicals. The considered additional triplets are (Cesium, Pottasium, Titanium) and (Cesium, Scandium, Titanium): (Cs, K, Ti) = (Y 1 , Y 2 , X) and (Cs, Sc, Ti) = (Y 1 , Y 2 , X). Figures 3 (b) and (c) summarize the results. Of particular interest is to note that the unconditional Kendall's tau is for both pairs (Y 1 , Y 2 ) around 0.2, whereas the partial Kendall's tau is for the pair (Cs, Sc) close to zero. In other words, the average strength of the dependence between the log-concentrations of Cesium and Scandium is far smaller than that between the log-concentrations of Cesium and Pottasium, when taking the log-concentration of Titanium into account. See also Table 3 for the estimated values of the other quantities. From this and other examples and simulations, we experienced that the values for the original partial Kendall's tau are often in between these of the unconditional Kendall's tau and the newly-defined partial Kendall's tau. See also Figure 3 and Table 3.
To illustrate further the use of other association measures and other conditioning settings, we provide in Figure 4 (a) (respectively Figure 4 (b)) the estimated upper (respectively lower) unconditional, conditional, partial and average tail coefficient of Schmid and Schmidt (2007) (ρ L , ρ L (X),ρ L and ρ A L ; and similarly for the upper tail coefficients). Note that the average and partial co- efficients are close (their population versions coincide as proved in Proposition A.1). Further, the tail dependence seems to be a bit higher in the upper tail than in the lower tail. Moreover the upper tail dependence reaches a maximum around a Ti-value of 3.6 and then the tail dependence weakens. Figure 4 (c) depicts the estimates for the conditional and average (conditional) Kendall's tau for two different conditioning settings: X = x and X ≥ x (respectively the black and grey solid curves). Although the curves look quite different with a switching regime in dependence strength (around 3.65), their average values (the horizontal lines) are close to each other, meaning that on average the strength of the dependence between Cobalt and Scandium is comparable when either looking at a given Titanium value, or at Titanium values exceeding a given threshold. For all estimated conditional association measures in Figure 4 (a)-(c) we also plot 95% confidence intervals. For most confidence intervals the bootstrap procedure of Omelka, Veraverbeke and Gijbels (2013) was applied, but intervals for the conditional Kendall's tau, when conditioning on the event X ≥ x, were constructed based on the asymptotic normality result for the conditional Kendall's tau.

Theoretical results
In this section we first discuss nonparametric estimation of a partial copula, defined in (2.4).
We need to transform the observed random variables Y 1i and Y 2i , i = 1, . . . , n to be less (or not) dependent on X i . The transformations are based on G j (y, x) (with j = 1, 2) using their estimates G jn (y, x). Depending on whether the influence of X on Y j (j = 1, 2) can be modelled by a parametric location-scale model (see Section 3.4.1), a nonparametric location-scale model (see Section 3.4.2), or this influence is fully unknown as in Section 3.4.3, we use the following estimated transformations • for parametric location-scale adjustments: • for nonparametric location-scale adjustments: the nonparametric estimator of the partial copula in (2.4) is then given bȳ where, for j = 1, 2, The estimator (5.3) was studied in Gijbels, Omelka and Veraverbeke (2015) but under the restrictive setting that the simplifying assumption holds, i.e. that only the marginal distribution functions are affected by the covariate X (see (2.7)).
In the next section we establish the asymptotic properties of the estimator defined in (5.3), in its general setting. These results are then the basis for proving asymptotic properties (see Section 5.2) of the estimator of the partial Kendall's tau, defined in (3.6). Finally, in Section 5.3, we provide asymptotic results for the estimator of the average conditional Kendall's tau, given in (3.5). For clarity of presentation, all assumptions are formulated in the Appendix. The theoretical results are presented according to the three major transformations considered in the adjustment/transformation step (5.2), as the asymptotic behaviour of the estimators depends on this step (and the degree of prior knowledge that it reflects).

Asymptotic results for the nonparametric partial copula estimator
Theorems 1, 2, and 3, establish asymptotic i.i.d. representations for the estimator (5.3) of the partial copula (2.4), when using the respective estimated transformations in (5.1).
In what follows let

Parametric location-scale adjustments
Theorem 1. Assume that the marginal distributions follow parametric locationscale models described in Section 3.4.1 and that (Cp), (βγ), (F1p), (F2p), (mσp) given in the Appendix hold. Then uniformly in As mentioned in the introduction of Section 5, Theorem 1 can be viewed as an extension of the results presented in Gijbels, Omelka and Veraverbeke (2015). From this we thus can tell what happens if the pairwise simplifying assumption is wrongly assumed. The consequences for the estimatorC n can be summarized as follows.
•C n still converges at √ n-rate, but nowC n estimates the partial copula functionC (and not C which is not even well defined now).
• The limiting structure of the estimatorC n is more complicated due to the second and third term in the asymptotic representation of √ n C n −C .
On the other hand if the pairwise simplifying assumption (2.7) really holds, then C ≡ C and both A j (u 1 , u 2 ) and B j (u 1 , u 2 ) vanish and thus also the second and third term in the asymptotic representation of √ n C n −C . The latter then coincides with the results of Gijbels, Omelka and Veraverbeke (2015), where the asymptotic representation is derived.

Nonparametric location-scale adjustments
Let F jε be the distribution function of ε j .
Theorem 2. Assume that the marginal distributions follow nonparametric location-scale models described in Section 3.4.2 and that (Cn), (Bwn), (F1n), (F2n), (kn), (mσ) and (Xn) given in the Appendix hold. Then uniformly in with ψ be given in (5.4) and where for j = 1, 2 Analogously as Theorem 1 extends the result of Gijbels, Omelka and Veraverbeke (2015) for the parametric location-scale model adjustment, Theorem 2 does this for the nonparametric location-scale model adjustment. If the simplifying assumption (2.7) holds, then the functions φ j given in (5.5) vanish and the result of Theorem 2 is in agreement with Gijbels, Omelka and Veraverbeke (2015). On the other hand if the simplifying assumption (2.7) does not hold, thenC n still converges at √ n-rate, but now it estimatesC and the limiting structure of the estimator is more involved.

General nonparametric adjustments
Theorem 3. Suppose that assumptions (Bw), (F), (k) and (X) given in the Appendix are satisfied. Then the estimatorC n is a consistent estimator of the copula C, that is where r n = max g 2 1n , g 2 2n , log n n gn1 , log n n gn2 .
Note that this theorem also gives the same rate as the corresponding (more restrictive) theorem in Gijbels, Omelka and Veraverbeke (2015).

Parametric location-scale adjustments
Note that thanks to Hadamard differentiability of the functional C → C dC (tangentially to the set of functions that are continuous on [0, 1] 2 ) proved in Lemma 1 of Veraverbeke, Omelka and Gijbels (2011) one gets where α n stands for the asymptotic representation of √ n C n −C (see Theorem 1). Now (5.7) together with some further calculations yields the following result.  u 2 ), Note that provided one has asymptotic representations for √ n β jn −β j and √ n γ jn − γ j , one can with the help of (5.8) derive the asymptotic distribution ofτ n .

Nonparametric location-scale adjustments
With the help of (5.7) and similarly as in the previous section one can show the following i.i.d. representation of the estimator of the partial Kendall's tau.

General nonparametric adjustments
Theorem 6. Suppose that assumptions (Bw), (F), (k) and (X) given in the Appendix are satisfied. Thenτ n −τ = O P (r n ), where r n is given in (5.6).

Asymptotic results for the estimator of the average conditional Kendall's tau
Let τ A n = 1 n n i=1 τ n (X i ) with τ n (x) given by (3.2). Theorem 7. Assume that (kn), (Xn) and (H) given in the Appendix hold. Assume also that the bandwidth h n satisfies the assumptions of a bandwidth stated in (Bwn). Then Note that the nice thing about τ A n when compared withτ n is that it is asymptotically normal without requiring that the marginal distributions follow either parametric or nonparametric location scale models. This might be surprising as for each x ∈ R X the estimator of the conditional Kendall's tau τ n (x) converges typically at most at n 2/5 -rate . But thanks to averaging of τ n (X i ) this rate is improved to √ n (see Akritas and Van Keilegom, 2001;Neumeyer and Van Keilegom, 2010, among others, for similar settings in nonparametric smoothing where averaging improves the rate of convergence).
Note that asymptotically we do not even need to bother about the adjustments of the marginals. At first sight this might be surprising in view of the previous results on conditional Kendall's tau estimation . This can be explained by assumption (Bwn) on the bandwidth which together with assumption (H) guarantees that for each x ∈ R X the conditional bias (given X 1 , . . . , X n ) of the conditional Kendall's tau estimator τ n (x) is of order o P n −1/2 uniformly in x. The bias of τ A n is of the same order. To improve the finite sample properties we still recommend to pre-adjust the observations for the effect of X on their marginal distributions as described in Section 3.2. The asymptotic normality of the resulting estimator for the average conditional Kendall's tau has also been established by the authors (result not included here, for brevity). Remark 1. Suppose that the pairwise simplifying assumption (2.7) holds. Then the function φ in (5.9) simplifies to φ(x, u 1 , u 2 ) = 2 4 C(u 1 , u 2 ) − 1 − τ A − 4 u 1 + u 2 − 1 and in fact does not depend on x any more. This implies that the estimator of the average conditional Kendall's tau τ A n has the same asymptotic distribution as the oracle estimator based on unobserved (U 1i , U 2i ), i = 1, . . . , n . Note that this asymptotic distribution then coincides also with the asymptotic distribution of the estimator of the partial Kendall's tau when either parametric or nonparametric location scale models are correctly used to remove the effects of the covariate on the marginal distributions (see Theorems 2 and 4).

Conclusion and Discussion
In this paper we focus on several conditional association measures describing the dependence between two response variables Y 1 and Y 2 , given that a third (covariate) variable X takes some value x. The common feature of all these measures is that they are copula-based, i.e. they can be described as functionals ϕ(C X ) of the conditional copula C X . This leads to two different ways of summarizing the level of dependence by a single number. The first is to consider E X ϕ(C X ) , leading to the average (conditional) association measures. The second is to calculate ϕ(C) whereC is the so-called partial copulaC(·, ·) = E X C X (·, ·) , resulting in partial association measures. We provide statistical inference for the corresponding estimators in the important case of the average and partial Kendall's tau.
Based on the obtained results on estimation of the average and partial Kendall's tau we reported on the following interesting findings. A first finding is that the nonparametric estimator of the partial Kendall's tauτ n (given in (3.6)) is easier to compute than the nonparametric estimator for the average (conditional) Kendall's tau τ A n (see (3.2), (3.4) and (3.5)). A second finding is that for τ A n we can establish an (asymptotic) i.i.d. representation for √ n τ A n − τ A in a general setting (see Theorem 7) and hence for τ A n an asymptotic normality result is available under this general setting. For the partial Kendall's tau estimatorτ n however we could only establish (asymptotic) i.i.d. representations for √ n (τ n −τ ) under the more restrictive settings of parametric or nonparametric location-scale modelling of the conditional marginal distributions. Under the more general setting (not requiring such location-scales models to hold) we only achieve consistency of the estimatorτ n at the nonparametric rate r n (see Theorem 6). In conclusion, both estimators exhibit a specific advantage (for one of computational type and for the other estimator of theoretical type).
In practical examples the choice between the various association metrics depends, among others, on the considered research question, but also on the taste of the researcher. For example, if one is interested in dependence structures in the tails of joint distributions, then a study of tail coefficient type of association measures would be of primary interest.
The association measures of the Kendall's tau and Spearman's rho type are often used in economic and social statistics. A possible disadvantage however is that these concordance measures are not very sensitive to the dependence in the tails of the bivariate copula. Such additional information is crucial in bivariate extreme value theory and can be provided by the tail coefficient measures, see Table 1. Their definitions describe the limiting amount of dependence in the edges of the copula domain. The classical lower and upper tail dependence coefficients have some drawback since they only evaluate the copula at the diagonal sections. The association metrics of Schmid and Schmidt (2007) in Table 1 offers an alternative by averaging over all directions in the edge.

Appendix A: (Un)conditional, average and partial association measures
In this appendix we provide, in Sections A.1 and A.2, some background information on the original partial correlation measures, illustrate their drawback with an example, and illustrate the notion of conditional association measure (in Section A.3). Moreover, we provide some further insights in coincidence, or not, of the notions of average conditional association measure and partial association measure.

A.1. Global or unconditional association measures
Among the most popular unconditional association measures is Kendall's tau. Consider (Y 1 , Y 2 ) an independent copy of (Y 1 , Y 2 ). Kendall's tau is then defined as the probability of concordance minus the probability of discordance between the couples (Y 1 , Y 2 ) and (Y 1 , Y 2 ), i.e.
where the second equality holds when Y 1 and Y 2 are continuous random variables.
Another popular association measure is Spearman's rho defined as follows. Consider (Y 1 , Y 2 ) and (Y 1 , Y 2 ) two independent copies of (Y 1 , Y 2 ). Spearman's rho is defined as Recall that, due to the probability integral transformation, it holds that: if Y 1 and Y 2 are continuous random variables with respective distribution functions F Y1 and F Y2 , then F Y1 (Y 1 ) and F Y2 (Y 2 ) are uniformly on [0, 1] distributed random variables. In this particular case of continuous random variables Y 1 and Y 2 , Spearman's rho is equal to Pearson's correlation coefficient for the transformed random variables, F Y1 (Y 1 ) and F Y2 (Y 2 ), that is (see Nelsen, 2006) since the variance of a uniform distribution on [0, 1] equals 1 12 .

A.2. On some original partial association measures
Assuming a trivariate normal distribution for the triple (Y 1 , Y 2 , X) implies the following regression model structures: where ε 1 and ε 2 are independent of X. The Pearson partial correlation coefficient ρ 2)) then measures the correlation of ε 1 and ε 2 . Note that model (A.3) implies that the dependence structure of the 'X adjusted' variables Y 1 − α 1 − β 1 X and Y 2 − α 2 − β 2 X does not depend on X any more, thus in this model the partial correlation coefficient coincides with the conditional correlation coefficient that measures the dependence of Y 1 and Y 2 given X = x. Analogously as for the Pearson's correlation coefficient, researchers soon realized the need for alternatives to the partial Pearson's correlation coefficient that would not require the assumption of a trivariate normal distribution. Inspired by formula (1.2) for the partial Pearson's correlation coefficient, Kendall (1942) suggested to define the (original) partial Kendall's tau whose population version is denoted and given bȳ where τ (A, B) is the (global) Kendall's tau of the random variables A and B, see (A.1). See also Goodman (1959). While the obvious advantage ofτ K is its simplicity (to get the estimate it is sufficient to replace the pairwise Kendall's tau with their empirical versions), several criticism appeared in the literature, questioning whetherτ K is a reasonable measure of the dependence of Y 1 and Y 2 when X is taken into consideration, see e.g. Korn (1984), Nelson and Yang (1988) and Gripenberg (1992). The difficulties ofτ K are illustrated also by the following two examples (for other examples see the references given above).
Example A.1 Consider the models: where X ∼ N (0, 1), ε ∼ N (0, σ 2 ) and X and ε are independent. Note that Y 1 − X = Y 2 − X 2 , thus Y 1 and Y 2 are perfectly dependent when (correctly) adjusted for the effect of X, but it can be checked via Monte Carlo simulations thatτ K depends on the value of σ and evenτ K → 0 as σ → 0. Thus for small σ the coefficientτ K completely fails to measure the dependence of Y 1 and Y 2 when adjusted for X.

Example A.2 Suppose we have the models
where X, ε 1 and ε 2 are independent and all with a uniform distribution on [0, 1] (denoted by U[0, 1]). Note that here Y 1 − 2 exp{X} and Y 2 − 2 exp{X} (that is Y 1 and Y 2 adjusted for X) are independent. But by Monte Carlo simulation one can find out thatτ K . = −0.24.

A.3. Conditional association measures
A more detailed characterization of the dependence structure can be gained with the help of conditional measures of dependency/association, that measure the dependency/association of (Y 1 , Y 2 ) conditionally on the event that X = x. Let us consider for instance the conditional Kendall's tau (see Gijbels, Veraverbeke and Omelka, 2011) that is denoted and defined as where (Y 1 , Y 2 , X ) is an independent copy of the random vector (Y 1 , Y 2 , X) and Y 1 and Y 2 are continuous random variables. See also (A.1) and (A.2). It is easy to see that in the first example (Example A.1) τ (x) = 1 for all x, while in Example A.2 τ (x) = 0 for all x. Examples A.1 and A.2 are thus simple in the sense that the conditional dependence structures do not change with x, i.e. are constant in x. Denote by Y a 1 the variable Y 1 adjusted for X, and similarly the variable Y a 2 , the variable Y 2 adjusted for X. Note that in the above two examples one can characterize the dependence of Y a 1 and Y a 2 by a single (global) association measure (respectively 1 and 0).
In more complex (and realistic) models, the dependency structure of Y a 1 and Y a 2 can still change with X, which prevents us from describing the dependency/association structure of Y a 1 and Y a 2 by one single real number. Such a more complex model is presented in the following example that modifies Example A.2. Example A.3 Suppose that Y 1 and Y 2 follow (A.5), and that the marginal distributions of X, ε 1 and ε 2 are the same as in Example A.2. But now the conditional joint distribution of (ε 1 , ε 2 ) given X = x, with x in [0, 1] and (u 1 , u 2 ) ∈ [0, 1] 2 , is given by Note that ε 1 is independent of X, and thus ε 1 given X = x is U[0, 1] for each value of x ∈ [0, 1]. The same holds for ε 2 given X = x. But X still influences the dependence structure of the random vector (ε 1 , ε 2 ). In this case, a straightforward calculation shows that the conditional Kendall's tau equals This example just illustrates the usefulness of the concept of conditional Kendall's tau, or more generally, the concept of conditional association measures (see also Section 2).

A.4. On average conditional and partial association measures
We now look into the average conditional association measures, defined in general as E X {ϕ(X)}, and the partial association measures, defined as ϕ(C). Table 1 in Section 1 indicates in the last column some interesting facts regarding equality (or not) of the two concepts. Example A.3 serves to illustrate that for Kendall's tau they in general do not coincide.
For the lower and upper tail coefficients, λ L , λ U , ρ L and ρ U in Table 1 the functional ϕ(·) involves a limit expression. It is fairly easy to show though that for this type of association measure, the concepts of average and partial tail coefficients coincide. Since the limits do not need to exist (see e.g. Larsson et al., 2011) we need to request existence.

Proposition A.1. Suppose that all quantities below exist. Then
where the second equality is justified by applying Lebesgue's dominated convergence theorem (allowing to interchange the integral and the limit). Indeed, by Lipschitz continuity of the copula C x we have which is integrable with respect to dF X (x). The other statements can be proven in a similar way.

Appendices B-F: Assumptions, proofs and auxiliary results
Here we list the assumptions needed for each of the theoretical results, and provide proofs for these. The proofs of the asymptotic results for the nonparametric partial copula estimator are presented in Appendices B, C and D respectively, a division according to the three major transformations. The proof of Theorem 7, establishing the asymptotic representation for the estimator of the average conditional Kendalls' tau, is given in Appendix E. Finally, some auxiliary results, needed in the proofs, can be found in Appendix F.

Appendix B: Adjusting through parametric location-scale models
In this appendix we prove Theorem 1.

Regularity assumptions
(Cp) For j = 1, 2, the j-th first-order partial derivative of C x exists and is continuous on the set Note that if the parameters β 1 , β 2 , γ 1 , γ 2 were known, then one could estimate the unknown distribution function F jε as The estimator ofC then could bē which can be viewed as a kind of 'oracle' estimator. The proof of Theorem 1 follows by Proposition B.1 and by applying standard results on the asymptotic representation of the empirical copula process applied toC (or) n (see e.g. Gänssler and Stute, 1987;Fermanian, Radulovič and Wegkamp, 2004;Tsukuhara, 2005;Segers, 2012, among others).
Proposition B.1. Assume that the assumptions of Theorem 1 are satisfied.

Proof of Proposition B.1
The proof closely follows the proof of Theorem 4 in Gijbels, Omelka and Veraverbeke (2015).

Decomposition
Let us decompose the copula process Finally, E n is given by Completely analogously as in the proof of Theorem 4 in Gijbels, Omelka and Veraverbeke (2015) one can show thatĀ n ,B n are asymptotically negligible uniformly in (u 1 , u 2 ). Thus it remains to investigate the process E n .

Proving (B.7)
Using the mean value theorem one can calculate that where y x j (u) lies between F −1 j ε (u) and , γ * j between γ jn and γ j , and finally β * j between β jn and β j . First note that by assumptions (βγ) and (mσp) uniformly in x.
To proceed further with the analysis of the right-hand side of (B.11), one needs to analyse the term F jε F −1 j ε (u) . To do so it is useful to investigate the process F j ε F −1 jε (u) , u ∈ [0, 1] . In the same way as in the proof of Theorem 4 in Gijbels, Omelka and Veraverbeke (2015) one can show that uniformly in u ∈ [0, 1] This together with further implies that uniformly in u ∈ [0, 1] where z x j (u) lies between F −1 jε (u) and . Now thanks to assumptions (βγ), (mσp) and Lemma F.4 (see Appendix F) uniformly in u and x, which further yields that Now, thanks to (B.14) and assumptions (βγ), (F2p) and (mσp) one gets that the process √ n F j ε (z) − F jε (z) , z ∈ R converges in distribution to a limiting process F j . Further, F j satisfies that P F j ∈ A = 1, where A is a set of functions on [0, 1] such that each function α ∈ A can be written as α(x) = h(F jε (x)), where h is a continuous function on [0, 1] that meets h(0) = h(1) = 0. Now, we are ready to use the Hadamard-differentiability of the functional F → F •F −1 at the pointF = F (see Lemma A.2 of Omelka, Gijbels and Veraverbeke, 2009) Now, one can use Theorem 3.9.4 of van der Vaart and Wellner (1996) together with the Hadamard differentiability, (B.14) and (B.15) to deduce that uniformly in u ∈ [0, 1] Now, it remains to deal with the terms f jε y x j (u) and f jε y x j (u) F −1 j ε (u) on the right-hand side of (B.11). First note that (B.16) together with (F2p) implies that uniformly in u ∈ [0, 1] and analogously Further note that y x j (u) introduced in (B.11) satisfies y x j (u) = F −1 j ε (u)(1 + a n ) + b n , where both a n and b n are of order o P (1) (uniformly in u and x). Thus it holds uniformly in u, x where a * n (b * n ) lies between zero and a n (b n ). Now, (B.19) together with (B.17) and (B.18) implies that (uniformly in u and x) Analogously to (B.16) using the Hadamard differentiability of the functional which together with (B.6) and (B.22) yields (B.7).

Appendix C: Adjusting through nonparametric location-scale models
In this appendix we list the assumptions needed for Theorem 2, and we provide a proof for this result.

Regularity assumptions
Let F jε stand for the distribution function of ε j .
(kn) The kernel k is twice continuously differentiable, symmetric with support [−1, 1], decreasing on [0, 1), and integrates out to one. (Xn) The support R X of X is a non-empty finite interval (a, b). Suppose that inf x∈R X f X (x) > 0 and f X (x) is twice continuously differentiable in R X . (mσ) For j = 1, 2 the functions m j and σ j are twice continuously differentiable on the interior of R X and inf x∈R X σ j (x) > 0.
Remark 2. Note that assumption (Bwn) requires that g jn = o(n −1/4 ). As the optimal rate is usually g jn = O(n −1/5 ), assumption (Bwn) says that g jn should converge to zero faster than at an optimal rate.
The proof of Theorem 2 follows from Proposition C.1 and standard results on the asymptotic representation of the empirical copula process applied toC (or) n , that is given in (B.1).
Proposition C.1. Assume that the assumptions of Theorem 2 are satisfied. Then uniformly in (u 1 ,u Proof of Proposition C.1 As the proof is analogous to the proof of Proposition B.1 in Appendix B only the differences between these two proofs are briefly indicated. In the same way as in the proof of Proposition B.1 one can use the decomposition (B.3) with the copula estimatorC n and the distribution functions F j ε based on the residuals given by (3.10).
Further, completely analogously as in the proof of Theorem 6 in Gijbels, Omelka and Veraverbeke (2015) one can show thatĀ n ,B n are asymptotically negligible uniformly in (u 1 , u 2 ). Thus it remains to investigate the process E n .

Treatment of E n
Note that E n can be expressed as (B.4) with F jx (y) and F jx (y) given by Now with the help of a second-order Taylor series expansion one gets where u jx (for j = 1, 2) lies between the points F jx F −1 jx (u) and F jε F −1 jε (u) and Y (n) jx is given in (B.6) (with F jx (y) and F jx (y) given by (C.2)). Now completely analogously as in the proof of Theorem 6 of Gijbels, Omelka and Veraverbeke (2015) one can show that all the terms given in (C.4), are of order o P 1 √ n uniformly in (u 1 , u 2 ) ∈ [0, 1] 2 . The only difference is that thanks to the results of Ojeda (2008) and assumption (Bwn) one gets with a = 1 3 instead of a = 3 8 as in (C5) of Gijbels, Omelka and Veraverbeke (2015). But this is compensated by requiring that δ > 1 3 in (F2n) (instead of δ > 1 4 as in (F2n) of Gijbels, Omelka and Veraverbeke (2015)) so that Lemma 3 of Gijbels, Omelka and Veraverbeke (2015) still holds true.
Thus it remains to investigate the right-hand side of (C.3). To proceed with the investigation of Y (n) jx one needs to deal with the quantity F jx F −1 jx (u) . By (F2n), (C.5) and a second-order Taylor series expansion one gets that uniformly in u ∈ [0, 1] and x ∈ R X To go on one needs to treat the quantity F jε F −1 j ε (u) . Analogously as in the proof of Proposition B.1 it is useful to investigate F j ε F −1 jε (u) . In the same way as in the proof of Theorem 6 in Gijbels, Omelka and Veraverbeke (2015) one can show that (B.13) holds. This together with , assumptions (mσ), (F1n), (F2n), (C.5), and Lemmas F.2 and F.3 (see Appendix F) yields that uniformly in u ∈ [0, 1] Now, analogously as in the proof of Proposition B.1 one can use the Hadamard-differentiability of the functionalF → F •F −1 at the pointF = F to deduce that uniformly in u ∈ [0, 1] Note that (C.7) together with (F2n) also yields that (B.17) and (B.18) hold uniformly in u ∈ [0, 1]. Now using (B.17) and (B.18) and approximation (C.7) for F jε F −1 j ε (u) in (C.6) gives that uniformly in u ∈ [0, 1] and Analogously one can show that (B.23) holds, which together with (B.22) yields that uniformly in u ∈ [0, 1] and Now combining (C.8) together with Lemmas F.2 and F.3 yields that uniformly in (u 1 , u 2 ) where φ j is defined in (5.5). Thus the right-hand side of (C.3) can be approximated by the right-hand side of (C.1) uniformly in (u 1 , u 2 ) ∈ [0, 1] 2 , which finishes the proof of the proposition.
This further yields that Further, let τ (x) stand for the conditional Kendall's tau at the point x and note that τ ( and note that one can decompose where V n is given by the first term (except for the factor 4) of the right-hand side of (E.2).
where S n,j (x) is introduced in (3.9). Note that V n can be rewritten as and the function v(y 1 , y 2 , y 1 , y 2 , x) is defined in (E.1).
Below (see pages 2462-2463) we show that uniformly in Combining (E.5), (E.6), (E.7) now yields where h(y 1 , y 2 , x, y 1 , y 2 , x , y 1 , y 2 , x ) = 1 Note that h in fact does not depend on y 1 and y 2 . The reason why we put these arguments is to stress that V n is a V -statistic (see e.g. Chapter 5.1.2 Serfling, 1980) with a kernel h of degree 3. Although the kernel is not symmetric, it could be easily symmetrized without affecting the quantity V n . Further, as V -statistics and U -statistics are in terms of the √ n-asymptotic distribution equivalent (Chapter 5.7.3 of Serfling, 1980), in what follows we can think of V n as a U -statistic.
In the following the expectation E is taken with respect to the distribution of the random vectors (Y 1 , Y 2 , X ) and (Y 1 , Y 2 , X ) that are two independent copies of (Y 1 , Y 2 , X).
Using the notation introduced above one can decompose V n as where V π n is the first term on the right-hand side of (E.12).
With the help of assumptions (Bwn), (kn), (Xn) and Lemma A of Section 5.1 of Serfling (1980) one gets that for sufficiently large constant K and for all sufficiently large n ∈ N which together with E R n = 0, Chebyshev's inequality implies that R n = o P 1 √ n . Thus one can concentrate on V π n . Combining (E.8), (E.9), (E.10), (E.11) yields that and finishes the proof of the theorem.
Proof of statements (E.6) and (E.7) First note that analogously as in Lemma 4 of Gijbels, Omelka and Veraverbeke (2015) using assumptions (Bwn), (kn) and (Xn) one can show that uniformly where r n = log n n hn and μ 2k = u 2 k(u) du.
Treatment of V 1n . (E.13) and the definition of D n (x) in (E.4) imply that uniformly in (E.14) Further by Lemma F.5 in Appendix F By a straightforward calculation uniformly in uniformly in x ∈ R X , which together with (E.14) yields (E.6).
Treatment of V jn , j = 2, 3, 4. Let us start with j = 2. Analogously as before thanks to Lemma F.5 of Appendix F one gets (E.15) with U n (x) given by and with the help of (E.13) Analogously one can prove (E.7) for j = 3 and 4.

Appendix F: Auxiliary results
In this appendix we state auxiliary results that are used in the proofs of the main theorems and that can also be of independent interest. The results will be formulated with the help of assumptions defined in the previous appendices. To simplify the notation the index j is usually dropped.
The following lemma is a simple adaptation of Lemma 19.24 of van der Vaart (2000), and will be useful in the proofs of Lemmas F.2 and F.3.
Lemma F.1. Suppose that X 1 , X 2 , . . . are identically distributed random vectors with distribution P and H is the set of real valued uniformly bounded functions defined on S X that is P -Donsker. Further let g be a real valued function on S X with a finite second moment and {h n } be a sequence of real valued functions on S X such that Proof. Thanks to assumption of the lemma one can suppose that h n ∈ H and that h n is bounded. Further by the permanence of Donsker property also the set of functions F = {g h, h ∈ H} is Donsker. The proof now follows by applying Lemma 19.24 of van der Vaart (2000) with f n (x) = g(x)h n (x) and f 0 taken to be a zero function.
Further in the proofs of Lemmas F.2 and F.3 we make use of the bracketing numbers for sets of differentiable functions. Following the notation of Chapter 2.7 of van der Vaart and Wellner (1996) let C M 1 (R X ) stand for a set of real valued functions defined on R X that are Lipschitz of order 1, with the Lipschitz constant bounded with M , that is By Corollary 2.7.2 of van der Vaart and Wellner (1996) there exists a constant K such that the logarithm of the bracketing number N [ ] ε, C M 1 (R X ), L 2 (P ) of the set C M 1 (R X ) (with L 2 (P ) denoting the norm that is used) is bounded by for every ε > 0 and probability measure P on R X . Note that (F.1) implies that C M 1 (R X ) is Donsker (see e.g. Theorem 19.5 in van der Vaart, 2000). Lemma F.2. Suppose that our observations follow the nonparametric location scale models described at the beginning of Section 3.4.2 and m n (X) be given by (3.7). Further let the assumptions (Bwn), (F1n), (kn), (Xn), and (mσ) hold and b : R X → R possesses a bounded derivative on R X . Then Partial and average copulas 2465 Proof.
where A n and B n are the first and second terms on the right-hand side of (F.3) respectively. Now to deal with A n , which stands for the first term on the right-hand side of (F.3), one can use a second-order Taylor series expansion of m(X i ) with respect to X i at the point X which together with the properties of the local linear weights yields where X * i lies between X i and X. Note that m is bounded and w ni (X, g n ) = 0 for |X i − X| > g n . Further, following the arguments of Section 2.4.1 of Omelka, Veraverbeke and Gijbels (2013) one can easily show that n i=1 w ni (x, g n ) = 1+O P (1) uniformly in x ∈ R X and thus using (mσ) and (Bwn) one can bound Now, one can concentrate on B n , which stands for the second term on the right-hand side of (F.3). As the local linear weights are given by (3.8), one needs to investigate the following two quantities where b 1n (x) = b(x) S n,2 (x) S n,0 (x) S n,2 (x) − S 2 n,1 (x) and b 2n (x) = b(x) S n,1 (x) S n,0 (x) S n,2 (x) − S 2 n,1 (x) , with S n,j (x) defined in (3.9).
Note that with the help of (E.13) and assumptions of the lemma about g one gets for l = 0, 1 uniformly in x ∈ R X . Now combining (Bwn), (F.5) and (F.6) with l = 0 yields that Analogously combining (Bwn), (F.5) and (F.6) with l = 1 yields B 2n = o P 1 √ n . Thus it remains to treat only B 1n . Thanks to (F.7) this quantity can be approximated as Let us introduce the set of functions on R X × R F = (x, e) → b(x + t g)k(t) dt σ(x) e, g > 0 .
Then the first term on the right-hand side of (F.8) can be viewed as an empirical measure P n of (X i , ε i ) n i=1 that is indexed by the functions from F and evaluated at the function f n (x, e) = b(x + t g n )k(t) dt σ(x) e. Thanks to the boundedness of the derivative of b the set of the functions is a subset of C M 1 (R X ) for a sufficiently large M and thus Donsker. Thus Lemma F.1 with X i = (X i , ε i ) and h n (x, e) = b(x + t g n )k(t) dt − b(x) and g(x, e) = σ(x) e yields that which finishes the proof of the lemma.
Thanks to Ojeda (2008) and our assumptions one gets sup x∈R X | σ 2 n (x)−σ 2 (x)| = o P (n −1/3 ), which together with (mσ) yields that sup x∈R X | σ n (x) − σ(x)| = o P (n −1/3 ), which gives us that uniformly in Analogously as in Lemma F.2 one can show that A 2n given by (F.16) can be approximated as 2σ 2 (Xi+t gn) k(t) dt + o P 1 √ n , and further use Lemma F.1 to deduce that Further, similarly as in the proof of Lemma F.2 one can show that A 3n given by (F.13) can be approximated as By the results of Ojeda (2008) one gets P m n ∈ C M 1 (R X ) → 1 and thus one can view A 3n as P n (f n ), where P n is the empirical measure indexed by the set of functions F = (x, e) → r(x)σ(x)e; r ∈ C M 1 (R X )} evaluated at at f n (x, e) = m(x) − m n (x) σ(x) ε i b(x + tg n ) k(t) dt. Further, m n (x) = m(x)+o P (1) uniformly in x ∈ R X Thus one can put h n (x, e) = m(x)−