Dependence Measuring from Conditional Variances

Abstract A conditional variance is an indicator of the level of independence between two random variables. We exploit this intuitive relationship and define a measure v which is almost a measure of mutual complete dependence. Unsurprisingly, the measure attains its minimum value for many pairs of non-independent ran- dom variables. Adjusting the measure so as to make it invariant under all Borel measurable injective trans- formations, we obtain a copula-based measure of dependence v* satisfying A. Rényi’s postulates. Finally, we observe that every nontrivial convex combination of v and v* is a measure of mutual complete dependence.


Introduction
The problem of how to assign the level of dependence between two random variables in a consistent manner can never be solved completely by using only a single measure of dependence. There are many attributes to consider in choosing the "right" measure of dependence in a given situation. Among them are the nature of dependence (linear, monotone, or other types of dependence), a reference to the normal correlation coefficient and other specific purposes. Many measures of dependence have been proposed and studied since the beginning of the twentieth century. See [10,14,16,18,20,22]. But it is not until the seminal paper of A. Rényi [16] that this problem attracted much wider attention. He proposed the following set of seven properties that should be valid for a generic measure of dependence δ. To the best of our knowledge, the only measure satisfying all of these properties is the maximal correlation coefficient [8]: R(X, Y) = sup f ,g γ(f (X), g(Y)), where the supremum is taken over all Borel measurable functions f and g such that the correlation coefficient γ(f (X), g(Y)) can be defined. R0. δ(X, Y) is defined for all random variables X and Y, neither of them being constant almost surely (a.s.). R1. δ(X, Y) = δ(Y , X). R2. 0 ≤ δ(X, Y) ≤ 1. R3. δ(X, Y) = 0 if and only if X and Y are independent. R4. δ(X, Y) = 1 if X and Y are completely dependent, i.e. X is almost surely a Borel measurable function of Y or vice versa. R5. δ(f (X), g(Y)) = δ(X, Y) for all Borel measurable injective transformations f and g. R6. δ(X, Y) = |ρ| if X and Y are jointly normal with correlation coefficient ρ.
Since the discovery of copulas and the famous Sklar's theorem [13,21], many measures of dependence defined via copulas -called copula-based measures of dependence -have been introduced. As a pair of random variables X and Y has a unique copula when they are continuous, a copula-based measure of dependence is guaranteed to be well-defined only for continuous random variables and hence R0 may not hold. For the copula-based measures of dependence, R6 is usually replaced by a weaker postulate that δ(X, Y) is a strictly increasing function of |ρ|. Two such measures are Schweizer and Wolff's σ [18], and Siburg and Stoimenov's ω [20], where C is the copula of X and Y, denoted by C X,Y . Recall that Π, defined by Π(x, y) = xy, is the copula of independent continuous random variables; M(x, y) = min(x, y) is the copula of comonotonic random variables; and W(x, y) = max(x + y − 1, 0) is the copula of countermonotonic random variables. Both σ and ω are defined for continuous random variables and satisfy R1-R3. σ is called a measure of monotone dependence because its maximum value detects strict monotone dependence: σ4. σ(X, Y) = 1 if and only if X and Y are a.s. strictly monotonically dependent, i.e. C X, for all a.s. strictly monotonic measurable transformations f and g.
The measure ω is called a measure of mutual complete dependence because its maximum value is attained exactly when the random variables are mutually completely dependent, i.e. they are completely dependent on each other: ω4. ω(X, Y) = 1 if and only if X and Y are mutually completely dependent.
Observe that the properties R4-R5 need to be adjusted according to which types of dependence the measures aim to detect. Historically, σ and ω have their roots in the Spearman's ρ and the (modified) Sobolev norm of copulas [13,[18][19][20]. The conditional variance of Y given X is an indicator of how weakly Y is dependent on X. We make this relationship more explicit as follows. For the uniform [0, 1] random variables X and Y with joint distribution function or copula C, we observe that the L 1 -norm of the conditional variance, called the total conditional variance, . This suggests that the L 1 -norms of C T * C and C * C T might possibly give rise to new measures of dependence with close ties to conditional variances. It turns out that the sum of these two L 1 -norms give a "measure of mutual complete dependence" ν that satisfies R1-R5 except that in R3 it holds only that ν(X, Y) = 0 if X and Y are independent. Moreover, ν(X, Y) is very close to |ρ| in the case when X and Y are jointly normal with correlation coefficient ρ. All of the above are developed in section 3. In section 4, we overcome the inability of ν in classifying the independence and define a new measure ν * . It is proved to satisfy R1-R5. We also show that the converse of R4 does not hold for ν * . Finally, a class of measures of mutual complete dependence is given by all nontrivial convex combinations of ν and ν * . Note also that, by computation, both ν and ν * are increasing functions of |ρ|. Let us begin with a section summarizing all the necessary backgrounds on copulas including their properties and constructions.

Background on copulas
Denote I = [0, 1], B(I) the Borel σ-algebra on I and let λ and λ 2 denote the Lebesgue measures on I and I 2 respectively. The Lebesgue integral on I is denoted simply by ∫ 1 0 dx. The symbol ∂ i C denotes the partial derivative of C with respect to the ith variable.
A function C : I 2 → I is called a (bivariate) copula if for all u, v ∈ I, Every copula C can be extended to a joint distribution function of uniform [0, 1] random variables in a unique way. Let X and Y be any random variables whose distribution functions are F and G, respectively. Sklar's theorem states that every joint distribution function H of X and Y can be written as for some copula C. If F and G are continuous, then C is uniquely determined by (1) and called the copula of X and Y. Conversely, putting an arbitrary copula C into (1) always yields a joint distribution function H. A copula C is said to be symmetric if its transpose C T , given by C T (u, v) = C(v, u), is equal to C. For more details on the theory of copulas, see [13].
In a series of papers [3,4,15], Darsow, Nguyen and Olsen introduce a binary operation on the class of bivariate copulas, called the *-product, defined by It was shown that a copula C is left invertible (right invertible) if and only if C is the copula of X and f (X) (f (X) and X) for some continuous random variable X and Borel measurable transformation f . Random variables X and Y are said to be completely dependent if Y = f (X) a.s. or X = f (Y) a.s. for some Borel measurable f . They are said to be mutually completely dependent if Y = f (X) a.s. for some Borel measurable injection f . A mutual complete dependence copula is the copula of two continuous random variables which are mutually completely dependent. Note that the invertible copulas, whose class is denoted by I, are exactly the mutual complete dependence copulas.
Shuffles of Min are the copulas of random variables X and f (X) for which f is a piecewise continuous injection. They are simple mutual complete dependence copulas in the sense that they can be constructed by cutting I 2 into a finite number of vertical stripes and shuffling the masses of M(u, v) = min(u, v) on the main diagonal with possible flipping of the stripes. See [12,13] for more details on shuffles of Min. Note that the *-product of shuffles of Min is a shuffle of Min.
The ordinal sum of copulas The mass of C is spread in each square Let C be the copula of uniform [0, 1] random variables X and Y on a common probability space and x ∈ I.
Recall that the conditional distribution of Y given a.s. and so the conditional expectation of Y given X = x is given by Denote the conditional variance of Y given X = x by and the total conditional variance of Y given X by σ 2 Our proof will use some identities collected in the following Lemma.

Lemma 3.2. Let C be a copula and f be a nonnegative bounded measurable function on
Proof.
(3) can be proved by repeating the arguments in the proof of Lemma 3.1 in [3].
To prove (4), let x ∈ I be such that d dx ∫ 1 0 C(x, y) dy exists and ∂ 1 C(x, y) exists a.e. y. Note that almost every x possesses these properties. Consider a sequence {xn} converging to x. By the Lipschitz condition of copula, C(x,y)−C(xn ,y) x−xn ≤ 1 for all y ∈ I. So, by the dominated convergence theorem, d dx We have used (3) twice in the second line, the first fundamental theorem of calculus and the method of integration by parts in the third line, and (4) in the last line. Applying Tonelli's theorem, the second integral in the last line equals X|Y of X given Y is defined similarly and can be proved to satisfy Motivated by this relationship, we define |[C]| for every bivariate copula C by Recall that if C is the copula of mutually completely dependent continuous random variables then C is invertible with inverse C T , i.e. C * C T = C T * C = M. See [19,20]. So Let us note here that every idempotent copula C is symmetric [5,23] where ρ denotes the Spearman's rho.
Theorem 3.3. Let C be a bivariate copula.
Proof. Let C be a bivariate copula. We only prove the statements for |[·]| 1 .
Therefore, by Tonelli's theorem and the fundamental theorem of calculus, is a copula satisfying Proof. By considering all pertinent cases, it can be verified straightforwardly that A is a copula. In fact, A is called the uniform Summing the four integrals gives The supports of C, C T and C * C T Example 3.6. Let C be the copula whose mass is spread uniformly on the line segments shown in Figure 1.
It follows that C * C T and C T * C are the uniform Figure 1. The integral of each A ij is 1 4 . By Lemma 3.5,  Proof. By a straightforward but tedious computation, we obtain | is defined in terms of C T * C and C * C T , let us investigate further some properties of the self-map Ψ on the class of copulas C defined by Ψ(C) = C T * C. The mapping C → C * C T has the analogous properties. Proposition 3.8. Let C be a copula.

Ψ is neither one-to-one nor onto. 2. Ψ(C) is symmetric and hence
[ [

Ψ is a continuous function from C endowed with the Sobolev norm into itself. That is, if a sequence of copulas {Cn} converges to a copula C in the Sobolev norm, then
Again, by Theorem 3.3 (iv), 1 4 , then by Theorem 3.3 (iv), This follows from the fact that the *-product is jointly continuous with respect to the Sobolev norm. See Theorem 4.2 in [4].
We are now ready to define the first candidate for a measure of dependence ν. For all continuous random variables X and Y, let Theorem 3.9. The measure ν satisfies the following properties: ν4. ν(X, Y) = 1 if and only if X and Y are mutually completely dependent.
Proof. ν1-ν4 follow directly from the definitions of |[·]| and ν and Theorem 3.3. To prove ν5, let C X,Y be the copula of X and Y and consider the following four cases. If f and g are strictly increasing, then C f (X),g(Y) = C X,Y [13,Theorem 2.4.3]. So ν(f (X), g(Y)) = ν(X, Y). If f is strictly increasing and g is strictly decreasing, Thus, Y). The case where f is strictly decreasing and g is strictly increasing follows from the symmetry of ν. The last case when f and g are strictly decreasing can be proved using the fact that C f (X),g(Y) = W * C X,Y * W.
Note that the property ν5 in Theorem 3.9 is not valid for all Borel measurable injections where we utilize the same counterexample as the one in page 109. Example 3.10. Consider jointly normal random variables X and Y with correlation coefficient ρ. Then ν(X, Y) is a strictly increasing function of |ρ|. Its graph obtained from a Matlab implementation is shown in Figure 2. Note the small difference between them whose graph is shown in Figure 3. In Section 3, we show that ν, defined in (5), is almost a measure of dependence. In fact, ν is almost a measure of mutual complete dependence as it can classify the mutual complete dependence: ν(X, Y) = 1 if and only if X and Y are mutually completely dependent. However, it lacks the ability to classify the independence, as there are infinitely many copulas whose measure ν is zero. This is due to the fact proved in Theorem 3.3 that the minimum value of ν-measure is attained exactly when These equalities reflect that the copula C spreads the probability mass in an almost uniform way albeit in a weaker sense than having the uniform mass distribution. For a copula satisfying (6) but having a non-uniform mass distribution, shuffling masses of the horizontal or vertical stripes can alter the copula values in such a way that (6) no longer holds. Recall from [17] that shuffling a bivariate copula amounts to transforming one of the two random variables by a Borel measurable piecewise continuous injection. More generally, C f (X),X * C X,Y = C f (X),Y for any Borel measurable injection f . See [17,Theorem 4.1]. Such a copula C f (X),X is invertible with inverse C T f (X),X = C X,f (X) : C X,f (X) * C f (X),X = M = C f (X),X * C X,f (X) . In light of this observation, given continuous random variables X and Y with copula C X,Y , we define where the supremum is taken over all Borel measurable injective transformations f and g. Using the facts that [17,Corollary 4.6]) and that C f (X),X and C Y ,g(Y) are invertible, we obtain Likewise, we can show that The following theorem shows that ν * is a copula-based measure of dependence in the sense of A. Rényi [16], i.e. all R1-R5 are satisfied.
Theorem 4.1. The measure ν * satisfies the following properties: In order to prove ν * 4, we will show only that ν * (X, Y) = 1 if Y is a Borel measurable function of X as the other case is similar. This is equivalent to proving that |[C]| * = 2 3 when C is left invertible. Suppose a copula C is left invertible. Then C T * C = M and |[C * S]| 1 = ∫ I 2 S T * C T * C * S dλ 2 = ∫ I 2 M dλ 2 = 1 3 for every invertible copula S. For sup S∈I |[S * C]| 2 , let us start from the fact, see [5], that E = C * C T is an idempotent copula whose invariant sets form a nonatomic σ-algebra E ⊆ B(I).¹ As a consequence, by Corollary 1.12.10 in [1], for every integer n = 1, 2, . . . , there exists an essential partition consisting of sets P 1 , . . . , Pn in E in the sense that Since each P i is an invariant set of E, we have where the first equality follows from a standard measure theoretic argument starting from P i being intervals. By the symmetry of the idempotent E, we get We will then use {P i } to construct an invertible copula Sn for which Sn * E * S T n is supported in . In fact, by Lemma 4.
To prove the opposite direction of ν * 3, or equivalently that |[C]| * = 1 2 implies that C = Π, we define S α,β , for 0 ≤ α ≤ β ≤ 1, as the shuffle of Min whose support consists of at most three line 1 A measurable set S is called an invariant set of a copula A if the characteristic function χ S is a fixed point of the Markov operator to be nonatomic if for every S ∈ E there exists a subset S ′ of S in E such that 0 < λ(S ′ ) < λ(S).
segments of slope 1 shown in Figure 4, i.e.
For every copula C and 0 ≤ α ≤ β ≤ 1, a direct computation gives Integrating S α,β * C with respect to x and making suitable changes of variables yield Then . Since i is arbitrary, repeated use of this equation gives C ( i 2 n , y = · · · = i 2 n C (1, y) = i 2 n y.
By the continuity of C and the denseness of the dyadic rationals in [0, 1], we have C = Π as desired.
Remark. We then give an example to demonstrate that the converse of ν * 4 is not true. By the proof of ν * 4 above, any nonatomic idempotent copula gives maximum |[·]| * . However, in order to illustrate the shuffling in the proof, let us consider E 0 = M+W 2 which is neither left nor right invertible. Equivalently, if E 0 is the copula It follows that Sn is a shuffle of Min, Sn = ord 2 n−1 (S 1 ) * S n−1 and En = ord 2 n−1 (S 1 ) * E n−1 * ord 2 n−1 (S 1 ) T .
Using the recursive relation (8), it can be shown by induction on n that En = ord 2 n (E 0 ). The first few En's are illustrated in Figure 6. Since the value of this ordinal sum agrees with M except possibly on the union ∪ 2 n i=1 [ i−1 2 n , i 2 n ] 2 whose area is 1 2 n → 0, ∫ I 2 Sn * E 0 * S T n dλ 2 converges to ∫ I 2 M dλ 2 = 1 3 .
Owing to one of the anonymous referees, we are pleased to propose a class of measures of mutual complete dependence να defined as a nontrivial convex combination να = αν + (1 − α)ν * for 0 < α < 1. Proof. The proof uses corresponding properties of ν and ν * in Theorems 3.9 and 4.1. να1 and να2 clearly follow from the same properties of ν and ν * . If X and Y are independent, then ν(X, Y) = ν * (X, Y) = 0 and hence να(X, Y) = 0. Conversely, if να(X, Y) = 0 then ν * (X, Y) must be zero and so X, Y are independent.
να5 is a result of ν5 and ν * 5.

Conclusion
We show that C T * C + C * C T L 1 gives rise to a [0, 1]-valued function ν of continuous random variables which is almost a measure of mutual complete dependence as it cannot identify independence. We then prove that the measure ν * , modified from ν in such a way that it is invariant under all one-to-one transformations, satisfies the five essential properties in Rényi's postulates for measures of dependence. Finally, every nontrivial convex combination of ν and ν * is a measure of mutual complete dependence.