Local Dependence in Bivariate Copulae with Beta Marginals Dependencia local en copulas bivariadas con marginales Beta

The local dependence function (LDF) describes changes in the correlation structure of continuous bivariate random variables along their range. Bivariate density functions with Beta marginals can be used to model jointly a wide variety of data with bounded outcomes in the (0,1) range, e.g. proportions. In this paper we obtain expressions for the LDF of bivariate densities constructed using three di erent copula models (Frank, Gumbel and Joe) with Beta marginal distributions, present examples for each, and discuss an application of these models to analyse data collected in a study of marks obtained on a statistics exam by postgraduate students.


Introduction
Interest often lies in studying bivariate bounded distributions, for example where both random variables (rv's) are rates or proportions limited from 0 to 1 and these can be eciently modelled using bivariate Beta distributions.There are many elds of application for such joint models, e.g.mathematics and language exam marks of students, results of psychological tests applied to matched groups of patients receiving intervention and placebo, the percentiles of height and weight, or the proportions of household income spent on food and on heating.
There are several models for bivariate Beta distributions: some are derived from transformations of three standard (Olkin & Liu 2003), non-central and ve (Gupta, Orozco-Castañeda & Nagar 2011) Gamma-distributed rv's; others arise from the relations between the Beta, F and skew-t distributions (Jones 2002), (El-Bassiouny & Jones 2009).
Sometimes there is no desire to impose constraints on the data-generating mechanism, such as those driven by transformations of Gamma densities, and alternative processes are required.Thus, another possibility is provided by the class of bivariate distributions with Beta marginals constructed via copula functions.
The copula parameter θ denes the degree of association between the marginals and it is well-known that it is directly associated with Kendall's correlation coefcient (Schweizer & Wol 1981), as follows: Moreover, a scalar measure of correlation, e.g.Pearson's r, is often not enough to adequately describe the dependence structure of bivariate rv's (X 1 , X 2 ) with continuous joint density function f .An interesting review paper on copula and dependence structures is (Escarela & Hernández 2009).
Local dependence measures allow a thorough exploration of the nature of the joint variation by analysing changes in the strength of the association along the range of (X 1 , X 2 ).There are several models and graphical representations of the notion of local dependence, for example local dependence maps (Jones & Koch 2003), local Gaussian correlation (Tjøstheim & Hufthammer 2013), Kendall plots (Genest & Boies 2003) and chi-plots (Fisher & Switzer 1985).Here we focus on the denition of the local dependence function (LDF) as given by Holland & Wang (1987): The γ (x 1 , x 2 ) LDF calculates the rate of change of the natural logarithm of the bivariate density at every combination of the x 1 and x 2 values.The LDF can be seen as a localization of the Pearson correlation coecient r for (X 1 , X 2 ) (Jones 1996), and measures the strength of the association between X 1 and X 2 in a neighbourhood of any point (x 1 , x 2 ) in the domain of f .For independent rv's γ is uniformly 0, and it is constant only for the few distributions identied by (Jones 1998), among them the bivariate normal with value γ = r / (1 − r 2 ).
The class of bivariate models with Beta marginals provides an interesting framework in which to explore the role of the LDF in revealing bivariate structures because the Beta distribution is extremely exible and can produce probability density functions (pdf's) in many shapes, e.g.U-and J-shaped, symmetric, and even uniform.Some work on this has been done by (Gupta, Kirmani & Srivastava 2010); see the LDF formula in equations 11 and 20 for the Farlie GumbelMorgenstern (FGM) (Morgernstern 1956, Gumbel 1960, Fairlie 1960) and AliMikhailHaq (AMH) (Ali, Mikhail &Haq 1978 andKumar 2010) copulae with Beta marginals, respectively.We aim to extend these LDF formulae to more bivariate copulas with Beta marginals that have not been previously studied.The LDF, as given in equation ( 1), is directly related to the density of the bivariate distribution itself via its moments as shown by Jones (1996), hence it is easy to construct when the bivariate distribution is fully specied.We hope to improve research practice in exploring bivariate associations by providing the LDF for three bivariate distributions obtained via coupling Beta marginals and focusing on the interpretation of their graphical displays.
The copula models featured in this paper were chosen based on their widespread usage in the multivariate research eld, the ease of construction of their LDF as well as their ability to capture a wide range of dependency structures.We focus on three models from the Archimedean family (Nelsen 2013): Frank (1979), Gumbel (1960) and Joe (1993Joe ( , 1997)).
In the next section, we provide expressions of the LDF for the three bivariate copulas with Beta marginals (as obtained via the computational software Mathematica version 10 (Wolfram Research Inc 2014) using the MathStatica extension (Rose & Smith 2002)) along with illustrations of the copula density and the LDF for each with varying parameters (using R Development Core Team (2007), version 3.1.2) to facilitate the reader's understanding and interpretation of the LDF gures.In section 3, and for the purposes of completeness, we discuss the graphical interpretation of the results for the FGM and AMH copulas with Beta marginals as presented by Gupta et al. (2010).
There are very few published applications of the LDF to real data despite its great potential for its use on non-Normal bivariate associations that researchers often deal with.Usually analysis of such associations is restricted simply to a nonparametric scalar correlation coecient, such as Spearman's or Kendall's.In section 4, we t and compare copula models to obtain the maximum likelihood estimates (Yan 2007) of the parameters of the bivariate pdf with Beta marginals to a dataset from the eld of statistical education and we demonstrate the usefulness of the derived LDF formulations as an alternative to scalar correlation coecients.

Copula-Dened Bivariate Distributions with Beta Marginals
Let X 1 , X 2 be univariate random variables each with a univariate Beta distribution with shape parameters a i , b i ≥ 0, i = 1, 2 respectively, i.e.: In each copula section below each LDF is followed by two pairs of graphs displaying the contours of the copula density function on the left and the LDF on the right for dierent marginal and dependence parameters.The parameters of the bivariate distribution (a i , b i , and θ) dier between each copula for the rst pair of graphs (Figures 1, 3 and 5), but are the same for the second set of graphs (Figures 2,4 and 6).This illustrates how the same copula function can lead to markedly dierent bivariate structures for varying marginal and dependence parameters as well as how the same parameters can lead to markedly dierent bivariate associations for diering copula functions.The contours are slices of a distribution drawn at certain density levels; in these examples the following values are shown: 0, 0.2, 1, 2, 3, 4, 5, 10, 15 and 100.These represent density levels on the left hand side graph and local dependence levels on the right hand side graphs.For both graphs, the x and y axes represent the two rv's, X 1 and X 2 respectively.

Frank Copula
A Frank copula is constructed via the following combination of univariate marginals F 1 (x 1 ) and F 2 (x 2 ): The density of the copula with Beta marginals can be written as follows and this corresponds to a ve-parameter family of bivariate distributions with Beta marginals (for i = 1, 2): The LDF of this copula is equal to: Notice that the latter has the same sign as θ all over the unit square.
Figure 1 presents the pdf and LDF contours of a Frank copula with given parameters.
Figure 1: pdf and LDF of Frank copula with marginal and dependence parameters: Both the pdf and the LDF follow the same general pattern with the only dierence being the fact that the local dependence function expands more widely to accommodate for pairs of x 1 and x 2 that have low bivariate density but are locally associated with respect to the whole range of values.Notice the negative values of the LDF contours, as a result of the negative dependence parameter.
The following set of graphs shows a Frank copula with changed marginal and dependence parameters.Notice that the general shape of the LDF is very similar to that of the pdf, as in the previous example, i.e. local dependence values mirror very well the density values around the same areas of the bivariate relationship.

Gumbel Copula
The Gumbel copula is dened as follows, where θ ∈ [1, ∞): The density of the copula when the two marginals are Beta distributions and its LDF can be written as (for i = 1, 2): where L xi = − ln F i (x i ), and In contrast with the previous copula, there is no linear relationship between the dependence parameter θ and the LDF for this copula.
The pdf graph of the copula shown in Figure 3 may give the impression of a similar positive association for most of their joint range.However, the LDF graph provides a more thorough view of the dependence.The correlation is maximal across the main positive diagonal whilst it decreases rather quickly o diagonal and becomes minimal along the (0, 1) and (1, 0) axes.

Joe Copula
The Joe copula with Beta marginals yields the following bivariate distribution, where θ ∈ (1, ∞): The density of the copula when the two marginals are Beta distributions and its LDF can be written, respectively, as follows: where F( x i ) = F (x i ) θ − 1 for i = 1, 2 and: As with the Gumbel copula, the overall sign of the LDF cannot be determined directly from the sign of θ.
In Figure 5, density is highest at the top right corner whilst the remaining corners have lower peaks.The highest levels of local dependence are found in similar locations on the LDF graph too, i.e. all corners, with the top right corner having the highest level of local dependence, i.e. large values of both X 1 and X 2 rv's are most highly correlated.The bivariate copula shown in Figure 6 has the same parameters as the second example of the previous two copulas, Frank (Figure 2) and Gumbel (Figure 4).This illustrates the exibility of various copula models to capture changing bivariate given the same parameters.This is mirrored in the LDF graphs too, capturing local dependence at all sets of neighbouring points.The LDF in Figure 6 is an example of how the same amount of local dependence is found at two dierent areas of the bivariate plot, i.e. for low values of X 1 and X 2 and for values of X 1 and X 2 within the range of (0.4, 1).

More Examples and Results
For completeness, we present here examples of the FGM and AMH copulas based on the expressions presented by Gupta et al. (2010) in Figures 7 and 8 respectively.Notice that, although the general shapes of the two densities are similar, the LDF's are markedly dierent.The LDF of this FGM model has the highest correlation for x 1 values higher than 0.4 and for low values of x 2 and similarly for x 2 values higher than 0.4 for low values of x 1 , whilst this pattern was not evident on the density graph.In contrast, the AMH copula reaches its highest local dependence close to the bottom left corner of the graph, where the highest density is also observed.An interesting result for the FGM copula expresses Pearson's r for this bivariate model as a function of its ve parameters.Since this bivariate distribution is constructed as an FGM copula its joint moments reduce to the product of univariate integrals in each variable as shown by D'Este (1981) and Gupta & Wong (1985), hence where 3 F 2 is the generalized regularized hypergeometric function (El-Bassiouny & Jones 2009).It was shown by Schucany, Parr & Boyer (1978) that the absolute value of the correlation coecient for any FGM copula is less than or equal to 1/3 together with examples of particular marginals with correlations smaller than 1/3.It is easy to see that this bound is reached for this bivariate distribution, e.g. if a 1 = b 1 = a 2 = b 2 = 1 then r = θ / 3.If all the four parameters of the univariate Beta distributions in the copula are equal, r takes values between θ / 4 (all 0) and θ / 3 (all 1) and then decreases very little to stay just above 0.318 θ for parameter values much larger than 1.We were only able to reach this result for the FGM copula due to the ease of the calculation of the moments of its bivariate expression; this was not possible for the other bivariate copulas presented in this paper.

Application
Having calculated the cdfs and LDFs of three copula models with Beta marginals, we analyse the association between postgraduate students' grades for a practical SPSS task and a set of multiple-choice questions (MCQ) together comprising a statistics module exam.The data were collected as part of an MSc Statistics module ran at University College London by the rst two authors in 2013 (n = 116 students), are shown in Figure Both variables are expressed as proportions between 0 and 1 and the median grade for the SPSS task is 0.72 and for the MCQ questions is 0.51, with r = 0.67 and τ = 0.54.Additionally, the estimates of multivariate skewness (m 1 ) and kurtosis (m 2 ) dened by Mardia (1970) have values of 1.5 and 10.7, respectively, providing evidence of bivariate non-Normal distribution when compared with the threshold values of 0 and d / (d + 2) = 0.5, where d is the dimensionality of the dataset, which indicate multivariate Normality (Mardia 1974).neither a natural choice for this data set due to the bounded nature of the variables nor does it provide a good t (BIC = −266) compared to other models that were tted, as shown in Table 1.
For this application, we tted the Frank, Gumbel and Joe copulas with the use of univariate Beta marginals for the two related proportions.The use of the AMH and FGM copulas is not possible for this dataset, as the estimated τ is outside the allowed bounds of (−0.18,0.33)and (−0.22,0.22) respectively for each of these copula models.−263.56 −249.79According to the BIC and AIC values (Kuha 2004), the Frank copula provides the best t to this dataset.The left hand side graph of Figure 10 presents the contours the bivariate copulae based the maximum likelihood estimates of the parameters of the each of the Beta marginals and the dependence parameter θ (all were found signicant at 5%).The right hand side graphs show the contours of the associated LDF.The tted pdf evaluates how densely observations occur within a certain area but does not tell us anything about the association (not necessarily linear) of those points within the same area, which is what the LDF does.The pdf of the Frank copula model shows higher density around marks of (0.6,0.9) and (0.4,0.6) as well as a wider spread of the values for the lower marks, hence wider contours.This is additionally emphasised in the LDF the highest dependence value is located at the exact same range of values with lower local dependence values found for the remaining ranges.The LDF provides a more comprehensive description of the relationship between the two types of grades, which would have otherwise been summarised merely via a constant Kendall's correlation coecient of τ = 0.54.
Focusing more on the LDF, what we know from years of teaching experience is apparent in this sample; practical tasks are able to boost up student's overall marks, where students can mark relatively well on the practical task (achieve marks between 0.6 and 0.9) whilst they only achieve a pass on average on the theoretical questions (marks between 0.4 and 0.6).
The Gumbel and Joe copulae fail to capture the evident dependency between the SPSS and MCQ marks.The Gumbel copula puts more emphasis on the peripheral points of this graph, the two extreme areas of the bivariate plot, and entirely misses the strong dependency at the most dense areas of the scatterplot.The Joe copula adopts a atter association between the two variables; with marks for the software-based question between 0.2 and 0.9 being associated with marks ranging from 0.3 to 0.6 for the non-practical questions, suggesting that students can perform rather well at the SPSS task whilst their theoretical marks might not be as high.

Discussion
The exibility of individual copula types to cover a wide range of bivariate distributions with Beta marginals is emphasised via the use of dierent parameter sets (contrast Figures 1 and 2, 3 and 4 and 5 and 6).To show the dierence between copula models note that we have used the same parameter sets for Figures 2, 4 and 6.It is important for researchers to realise that correlation coecients do not always convey the relationship between numerical variables in the best possible way.Summarising the entire correlation structure in a single constant value does not account for changing the strength of the association across the joint range of x 1 and x 2 .The local dependence function overcomes this problem by producing a detailed graphical display of the association between x 1 and x 2 across all values.The LDF can produce distinctly dierent graphs for bivariate density functions that look very much alike as demonstrated in the contrasting Figures 7 and 8.
We have presented three LDF formulae, which can be easily programmed to produce LDF graphs for bivariate scenarios where Beta marginals are a relevant and good choice for the univariate density functions.
Our results could be extended to incorporate covariates, hence LDF could be then drawn conditional on selected predictor values.Additionally, the same principles used here could be applied to marginals other than univariate Beta such as the multivariate skew-normal distribution dened by Azzalini & Dalla Valle (1996).Such extensions would lead to even greater scope for the application of LDF in many elds where bivariate rv's are analysed thus enhancing researchers' understanding of local dependence structures.

6.
We have used a approach to explore bivariate associations between rv's via the combination of the notion of local dependence and copulas.We anticipate that this work will facilitate better research within the eld of multivariate dependence structures.

Figure 4
Figure 4 shows another example of the density and dependence functions of a Gumbel copula.The pdf emphasises the high density at the bottom left corner of the bivariate distribution, whereas the LDF shows a strong correlation along the central part of the distribution.

Figure 9 :
Figure 9: Scatterplot of students' marks for SPSS and MCQs.

Figure 10 :
Figure 10: Frank, Gumbel and Joe copula bivariate distribution for data on students' marks.

Table 1 :
AIC and BIC of tted models