CONDITIONAL EXPECTATION FORMULA OF COPULAS FOR HIGHER DIMENSIONS AND ITS APPLICATION

unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract. Due to its simple form, linear regression is the most commonly used model when dealing with a predictive model. However, there are some limitations to the model, such as the constraint of only being able to model variables that have a linear relationship, the assumption of normality on its error, and the multi-collinearity between independent variables which should not occur. One of the alternative models that is free from these limitations is the copula-based regression model defined by the conditional expectation formula of copulas. Leong and Valdez [Claims prediction using copula models, Insurance Math. Econom., 2005] [15] developed a conditional expectation formula of copulas for higher dimensions in the implicit form with bivariate case examples. Crane and Hoek [Conditional expectation formulae for copulas, Aust. N.Z.J. Stat, 2008] [5] provided conditional expectation formula of copulas explicitly for two dimensions with its examples. However, in practice, a predictive model often involves more than two variables, i.e. one dependent variable with more than one independent variable, including a copula-based regression model. With regard to these problems and the limitations of dimension in previous studies, our contribution in this study is extending the copula-based regression model for higher dimensions for class of Farlie-Gumbel-Morgenstern, elliptical, and Archimedean copula. We obtain a closed-form of conditional expectation formula of Farlie-Gumbel-Morgenstern, Gaussian, Student-t, and Clayton copula for n dimensions and provide the formula for Gumbel copula up to four dimensions. We apply our extended formula to estimate


INTRODUCTION
One of the simplest and most widely used predictive models is a linear regression model. It is used to predict the data that have a linear relationship pattern. In addition, there are assumptions of normality that must be met and multi-collinearity that must be avoided. Meanwhile, there exist a possibility that there is a non-linear relationship between independent variables and the dependent variable. Besides, the distribution of the data is not always normal. In such cases, the linear regression model cannot be used. There is a function that can describe the relationship between some variables both linear and non-linear, namely copula function, first introduced by Abe Sklar in 1959. It is widely used to model the dependency structure between some variables with any non-linear relationship. The function's advantage lies in its ability to identify any structure of dependency, as well as there are no certain assumptions that the marginal distribution of the variables whose dependencies will be tested must satisfy.
The copula function is developed as an alternative predictive model for the linear regression model. Some studies mention that copula-based regression model can generally be written as a function of independent variables, where the model is defined as the conditional expectation of the dependent variable given by the independent variables [15,5,9,11,8,6].
The main idea of the construction of a simple copula-based regression model using conditional expectation formula was introduced by Leong and Valdez [15] and Crane and Hoek [5]. Leong and Valdez [15] found the predictive claim in the form of a conditional expectation formula of copulas. They provided some formulas for multivariate copulas implicitly with the examples for bivariate cases. Crane and Hoek [5] derived conditional expectation formula for Farlie-Gumbel-Morgenstern (FGM), Iterated FGM, Gaussian, and Archimedean copula for two and three-dimensional cases. The formula for the two dimensions model has been applied to several examples such as exchange-rate data, male waist size given by male forearm size, and male chest size given by male waist size. Kolev and Paiva [9] discussed the copula-based regression model for Gaussian copula, transitional regression model, longitudinal model, and Archimedean copula up to three dimensions cases. Parsa and Klugman [11] presented the formulas and algorithms necessary for conducting the Normal copula-based regression model and provided the examples for three-dimensional cases. The simulations describe that copula regression provides a good alternative to OLS and GLM. Noh et al. [8] studied the estimator for different copula-based regression models which are asymptotically normally distributed.
Masarotto and Varin [6] made a package in R to implement the Gaussian copula-based regression model.
Based on the previous researches, the copula-based regression model gives more flexible result compared to the linear regression model because it can accommodate possible non-linear relationship that cannot be captured by the linear regression model. The basic foundation of the copula-based regression model is the conditional expectation formula of the dependent variable given by the independent variables. However, in the studies mentioned earlier, the copula-based regression model used is still focused on two-dimensional cases. Therefore, in this paper, our contribution is on the extension of the conditional expectation formula discussed by Leong and Valdez [15] and Crane and Hoek [5] to the higher dimensions for FGM, elliptical, consists of Gaussian and Student-t, and Archimedean copulas, consists of Clayton and Gumbel copula. This is interesting because in many cases, the predictive model is widely used to solve problems related to many independent variables. We obtain a closed-form of conditional expectation formula of FGM, Gaussian, Student-t, and Clayton copula for n dimensions and provide the formula for Gumbel copula up to four dimensions. To obtain a better understanding regarding the extended formula of the conditional expectation of copulas for higher dimensions, we provide an application on the financial case. We apply the three-dimensional conditional expectation of copula to estimate KRW/USD currency based on its association with CNY/USD and JPY/USD. For visualization purpose, we plot the estimation results against the original data in a three-dimensional plot.
The rest of this paper is organized as follows. In Section 2, we discuss the conditional expectation formula for multivariate copula and provide general functions for FGM, elliptical, and Archimedean copulas. In Section 3 we explain the estimation procedures for marginal and copula modeling. In Section 4, we apply the three-dimensional conditional expectation formula of copula to estimate KRW/USD currency based on its association with CNY/USD and JPY/USD.
We also provide the three-dimensional plot of the estimation results for visualization purpose.
The last Section provide the conclusion of our research.

CONDITIONAL EXPECTATION FORMULA FOR MULTIVARIATE COPULA
Suppose that Y is a random variable which represent the dependent variable and the sequence {X 1 , X 2 , . . . , X n } are random variables which represent the independent variables, the conditional expectation of Y given by {X 1 , X 2 , . . . , X n } defined by [5].
For bivariate cases, suppose that X represents the independent variable and Y represents the dependent variable. Suppose that u = F Y (y) and v = G X (x), H is the joint distribution function of X and Y , then and the conditional distribution function is [14] P The index 2 in Eq.(4) represents the partial derivative for the second variable, that is the independent variable. Furthermore, for the three-dimensional cases, suppose that Y is the dependent variable and X 1 and X 2 are the independent variables with u 1 = F(y), u 2 = G 1 (x 1 ), and u 3 = G 2 (x 2 ) (see the illustration in Figure 1). The multivariate distribution function of P 1 , P 2 , . . . , P 8 can be expressed as the copula function as follows Based on the definition area for the three-dimensional cases in Figure 1 and the relationship between multivariate distribution function and copula function, then the conditional distribution function of Y given by X 1 = x 1 and X 2 = x 2 can be derived as follows [5] Generally, for n independent variables, suppose that u 1 = F(y), where its multivariate distribution function can be expressed in the copula form below Then the conditional distribution function can be expressed as [5] (7) P(Y ≤ y|X 1 = x 1 , . . . , X n = x n ) = C 23...n(n+1) (F(y), G 1 (x 1 ), . . . , G n (x n )) C 23...n(n+1) (1, G 1 (x 1 ), . . . , G n (x n )) Therefore, the conditional expectation formula for multivariate cases is defined by [5] (FGM) copula is one of the parametric copulas which is widely used because of its simple form. Multivariate FGM copula is defined by [13,7] Based on [7], multivariate FGM copula has 2 d − d − 1 parameters.
By doing an algebraic derivation according to Eq.(8) using FGM d-copula from Eq.(9), we obtain the following theorem.
Theorem 2.1. Suppose that Y represents a dependent variable and a sequence {X 1 , X 2 , . . . , X n } represent the independent variables, with u 1 = F(y) and , then the conditional expectation of Y given by {X 1 , X 2 , . . . , X n } for multivariate FGM copula is defined by Proof. Based on Eq.(9), we have (n + 1) variables consist of Y and {X 1 , . . . , X n } with u 1 = F(y) and u 2 = G 1 (x 1 ), u 3 = G 2 (x 2 ), . . . , u n+1 = G n (x n ), therefore we have the following equations Using Eq. (8), we have to derive Eq. (11) and (12) partially to (n + 1) and n-th order as follows Substitute Eq. (13) and (14) into Eq.(8), we obtain and For two-dimensional case, the conditional expectation of Y given by X = x for bivariate FGM copula is For three-dimensional case, suppose that there are three variables; Y that represents the dependent variable and X 1 and X 2 represent the independent variables, with u 1 = F(y), , then the FGM 3-copula has 2 3 −3−1 = 4 parameters and defined by The conditional expectation of Y given by X 1 = x 1 and X 2 = x 2 by using Theorem 2.1 is Example 2.4. This example provides the conditional expectation formula for four-dimensional FGM copula. Suppose that Y represents the dependent variable and X 1 , X 2 , and X 3 are inde- , then the FGM 4-copula has 2 4 − 4 − 1 = 11 parameters and defined by Conditional expectation formula for FGM 4-copula is defined by . . , n + 1, then θ has a unique range of parameter, given by Proof. The proof is the extension of Johnson and Kott's work [10]. Suppose we have n + 1 variables consist of Y, X 1 , . . . , X n with u 1 = F(y), u 2 = G 1 (x 1 ), . . . , u n+1 = G n (x n ), the joint probability density function of Y, X 1 , . . . , X n is 21) is said to be a probability density function if the value of h ≥ 0, it means that the value in the bracket should be greater than or equal to zero.
For two dimensions (d = 2), the result is quite obvious. We can refer to the case of bivariat FGM copula, and we obtain the range of parameter, i.e. θ ∈ [−1, 1].
For three dimensions consists of Y , X 1 , and X 2 , we have The possible values of θ are

and 2 other similar conditions)
and trivial solution for the case c j 1 = 1; c j 2 = c j 3 = −1 and two other similar conditions. By completing the three inequalities, we have the range of parameter of θ for three dimensions that is θ ∈ − 1 4 , 1 2 . For four dimensions consists of Y , X 1 , X 2 , and X 3 , we have Using similar way, we have possible values for θ that are This method can be expanded into d dimensions, and the general solution is In class of elliptical copula, two popular copulas are often to be used, that are Gaussian and Student-t copula. Multivariate copula for this class is defined by where Φ n (·) and t n (·) are the joint distribution functions of standard normal and Student-t, respectively.
We obtain the conditional expectation formula for multivariate elliptical copula as follows.
Example 2.15. Conditional expectation of Y given by X = x for two-dimensional case of Gumbel copula is as the same as provided by [5], i.e.
Example 2.16. For three-dimensional case, the Gumbel 3-copula is defined by Suppose that u 1 = F(y), u 2 = G 1 (x 1 ), and u 3 = G 2 (x 2 ), then the conditional expectation of Y given by X 1 and X 2 is

ESTIMATION PROCEDURES
Suppose that a joint distribution function of the marginal variables {X 1 , . . . , X n } with the parameter spaces of {Ω 1 , . . . , Ω n } is written by where C(·) is the copula function and Θ = {θ FGM , ρ Ga , {ρ t , ν t }, θ Cl , θ Gb } is a set of parameter of copula function.
Parameter estimation of the marginal variables and the copula function can be solved by the maximum likelihood estimation (MLE) method. The log-likelihood function of the copula model is given by By considering the complexity of parameter estimation using the maximum likelihood method directly due to the number of unknown parameters, the estimation procedure can be carried out in two steps procedure [2,4,1]. First, estimate the marginal variable parameter (Ω 1 , . . . , Ω n ) and then estimate the parameter of the copula function Θ. Therefore, Eq. (53) can be written as where L f j (Ω j ), j = 1, . . . , n is the log-likelihood function of the marginal variables and L c Θ (Θ) is the log-likelihood function of the copula.
The last, the copula parameter Θ can be estimated by maximizing L c Θ (Θ) and select the best copula by choosing the smallest AIC value.

APPLICATION
In this section, we apply the general formula of the conditional expectation of copulas for higher dimensions to estimate the exchange rate value. We use KRW/USD, CNY/USD, and JPY/USD from 1 st January, 2018 to 11 th December, 2020. We gather the data from Yahoo Finance. First, we conduct a distribution fitting for the marginal variables. We then estimate the parameter of the copula for FGM, Gaussian, Student-t, Clayton, and Gumbel and select the best copula which describe the relationship between the three marginal variables by selecting the   Table 2 provides the parameter estimates of the marginal distribution of each exchange rate.  Table 2, the best distribution for each marginal variable is log-normal distribution.
The next step is to identify the structure of dependence of the three marginal variables using copula modeling. Table 3 provides the parameter estimates of the copula functions. where Y represents KRW/USD, while X 1 and X 2 represent CNY/USD and JPY/USD, respectively, is provided in Figure 3. quite close with the original data. It is also indicated by the value of the mean absolute percentage error (MAPE) of the original and estimated data which has a value of 0.0476%. Therefore, we can say that the general formula of the conditional expectation of copulas for higher dimensions can be used for estimation procedure of more than two-dimensional cases.

CONCLUSION
We have extended some general formula of conditional expectation of copulas for higher dimensions as the definition of copula-based regression model and its basic properties stretched from Crane and Hoek's work [5]. We obtained the closed-form of the conditional expectation formula for higher dimensions of Farlie-Gumbel-Morgenstren, Gaussian, Student-t, and Clayton copula, and provided some examples of Gumbel copula.
We have also employed the conditional expectation formula of copula for three-dimensional case to estimate the value of KRW/USD exchange rate based on its association with CNY/USD and JPY/USD. The result gives a quite small MAPE of 0.0476%. The estimation results show that the conditional expectation formula of copula for higher dimension can be used to estimate more than two-dimensional case quite accurately. In addition, the visualization shows that the estimation results are distributed around the actual value.