Assessing green bond risk: an empirical investigation

Green bonds have gained a significant share in the bond market. However, dynamic risk and its spillover to other conventional bond investments plays an important role in its understanding. In this paper, we analyze the volatility and correlation dynamics between conventional bond and green bond assets under both loose and stringent eligibility green-labeled criteria. We build dynamic conditional correlation (DCC) model specifications using alternative distributional assumptions. We also assess risk dynamics expressed by Value-at-Risk (VaR) and its corresponding loss function. We illustrate risk assessment in within and out-of-sample periods using conventional and green bond returns. The results show that there is significant spillover between conventional and green bond assets, triggering significant hedging strategies. However, these spillover effects are subjected to the type of green-labeled criteria. Finally, a risk assessment using VaR forecasting and its corresponding loss function estimation also demonstrates significant differentiation between green and conventional bonds.

potential risks, specifically contagious risk, which should be considered when investing. Recently, Agliardi and Agliardi (2021) propose a structural model for defaultable bonds incorporating both uncertainty about corporate earnings and uncertainty due to climate-related risks. They study how bond pricing is affected by transition risks, such as those coming from an abrupt change of climate policies. They show how the issuer's credit quality changes as a result of its engagement in projects funded by GBs and study the impact of GBs on investors' portfolio allocation.
Finally, other approaches investigate hedging strategies based on dynamic hedge ratio (hereafter HR) estimation using conditional volatility models of the GARCH family (Pham, 2016;Jin et al., 2020). Pham (2016) signifies that a shock in the overall conventional bond market tends to spillover into the green bond market, where this spillover effect is variable over time. Moreover, he reveals a non-constant interaction of volatility between the green bond market and the overall broader bond market due to HR instability. On the other hand, Jin et al. (2020) point out that GBs are strong hedging instruments since the green bond index produces positive HR values in the volatile period while other market indexes such as carbon, energy and commodity index fail to do so. However, the above estimation strategy adopted by Pham (2016), can be made more robust in the direction of the use of alternative distributional assumptions regarding to the existing competitive models. Other issues also arise regarding the use of other measures of risk such as the VaR and its corresponding loss function. In this paper, we will resolve the model selection problems using univariate and multivariate conditional volatility analysis and the estimation of conditional VaR at various nominal levels. One of the most popular multivariate volatility models is the dynamic conditional correlation (hereafter DCC) model (Engle, 2002;Tse and Tsui, 2002). The main advantage of the DCC model is the positive definiteness of the conditional covariance matrices and its ability to describe time-varying conditional correlations and covariances in a parsimonious way. Concerning parametric estimation, the DCC model can be estimated in two stages, which makes this approach relatively simple and possible to apply even for very large portfolios. Here, we adopt DCC models of alternative distributional assumptions such as: the Normal, the Student-t and the Laplace. Concerning the VaR, this is a quantile risk measure used in finance representing the minimum loss for an asset or portfolio over a pre-specified time horizon for a given probability (confidence level). In our case, time-varying VaR and its corresponding loss measure is derived from competing DCC models. Another issue that is addressed in the present paper is the investigation of the existence of differentiation between alternative GBs indexes and how this is affecting the volatility clustering and volatility spillovers. Within this context, this paper can give answers to the following questions: Is correlation between conventional and GBs subjected to the type of green-labeling? Does correlation between conventional and GBs evolve differently when it comes to adopt either loose or stringent eligibility green-labeled criteria? How does the last differentiation affects hedging strategies.
The objective of this study is three-fold. First, we assess the conditional variance and other risk measures of green bond returns in comparison with the corresponding CBs under alternative volatility specifications. Thus, by using univariate and multivariate conditional volatility specifications of the GARCH family, we compare alternative model specifications for their ability to estimate volatility, conditional covariance and conditional correlation. Special reference is given to the forecasting of the Value-at-Risk and its corresponding loss using the alternative multivariate specifications. Second, under the bivariate analysis which involves both conventional and GBs, we analyze issues such as how volatility spills over between conventional and GBs, whether there is any asymmetry in volatility transfer and the effect of condition hedging using a Hedge Ratio (HR). Third, we investigate the above volatility and correlation dynamics when CB interacts with GBs under either loose or stringent eligibility green-labeled criteria. This can reveal the existence of potential differentiation in the way one invest when considering GBs of alternative labeling.
The paper is organized as follows. Section 2 reviews the notion of the conditional volatility and the conditional VaR. In the same section, we approach conditional volatility and VaR under a univariate and multivariate version using the DCC models. We specify these models under various distributional assumptions which involves the Normal, the Student-t and the Laplace cases. Special reference is given to the conditional VaR estimation and its corresponding loss function. In Section 3 we illustrate our risk assessment results using real data under univariate and multivariate volatility specifications.
Here, a within sample period is used to assess conditional volatility, covariance and the HR between conventional and GBs. Section 4 presents forecasting evaluation results using conditional covariance, VaR and its corresponding loss function measure performed for all of the competing DCC models. Finally, we present our conclusions in Section 5.

Conditional volatility models
The time-varying risk of bond returns expressed by volatility needs to be investigated. A widely used methodology in modeling volatility is through GARCH(p, q) models. Under this framework, the bond returns at time t, y t is related to a conditional mean component plus an error term, such as (Bollerslev, 1986): where the µ t ∼ ARMA(p, q) can be an autoregressive moving average process and the t an independent and identically distributed error term with mean zero and variance that equals h t . The latter component then evolves according to the autoregressive moving average structure with h t being the conditional variance, such as: Estimation is usually based on maximum likelihood where the stationarity condition α 1 + · · · + α p + β 1 + · · · + β q < 1 need to be considered. Here, high and statistically significant values for α i and β j indicate the existence of volatility persistence, where periods of high volatility are followed by periods of high volatility and periods of low volatility are followed by periods of low volatility. Usually, in financial returns analysis and considering likelihood test criteria, the GARCH(1, 1) model is adopted. This is of the form: The time-varying volatility comparison between the green bond and the conventional bond returns can be done using the above equations separately for each series. Therefore, the estimation and testing procedure of a univariate GARCH(1, 1) model has the following steps: 1. We estimate the mean valuesμ t for each time t based on an ARMA(p, q) model (usually an ARMA(1, 1) is enough). 2. We derive the estimated errorsˆ t = y t −μ t which are then used as component in the GARCH(1, 1) specification and the likelihood function derivation.
3. Given the α 1 + β 1 < 1 stationarity condition, a conditional maximum likelihood method is used to estimate the ω, α 1 and β 1 parameters from a GARCH(1, 1) model. 4. The estimated volatilityĥ t is derived together with diagnostic testing score functions such as the AIC and BIC.
However, if we want to consider the interactions between the green bond and the conventional bond market this cannot be taken into account using a univariate GARCH model. Thus, under a multivariate GARCH model the volatility of an asset return doesn't only depend on its past values but also on the volatility of an other asset expressed by the covariance function.
Thus, to understand how GBs' risk evolves dynamically in relation with other CBs we need to investigate its volatility dynamics simultaneously. Engle (2002) has extended the above univariate conditional volatility model to the so-called dynamic conditional correlation model (DCC). According to this, let's suppose a multivariate N × 1 error process t which can be written as: distributed according to a D distribution with a N ×N conditional covariance matrix. Then, its covariance functions evolves such as: with the conditional correlation function and with D t = diag(h 1/2 1t , . . . , h 1/2 Nt ). Here, the z t denotes the standardized IID N × 1 error such as z t = D −1 t t , the cor t denotes the N × N conditional correlation function of the z t , the S the unconditional correlation of the errors and Q * t the diagonal N × N matrix which consists of the square root of the diagonal elements of Q t . The above specification represents the so-called DCC(P, Q)-GARCH model. Concerning the parameters, these should have the following stationary condition: Q i=1 ζ i + P j=1 θ j < 1. Here, the green bond returns y t,G and the conventional bond returns y t,C formulate a bivariate DCC-GARCH model. The estimation of the DCC's model involves two steps. First, the estimation of the following univariate models: for the green bond and the conventional bond index returns respectively. Here, ω i > 0, α j,i , β k,i ≥ 0 for i = G, C, j = 1, . . . p and k = 1, . . . , q. The covariance stationarity of t,i requires that p j=1 α j,i + q k=1 β k,i < 1. After estimating the above parameters, the standardized residuals are estimated for each case:ẑ t,i =ˆ t,i / ĥ t,i where i = G, C. Then, the conditional correlations between the standardized residuals are calculated. Thus, the conditional covariance function is used to produce the conditional correlation functionρ t,i, j =ĉ t,i, j / ĉ t,i,i ·ĉ t, j, j . At the end, the first and the second steps are used to estimate the conditional variance-covariance matrix of the errors of the two univariate errors analyzed (Engle, 2002). The parameters a and b express the magnitude of volatility spillover between the analyzed returns. Thus, the estimation and testing procedure of a bivariate DCC(1,1)-GARCH model has the following steps: 1. Steps 1 and 2 of the univariate GARCH(1, 1) are followed here for each individual series. 2. The likelihood function is split into two parts: the pure volatility one which is like a linear combination of two univariate likelihood functions and the correlation plus volatility part. Here, volatility for each series is estimated separately using a univariate-likelihood function under the maximum likelihood method. 3. Given the univariate volatility estimates derived in step 2, the a and b parameters are estimated using the second part of the likelihood function and applying the maximum likelihood method.
Here, the AIC and BIC diagnostic scores are also estimated (Engle, 2002).
In the past, various authors have used DCC-GARCH models of linear and non-linear specifications; such as the DCC-APGARCH, DCC-T-GARCH, and DCC-GJR-GARCH models (Pham, 2016;Reboredo, 2018;Jin et al., 2020). Concerning their results, the two first papers show no systematic evidences for the need for diversification to CBs when buying GBs. However, all of them used the Normality assumption for the errors. In the present analysis, we will adopt three DCC-GARCH model specifications: with Normal, Student-t and Laplace errors. * After deriving conditional volatility estimates, one can generate another important risk measure: the VaR. The VaR is a quantile measure that is widely used in finance as an instrument to control and manage risk. This measure represents the minimum loss for an asset or portfolio for a given probability α. This is used as a mathematical tool designed to offer optimal asset allocation in financial management strategies. The VaR estimates the worst expected loss over a specific time period for a given nominal level α, such as the following: P(y t < −q t,α ) = α with q t,α ≡ {r : F(r) ≥ α} ≡ F −1 (α), where F is the conditional distribution function of the y t observed financial return, F −1 the inverse conditional distribution, and q t,α the α-level quantile of the distribution function of the y t conditional on the information up until t − 1 given a vector of returns y = (y 1 , . . . , y t−1 ). By estimating the worst risk, risk analysts can make optimal asset allocation. Conditional VaR strategies that account for the dependence structure of volatility have been modeled using mostly some fully parametric models such as the GARCH-class (for other approaches in conditional VaR estimation see: Tsiotas, 2018;Tsiotas, 2020). In such a case, the location-scale class is based on the assumption that returns belong to a location-scale family of probability distributions. Using the above approach, we can specify the conditional VaR estimate at the α-level quantile as: (11) * Here, the probability density function of the Student-t and Laplace errors is given by: where D −1 is the inverse CDF for the distribution D at α nominal level. In the empirical analysis that follows, we utilize three different assumptions for the innovation distribution: the Normal; the Student-t with ν degrees of freedom and the Laplace density.  Table 2. VaR evaluation for the AB, GB and GBS return series using empirical, Gaussian and Cornish-Fisher (C-F) methods.

AB GB GBS
VaR

Real data analysis
To make a risk analysis comparison between the conventional bond and the green bond market, we consider the daily time series data from the S&P U.S. Aggregate Bond Index (denoted hereafter as AB), the S&P Green Bond Index (hereafter GB), and S&P Green Bond Select Index (hereafter GBS). The AB measures measures the performance of U.S. dollar-dominated investment grade U.S. fixed income market, including U.S. Treasuries, quasi-governments, supranational non-U.S. goverments and agencies, corporates, covered bonds, residential mortgage pass-troughs and taxable municipal bonds. Concerning its credit rating criteria, the AB has as the minimum credit rating for inclusion in investment-grade indexes with BBB-/Baa3/BBB-. On the other hand, the GB is composed of a variety of global bonds labeled "green" by Climate Bond Initiative (CBI) and subject to eligibility criteria. This index undergoes a re-balancing process once a month, with the intend of keeping the index current. The GBS is a market value-weighted subset of the GB that seeks to measure the performance of green-labeled bonds issued globally, subject to stringent financial and extra-financial eligibility criteria. In other words, the GBS includes those issuers have provided accurate information about the use of proceeds, or whose compliance with the Green Bond Principles around the "Use of Proceeds, Process for Project Evaluation and Selection, Management of Proceeds and Reporting" has been independently verified. Under this procedure, the GBS calls for higher standards of transparency, disclosure and accountability in the green bond market (S&P, 2021). Whereas, the the credit rating of the GBS requires as the minimum credit rating for inclusion in investment-grade indexes with BBB-/Baa3/BBB-, the GB includes sub-indexes of non-rated bonds issued by U.S. government sponsored enterprises.
For these closing price indexes, we derive the daily returns than span between the 2nd of May 2011 and the 21th of May 2021. Figure 1 presents the returns of the AB, GB and GBS indexes. What we observe is that in all returns the mean values are nearly zero and the GB and the GBS returns have spikes that are more extreme than those observed in the AB returns. Before we proceed with our analysis, we summarize some descriptive statistics measures of the AB, GB and GBS return series described in Table 1. These include the mean, standard deviation, minimum, maximum, skewness, excess kurtosis coefficients, and the Jarque-Bera normality test. The mean values show that in all series the return is nearly zero a phenomenon that is widely observed in all stock return series. As regards the standard deviation this indicates an overall variation of the series. Thus, we observe that the GBS return series exhibits the highest variation followed by the GB one. Also, the maximum and the minimum values indicate the upper and the lower range of values returns can take. Again, the GB return indexes show higher risk than the conventional bond return index with the GB having the lowest minimum value and the GBS the highest maximum value among the bond return indexes. The skewness and excess kurtosis measures assess how much the returns deviate from the Normality assumption. Therefore, we observe that both the AB and GBs have negative skewness. Concerning the excess kurtosis, they all have leptokutosis with the GB returns having more than the others. Finally, the Jarque-Bera test is a test that examines the existence or not of Normality. It is based on a chi-square distribution with two degrees of freedom. In practice the higher its value the more probable is to reject the Normality assumption. Under this test and using the p-value results for all return series we have Normality violation with the stronger one being observed for the GB return index.
An important statistical measure that can quantify the bond market risk between the green and the conventional market is the VaR. VaR is a number that indicates how much a financial institution can loose with some probability over a given time horizon.
Here, together with the empirical VaR we demonstrated the values under the Normal and Cornish-Fisher (C-F) VaR estimates. The Normal VaR estimates (VaR N ) are generated by: where µ and s are the empirical mean and standard deviation values of the return series respectively and q α,N is the α-level percentile under standard Normality (here (P(Z ≤ q N,α ) = α)). A robust type of VaR that can count for Normality violations such as excess kurtosis and skewness can be estimated using C-F VaR estimates. These are generated by first deriving the C-F α-level percentile as illustrated below: where a 1 is the skewness coefficient and K = a 2 − 3 is the excess kurtosis coefficient. Therefore, C-F VaR estimates (VaR CF ) are given by: In Table 2, we present the α = 1% and 5% VaR of the AB, GB and GBS return series using: the empirical VaR, the Normal VaR and Cornish-Fisher VaR. The empirical one is in accordance to the formal VaR definition. Under all methods the GBs seem riskier to the conventional bond with the GBS showing the highest risk under all methods of estimation for both α = 1% and 5% nominal levels. In Figure 2 we present the α = 1% and 5% VaR of the AB, GB and GBS return using the Normal VaR estimates.  An additional statistical measure that reveals the overall return series characteristics is the rolling conditional covariance and correlation functions. By selecting a 100 and a 200 day window, we estimate the empirical covariance and the correlation between AB and GB and between AB and GBS index return series. Figure 3 demonstrates the results for the 100 and 200 day window respectively. The rolling covariance and correlation are generally very stable with the stability to be increased with the increase of the rolling window. What we can observe is that the rolling covariance and correlation between the AB and the GBS are more stable than those between the AB and the GB a phenomenon that shows that there is more causal relationship between the AB and the GBS than between the AB and the GB indexes.

Univariate GARCH modeling
In this subsection we will do conditional volatility estimation for each return series separately. Before we proceed with a univariate GARCH estimation, we test whether a GARCH model is valid for the available return series. To do so, we need to test for the existence of correlation in the squared returns using the Ljung-Box test. The Ljung-Box statistic uses the squared residuals of an ARMA(p, q) model to check for GARCH model adequacy (for details see Tsay, 2005). Table 3 reports the statistic of this test using the squared residuals of the ARMA(1, 1) model. Under the p-value results, the null hypothesis which signifies that there are not GARCH effects is rejected even at the 1% significance level for all return series. Another, less formal test for the validity or not of GARCH modeling is to assess the absolute and the squared return values for serial correlation using autocorrelation functions (ACF). In Figure 4, we report the ACF of the observed, the absolute and the squared AB, GB and GBS return series. These indicate that these is nearly zero correlation in the returns and significant correlation in both the absolute and squared returns for all series. This signifies that the series can exhibit significant correlation in their volatility (volatility clustering). In such a case, periods of high volatility may be followed by periods of high volatility and periods of low volatility may be followed by periods of low volatility.
After this formal and informal test for the existence of the conditional volatility effect, we run univariate volatility estimation for the three return series using the whole sample. Initially, we consider alternative Autoregressive Conditional heteroscedasticity (ARCH) and GARCH models of various p and q specifications. However, as the extended financial literature suggests, the GARCH(1, 1) process is the best specification when it comes to estimate volatility dynamics (Tsay, 2005). The parametric model that is considered here is the GARCH(1, 1) model with Gaussian and Student-t errors. Much of the literature on conditional volatility estimation and forecasting uses these models as benchmarks. The models are specified as: for distribution D with Normal and Student-t errors. Table 4 presents the estimated parameters of the univariate GARCH(1, 1) model for the AB, the GB and the GBS return series under the Normal and the Student-t distribution assumptions. To perform the models fitness we use the AIC and the BIC score functions. Under these measures the Student-t GARCH(1, 1) model is the best performer since it minimizes both scores under all data cases. The reported parametric estimates that correspond to the α 1 , β 1 and ν have a high level of significance in all data cases. In terms of the measure of persistence in conditional volatility (α 1 + β 1 ), the GB return series have the highest persistence with the GBS following and the AB in the third place. This is translated into more volatility clustering for the green bond indexes compared to the conventional one. Also, using the ln 0.5/(ln(α 1 + β 1 ) measure one can count the days needed to reduce a volatility shock by 50% (see Pham, 2016 for details). This is estimated in 17.7207 days for the AB index, 137.2096 days for the GB index and 132.2114 days for the GBS one. Again, the volatility impact of the GB index is considered the highest compared to the other two. If we see the conditional variance estimates using the Student-t model, presented in Figure 5, we can diagnose that although the GB index has a higher volatility clustering compared to the GBS the latter has higher minimum and maximum volatility values with a very high volatility spike observed in mid of 2014. This characteristic comes in accordance to the VaR results where the GBS index has been considered as the riskiest asset compared to the AB and GB ones. Concerning the kurtosis parameter ν, again the GB index returns show higher deviation from Normality than the CB. Here, the GBS series show the highest deviation since the estimated degrees of freedom parameter ν is the lowest one.

Bivariate GARCH modeling
In the univariate GARCH analysis we have captured volatility dynamics for the individual time series. However, this fails to capture a potential volatility interaction (or spillover) between the time series especially those between GBs and CBs. As seen on Figure 5, there are periods where the bond returns' volatility moves towards the same and in some other periods the opposite direction. In this subsection, we intend to analyze the volatility movements of the GB and the GBS returns in relation with those expressed by conventional bond returns such as the AB ones. In practice we will demonstrate the results of the bivariate DCC(1, 1)-GARCH model applied in two cases: the AB with the GB (hereafter AB-GB) return series and the AB with the GBS return series (hereafter AB-GBS).
We have specified three DCC-GARCH specifications depending on the distribution assumption assumed: The Multivariate Normal (hereafter MV-Normal), the Multivariate Student-t (hereafter MV-Std) and the Multivariate Laplace (hereafter MV-Laplace). Under these specifications, we estimate the Bivariate models for the AB-GB and AB-GBS index return series. In Table 5 we demonstrate two diagnostic testing measures such as the AIC and the BIC scores. Results show that the MV-Std is the best performer under the two multivariate data cases. Here, we need to point out that other DCC-GARCH specifications such as the asymmetric (DCC-GJR-GARCH), with threshold (DCC-T-GARCH) and the flexible DCC-GARCH cases do not perform better than the "standard" DCC-GARCH models. More specifically, the asymmetric DCC-GARCH model aims to count for the leverage effect observed in many financial data. As results have shown (NB: not demonstrated in the present manuscript), the asymmetric coefficient is very small and statistically insignificant signifying that the volatility transfer between green and CBs is not affected by the leverage effect.
In Table 6 we summarize our results under the MV-Std DCC-GARCH model. First, in the AB-GB case we see that in the p-value results of all important variables that influence volatility estimation are statistically significant at the 1% significance level (they have p-value < 1%). Again, the GB index exhibit higher volatility clustering compared to the AB index. Moreover, the positive and statistical significant value of b AB,GB signifies a volatility spillover between the GB and the AB index. This is translated into a positive conditional correlation of a small value. In Figure 6, we see how the conditional standard deviation of the GB return series is larger compared to that of the AB ones. This comes in accordance with the univariate GARCH results. Also, in Figure 8, we see that the conditional correlation between these two return series is unstable. However, on average, the GB is positively correlated with the market benchmark AB index. This signifies that the use of environmentally friendly projects has sometimes created different patterns from conventional bonds. Secondly, in the AB-GBS case we see that the p-value results of all important variables that influence volatility estimation are statistically significant at the 1% significance level (they have p-value < 1%) with the exception of the α 1,GBS which is significant at 10% significance level (they have p-value < 10%). Here, the AB and the GBS indexes exhibit similar volatility clustering. The positive and statistical significant value of b AB,GBS also signifies a volatility spillover between the GBS and the AB indexes. In Figure 7 we see the conditional standard deviation of the GBS return series to be systematically larger compared to that of the AB ones. This comes in accordance with the univariate GARCH results.     Moreover, in Figure 8, we see the conditional correlation between these two return series. In contrary to the AB-GB case, the AB-GBS ones show on average negative conditional correlation which is less stable than the corresponding in the AB-GB case. Another important finding is that, for the period after the 2015, positive correlation values between the AB-GB indexes is associated with negative correlation values between the AB-GBS indexes and vice versa. This signifies an important sign of differentiation between the GB and GBS.
Using the above conditional estimates of standard deviations and covariance functions, we use these results to derive HRs. The HR between the AB and a green bond returns is given by (see Kroner and Sultan, 1993): with j = GB, GBS where σ AB, j,t to denote the covariance between the AB and the green bond return indexes at time t and σ 2 AB,t to denote the variance of the AB index at time t † . Generally, a positive HR shows the extent to which a long position in a green bond can be hedged by a short position in the conventional bond. On the other hand, a negative HR shows the extent to which a short position in the green bond can be hedged by a long position in the conventional bond.
In Figure 9 we show the time-varying HR for the AB-GB and the AB-GBS cases. First, we observe that on average the HR value of the AB-GB case is positive and the corresponding one for the AB-GBS case is negative. This indicates that, for the AB-GB case, a long position in the GB should be hedged by a short position in the AB asset which price is expected to drop. By having an average HR value of 0.0219, one can say that the optimal quantity of hedge for 1 million dollar of investment is 0.0219 million dollars. On the other hand, a negative HR value for the AB-GBS case indicates that, a short position in the GBS can be hedged by a long position in the AB which price is expected to rise. By having an average HR value of −0.0012, one can say that the optimal quantity of hedge for 1 million dollar of investment is 0.0012 million dollars. Looking at the dynamics of the HRs, we observe significant differences in hedging strategies where periods with a positive HR values between the AB-GB indexes is associated with negative HR values between the AB-GBS indexes and vice versa. This again signifies the existence of differentiation strategies one should adopt when he/she considers hedging. Also, the magnitude of the absolute HR value is larger in the AB-GB than in the AB-GBS case signifying that the level of hedging is stronger in the AB-GB than in the AB-GBS case.
To robustify the above estimation results, we present the AIC and the BIC scores when out total sample is divided into three sub-samples. Here, we denote as period I the one that than span between the 2nd of May 2011 and the 22th of August 2014, as period II the one that than span between the 25th of August 2014 and the 15th of January 2018, and as period III the one that than span between the 16th of January 2018 and the 21th of May 2021. For these periods and for the AB-GB and the AB-GBS cases, we have estimated the MV-Normal, the MV-Std and MV-Laplace models. In Table 7 we demonstrate two diagnostic testing measures such as the AIC and the BIC scores. Again, results show that the MV-Std is the best performer under the two multivariate data cases for all three periods.

Portfolio assessment using a portfolio rebalancing method
In this paragraph, we assess the portfolio weights adopted for both the AB-GB and the AB-GBS cases using portfolio rebalancing. Portfolio rebalancing is an important part of the portfolio management process. Usually, once a portfolio manager determines a target portfolio, maintaining this balance of assets is non-trivial. Therefore, due to changes in portfolio return and risk, a manager must rebalance actively the portfolio weights in an optimal way. Conventional approaches to portfolio rebalancing include periodic and tolerance band rebalancing (Masters, 2003). With periodic rebalancing, the portfolio manager adjusts to the target weights at a consistent time interval (e.g., monthly or quarterly). The time-varying weights are estimated using an objective function which is usually based on mean and variance estimates of the portfolio assets.
Here, we use the quadratic utility function, with it expected value being of the form: with µ p the portfolio mean, σ p the portfolio standard deviation and φ the risk aversion parameter ‡ . In practice, from the selected MV-Std DCC-GARCH model, we draw random samples for the mean return values from each series i based onμ i +σ t,i ν−2 ν · Tν(0, 1). Here, Tν(0, 1) denotes a Student-t random variable with mean 0, variance 1 andν degrees of freedom which are estimated. Theμ i andσ t,i components represent the corresponding mean and variance estimates of each series. After, we derive the portfolio mean value µ p = 2 i=1 w i µ i and the portfolio variance valueσ 2 p = 2 i=1 w iσ 2 i + 2w 1 w 2ĉt,1,2 . Thus, the w 1 and w 2 weight components are derived optimally under the w 1 + w 2 = 1 restriction. In Figure 10, we show the time-varying portfolio weights for both the AB-GB and AB-GBS cases. What is diagnosed is that the AB dominates both the GB and GBS indexes in most of the observed quarters. Also, one can observe that the GB weights are more volatile and of marginally higher value than the corresponding GBS weights. A recent work by Nguyen and Huynh (2019) has used an alternative approach to derive portfolio weights. Under this paper's approach, a risk measure based on conditional value-at-risk (CVaR) is used in order to achieved the maximal expected return for investments. Table 7. AIC and BIC scores for alternative Bivariate DCC(1, 1)-GARCH models on AB-GB and AB-GBS return series using three sub-periods (boxed numbers indicate the favored estimation case).   Finally in this section, we investigate the effect of the COVID-19 Pandemic on the Bivariate GARCH Modeling results. In practice we divide the sample period to two sub-periods: from the 2nd of May 2011 till the end of December 2019, and from the 2nd of January 2020 till the 21th of May 2021. The first period is called as the pre-COVID-19 period and the second as the COVID-19 period. Our intention is to assess the COVID-19 Pandemic effect on the volatility clustering, the conditional covariance-correlation and the HR. Thus, we investigate whether one can observe differences in risk analysis when comparing the pre-COVID-19 with the COVID-19 period.
In Table 8 we present the mean estimate results under the MV-Std DCC(1,1)-GARCH model for these two time-periods. Here, we need to point out that for all estimation results we have statistical significance with p-value < 1%. First, we diagnose that the volatility persistence of the COVID-19 period is lower than the corresponding in the pre-COVID-19 period. More specifically, under the AB-GB case, the volatility persistence for the AB index equals 0.995554 for the pre-COVID-19 period and 0.920267 for the COVID-19 period which is translated to the corresponding 155.5567 and 8.341981 days needed to reduce a volatility shock by 50%. For the GB index, we have volatility persistence that equals 0.996037 for the pre-COVID-19 period and 0.988603 for the COVID-19 period. Again, this is translated to 174.5579 and 60.47115 days needed to reduce a volatility shock by 50% for the pre-COVID-19 and COVID-19 periods respectively. Now, under the AB-GBS case, the volatility persistence for the AB index equals 0.996037 for the pre-COVID-19 period and 0.988603 for the COVID-19 period. This is translated to the 155.5567 and 8.341981 days needed to reduce a volatility shock by 50% for the pre-COVID-19 and COVID-19 periods respectively. For the GBS index, we have volatility persistence that equals 0.995444 for the pre-COVID-19 period and 0.988278 for the COVID-19 period. Again, this is translated to 151.7926 and 58.78491 days needed to reduce a volatility shock by 50% for the pre-COVID-19 and COVID-19 periods respectively.
Concerning the volatility spillover, expressed by the a and b parameters, we diagnose that this is higher for the COVID-19 period when compared with the pre-COVID-19 period. This is observed for both AB-GB and AB-GBS cases.. Finally, as regards the level of conditional correlation and that of HR, we see that for both periods we have mean values of the same sign with no significant differences. As it is more obvious in Figures 8 and 9, where we present the time-varying correlation and HR functions for the AB-GB and AB-GBS cases, the COVID-19 period is characterized by a period of negative and a period of positive correlation. The latter signifies non-constant interaction of volatility between the GBs and the AB within the COVID-19 period. A recent work in Huynh et al. (2021) has analyzed the effect of the COVID-19 period in the equity market using the predictive power of a newly constructed "feverish" index on stock returns and volatility.

Forecasting
In this section, we compare the forecasting performance of the three DCC-GARCH models. Based on a within-sample estimation where parameters are estimated separately for each competing model, we formulate out-of-sample forecasts for the conditional variance and covariance functions. The out-of-sample period is derived by extracting the last S = 1000 observations from a period of 2619 observations. Thus, the forecasting period is from the 23th of May 2017 to the 21th of May 2021. The purpose of this exercise is to investigate the effect of the use of alternative distributional assumptions in forecasting the covariance and VaR estimates. Also, by estimating the VaR its corresponding loss function, one can have a view of the tail of the distribution as regards both conventional and GBs.

Conditional covariance
For the conditional covariance evaluation, we use the index return products as a proxy. Here, we perform a one-step ahead conditional covariance function estimation. As evaluation measures, we use the Mean Absolute Forecast Error (MAFE) and the Mean Squared Forecast Error (MSFE). Compared to the MSFE, the MAFE measure puts less weight on possible outliers of our forecasts. The forecasting performance results are presented in Table 9. As the results indicate the DCC-GARCH under the multivariate Student-t errors outperforms the other two in the AB-GBS conditional covariance estimation, whereas, in the AB-GB conditional covariance estimation, the DCC-GARCH under the multivariate Normal errors is the best performer under MAFE measure and the corresponding under the Student-t errors under the MSFE measure.

Conditional value-at-risk
For the same out-of-sample period, we perform one-step-ahead volatility forecasts for the three competing models using the two bivariate settings. These forecasts are then used to formulate the one-step-ahead α-level conditional VaR estimate for each univariate series, such as: where D −1 is the inverse CDF for the distribution D at α nominal level. In the empirical analysis below, we utilize three different assumptions for the innovation distribution: the normal; the Student-t with ν degrees of freedom and the Laplace density. In the case of the Student-t model, the quantile measure is formulated as µ t+1 + D −1 (α)σ t+1 ν−2 ν for the ν degrees of freedom. We will evaluate the conditional VaR estimates using three measures: the Violation Rate (hereafter VRate), the quantile loss function and the two-hypothesis testing methods based on the unconditional and conditional tests (see Kupiec, 1995;Christoffersen, 1998). First, the VRate is a common nontest criterion to compare alternative VaR estimates defined as the proportion of days for which the actual return is more extreme than the forecasted VaR level, over the forecast period. Thus, one should investigate how these specifications can estimate VaR at a pre-specified nominal level given the alternative models. Here, we calculate the VRate which takes the following form for an S forecasting period S −1 S t=1 I(y t < VaR t,α ). This non-statistical validation test estimates the empirical nominal level a. This becomes the proportion which represents the fraction of days for which the actual return yields bigger losses than the estimated VaR over a given forecast evaluation period of S . When the estimated α equals to the given nominal level α this is highly desirable. However, thisα = α condition is only rarely met. In practice, whenα < α, the losses are underestimated (conservative), while on the other hand, whenα > α the losses are overestimated and this results in capital allocation with potential losses. In practice, underestimation is more desirable than overestimation unlessα/α equals one.
Then, we calculate a quantile criterion function based on the log-transformed loss function where e t = y t − VaR t,α . The above criterion function should be minimized for the forecasting period adopted. Finally, we use the unconditional coverage test by Kupiec (1995) and the conditional coverage test by Christoffersen (1998). The first, is a likelihood ratio test of the hypothesis that the nominal level equals the estimated VRate when conditional dependence is assumed. According to this, we test whether the VRateα obtained by the candidate model is significantly different from the proposed nominal level α. According to this, the likelihood function of the i.i.d. Bernoulli sequences is defined as: where S 0 and S 1 are the number of 0s and 1s in the sample. Therefore, the VRate is expressed aŝ α = S 1 /S . Under the H 0 hypothesisα = α and its likelihood function is defined as: L(α) = (1 − α) S 0 α S 1 . Then, we derive the likelihood ratio unconditional coverage test, such as: distributed as a χ 2 distribution with one degree of freedom. Under the second test: the conditional coverage test derived by Christoffersen (1998) this is based on a first-order Markov sequence with a transition probability matrix such as:Â Under this matrix, theα 01 corresponds to the probability given 0 today to have a 1 result the next day; i.e. Pr(I t + 1 = 1, I t = 0), whereas theα 11 corresponds to the probability given 1 today to have a 1 result the next day; i.e. Pr(I t + 1 = 1, I t = 1). Thus, the probability of a 0 event following a 0 event is 1 −α 01 , and the probability of a 0 event following a 1 event is 1 −α 11 . Then the likelihood function for the S forecast period is defined as: Now, under the null hypothesis of the test, the probability of a violation tomorrow does not depend on today being a violation or not having α 01 = α 11 = α. Now, the likelihood ratio conditional coverage test becomes as follows: LR cc = −2 ln[L(A)/L(Â)] ∼ χ 2 1 (24) In practice, in order for the VRate to be validated, we intend for the forecasting data to show acceptance of the null hypothesis with p-values at least equal to 5%. Table 10 reports the one-step-aheadα/α ratio for all data series under the bivariate DCC-GARCH models under MV-Normal, MV-Std and MV-Laplace specifications at the 1% and 5% significance level. Also, for the same models, we report a quantile criterion function based on the log-loss function log E[L α (e t )] appearing in (4). Finally, in the same table we report two formal statistical tests: the unconditional coverage test (denoted as LR uc ) and the conditional coverage test appearing in (5) and (6) respectively. Results at α = 1% show that under theα/α ratio criterion, the MV-Std model specification ranked first in three out of four data cases, whereas the MV-Normal and MV-Laplace specification show similar VRate results in one case. Concerning the unconditional and the conditional coverage tests the first seems very sensitive with nine rejections out of twenty four test cases. The conditional coverage test is never rejected. At α = 5% and under theα/α ratio criterion, we have the MV-Normal specification coming first in two out of four cases. Concerning the unconditional and the conditional coverage tests they both have no rejections. Under the loss function criterion, we observe that GBs systematically have larger losses than the convectional ones in all model specifications and under both nominal levels. Now, concerning the distribution specification, at α = 1% nominal level the MV-Std specification outperforms the others in all cases whereas for α = 5% the MV-Normal one dominates the rest as in the case of theα/α ratio criterion.
The conditional VaR forecasting results demonstrate a number of general results: 1. Among the DCC(1, 1)-GARCH models with different distribution assumptions, the one selected under the VRate criterion depends on the nominal level assumed. 2. At α = 1% nominal level the MV-Std model seems to outperform the MV-Normal and MV-Laplace whereas at α = 5% the MV-Normal model outperforms the MV-Std and the MV-Laplace models. 3. Model selection results under the loss function measure seem to coincide with those of the VRate one. 4. Under the loss function measure, GBs convey higher risk than CBs.

Conclusions
The need for financing the green economy has urged the financial market to issue green bond products. These have been in the interest of investors who mostly want to promote investments that are environmentally friendly. However, growing interest in these products has raised questions regarding whether GBs differ to CBs in terms of their risk and whether there are any systematic patterns between these two different products. To answer these questions we have collected daily closing prices of green and conventional bond indexes. For the purposes of analyzing their risk, we took the daily returns of these series. The preliminary and conditional variance and covariance statistical analysis implemented using univariate and bivariate GARCH models has revealed the following results: 1. Risk expressed by conditional volatility as well as VaR measures is higher for green bond than for conventional bond returns. 2. Volatility clustering exists within conventional bond and green bond returns with the GBs demonstrating a marginally higher level of clustering than the CBs. 3. There are cases where we observe volatility spillover between the green bond and the conventional bond returns. 4. There are significant differences in conditional correlations when comparing the conventional bonds with green bonds of either loose or stringent green-labeled criteria. 5. The adoption of loose or stringent eligibility green-labeled criteria can significantly affect hedging strategies. 6. The COVID-19 Pandemic period is associated with lower volatility persistence for both green and conventional bonds when compared to the pre-COVID-19 period. Also, we observe a higher spillover effect between conventional and GBs.
These results have several consequences for investors and decision makers. These are as follows: First, the estimation of time-varying volatility as well as the VaR measure can help investors to construct an optimal portfolio where assets of high risk like the GBs can be mixed with assets of lower risk such as the CBs. Second, the different signs of correlations and HRs observed in the AB-GB and AB-GBS cases indicate that, the way these indexes are generated has a significant effect in their real diversification.
Finally, the present approach can be easily applied in cases of other sub-indexes of green bonds, indexes of specific issuers, sectors or geographical areas, stocks, gold, crude oil, cryptocurrencies etc.