A COMPARISON OF THE EXTREME VALUE THEORY AND GARCH MODELS IN TERMS OF RISK

In this paper, we apply extreme value theory (EVT) and time series models to eight developed and emerging stock markets published in the Morgan Stanley Capital International (MSCI) Index. Based on the Human Development Index (HDI) rankings, which are consistent with the MSCI index, we analyse Singapore, Spain, UK and US for developed stock markets and Chile, Russia, Malaysia and Turkey for emerging stock markets. We use the daily prices (in USD) of eight countries for the period from January 2014 to December 2017 and examine the performances of the models based on in-sample testing. Calculating the value-at-risk (VaR) as a risk measure for both right and left tails of the log-returns of the selected models, we compare these countries in terms of their financial risks. The obtained risk measures enable us to discuss the grouping and the ranking of the stock markets and their relative positions.


Introduction
In this paper, we compare financial risks of different countries according to a commonly used risk measure Value-at-Risk (VaR).Market risk management enables us to evaluate the potential risks and their possible effects in the future.Risk measures have been developed and have taken an increasingly important role in financial risk management and regulatory purposes after the financial disasters in 1990s.In this context, forecasting the risk of the stock markets requires a comprehensive information and the models which reflect the characteristics of the portfolios best should be determined.Time series models and extreme value theory (EVT) are two popular methods used for financial modelling.
The aim of examining financial time series is to investigate the behaviour of the prices and to manage the future risks according to the price movements.Since the future price is uncertain, it must be considered as a random variable with a probability distribution.Therefore, we use models describing how consecutive prices are determined statistically in order to investigate the prices.By analyzing stock returns, time series models are used for forecasting, pricing and financial risk management (Aas and Dimakos, 2004).Since generalized autoregressive conditional heteroskedasticity (GARCH) models capture price volatilities effectively, we prefer using GARCH models to represent stock returns.
On the other hand, the EVT becomes one of the most popular methods to forecast extreme financial risks since it is a well-developed approach under the probability theory.In this approach, the asymptotical distribution of extreme events, which are rare but severe, are taken into consideration.Statistical methods obtained from the EVT are very useful when they are applied to finance, particularly within the frame of risk measurement (Rocco, 2014).
Genc ¸ay et al. (2003) divide risk models into two groups according to the volatility of quantile forecasts.The first group consists of GARCH(1,1) and GARCH(1,1)-t whereas the second group includes historical simulation, variance-covariance approach, adaptive generalized Pareto distribution (GPD) and non-adaptive GPD models.When models are compared in terms of the VaR, the GPD model appears as a robust quantile forecasting tool for Turkish Stock Exchange index.Eks ¸i et al. (2006) compare the EVT, which is used to generate VaR estimates and provide the tail forecasts of daily returns, with other risk computation methods for Turkish Stock Market.The relative performance of the expected shortfall is measured with respect to other risk measures in their study.Genc ¸ay and Selc ¸uk (2004) investigate the relative performance of the VaR models with the daily stock market returns of nine different emerging markets including Turkey.
150 Lawler (2003) provides a time series analysis of the Shanghai and New York Stock Exchange composite price indices to compare their weekly rates of return and volatility.Co-movement of these two markets in 1992-2002 are also analysed in this study.Gilli and K¨ellezi (2000) present a practical application where the observations of 31 years of daily returns on an index representing the Swiss market are analysed.Point and interval estimates of the tail risk measures are computed by modelling the loss tail.
We analyse the stock indices of different developed (Singapore, Spain, UK and US) and emerging (Chile, Russia, Malaysia and Turkey) markets using both the EVT and GARCH models.These models are compared by in-sample testing and the best model is chosen for each country.Finally, we compare these countries in terms of their financial risks by means of VaR and discuss the groupings and the relative positions of the countries based on the Human Development Index (HDI).
The paper is organised as follows.Section 2 introduces the data.Section 3 presents a brief background of the models fitted to the financial data.We discuss the results of the application in Section 4. The risk measures are obtained and analysed by comparing the countries in Section 5. Finally, Section 6 concludes.

Data
Emerging and developed financial markets have different dynamics.Emerging markets have experienced more severe financial disasters than developed economies had.The structure of prices must hence be investigated carefully especially when emerging and developed countries are compared.A relative comparison is also beneficial for the investors in their decisions (Genc ¸ay and Selc ¸uk, 2004).
We use the daily prices published in Morgan Stanley Capital International (MSCI) index (MSCI, 2018).According to this index, countries are classified into two main groups as "developed" and "emerging".Within each group, countries are also categorized according to their regional characteristics.We choose developed economies and emerging markets which can be comparable with each other according to the HDI rankings.United Nations Develop-r t = log where P t is the price level and p t = log(P t ) is the natural logarithm of the price level (Camilleri, 2006).Using the logarithmic transformations of prices has important advantages: (i) a non-linear relationship can be transformed into a linear one, (ii) the use of linear regressions with logarithmic series provides an immediate interpretation of the estimated coefficients such as elasticities, and (iii) the series are usually compressed and a constant variance is obtained for the transformed series.
We apply the Augmented Dickey-Fuller (ADF) unit root test to check the stationarity of the series.The results in Table 1 and According to the common tendency in both the time series and the EVT literature, the rate of return is measured by the change in the natural logarithm of the price index in a given period (Lawler, 2003).The log-return is obtained as In addition to the statistical tests, graphical representations show that the logreturns are more appropriate for the time series analysis due to the stationarity property.EVT is also applicable for log-returns as the volatility can be seen in the graphs.The graphs also indicate that emerging countries experience more jumps in the returns comparing with the developed countries which might cause higher risk measures.Figure 1 and 2 represent the daily prices and logreturns of the selected countries.Since the daily prices are not stationary, we will use the log-returns for future analysis.

153
A Comparison of the Extreme Value Theory and GARCH models…  In order to define the asymptotic distribution of extreme events, Fisher-Tippet Theorem suggests that the limit distribution of the maximum values of a random variable in specified periods belongs to an extreme value distribution family.
(ii) Peaks over threshold (POT) method: According to Pickands-Balkema-de Haan Theorem, the conditional distribution of the values above a given high threshold is approximately GPD.
BM is a traditional method which is used to analyse data with seasonality.On the other hand, POT is preferred by many financial applications due to the fact that this method uses data more efficiently (Gilli and Këllezi, 2000).In this paper, we analyse the behaviour of large observations which exceed a high threshold instead of maximum values of observations.We thus use POT approach in order to model log-returns.

Models for Index Returns
The risk measurement in this paper is based on the tail estimation and the variability of the index returns.We thus choose two modelling approaches, EVT and GARCH models, which are commonly used in financial studies.
EVT is used to model high severity and low frequency events whereas GARCH considers the heteroskedasticity which appears when both high and low volatility occur in different periods.

Extreme Value Theory
EVT is very effective for modelling financial data in order to examine the limit behaviours of fat-tailed distributions (Genc ¸ay and Selc ¸uk, 2004;McNeil, 1997).Since one of the ways that we assess the risk is the analysis of the tail risk, we only handle the randomness in the tail using limit laws rather than fitting a single distribution to the whole sample.
There are two principal types of methods to determine extreme events (Genc ¸ay and Selc ¸uk, 2004; Gilli and Këllezi, 2000): function F u (y) converges to the GPD for a high value of threshold u as follows: where σ is the scale parameter and ξ is the shape parameter of the distribution.

Time Series Models
In the statistical analysis of time series, autoregressive-moving-average (ARMA) models provide a parsimonious description of a (weakly) stationary stochastic process in terms of two polynomials, one for the autoregression and the second for the moving average.The ARMA model is described as a tool for understanding and predicting future values in time series.The model consists of two parts, an autoregressive (AR) part and a moving average (MA) part.The AR part involves regressing the variable on its own lagged values.
The MA part involves modelling the error term as a linear combination of error terms occurring contemporaneously and at various times in the past.
ARMA(p,q) model (where p is the order of the AR terms φ and q is the order of the MA terms ψ) is given as where unknown parameters are θ Hamilton, 1994).
The autoregressive conditional heteroskedasticity (ARCH) model is a statistical model for time series data that describes the variance of the current error term or innovation as a function of the actual sizes of the previous time periods' error terms.The ARCH model is appropriate when the error variance in a time series follows an AR model; if an ARMA model is assumed for the error variance, the model is a GARCH model.For forecasting, combining autoregressive integrated moving average (ARIMA) and ARCH models could also be considered.
The GARCH(p, q) model (where p is the order of the ARCH terms w 2 and q

157
A Comparison of the Extreme Value Theory and GARCH models… is the order of the GARCH terms σ 2 ) is given by where α 0 > 0, α i ≥ 0 and β i ≥ 0 for all i, and max(p,q) i=1 Here w t is a white noise (WN) with time-varying conditional variance (conditional heteroskedasticity) (Hamilton, 1994).Specifically, we write where t is a strong WN with zero mean and unit variance, i.e., t ∼ iid N(0, 1).
And the conditional variance of the time series X t is calculated as After selecting the model, residuals of the fitted model should be tested in several ways.According to the independence of residuals assumption, once the model is developed and residuals are computed, there should be no remaining autocorrelations or partial autocorrelations at the lags in the autocorrelation functions (ACFs) and partial autocorrelation functions (PACFs).In addition to the graphical interpretation, the Ljung-Box test is a quantitative way to test for autocorrelation at multiple lags simultaneously by testing the hypotheses below: i.e. the residuals are independent H 1 : ρ j = 0, ∃j ∈ {1, 2, . . ., m} i.e. the residuals are not independent

Application
We analyse the data by separating the positive and negative returns to apply the EVT for both upper and lower tails.Similarly, time series analyses are applied for both positive and negative returns too.Although it might be a concern that separating as positive and negative returns might distort the dependency structure of the data, which is important for the regression and time series analysis, it is a common application in the financial and implied volatility literature (Davidson et al, 2001;Simon, 2003;Ahoniemi, 2008;Ma, 2013;Bugge et al., 2016;Rizvi et al., 2017).Negative and positive returns affect the volatility differently and using absolute or squared returns does not address this fact.To capture this effect, volatility studies treat the positive and negative returns as two data series.
Dividing the stock return data into two parts as a gain or loss might be useful to consider an investor's position as a short or long seller.In finance, being at a short position in a stock means that an increase in the stock price causes a loss.Hence, negative log-returns appear as a risk for a short position.On the other hand, positive log-returns yield riskiness for the long position in a stock since a price decline affects the long position negatively.

Selection of the Optimal EVT Model
The choice of the optimal threshold u is a crucial step for parameter estimation of the GPD.We first obtain the mean excess plots ( MEP) to determine the threshold intervals for each country.In order to choose the best threshold carefully, we select four candidate intervals from the MEP.After that, we choose two potential optimal thresholds for each interval yielding stable scale and shape parameter graphs simultaneously.The potential thresholds are determined graphically according to threshold stability plots and finally the efficiency of the relevant GPD models are evaluated by the goodness-of-fit tests.

Estimation of the Upper Tail
In this subsection, we obtain the optimal thresholds for the positive log-returns for each country.After choosing the best threshold for the right tail, we estimate the parameters of the GPD distribution that fits the log-returns exceeding this threshold.Having obtained four threshold intervals for each country, threshold stability plots help us to find two potential u for each interval, which are candidates for the optimal threshold, so that 8 optimal threshold candidates are found for each country.The test results of significance for the optimal thresholds are displayed in the following table for developed and emerging countries.
The bootstrap goodness-of-fit test is used to test the null hypothesis "H 0 : A random sample has a GPD with unknown shape parameter ξ, which is a real number" (Villasenor-Alva and Gonzalez-Estrada, 2009).After optimal thresholds are determined visually, a goodness-of-fit test is needed for an efficient decision.Since the chosen threshold gives the estimations of the GPD parameters, we simulate values according to the number of exceedances with respect to the potantial threshold and compare them with the real values.The optimal threshold for each country is chosen by the root mean square error (RMSE), the mean absolute percentage error (MAPE), the log-likelihood and the Bayesian Information Criteria (BIC) given in Table 5.Although the RMSE and MAPE are important performance criteria, BIC has been considered as the primary criterion to decide the optimal threshold since the EVT models will 160 E. Nevruz y Ȿ. Ȿahin  Since the chosen threshold gives the estimations of the GPD parameters, we simulate values according to the number of exceedances with respect to the potential threshold and compare them with the real values.The optimal threshold for each country is examined by the RMSE, the MAPE, the log-likelihood and the BIC.As in the upper tail results, the BIC is considered as the key criterion to compare the EVT models with the time series models in the next section.
In order to forecast the future returns, we choose the optimal parameters given in Table 9 for EVT method based on the best fitted models.

Selection of the Optimal Time Series Model
We use both graphical tools and statistical tests in order to choose the best time series models for positive and negative log-returns of each country.

162
E. Nevruz y Ȿ. Ȿahin The parameters for the optimal time series models for positive and negative returns of all countries are presented in Table 11 and 12, respectively.These parameters are used to simulate the log-return data to compare with the observed values in order to analyse the goodness-of-fit of the proposed models.

163
A Comparison of the Extreme Value Theory and GARCH models…

Model Estimation of the Positive and Negative Returns
Primarily we analyse the ACF and PACF graphs of the log-returns to see if they can be modelled by ARMA(p,q).We also check the goodness of fit of the models using some statistics such as BIC, adjusted R 2 and Ljung-Box test results.After trying several ARMA, ARIMA, GARCH and ARMA-GARCH models for the positive returns for each country, we see that the GARCH or ARMA-GARCH models fit the data best for all countries.
Table 10 presents the best models and the BIC values for the positive and negative log-returns for developed and emerging countries.GARCH and ARMA-GARCH models seem as the best models for the returns but the orders of the models change from country to country.Although the models presented in the table have been chosen as the best, all adjusted-R 2 values are very low and close to zero which might indicate that the increasing number of parameters provide very few information.Results also show that higher order GARCH models are required for negative log-returns for most of the countries comparing with the models proposed for the positive log-returns.When we compare the BIC values for the best models chosen in time series analysis with the EVT models, we see that the time series models display better fits due to producing lower BIC values.Thus, by looking at these results we might conclude that GARCH and ARMA-GARCH models with specified parameters represent the log-return data better than the EVT models.

164
E. Nevruz y Ȿ. Ȿahin Table 12.Parameters for the optimal time series models for negative returns for developed and emerging countries where F −1 is the quantile function of random variable X defined as the inverse of the distribution function F .Although it would be better to present both VaR and expected shortfall to compare the models using different risk measures, we only calculate VaR to provide a benchmark comparison to keep the simplicity.However, we will consider the other risk measures for future studies.
The methods used for computing VaR can be grouped as the parametric and non-parametric approaches.In this paper, we estimate VaR values for EVT and GARCH models both of which are parametric approaches.In addition to the above definition that is used to calculate VaR for the selected EVT models, we use another calculation of VaR suggested by Gençay et al. (2003).This definition, which is proposed as a variance-covariance approach, is useful for our study due to its applicability to GARCH models.
Let r t , t = 1, 2, . . ., n − 1 be the log-return of the prices which follows a martingale process as r t = µ t + t where t has a distribution function F with zero mean and variance σ 2 t .The VaR in this case can be calculated as where F −1 (α) is the q-th quantile (q = 1 − α) value of the unknown distribution function F .An estimate of µ t and σ t 2 can be obtained from the sample mean and the sample variance.Instead of the sample variance, the standard deviation in this equation can also be estimated by a statistical model (Genc ¸ay et al., 2003).
According to the optimal EVT and GARCH models, we estimate the VaR of developed and emerging countries for positive and negative log-returns, respectively.Figures 3 and 4 display the calculated VaR values of the chosen methods for each country for 95% and 99.5% confidence levels.

165
A Comparison of the Extreme Value Theory and GARCH models…

Risk Measures
VaR is one of the most commonly used risk measures which reflects the maximum loss within a given confidence level (Jorion, 2001).Despite its simplicity, VaR is not coherent because of not fulfilling subadditivity property except under the normal distribution assumption.
VaR can be expressed as Figure 3. Risk measure results for positive log-returns for developed and emerging countries Figure 3 shows that the risk measures obtained from the EVT models are generally higher than the risk measures obtained from the time series models for positive returns which indicates that the EVT models provide more conservative results.The ranking of the countries according to the VaR are consistent within the same model (EVT or time series) for different confidence levels (95% and 99.5%) except for a few discrepancies.166 E. Nevruz y Ȿ. Ȿahin Figure 3 displays the VaR analysis for positive returns of the EVT models, which shows that Russia and Turkey have the highest risk measures and the US has the smallest.However, the VaRs obtained from the time series models indicate that Malaysia and the US have the highest risk measures while Singapore has the smallest.Different models might lead different country rankings in terms of risk measures.Moreover, the graphs do not suggest a significant boundary between the developed and emerging markets which makes it difficult to analyse the relative positions of the countries.
By analysing Figure 4 we see that the absolute value of the VaR results are higher compared to the positive returns for both EVT and time series models.The rankings of the countries based on the calculated risk measures for two sets of models are different.Time series models are more consistent based on the country ranking for two different confidence levels while the EVT models indicate some discrepancies.Based on the negative returns data, EVT models show that Russia and Turkey have the highest risk while Malaysia and Singapore have the lowest.Time series models for two confidence l evels present almost identical numbers for the risk measures which change on the fifth or sixth decimal places and also indicates Russia and Turkey are the most risky countries while the US and Malaysia are the least.There is no indication of a separation of the developed and emerging countries in terms of the risk measures obtained for both sets of models.
The results obtained for different confidence levels might indicate that the structure of the tail distributions of these countries are similar due to the increments on the risk measures between different confidence levels.
When we compare the rankings of these countries based on HDI and the riskiness of their stock markets, we see that there is a correlation between those two.Higher HDI indicates lower riskiness for emerging markets and the countries have been ordered from the highest to lowest HDI as Chile, Russia, Malaysia and Turkey.However, the ranking is not that strict for the developed countries since Spain has the second lowest VaR just after Singapore while it is the fourth in terms of the HDI index.Table 13 shows the rankings of the countries for both the HDI and the VaR obtained from the analysis of the returns.The rankings presented in Table 13 are consistent with the graphs of the prices and the log-returns displayed in Figure 1 and 2. Since the only difference in ranking exists for the US and Spain, the log-return graph of the US confirms the relative riskiness of its stock markets due to the higher volatility.

Conclusions
We have compared developed and emerging countries based on the HDI index and the VaR results obtained from EVT and GARCH and ARMA-GARCH models.The results show that the time series models fit the logreturn data better for all the countries chosen (Singapore, Spain, UK, US, Chile, Russia, Malaysia and Turkey) based on the BIC values.The risk measures obtained from EVT models are more conservative with respect to the risk measures obtained from the time series models.They are also sensitive to the confidence levels due to the changing rankings of the countries based on the calculated risk measures.However, the time series models provide more consistent results for both negative and positive logreturns.Due to the increments of the risk measures for 95% and 99.5% confidence levels, the tail structures are similar for all countries.The risk measures obtained from different models for different confidence intervals might indicate different rankings which makes us to consider more sophisticated ranking/ordering approaches.Although the HDI rankings and the VaR rankings seem consistent for most of the countries, the main components of the HDI which are life expectancy index, education index and gross national income per capita might have different affects on the rankings with different weights.We intend to consider the stochastic ordering approaches to compare the countries based on different risk measures and HDI 168 E. Nevruz y Ȿ. Ȿahin

Figure 2 .
Figure 1.The prices and log-returns of the developed countries

Figure 4 .
Figure 4. Risk measure results for negative log-returns for developed and emerging countries

Table 1 .
The ADF test results for the prices

Table 2 .
The ADF test results for the log-returns

Table 3 .
The threshold intervals obtained by MEP of upper tail of log-returns for developed countries

Table 3 and
Table4present the candidate intervals which consist of the optimal threshold for each country.These intervals are determined by MEP based on the regions where the graph is stable.

Table 4 .
The threshold intervals obtained by MEP of upper tail of log-returns for emerging countries be compared with the time series models.
The analyses in Section 4.1.1 are repeated for negative log-returns in this subsection.Table6 and Table 7present the candidate intervals which consist of the optimal threshold for each country.Similar as upper tail results, the potential optimal thresholds and significance test results are displayed in Table8for developed and emerging countries.161AComparison of the Extreme Value Theory and GARCH models…

Table 5 .
Test of significance results for the best threshold of upper tail of log-returns for developed and emerging countries

Table 7 .
The threshold intervals obtained by MEP of lower tail of log-returns for emerging countries

Table 8 .
Test of significance results for the best threshold of lower tail of log-returns for developed and emerging countries

Table 9 .
The optimal GPD parameters for upper and lower tail of log-returns for developed and emerging countries

Table 10 .
The time series model selection results for the positive and negative log-returns for developed and emerging countries

Table 11 .
Parameters for the optimal time series models for positive returns for developed and emerging countries

Table 13 :
Rankings of the countries in terms of HDI and VaRHDI for further studies.