Not all that glitters is RMT in the forecasting of risk of portfolios in the Brazilian stock market

https://doi.org/10.1016/j.physa.2014.05.006Get rights and content

Highlights

  • Use of Random Matrix Theory and the Single Index Model in the building of portfolios.

  • We compare combinations of techniques in the cleaning of the correlation matrix.

  • Cleaning the correlation matrix is not always advisable for times of high volatility.

Abstract

Using stocks of the Brazilian stock exchange (BM&F-Bovespa), we build portfolios of stocks based on Markowitz’s theory and test the predicted and realized risks. This is done using the correlation matrices between stocks, and also using Random Matrix Theory in order to clean such correlation matrices from noise. We also calculate correlation matrices using a regression model in order to remove the effect of common market movements and their cleaned versions using Random Matrix Theory. This is done for years of both low and high volatility of the Brazilian stock market, from 2004 to 2012. The results show that the use of regression to subtract the market effect on returns greatly increases the accuracy of the prediction of risk, and that, although the cleaning of the correlation matrix often leads to portfolios that better predict risks, in periods of high volatility of the market this procedure may fail to do so. The results may be used in the assessment of the true risks when one builds a portfolio of stocks during periods of crisis.

Introduction

Modern portfolio theory is largely based on Markowitz’s ideas, where portfolios of various equities are built on the principle of minimizing risk given some expected returns, allowing one to obtain an efficient frontier of risk and returns of portfolios. Risk is assessed as the volatility of each stock that made up the portfolio, as well as their covariances. The covariance matrix is used to predict the risk of a portfolio, and it is usually different from the realized risk of the same portfolio, since the matrix is built using the stock returns of past data.

Three problems arise from this approach. The first one is that past data reflect the market as it was, and not as it will be. So, the theory assumes the hypothesis that future events shall mimic past events, which is usually not true, since it does not incorporate news releases, or the current mood of the market. There is not much that can be done about this, but to minimize effects of events that might change the behavior of a market, one cannot use past data that is too far in the past.

This leads us to the second problem, which are the deviations associated with the finite sample effect, that arises purely from the fact that the available data are finite. Since one cannot go back in time indefinitely, and even if one could, it would not be advisable given the discussion in the preceding paragraph, there is only a limited amount of data (in our case, price quotations) from which to build a covariance matrix. The problem gets even more severe if we think that an efficient portfolio should be built from many and diverse equities, while maintaining a fairly recent scope of historical data, since that leads to more finite sample effects due to a smaller ratio between the number of days in the historical data and the number of stocks in the portfolio.

A third problem is the statistical noise that emerges from the complex interactions between the many elements of a stock market: news, foreign markets, crisis, and the very prices of stocks interact in order to guide the price of a stock. Those interactions are usually too complex to be accommodated by any econometric model.

So, all these effects are incorporated into the covariance matrix that is used in the attempt to forecast the risk of a particular portfolio, and if one can remove some of those from the matrix, one is then able to make better risk predictions. Some authors made studies on the influence of noise and other factors on the covariance matrix in the building of portfolios  [1], [2], [3], [4], [5], [6]. Most of the approaches for solving them involve the reduction of the dimensionality of the covariance matrix by introducing some structure into it, obtained by principal component analysis, and the separation of stocks into economic sectors, among other means  [7], [8].

A technique that has been applied to a number of complex systems, and, particularly, to financial markets, is Random Matrix Theory  [9]. Of the many results that were obtained, the building of portfolios that most closely resemble the realized risk of the future market, based on past data, is one of them  [10], [11], [12], and it has been successfully applied to stocks  [13], [14], and to hedge funds  [15].

Random Matrix Theory had its origins in 1953, in the work of the Hungarian physicist Eugene Wigner  [16], [17]. He was studying the energy levels of complex atomic nuclei, such as uranium, and had no means of calculating the distances between those levels. He then assumed that those distances between energy levels should be similar to the ones obtained from a random matrix which expressed the connections between the many energy levels. Surprisingly, he could then be able to make sensible predictions about how the energy levels related to one another by removing the results due to a random matrix. The theory was later developed, with many and surprising results arising. Of particular importance for our study are the results obtained by Marčenku and Pastur  [18] on Random Matrix Theory applied to correlation matrices, better described in the section on methodology.

Today, Random Matrix Theory is applied to quantum physics, nanotechnology, quantum gravity, the study of the structure of crystals, and may have applications in ecology, linguistics, and many other fields where a huge amount of apparently unrelated information may be understood as being somehow connected. The theory has also been applied to finance in a series of works dealing with the correlation matrices of stock prices, as well as with risk management in portfolios  [19], [20], [21], [22], [23] (for a recent review on the subject, see Ref.  [24]).

Another technique that can be used to better estimate the real relations among the components of the matrix correlation is to use a regression model to remove the market effect on the asset returns, i.e., to estimate the relationship between returns and an asset that represents the market (like the BM&F-Bovespa index, in Brazil’s case) and use only the residue of this model, thus eliminating the common variations of all stocks due to market movements. This procedure allows the estimation of the correlation matrix with greater precision, since there is just a part of the dependence which is due to the assets, which generates more reliable forecasts for the risk of a portfolio, being a large part of it due to the collective responses of the market to news or to other factors. This procedure is standard in many models in finance, most importantly in the CAPM (Capital Asset Pricing Model), and it is called Single Index Model (SIM), based on the idea that the majority of the systemic risk is captured by a single market index. The use of SIM is similar to the use of one component in the RMT filter, since the highest eigenvalue of the correlation matrix corresponds with the market.

Other models can be used to remove internal or external effects, the so called factor models, that defend the hypothesis that the systemic risk is due to a number of factors, which may include statistical, macroeconomic, or fundamentalist influences, that also can be used to remove noise. Ross  [25], as an example, presents a model, called Arbitrage Pricing Model (APT), which uses more than one factor to explain systemic risk. According to Campbell, Lo and McKinley  [26], the APT model provides an approximate relation for expected asset returns with an unknown number of unidentified factors. In the same way as Rosenow  [27] selected the number of factors to be used in a MV-GARCH model based on the number of eigenvectors of the correlation matrix outside the Wishart (noise) region, RMT may be used in order to decide how many factors should be used in a multifactor model of the APT type.

Previous works on the stock exchanges of emerging markets using Random Matrix Theory have been conducted for South Africa  [28], [29], India  [30], Sri Lanka  [31], and Mexico  [32]. Their results show some differences between the stock exchanges of emerging markets and the stock exchanges of more developed ones, such as less liquidity for the stocks, and less integration of different sectors.

Recent results on the application of Random Matrix Theory to financial data are basically concerned with the actual calculation of the optimal portfolios, which involve the inversion of the correlation matrix of log-returns of the time series of the stocks that may take part in the portfolio  [33], [34] and on a better formalization of the theory  [35], with recent results  [36] that claim to outperform the usual cleaning procedures used in this article.

Following a similar methodology as ours, the authors in Ref.  [37] studied the stock market from Chile using Random Matrix Theory and performing an analysis of the eigenvalues and of the eigenvectors of the correlation matrix, and also studying the dynamics of those eigenvectors, revealing some structure based on some key industrial sectors of the Chilean economy. They also used Vector Autoregressive Analysis in order to pinpoint the main drivers of the Chilean stock market.

The contribution of this article is to combine the use of the RMT and the SIM methods which are capable of ameliorating the risk forecasts of a portfolio built with stock market assets, based on past data, and doing it for periods of both low and high volatilities. Using Markowitz’s theory, we calculate portfolios of stocks by three different ways: cleaning the correlation matrix using RMT, removing the market effect of the assets (SIM), and combining the two procedures. We then compare all these results with the risk of portfolios built by the usual way, i.e., without using RMT and/or SIM.

In order to analyze the suitability of the proposed methods, we shall use the daily returns of the BM&F-Bovespa stocks from 2004 to 2011, so involving years of both low and high volatility. We only use stocks with 100% liquidity, which means that there was negotiation of those stocks every day the stock exchange was open, considering pairs of years ranging from 2004 to 2012. For each year being analyzed, we built a portfolio using data from the previous year in order to make a forecast of the risk for the target year, and that forecasted risk is then compared with the realized risk in that year. We use 61 stocks for 2004–2005, 72 stocks for 2005–2006, 86 stocks for 2006–2007, 105 stocks for 2007–2008, 148 stocks for 2008–2009, 153 stocks for 2009–2010, 134 stocks for 2010–2011, and 125 stocks for 2011–2012.

We also analyzed the evolution of the portfolios in time using 32 stocks that were 100% liquid in the years ranging from 2003 to 2012 and studied the differences between predicted and realized portfolios in time. As data used in this article include periods of both low and high volatility in the BM&F-Bovespa, in particular the data collected during the Subprime Mortgage Crisis of 2007 and 2008, we are able to study how this technique of cleaning the correlation matrix and/or using the Single Index Model applies to times of high volatility.

The article is organized as follows: Section  2 is dedicated to the standard way of building portfolios (according to Markowitz). Section  3 introduces the basic concepts of Random Matrix Theory, and the characteristics of the eigenvalues of the correlation matrix, in addition to building portfolios by cleaning the correlation matrix when short selling is not allowed, as well as the regression for the removal of the market effect. The measures of how well predicted risk approximates realized risk for equal values of returns and the discussion of the results are in Section  4. Section  5 presents the analysis of how the proposed measures evolve in time, and the article ends with final remarks in Section  6.

Section snippets

Building portfolios using Markowitz’s theory

In this section, we shall start by building portfolios using the stocks that were 100% liquid during the years 2004 and 2005, as an example, based on the correlation matrix of their returns (we shall refer to log-returns as simply returns) in the year 2004, and then also for the remaining pairs of years. According to the usual portfolio theory, we can obtain w, the vector of weights of the portfolio due to each stock, by fixing the portfolio return (RE) and minimizing the risk (RI) of the

Methodology

In this section, we briefly describe the method proposed for the construction of portfolios by cleaning the correlation matrix and removing the market effect, aiming at a better forecasting of risk based on the previous behaviors of the assets. We use the year 2004 as an example of the application of such method in this section, and then apply the same methodology to the remaining years.

Results

In this section, we present one measure of agreement of the predicted and realized risks calculated with the many combinations of techniques for building portfolios, and one measure of agreement between the correlation matrices of which risks are calculated.

Evolution in time

Our analysis so far has been based on large windows, with a varying number of stocks for each window, and large jumps from one window to the other. In order to perform a temporal analysis of the evolution of the portfolios, we now consider the 32 stocks that were 100% liquid in the period from 2003 to 2012 in moving windows of 100 days each, with a lag of 5 days between each window. For each of these windows, portfolios are built on efficient frontiers with and without regression, with and

Conclusions

In this article, we used two techniques in order to clean the correlation matrix in the building of portfolios using Markowitz’s theory. The first technique is the use of Random Matrix Theory in order to clean the correlation matrix built from the time series data of stocks in the year prior to that for which the portfolio is to be built. The second technique is to use a regression model in the removal of the market effect due to the common movement of all stocks. These are used in order to

Acknowledgments

L. Sandoval Jr. and M.K. Venezuela thank the support of this work by a grant from Insper, Instituto de Ensino e Pesquisa. We are also grateful to Gustavo Curi Amarante, who collected the data and to Nicolas Eterovic, for useful discussions. We also thank the anonymous reviewers, for their valuable suggestions and insights, which improved this article immensely. This article was written using LaTeX, all figures were made using PSTricks, and the calculations were made using Matlab, R and Excel.

References (46)

  • J.D. Jobson et al.

    Estimation for Markowitz efficient portfolios

    J. Amer. Statist. Assoc.

    (1980)
  • R.O. Michaud

    The Markowitz optimization enigma: is ‘optimized’ optimal?

    Financ. Anal. J.

    (1989)
  • V. Chopra et al.

    The effect of errors in mean and co-variance estimates on optimal portfolio choice

    J. Portfolio Manage.

    (1993)
  • P. Jorion

    Bayes-stein estimation for portfolio analysis

    J. Finan. Quant. Anal.

    (1986)
  • V. DeMiguel et al.

    A generalized approach to portfolio optimization: improving performance by constraining portfolio norms

    Manag. Sci.

    (2009)
  • M.L. Mehta

    Random Matrices

    (2004)
  • L. Laloux et al.

    Noise dressing of financial correlation matrices

    Phys. Rev. Lett.

    (1999)
  • L. Laloux et al.

    Random matrix theory and financial correlations

    Int. J. Theor. Appl. Finance

    (2000)
  • B. Rosenow et al.

    Portfolio optimization and the random magnet problem

    Europhys. Lett.

    (2002)
  • V. Plerou et al.

    A random matrix theory approach to cross-correlations in financial data

    Phys. Rev. E

    (2002)
  • T. Conlon et al.

    Random matrix theory and fund of funds portfolio optimization

    Physica A

    (2007)
  • E.P. Wigner

    Characteristic vectors of bordered matrices with infinite dimensions

    Ann. of Math.

    (1955)
  • E.P. Wigner

    On the distribution of the roots of certain symmetric matrices

    Ann. of Math.

    (1958)
  • Cited by (7)

    • Global financial indices and twitter sentiment: A random matrix theory approach

      2016, Physica A: Statistical Mechanics and its Applications
    • Interest rate next-day variation prediction based on hybrid feedforward neural network, particle swarm optimization, and multiresolution techniques

      2016, Physica A: Statistical Mechanics and its Applications
      Citation Excerpt :

      Accurate time series forecasting is always an important issue in different economics and business related applications; including prediction of wind speed [1,2], portfolio risk [3,4], energy market volatility [5,6], inflation [7], exchange rate [8], and financial market price index [9,10].

    View all citing articles on Scopus
    View full text