Not all that glitters is RMT in the forecasting of risk of portfolios in the Brazilian stock market

doi:10.1016/j.physa.2014.05.006

Physica A: Statistical Mechanics and its Applications

Volume 410, 15 September 2014, Pages 94-109

https://doi.org/10.1016/j.physa.2014.05.006 Get rights and content

Highlights

•
Use of Random Matrix Theory and the Single Index Model in the building of portfolios.
•
We compare combinations of techniques in the cleaning of the correlation matrix.
•
Cleaning the correlation matrix is not always advisable for times of high volatility.

Abstract

Using stocks of the Brazilian stock exchange (BM&F-Bovespa), we build portfolios of stocks based on Markowitz’s theory and test the predicted and realized risks. This is done using the correlation matrices between stocks, and also using Random Matrix Theory in order to clean such correlation matrices from noise. We also calculate correlation matrices using a regression model in order to remove the effect of common market movements and their cleaned versions using Random Matrix Theory. This is done for years of both low and high volatility of the Brazilian stock market, from 2004 to 2012. The results show that the use of regression to subtract the market effect on returns greatly increases the accuracy of the prediction of risk, and that, although the cleaning of the correlation matrix often leads to portfolios that better predict risks, in periods of high volatility of the market this procedure may fail to do so. The results may be used in the assessment of the true risks when one builds a portfolio of stocks during periods of crisis.

Introduction

Modern portfolio theory is largely based on Markowitz’s ideas, where portfolios of various equities are built on the principle of minimizing risk given some expected returns, allowing one to obtain an efficient frontier of risk and returns of portfolios. Risk is assessed as the volatility of each stock that made up the portfolio, as well as their covariances. The covariance matrix is used to predict the risk of a portfolio, and it is usually different from the realized risk of the same portfolio, since the matrix is built using the stock returns of past data.

Three problems arise from this approach. The first one is that past data reflect the market as it was, and not as it will be. So, the theory assumes the hypothesis that future events shall mimic past events, which is usually not true, since it does not incorporate news releases, or the current mood of the market. There is not much that can be done about this, but to minimize effects of events that might change the behavior of a market, one cannot use past data that is too far in the past.

This leads us to the second problem, which are the deviations associated with the finite sample effect, that arises purely from the fact that the available data are finite. Since one cannot go back in time indefinitely, and even if one could, it would not be advisable given the discussion in the preceding paragraph, there is only a limited amount of data (in our case, price quotations) from which to build a covariance matrix. The problem gets even more severe if we think that an efficient portfolio should be built from many and diverse equities, while maintaining a fairly recent scope of historical data, since that leads to more finite sample effects due to a smaller ratio between the number of days in the historical data and the number of stocks in the portfolio.

A third problem is the statistical noise that emerges from the complex interactions between the many elements of a stock market: news, foreign markets, crisis, and the very prices of stocks interact in order to guide the price of a stock. Those interactions are usually too complex to be accommodated by any econometric model.

So, all these effects are incorporated into the covariance matrix that is used in the attempt to forecast the risk of a particular portfolio, and if one can remove some of those from the matrix, one is then able to make better risk predictions. Some authors made studies on the influence of noise and other factors on the covariance matrix in the building of portfolios [1], [2], [3], [4], [5], [6]. Most of the approaches for solving them involve the reduction of the dimensionality of the covariance matrix by introducing some structure into it, obtained by principal component analysis, and the separation of stocks into economic sectors, among other means [7], [8].

A technique that has been applied to a number of complex systems, and, particularly, to financial markets, is Random Matrix Theory [9]. Of the many results that were obtained, the building of portfolios that most closely resemble the realized risk of the future market, based on past data, is one of them [10], [11], [12], and it has been successfully applied to stocks [13], [14], and to hedge funds [15].

Random Matrix Theory had its origins in 1953, in the work of the Hungarian physicist Eugene Wigner [16], [17]. He was studying the energy levels of complex atomic nuclei, such as uranium, and had no means of calculating the distances between those levels. He then assumed that those distances between energy levels should be similar to the ones obtained from a random matrix which expressed the connections between the many energy levels. Surprisingly, he could then be able to make sensible predictions about how the energy levels related to one another by removing the results due to a random matrix. The theory was later developed, with many and surprising results arising. Of particular importance for our study are the results obtained by Marčenku and Pastur [18] on Random Matrix Theory applied to correlation matrices, better described in the section on methodology.

Today, Random Matrix Theory is applied to quantum physics, nanotechnology, quantum gravity, the study of the structure of crystals, and may have applications in ecology, linguistics, and many other fields where a huge amount of apparently unrelated information may be understood as being somehow connected. The theory has also been applied to finance in a series of works dealing with the correlation matrices of stock prices, as well as with risk management in portfolios [19], [20], [21], [22], [23] (for a recent review on the subject, see Ref. [24]).

Another technique that can be used to better estimate the real relations among the components of the matrix correlation is to use a regression model to remove the market effect on the asset returns, i.e., to estimate the relationship between returns and an asset that represents the market (like the BM&F-Bovespa index, in Brazil’s case) and use only the residue of this model, thus eliminating the common variations of all stocks due to market movements. This procedure allows the estimation of the correlation matrix with greater precision, since there is just a part of the dependence which is due to the assets, which generates more reliable forecasts for the risk of a portfolio, being a large part of it due to the collective responses of the market to news or to other factors. This procedure is standard in many models in finance, most importantly in the CAPM (Capital Asset Pricing Model), and it is called Single Index Model (SIM), based on the idea that the majority of the systemic risk is captured by a single market index. The use of SIM is similar to the use of one component in the RMT filter, since the highest eigenvalue of the correlation matrix corresponds with the market.

Other models can be used to remove internal or external effects, the so called factor models, that defend the hypothesis that the systemic risk is due to a number of factors, which may include statistical, macroeconomic, or fundamentalist influences, that also can be used to remove noise. Ross [25], as an example, presents a model, called Arbitrage Pricing Model (APT), which uses more than one factor to explain systemic risk. According to Campbell, Lo and McKinley [26], the APT model provides an approximate relation for expected asset returns with an unknown number of unidentified factors. In the same way as Rosenow [27] selected the number of factors to be used in a MV-GARCH model based on the number of eigenvectors of the correlation matrix outside the Wishart (noise) region, RMT may be used in order to decide how many factors should be used in a multifactor model of the APT type.

Previous works on the stock exchanges of emerging markets using Random Matrix Theory have been conducted for South Africa [28], [29], India [30], Sri Lanka [31], and Mexico [32]. Their results show some differences between the stock exchanges of emerging markets and the stock exchanges of more developed ones, such as less liquidity for the stocks, and less integration of different sectors.

Recent results on the application of Random Matrix Theory to financial data are basically concerned with the actual calculation of the optimal portfolios, which involve the inversion of the correlation matrix of log-returns of the time series of the stocks that may take part in the portfolio [33], [34] and on a better formalization of the theory [35], with recent results [36] that claim to outperform the usual cleaning procedures used in this article.

Following a similar methodology as ours, the authors in Ref. [37] studied the stock market from Chile using Random Matrix Theory and performing an analysis of the eigenvalues and of the eigenvectors of the correlation matrix, and also studying the dynamics of those eigenvectors, revealing some structure based on some key industrial sectors of the Chilean economy. They also used Vector Autoregressive Analysis in order to pinpoint the main drivers of the Chilean stock market.

The contribution of this article is to combine the use of the RMT and the SIM methods which are capable of ameliorating the risk forecasts of a portfolio built with stock market assets, based on past data, and doing it for periods of both low and high volatilities. Using Markowitz’s theory, we calculate portfolios of stocks by three different ways: cleaning the correlation matrix using RMT, removing the market effect of the assets (SIM), and combining the two procedures. We then compare all these results with the risk of portfolios built by the usual way, i.e., without using RMT and/or SIM.

In order to analyze the suitability of the proposed methods, we shall use the daily returns of the BM&F-Bovespa stocks from 2004 to 2011, so involving years of both low and high volatility. We only use stocks with 100% liquidity, which means that there was negotiation of those stocks every day the stock exchange was open, considering pairs of years ranging from 2004 to 2012. For each year being analyzed, we built a portfolio using data from the previous year in order to make a forecast of the risk for the target year, and that forecasted risk is then compared with the realized risk in that year. We use 61 stocks for 2004–2005, 72 stocks for 2005–2006, 86 stocks for 2006–2007, 105 stocks for 2007–2008, 148 stocks for 2008–2009, 153 stocks for 2009–2010, 134 stocks for 2010–2011, and 125 stocks for 2011–2012.

We also analyzed the evolution of the portfolios in time using 32 stocks that were 100% liquid in the years ranging from 2003 to 2012 and studied the differences between predicted and realized portfolios in time. As data used in this article include periods of both low and high volatility in the BM&F-Bovespa, in particular the data collected during the Subprime Mortgage Crisis of 2007 and 2008, we are able to study how this technique of cleaning the correlation matrix and/or using the Single Index Model applies to times of high volatility.

The article is organized as follows: Section 2 is dedicated to the standard way of building portfolios (according to Markowitz). Section 3 introduces the basic concepts of Random Matrix Theory, and the characteristics of the eigenvalues of the correlation matrix, in addition to building portfolios by cleaning the correlation matrix when short selling is not allowed, as well as the regression for the removal of the market effect. The measures of how well predicted risk approximates realized risk for equal values of returns and the discussion of the results are in Section 4. Section 5 presents the analysis of how the proposed measures evolve in time, and the article ends with final remarks in Section 6.

Section snippets

Building portfolios using Markowitz’s theory

In this section, we shall start by building portfolios using the stocks that were 100% liquid during the years 2004 and 2005, as an example, based on the correlation matrix of their returns (we shall refer to log-returns as simply returns) in the year 2004, and then also for the remaining pairs of years. According to the usual portfolio theory, we can obtain $w$ , the vector of weights of the portfolio due to each stock, by fixing the portfolio return (RE) and minimizing the risk (RI) of the

Methodology

In this section, we briefly describe the method proposed for the construction of portfolios by cleaning the correlation matrix and removing the market effect, aiming at a better forecasting of risk based on the previous behaviors of the assets. We use the year 2004 as an example of the application of such method in this section, and then apply the same methodology to the remaining years.

Results

In this section, we present one measure of agreement of the predicted and realized risks calculated with the many combinations of techniques for building portfolios, and one measure of agreement between the correlation matrices of which risks are calculated.

Evolution in time

Our analysis so far has been based on large windows, with a varying number of stocks for each window, and large jumps from one window to the other. In order to perform a temporal analysis of the evolution of the portfolios, we now consider the 32 stocks that were 100% liquid in the period from 2003 to 2012 in moving windows of 100 days each, with a lag of 5 days between each window. For each of these windows, portfolios are built on efficient frontiers with and without regression, with and

Conclusions

In this article, we used two techniques in order to clean the correlation matrix in the building of portfolios using Markowitz’s theory. The first technique is the use of Random Matrix Theory in order to clean the correlation matrix built from the time series data of stocks in the year prior to that for which the portfolio is to be built. The second technique is to use a regression model in the removal of the market effect due to the common movement of all stocks. These are used in order to

Acknowledgments

L. Sandoval Jr. and M.K. Venezuela thank the support of this work by a grant from Insper, Instituto de Ensino e Pesquisa. We are also grateful to Gustavo Curi Amarante, who collected the data and to Nicolas Eterovic, for useful discussions. We also thank the anonymous reviewers, for their valuable suggestions and insights, which improved this article immensely. This article was written using LaTeX, all figures were made using PSTricks, and the calculations were made using Matlab, R and Excel.

References (46)

S. Sharifi et al.
Random matrix theory for portfolio optimization: a stability approach
Physica A
(2004)
S. Pafka et al.
Noisy covariance matrices and portfolio optimization II
Physica A
(2003)
V. Tola et al.
Cluster analysis for portfolio optimization
J. Econom. Dynam. Control
(2008)
B. Rosenow
Determining the optimal dimensionality of multivariate volatility models with tools from random matrix theory
J. Econom. Dynam. Control
(2008)
K,G.D.R. Nilantha et al.
Eigenvalue density of cross-correlations in Sri Lankan financial market
Physica A
(2007)
N.A. Eterovic et al.
Separating the wheat from the chaff: understanding portfolio returns in an emerging market
Emerg. Mark. Rev.
(2013)
M. Tumminello et al.
Correlation, hierarchies, and networks in financial markets
J. Econ. Behav. Organ.
(2010)
G.M. Frankfurter et al.
Portfolio selection: the effects of uncertain means, variances, and covariances
J. Finan. Quant. Anal.
(1971)
G.M. Frankfurter et al.
Estimation risk in the portfolio selection model: a comment
J. Finan. Quant. Anal.
(1972)
J.P. Dickinson
The reliability of estimation procedures in portfolio analysis
J. Finan. Quant. Anal.
(1974)

J.D. Jobson et al.

Estimation for Markowitz efficient portfolios

J. Amer. Statist. Assoc.

(1980)

R.O. Michaud

The Markowitz optimization enigma: is ‘optimized’ optimal?

Financ. Anal. J.

(1989)

V. Chopra et al.

The effect of errors in mean and co-variance estimates on optimal portfolio choice

J. Portfolio Manage.

(1993)

P. Jorion

Bayes-stein estimation for portfolio analysis

J. Finan. Quant. Anal.

(1986)

V. DeMiguel et al.

A generalized approach to portfolio optimization: improving performance by constraining portfolio norms

Manag. Sci.

(2009)

M.L. Mehta

Random Matrices

(2004)

L. Laloux et al.

Noise dressing of financial correlation matrices

Phys. Rev. Lett.

(1999)

L. Laloux et al.

Random matrix theory and financial correlations

Int. J. Theor. Appl. Finance

(2000)

B. Rosenow et al.

Portfolio optimization and the random magnet problem

Europhys. Lett.

(2002)

V. Plerou et al.

A random matrix theory approach to cross-correlations in financial data

Phys. Rev. E

(2002)

T. Conlon et al.

Random matrix theory and fund of funds portfolio optimization

Physica A

(2007)

E.P. Wigner

Characteristic vectors of bordered matrices with infinite dimensions

Ann. of Math.

(1955)

E.P. Wigner

On the distribution of the roots of certain symmetric matrices

Ann. of Math.

(1958)

Cited by (7)

On a spiked model for large volatility matrix estimation from noisy high-frequency data
2019, Computational Statistics and Data Analysis
Recently, inference about high-dimensional integrated covariance matrices (ICVs) based on noisy high-frequency data has emerged as a challenging problem. In the literature, a pre-averaging estimator (PA-RCov) is proposed to deal with the microstructure noise. Using the large-dimensional random matrix theory, it has been established that the eigenvalue distribution of the PA-RCov matrix is intimately linked to that of the ICV through the Marčenko–Pasturequation. Consequently, the spectrum of the ICV can be inferred from that of the PA-RCov. However, extensive data analyses demonstrate that the spectrum of the PA-RCov is spiked, that is, a few large eigenvalues (spikes) stay away from the others which form a rather continuous distribution with a density function (bulk). Therefore, any inference on the ICVs must take into account this spiked structure. As a methodological contribution, a spiked model is proposed for the ICVs where spikes can be inferred from those of the available PA-RCov matrices. The consistency of the inference procedure is established. In addition, the methodology is applied to the real data from the US and Hong Kong markets. It is found that the model clearly outperforms the existing one in predicting the existence of spikes and in mimicking the empirical PA-RCov matrices.
Global financial indices and twitter sentiment: A random matrix theory approach
2016, Physica A: Statistical Mechanics and its Applications
We use Random Matrix Theory (RMT) approach to analyze the correlation matrix structure of a collection of public tweets and the corresponding return time series associated to 20 global financial indices along 7 trading months of 2014. In order to quantify the collection of tweets, we constructed daily polarity time series from public tweets via sentiment analysis. The results from RMT analysis support the fact of the existence of true correlations between financial indices, polarities, and the mixture of them. Moreover, we found a good agreement between the temporal behavior of the extreme eigenvalues of both empirical data, and similar results were found when computing the inverse participation ratio, which provides an evidence about the emergence of common factors in global financial information whether we use the return or polarity data as a source. In addition, we found a very strong presumption that polarity Granger causes returns of an Indonesian index for a long range of lag trading days, whereas for Israel, South Korea, Australia, and Japan, the predictive information of returns is also presented but with less presumption. Our results suggest that incorporating polarity as a financial indicator may open up new insights to understand the collective and even individual behavior of global financial indices.
Interest rate next-day variation prediction based on hybrid feedforward neural network, particle swarm optimization, and multiresolution techniques
2016, Physica A: Statistical Mechanics and its Applications
Citation Excerpt :
Accurate time series forecasting is always an important issue in different economics and business related applications; including prediction of wind speed [1,2], portfolio risk [3,4], energy market volatility [5,6], inflation [7], exchange rate [8], and financial market price index [9,10].
Multiresolution analysis techniques including continuous wavelet transform, empirical mode decomposition, and variational mode decomposition are tested in the context of interest rate next-day variation prediction. In particular, multiresolution analysis techniques are used to decompose interest rate actual variation and feedforward neural network for training and prediction. Particle swarm optimization technique is adopted to optimize its initial weights. For comparison purpose, autoregressive moving average model, random walk process and the naive model are used as main reference models. In order to show the feasibility of the presented hybrid models that combine multiresolution analysis techniques and feedforward neural network optimized by particle swarm optimization, we used a set of six illustrative interest rates; including Moody’s seasoned Aaa corporate bond yield, Moody’s seasoned Baa corporate bond yield, 3-Month, 6-Month and 1-Year treasury bills, and effective federal fund rate. The forecasting results show that all multiresolution-based prediction systems outperform the conventional reference models on the criteria of mean absolute error, mean absolute deviation, and root mean-squared error. Therefore, it is advantageous to adopt hybrid multiresolution techniques and soft computing models to forecast interest rate daily variations as they provide good forecasting performance.
A Random-Matrix-Theory-Based Analysis of the Brazilian Stock Market During the 2008 Financial Crisis and Asian Crisis and Temporal Neighborhoods
2022, Fluctuation and Noise Letters
An efficient deep learning based model to predict interest rate using twitter sentiment
2020, Sustainability (Switzerland)
Between nonlinearities, complexity, and noises: An application on portfolio selection using kernel principal component analysis
2019, Entropy

View all citing articles on Scopus

View full text

Not all that glitters is RMT in the forecasting of risk of portfolios in the Brazilian stock market

Highlights

Abstract

Introduction

Section snippets

Building portfolios using Markowitz’s theory

Methodology

Results

Evolution in time

Conclusions

Acknowledgments

Physica A

Physica A

J. Econom. Dynam. Control

J. Econom. Dynam. Control

Physica A

Emerg. Mark. Rev.

J. Econ. Behav. Organ.

Portfolio selection: the effects of uncertain means, variances, and covariances

J. Finan. Quant. Anal.

Estimation risk in the portfolio selection model: a comment

J. Finan. Quant. Anal.

The reliability of estimation procedures in portfolio analysis

J. Finan. Quant. Anal.

Estimation for Markowitz efficient portfolios

J. Amer. Statist. Assoc.

The Markowitz optimization enigma: is ‘optimized’ optimal?

Financ. Anal. J.

The effect of errors in mean and co-variance estimates on optimal portfolio choice

J. Portfolio Manage.

Bayes-stein estimation for portfolio analysis

J. Finan. Quant. Anal.

A generalized approach to portfolio optimization: improving performance by constraining portfolio norms

Manag. Sci.

Random Matrices

Noise dressing of financial correlation matrices

Phys. Rev. Lett.

Random matrix theory and financial correlations

Int. J. Theor. Appl. Finance

Portfolio optimization and the random magnet problem

Europhys. Lett.

A random matrix theory approach to cross-correlations in financial data

Phys. Rev. E

Random matrix theory and fund of funds portfolio optimization

Physica A

Characteristic vectors of bordered matrices with infinite dimensions

Ann. of Math.

On the distribution of the roots of certain symmetric matrices

Ann. of Math.