Measuring the Systematic Risk of Sectors within the US Market Via Principal Components Analysis: Before and during the COVID-19 Pandemic

This research measures the systematic risk of 10 sectors in the American Stock Market, discerning the COVID-19 pandemic period. The novelty of this study is the use of the Principal Component Analysis (PCA) technique to measure the systematic risk of each sector, selecting five stocks per sector with the greatest market capitalization. The results show that the sectors that have the greatest increase in exposure to systematic risk during the pandemic are restaurants, clothing, and insurance, whereas the sectors that show the greatest decrease in terms of exposure to systematic risk are automakers and tobacco. Due to the results of this study, it seems advisable for practitioners to select stocks that belong to either the automakers or tobacco sector to get protection from health crises, such as COVID-19.


Introduction
According to Sullivan and Sheffrin [1], diversification is the process of allocating capital in a way that reduces the exposure to any particular asset or risk. Fama and Miller [2] state that the Capital Asset Pricing Model (CAPM) introduces the concepts of diversifiable and non-diversifiable risk. Synonyms for diversifiable risk are unsystematic risk and security-specific risk. Synonyms for non-diversifiable risk are systematic risk, beta risk, and market risk. Thus, the CAPM argues that investors should only be compensated for non-diversifiable risk.
According to Pasini [3], the principal Component Analysis (PCA) is a method of multivariate analysis. The idea behind the PCA is to reduce the dimensionality of a dataset in which there are a large number of interrelated variables, to maximize the variance of a linear combination of the variables. It is a method applied to data with no groupings among the observations and no partitioning of the variables into subsets y and x. Particularly, the principal components are obtained by applying this method. The first one is the linear combination with maximal variance, the second one is the linear combination with maximal variance in the orthogonal direction to the first principal component and so for the others. Moreover, they are ordered sequentially with the first one explaining much of the variation as it can.
With the help of the PCA, we measure how each sector is affected by market risk, measured by the first component. This article proceeds as follows. The next section presents relevant literature on PCA and the stock market, and the third section describes our methods and data. The fourth section presents the analyses of the findings, and lastly, we present our conclusions in the fifth section.

Systematic risk
Lakonishok and Shapiro [4] conclude that neither the traditional measure of risk (beta) nor the alternative risk measures (variance or residual standard deviation) can explain the cross-sectional variation in returns; only size seems to matter. Gencay et al. [5] propose a new approach to estimating systematic risk (the beta of an asset) and find that the relationship between the return of a portfolio and its beta becomes stronger as the wavelet scale increases. Campbell et al. [6] state that the systematic risks of individual stocks with similar accounting characteristics are primarily driven by the systematic risks of their fundamentals. Xing and Yan [7] indicate that improving accounting information quality causes the systematic risk to decrease, thus having important implications for disclosure decisions, portfolio management, and asset pricing.

PCA and the stock market
Liu and Wand [8] study the Chinese stock market and find that the performance of the BP model integrating PCA is closer to that of the proposed model in a relatively large sample. Hargreaves and Mani [9], using PCA through a perceptual map, provide a clear picture of the winning stocks that should be selected for trading. Wang et al. [10] achieve a good level of fitness, using two-directional two-dimensional PCA and Radial Basis Functional Neural Networks (RBFNN) in the Shangai stock market. Zahedi and Rounaghi [11], studying the Tehran stock exchange, through the usage of artificial neural network models and PCA method, note that prices have been accurately predicted and modeled in the form of a new pattern consisting of all variables. Noby and Lee [12] analyze global financial indices in the years 1998-2012 and indicate that the dynamics of individual indices within the group increase in similarity with time, and the dynamics of indices are more similar during crises. Gao et al. [13] experiment the prediction of the closing price of the stock market with two-dimensional PCA and deep belief networks (DBNs).
Waqar et al. [14] analyze three stock exchanges and show how PCA can help to improve the predictive performance of machine learning methods while reducing the redundancy among the data. Zhing and Enke [15] forecast the daily direction of the S&P 500 Index ETF (SPY) return and show that DNNs using two PCA-represented datasets give slightly higher classification accuracy than the entire untransformed dataset. Nahil and Lyhyaoui [16] show that the structure of the investment decision system can be simplified through the application of kernel PCA. Berradi and Lazaar [17], using both PCA and recurrent neural network model, reduce the number of features from eight to six, giving a good prediction of total Maroc stock price. Cao and Wang [18] compare the performance of both PCA and backpropagation (BP) neural network algorithms and find that the latter has the highest prediction accuracy.
More recently, Wen et al. [19] demonstrate how both PCA and LTSM can accurately predict the stock price fluctuation trend of Pingon Bank. According to Liang et al. [20], using volatility information of grains and softs through PCA and FA, find significant predictive ability in forecasting the RV of the S&P 500. Xu et al. [21], through the use of PCA, investigate the Chinese A-shares market over the 2013-2019 period and find that no matter investor sentiment, stock prices react significantly to rumors as well as when the rumor goes public. Yaojie et al. [22], using PCA and other methods, show the significant ability of the combined international volatility to predict US stock volatility. The literature review shows how PCA has been useful in dimensionality reduction, predicting prices, and other features of the stock market, in particular, this paper applies this mathematical technique in an innovative way, namely measuring the systematic risk in various sectors of the US stock market.

Methods & data
According to Ross et al. [23], systematic risk is the one that influences a large number of assets, thus having market-wide effects. On the other hand, unsystematic risk is the one that affects a single asset or a group of assets. Since the former cannot be eliminated through diversification is called non-diversifiable risk, whereas the latter is called diversifiable risk because it can be eliminated through portfolio diversification.

Principal Component Analysis
According to [24], PCA is a technique that may be useful where explanatory variables are closely related. In specific, if there are k explanatory variables in the regression model, PCA will transform them into k uncorrelated new variables. To explain, suppose that the original explanatory variables are denoted x 1 , x 2 , …, x k , and denote the principal components by p 1, p 2 , …, p k . These principal components are independent linear combinations of the original data Where α ij are coefficients to be calculated, representing the coefficient on the jth explanatory variable in the principal component. These coefficients are also known as factor loadings. The principal components are derived in such a way that they are in descending order of importance. In particular, for this study, we take the first component as a representative of systematic risk, that is, the risk that affects the whole sector and cannot be diversified in a stock portfolio. For this analysis we write a script in Python, particularly we use sklearn library to compute the principal components.
We gather all data from yahoo finance, where we include 10 sectors of the US stock market, choosing the biggest five companies per stock by market capitalization (Table 1), taking daily log returns of stock prices, and dividing the periods of study into two-the pre-COVID-19 era-January 10 to May 10, 2021.  Table 2 displays the explained variance per principal component by sector, in specific we consider the first principal component to be representative of the systematic risk, whereas the other two are representative of non-systematic risk, that is, the diversifiable risk. The three principal components embody the majority of the variance, having a range from 86.3% (restaurants), to 95.5% (airlines) during the pre-COVID period, in contrast, during the COVID period, the range goes from 88.1% (clothing) to 97.1% (banks). Figure 1 shows the overall results for the explained variance by the first principal component of all sectors analyzed. Before the pandemic, the three sectors with the highest systematic risk are-measured by the first principal component-banks, energy, and airlines; and the sectors with the lowest systematic risk are restaurants, healthcare, and automakers. Nevertheless, during the COVID-19, the three sectors that augmented the exposure to systematic risk are the restaurants' sector with an increase of 39.3%, clothing with 22.2%, and insurance with 14.5%. On the other hand, the sectors that presented a reduction of systematic risk during COVID-19 are automakers with 13.2% and tobacco with 10.3%.

Findings
The interpretation of the results is that according to our proposed metric of systematic risk, the sectors that are affected the most due to crises such pandemics are the restaurants, the clothing, and the insurance sector; in contrast, the sectors that show reliability during the pandemic are the automakers and tobacco. Due to these results, it seems advisable for practitioners to rely more on stocks that are both in the automakers and tobacco sectors, due to lesser exposure to systematic risk.

Conclusions
The innovation of this research is twofold-first, we apply PCA to measure systematic risk, and second, we discern systematic risk before and during COVID-19. In particular, the sectors that increase the most in terms of exposure to systematic risk are-the restaurants, clothing, and insurance sectors; in contrast, the sectors that show a decrease in systematic risk during the pandemic are-automakers and tobacco sectors, showing resilience during the pandemic. The results indicate that for portfolio managers it is better to pick stocks that belong to sectors, such as automakers and tobacco sectors in times of health crises such as pandemics, enhancing the benefits of diversification, and creating a shield against the increase of systematic risk due to these kinds of shocks. Consequently, further research could use the methodology proposed in this paper to measure systematic risk to better protect against crises such as COVID-19, thus having practical implications around the world (Video, https://youtu.be/o5SIhEHrRW8).