Forecasting the cross-sectional stock returns: Evidence from the United Kingdom

The study provides the forecasts of expected returns based on cross-sectional estimates from the Fama-Macbeth regressions in the United Kingdom. We collected the data of listed firms on the London Stock Exchange on the DataStream from January 1980 to December 2020. We analyze the data sample by employing three cross-sectional models' ten-year rolling estimates of Fama-Macbeth slopes. The empirical findings demonstrate that an investor can derive a composite estimate of the expected return by integrating various company-specific variables in real-time. Model 1 indicates that the expected-return estimates have a predictive slope for future monthly returns of 95.07%, with a standard error of 0.1981. Moreover, model 2 and model 3 report the predictability of returns are 77.57% and 76.94%. In short, our empirical evidence suggests that investors and stakeholders may consider using model 1 to estimate the cost of equity due to its simplicity and effective prediction capability. Our findings are consistent with trade-off theory and prior literature.


Introduction
The cross-sectional stock returns have been continuously exciting questions for practitioners and academics. The CAPM has become famous for decades due to its simplicity and solid theoretical frameworks. However, recent studies document various limitations of CAPM because it is difficult to get up with side effects, value effects, and momentum effects. Fama and French (2008) confirm that the correlations between a company's stock returns in the future and firm characteristics such as size, book-to-market (B/M), past returns, and investment are significantly positive. These findings are unlikely due to random chances or data snooping. They are robust by both characteristic-sorted portfolio tests and the cross-sectional properties from Fama-Macbeth (FM) regressions.
Although academics have done superb jobs by examining cross-sectional and time series models to estimate expected stock returns, two remaining unanswered questions remain. Traditional cross-sectional regressions and portfolio sorting have not determined whether an investor could forecast cross-sectional variation in projected returns. Another study done by Sharma and Chakraborty (2019) employs Fam-Macbeth (1973) (FM hereafter) regression to price the risk factors, but non-evidence clearly on how trustworthy FM regression estimates of expected returns remain a question in the literature. However, Lewellen (2015) proposed an appropriate answer for these two questions by studying the slopes measured in previous years for expected-return estimates from FM regressions report the distribution and the out-of-sample predictive ability of those estimates. The main concern of Lewellen (2015) is to examine whether the predicted subsequent returns have the slope of one with accurate expected returns. If this slope has a closer value to one, then an out-of-sample forecast provides a reasonable estimate of the expected return.
We follow the approach of Lewellen (2015) to conduct empirical analysis in the United Kingdom (the UK hereafter) for the following reasons. Firstly, there has been no previous study that used the same approach using UK data to our knowledge. Most research that examines multifactor asset pricing models uses only data from the United States, and evidence from other countries is scarce. As a result, it is critical to determine whether the results obtained by Lewellen (2015) apply to other stock markets. Our research could be used to test the resilience of asset pricing models. Furthermore, different asset pricing models produce varied results due to differences in institutional setups and investor preferences (Drobetz et al.,2019, Hyde & Sherif, 2010, leading to different exposures to macroeconomic factors and degrees of internationals in companies accounting treatments, reporting periods (French, 2017). Therefore, there is a demand for decision-makers and authorities in the UK who use asset-pricing models for estimating the cost of equity or evaluating performance to adopt a different model than decision-makers in the US. As a result, there is an evident need for asset pricing models to be tested using data from markets other than the United States (Fama & French, 2017;Huynh, 2018). Secondly, the UK capital market has an essential role in the world. The UK's trading volumes and market capitalization are the third largest globally (Hsieh et al., 2018). Therefore, we are interested in providing new evidence from a primary developed market by analyzing UK data. Our empirical findings also serve as a robustness check for evidence derived from US data. Furthermore, our results provide complement out-of-sample evidence to the findings of Lewellen (2015). Finally, there is uncertainty about which models could provide a meaningful measure of the expected cost of equity in the UK market. Fletcher et al. (2018) compared the efficiency of different asset pricing models in the UK market and concluded that it is difficult to recommend one model over the others. Furthermore, Chen and Sherif (2016), Kariofyllas et al. (2017), and Foye (2018) examine the UK market using either three, four, or five-factor models, and they provide mixed evidence on the premium of each component. As a result, researchers and managers in the UK market need a helpful model to predict crosssectional stock returns. From these bases, we construct the first hypothesis:

H1:
The capital asset pricing model is applicable in the UK market.
Using the UK data from 1986 to 2020, we analyze three FM regression specifications based on gradually increasing independent variables. The first model consists of size, B/M, and returns month-by-month (three predictors). The second model with three predictors is identified in model 1 and supplements accruals, profitability, and asset growth (six predictors). We put the second hypothesis underlying models 1 and 2 that: H2: Investors can make a profit through the above predictors.
The third model includes various predictors which could forecast returns such as beta, return-13,-36, market leverage, sales/price, and debt/price (ten predictors). The three regression models employ frequency variables such as monthly variables (size and B/M) or annual variables (accruals, assets growth, and dividend yields). Therefore, we propose the third hypothesis as follows: H3: Extending models can better predict returns than models 1 and 2.
Our preliminary experiments concentrate on monthly stock return forecasts based on 10-year rolling averages of FM slopes. The estimate of expected returns lines up well with the actual expected returns. When the subsequent returns estimate from FM regressions is based on all three sets of predictor variables, the first model generates the slope mean of 0.865 (with a tvalue of 7.08). With t-statistics of 7.39, these slope estimates are significant. The second model generates an insignificant slope, and the third model generates the slope means of 0.192 (with a t-value of 5.01). In short, the slopes of model 1 and model 3 are consistent with Lewellen (2015), but they have higher significance in the UK market. In addition to the preliminary tests, we classify equities into ten deciles based on the different expected-return estimates. This research aims to examine the estimations obtained from 10-year rolling FM slopes and see if the trading strategy, like high-anticipated decile return long and low-anticipated decile return short (H-L strategy), can yield significant profit or not. From the first model, the average spread between the predicted monthly returns of the top and bottom deciles is around 5.6% per month. The spread is also significant, with a T-value of 10.17. Similarly, the third model generates significant spreads between the predicted monthly returns of the top and bottom deciles is around 11.31% per month, respectively (with t-value and 14.11). However, the second model generates an insignificant spread between the top and bottom deciles. Therefore, investors may consider employing the first and third models to generate potential arbitrary profit in the UK market.
Finally, we perform the cross-sectional estimates regressions on lagged firm characteristics to predict monthly returns. The evidence from the three models suggests that book-to-market ratio, size, and momentum have insignificant effects on monthly returns. The size effect is consistent with Zagonov and Hanke (2020) as they conclude that there is an insignificant size effect in the EU. This finding is consistent in the US market (Chemmanur & Yan,2019). Moreover, the finding is robust because Chemmanur and Yan (2019) collect data from Standard and Poor's Compustat. However, the finding is not consistent with Drobetz et al. (2019) as they document the significant negative size effect on the European stock market from September 1999 to December 2018. On the other hand, Zaremba and Czapkiewicz (2017) detect a significantly positive size effect in 6 European countries. The evidence from model 2 and model 3 suggests that accruals and asset growth positively impact the expected return. In model 3, the return-2,-12 has an insignificant positive relationship with expected returns while the return-13,-36 have significant negative impacts on expected return. Our finding is consistent with the literature because long-term momentum shows signs of reversal. The sign of beta is positive, consistent with the argument of Bali et al. (2017).
Our study contributes to the literature with three findings. First, while the research suggests that numerous firm factors are related to subsequent stock returns, there is limited evidence that these characteristics can be used to expected actual returns estimates, either individually or in combination. For example, we could not be sure whether those factors align well with the forecasts of actual expected returns, even though B/M, accruals, and reality returns have a close connection. The crosssectional slope is unstable and incorrectly estimated; the valid parameters having time-variation could reduce the predictive power of out-of-sample estimated expected returns even if firm historical characteristics have a significant predictors ability of returns.
Second, we know that there is extraordinarily little work on how an investor can derive an integrated translation strategist, in fact, based on the combination of company characteristics available at present. Out-of-sample projections from FM regressions provide a simple but surprisingly efficient technique to construct a composite trading strategy that goes long high-expected-return stocks and short low-expected-return companies, using real-time slope estimations. There is a maximum of ten firm characteristics tested from the regressions. The result reports that many factors do not predict stock returns, making it challenging to identify the best variables.
Finally, by replicating this study, we not only examine the empirical explanatory power of these models; to the best of our knowledge, none of the three models has previously been tested for consistency with Lewellen (2015) by employing data from outside the United States. Moreover, we employ the intensive data sample, generating less biased findings.
The remaining structure of the paper is presented as follows: Part 2 describes the data and method. Part 3 analyses the monthly profit forecast and examines its relevance to subsequently realized profits, and Section 4 concludes.

Data and Models
This study collects the data of UK common stocks on the DataStream from January 1980 to December 2020. We exclude suspended companies to prevent a survivorship bias, and the data is expressed in Pounds. We exclude utilities and financials (Zheng et al., 2017). Moreover, the size and book-to-market ratio represent the returns of the portfolios for month t calculated at the end of each month t-1. Therefore, data should be available in month t − 1 for size as measured by market capitalization and the book-to-market ratio. We follow Fama and French (1993) to collect book-to-market equity stocks, which are non-negative. Following the method of Guo and Savickas (2008), we eliminate potential coding errors by the DataStream data imposed additional filters on. The market return (Rm;t) is the FTSE250, and the short-term interest rate is taken to be a 90-day (three months) treasury bill. Since the market returns and 90-day T-bills are available from Feb 1986, we start testing from Feb 1986 to Dec 2020. We follow Duong et al. (2021) to restrict the sample to a firm with valid data to calculate all variables. Moreover, independent variables are winsorized at one percentile and 99 percentiles. The final sample includes 72,626 firm-month observations from Feb 1986 to Dec 2020.
We use accounting data for the year t in March to match with market data at the end of September to obtain the appropriate variables to create a portfolio. Agarwal and Taffler (2008) show that companies in the UK have a March ending fiscal year for 22%, while 33% of UK firms have a December fiscal year-end. The study uses the FM regressions toward lagged firm characteristics to make the return forecasts. We focus on three regression models with more significant predictor variables. The first two models use characteristics that prior studies are significant: Model 1 with three predictors: consist of size, B/M, and lag stock returns with 12-month Model 2 with six predictors: adds one-year accruals, profitability, and asset growth. Model 3 with ten predictors: includes four additional characteristics that have a weaker relation historically to subsequent returns, including beta, three-year stock returns, market leverage, and the sales-to-price ratio. Models 1 and 2 are possibly based on theory and fitted with the variables identified if an investor recognized early significant predictors. In contrast, Model 3 is most fitted if an investor considered a more extensive set in the samples, even if we now do not give substantial justification for the model. Variables are explicitly explained in Table 1. LogSize-1 The log of market capitalization at the end of the previous month t-1 (Nartea et al., 2017) Return-2,-12 The momentum factor is measured by the change of log prices 11-month return lagged one month (Chang et al., 2018) AccrualsYr-1 Change in working capital (non-cash) minus depreciation and amortization in the prior fiscal year (Liu et al., 2019) ROAYr-1 Indicates the profitability factor measured by the ratio between income before abnormal activities and average assets of the business in the previous fiscal year (Stambaugh & Yuan,2017) LogAGYr-1 measures asset growth in the previous year total asset value (Angulo-Ruiz et al., 2018) LogReturn-13,-36 Three-year stock return (from month -36 to month -13) (Mselmi et al., 2019) Beta-1,-36 Market beta estimated from monthly returns from month -36 to month -1 (Lewellen, 2015) Debt/PriceYr-1 Divide total debt (Short-term and long-term debt) by the market value of total assets at the end of the prior month gives the ratio of market leverage (Lewellen, 2015) Sales/PriceYr-1 Sales in the prior fiscal year divided by market value at the end of the prior month. (Zaremba & Andreu, 2018)

Descriptive statistics
The firm characteristics represent slowly changing variables (size and B/M) or flow variables measured over a year (income and revenues). Therefore, the monthly data derived from observations are highly persistent. Descriptive statistics summary for returns month-by-month and the ten variables used within the study are indicated under Table 2. The descriptive statistics involve certain time sequence averages of the monthly cross-sectional mean, standard deviation, and sample size for each variable. We winsorize all characteristics at the 1st and 99th percentiles, excluding monthly returns. Table 3 presents the correlation matrix of the independent variables. This table shows that the correlations between variables are less than moderate. The maximum correlation is 0.55 so we conclude that there is no multicollinearity issue in our sample.  Table 4 suggests that the monthly cross-sectional regressions effectively predict the time-series variability of average return slopes, R 2 , and sample sizes from February 1986 to December 2020. The average slope's autocorrelation obtained from the t-statistic is an effective combination between the time-series variability of the slope estimates and a Newey West correction with four lags. As explained above, table 4 shows results for three specifications of the regressions.

Firm-level regressions
All three models generate the slopes on B/M, returns with lagged 12-month, and profitability are insignificant. The results are inconsistent with prior research such as Duong et al. (2021), Kubota and Takehara (2018), Kongsilp and Mateus (2017), Nartea et al. (2017). On the other hand, model 2 and model 3 report positive slopes on LnAG and accruals. These findings are consistent with Angulo-Ruiz et al. (2018) because their result reflects the strong predictive power of total asset growth for stock returns in the future (Zaremba & Maydybura, 2019;Jacobs, 2016). In particular, model 3 indicates a reversal effect because the slope on Return-2,-12 is positive while the slope on Return-13,-36 is negative. This finding is consistent with Kelly et al. (2021) because the result provides strong evidence in the predictive power for long-term price reversals of past returns. Moreover, model 3 also reports slope on beta is positive while the slope on ROA is negative. The beta coefficient is consistent with Bali et al. (2017). Others found statistically reliable evidence that the correlation between expected return and market beta is positive but not as strong as that predicted by CAPM. Table 4 reports exciting findings. First, the FM R 2 is not an estimate to reflect the predictive power of variables. The FM R 2 only reflects the power of characteristics to explain the simultaneous volatility and does not indicate the predictive strength of the characteristics. A simple example clarifies the explanation above by assuming stocks with no idiosyncratic residuals have the same expected return. However, they are not identical in terms of betas and single-factor market models that explain abnormal returns. In FM regressions, the beta would exhibit better explanatory power for monthly expected returns even though it has no power to forecast returns. The correlation between returns half-yearly and beta would be perfectly positive in the growth market phase and perfectly negative during the declining market because realized returns always align precisely with beta. More generally, the slope in the FM regressions represents the returns of the portfolios based on the predicted characteristics (Fama and Macbeth, 1973). This table considers Fama-MacBeth cross-sectional regressions with up to lagged firm characteristics when the predictive power for monthly return is mulled over the average slopes and R2. The autocorrelation in slope estimates is derived from the T-statistic on the time series variability of the estimates, incorporating the Newey-West correction with four lags-all data retrieved from DataStream. We use accounting data for the year t in March to match with market data at the end of September .
Second, the B/M approach significantly impacts B/M slope and Momentum strategies 12-months. Incredibly, some studies adopt a functional model already employed earlier by Fama and French (1992) and use the book equity for the last fiscal year and the market value of the firm's equity in December to compute the B/M value once a year at the end of June. Lewellen (2015) calculated B/M once a year on the latest market and book equity (the latter updated four months after the fiscal year). Our measures are based on March accounting data at year t with the end of September year t market data instead. Therefore, there is a timelier reflection of variables on the recent stock price changes . Table 5 presents the distribution and the ability to make an out-of-sample estimate of expected return in the FM regressions above. These forecasts are given out of the slopes from the three models at Table 4 in a firm's beginning-of-month characteristics and the prior 10-year rolling average, starting in 1986. Therefore, the first rolling forecast results start from 1996. Again, demonstrating what investors can predict in terms of expected returns based solely on the previous regression model is the primary purpose of this paper. The columns on the right in Table 5 provide summary statistics of the expectedreturns variables. These values include monthly cross-sectional means, standard deviations, and respective percentiles at 10th and 90th. Table 5 demonstrates that the mean, primarily for descriptive purposes, is not a significant factor compared to the cross-sectional dispersion to determine whether the estimates do an excellent job of predicting the variability in expected returns across stocks.  Table 5 shows the following time stock's returns forecasted performance of the three models. This table includes the distribution (average, standard deviation, respective percentiles) and predictive ability (slope, standard error, t-statistic, R 2 ) of monthly return forecasts from running FM regressions (10-year rolling estimates starting in 1986) of monthly returns on the expected-return, which are recalculated using slope estimate in the past and characteristics of the present company. All point estimates equal time-series averages of monthly cross-sectional parameters. In out-of-sample, FM regressions for actual monthly returns represent the predictive slopes and R 2 s on the expected-return estimates; standard errors are pretty like standard deviation and based on the average variability in time-series data, incorporating Newey-West standard errors with four lags. All data are retrieved from DataStream. Model 1 consists of size, B/M, and month-by-month momentum; Model 2 adds profitability and asset growth; Model 3 adds beta, market leverage, sales/price, three-year returns.

Estimates of expected stock returns
There is a difference in the slope of cross-sectional returns in all three models. For the entire sample, the cross-sectional standard deviations range from 0.5775 using 10-year rolling slope estimates for Model 3 to 3.13 using cumulative slope estimates for Model 1. Table 3 also investigates the critical question of whether there is indeed variation in cross-sectional estimates in actual expected returns. An unbiased forecast of the expected returns of an estimate will have a slope of one (the inaccuracy in the estimated measure of cross-sectional correlation in returns may affect the statistical significance toward a better forecast). The tests in Table 3 perform the FM regressions in out-of-sample, again as evidenced by the tstats from the slopes of time-series variability monthly. Notably, in model 1, the change in expected future returns is well indicated by the predicted returns according to current characteristics. In model 1, the predictive slope is 0.9507 with a Tstatistic of 4.8. Moreover, model 2 and model 3 report that the predictability of returns is 77.57% and 76.94%, and these slopes are also statistically significant. The study results reflect that the actual expected returns of stocks vary based on the variation in the expected return. However, the standard deviation of model 1 is the highest among the three models. Therefore, investors have to trade-off between simple models versus the variance of the outcomes from the simple model. The findings are consistent with the trade-off theory (Khoa et al., 2020). The evidence in Table 5 has several implications. Model 1 suggests that the forecasts of expected-return estimates have strong predictive power for subsequent stock returns through the FM regression model. Although the coefficients of individual characteristics in FM regression are not significant for model 1, the out-of-sample forecast of model 1 is the highest and most reliable among the three models. Moreover, the model includes a small number of variables, so it is a pretty simple approach to estimate expected returns. Finally, adding more variables to FM regression led to reductions in the out-of-sample forecast predictability power.
The results also suggest that a sufficiently stable and accurately estimated FM regression model significantly impacts the solid predictability of out-of-sample expected returns. Unlike the regressions on time series data (Fletcher et al.,2018), the forecast of subsequent returns is reliable based on prior FM regressions. Performing FM regressions is that a company with better characteristics would take up a better return in the future -and implies that the cross-sectional mean estimate of the predicted returns needs to be discounted (Lewellen, 2015).

Expected Returns of Sorted Portfolios
To give more the sight view on the predictive abilities of the return forecasts, Table 6 demonstrates predicted expectedreturn-sorted portfolios. Predicted returns for value-weighted portfolios increase monotonically from Low to High deciles in three models. In model 1, predicted returns are the most negligible dispersion among the three models. The H-L profit for model 1 is 5.06%, with a T-value of 10.17. Model 3 also generates significant H-L profits of 11.31%, with a T-value of 14.11. However, the H-L profit for model 2 is not significant due to its extremely high standard deviation. In summary, the results in Table 6 show the fundamental implication. There is too much variation across portfolios relative to average realized returns, even if the expected-return estimates from the FM model have a perfectly predicted ability for subsequent returns. This table reports the effect of the portfolio regressions to match the value-weighted predicted returns increase with the average predicted (Pred) returns when ranking stocks by predicted expected returns. At the same time, through the Standard Deviation (Std) and the Newey-West t-statistic (t-stat) test to check whether there is a positive or negative risk premium. Model 3 forces all ten firms' current characteristics and 10-year rolling estimates of past FM regressions to calculate the predicted expected returns. All data is retrieved from DataStream.

Conclusion
The study clarifies the nature of expected returns in the UK stock market. We employ the Fama-Macbeth cross-sectional regressions on lagged ten firm characteristics to predict monthly stock returns in 3 models, respectively. We also conduct a robustness test through the monthly return forecasting method of a 10-year rolling average of FM slope. At the same time, we perform an arbitrage profit test by classifying stocks into deciles based on portfolio estimates with different expected returns.
The study results show a significant difference for FM regression in all three models. The empirical results pointed out that forecasts vary between stocks and have a strong predictive power of actual returns with slope ranges from 0.9507 in model 1 to 0.7694 in model 3. Specifically, the forecasts derived from the FM regression model effectively simulate how an investor can, in real-time, combine a variety of company characteristics to obtain a composite estimate of a company's expected return. This result is consistent with the first hypothesis that the capital asset pricing model is intrinsically effective in predicting expected returns. Also, we find that all three models generate the slopes on B/M, returns with a lagged 12month, and profit is insignificant. Specifically, the results of model 1 are not statistically significant with most variables such as B/M, size, and past return in 12-month. This result is similar to model 2 except that the leverage variable is statistically significant. This result aligns with the second hypothesis that the model can partially explain the return predictor.
Furthermore, there is a considerable discrepancy about expected returns in Models 2 and 3. Model 3 results show statistically significant accruals, leverage, market beta, and three-year past returns. Model 3 has tremendous economic and statistical significance to capture the slope change in the FM cross-sectional regressions of out-of-sample predictive ability and the return differential for the sorted portfolio with expected profit obtained from the three models. This result is consistent with the third hypothesis that model 3 with extended properties can predict returns better than models 1 and 2.
For several applications in finance, the time sequence and cross-section of stock returns play a significant role in evaluating asset pricing models, developing trade strategies, and estimating company capital costs. This study was conducted with the primary objective of evaluating whether, based on the current characteristics of the company and the historical slopes of the FM regression, an investor can take an unbiased forecast of expected stock returns in real-time.