Abstract
It is widely recognized that limited attention capacity of individual investors affects stock performance. We construct five aggregate investor attention indices for each stock by extracting common information components related to stock returns from various attention proxies using equal-weighted (EW), principal component analysis (PCA), partial least squares (PLS), gradient boosting decision tree (GBDT), and random forest (RF) methods. In a sample of all Shanghai Stock Exchange 50 constituent stocks, we identify two attention indices constructed by machine learning algorithms, RF and GBDT, that provide economically meaningful enhanced prediction of stock returns in both in-sample and out-of-sample periods. Moreover, these indices are negatively related to return volatility. Results suggest the utility of using machine-learning to form proxies of investor attention and reveal the excellent forecasting power of these proxies in asset pricing.
Similar content being viewed by others
Notes
Where \(U\) and \(V\) are orthogonal matrices with orthonormal eigenvectors chosen from \(A{A}^{T}\) and \({A}^{T}A\) respectively. \({S}^{^{\prime}}\) is a diagonal matrix with \(r\) elements equal to the root of the positive eigenvalues of \(A{A}^{T}\) and \({A}^{T}A\).
References
Aboody, D., Lehavy, R., & Trueman, B. (2010). Limited attention and the earnings announcement returns of past stock market winners. Review of Accounting Studies, 2(15), 317–344.
Aggarwal, R., & Goodell, J. W. (2008). Equity premia in emerging markets: National characteristics as determinants. Journal of Multinational Financial Management, 18(4), 389–404.
Aggarwal, R., & Goodell, J. W. (2011). International variations in expected equity premia: Role of financial architecture and governance. Journal of Banking and Finance, 35(11), 3090–3100.
Akyildirim, E., Goncu, A., & Sensoy, A. (2020). Prediction of cryptocurrency returns using machine learning. Annals of Operations Research, 297, 3–36.
Andrei, D., & Hasler, M. (2015). Investor attention and stock market volatility. Review of Financial Studies, 1(28), 33–72.
Arif, S., & Lee, C. M. C. (2014). Aggregate investment and investor sentiment. Review of Financial Studies, 11(27), 3241–3327.
Baker, M., & Wurgler, J. (2006). Investor sentiment and the cross-section of stock returns. Journal of Finance, 61(4), 1645–1680.
Barberis, N., Shleifer, A., & Vishny, R. (1998). A model of investor sentiment. Journal of Financial Economics, 3(49), 307–343.
Barber, B. M., & Odean, T. (2008). All that glitters: The effect of attention and news on the buying behavior of individual and institutional investors. Review of Financial Studies, 2(21), 785–818.
Bali, T.G., Goyal, A., Huang, D., Jiang, F., & Wen, Q. (2020). Different strokes: Return predictability across stocks and bonds with machine learning and big data. Technical report. Georgetown University.
Ballings, M., Dirk, V. D. P., Hespeels, N., & Gryp, R. (2015). Evaluating multiple classifiers for stock price direction prediction. Expert Systems with Application, 42(20), 7046–7056.
Bianchi, D., Büchner, M., & Andrea Tamoni, A. (2021). Bond risk premiums with machine learning. Review of Financial Studies, 34(2), 1046–1089.
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. CRC Press.
Bijl, L., et al. (2016). Google searches and stock returns. International Review of Financial Analysis, 45, 150–156.
Bordalo, P., Gennaioli, N., & Shleifer, A. (2012). Salience theory of choice under risk. Quarterly Journal of Economics, 3(127), 1243–1285.
Bosch, A., Zisserman, A., & Munoz, X. (2007). Image classification using random forests and ferns. In Proceedings IEEE 11th International Conference on Computer Vision.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Cepni, O., Guney, I. E., Gupta, R., & Wohar, M. E. (2020). The role of an aligned investor sentiment index in predicting bond risk premia of the US. Journal of Financial Markets, 51, 100541.
Chen, J., et al. (2016). Investor attention and macroeconomic news announcements: Evidence from stock index futures. Journal of Future Markets, 3(36), 240–266.
Chen, J., Tang, G., Yao, J., & Zhou, G. (2020). Investor attention and stock return. Available at SSRN 3194387.
Cziraki, P., Mondria, J., & Wu, T. (2019). Asymmetric attention and stock returns. Management Science, 67(1), 48–71.
Da, Z., Engelberg, J., & Gao, P. (2011). In search of attention. Journal of Finance, 5(66), 14611–15499.
Da, Z., Engelberg, J., & Gao, P. (2014). The sum of all FEARS investor sentiment and asset prices. Review of Financial Studies, 1(28), 1–32.
Daniel, K., Hirshleifer, D., & Subrahmanyam, A. (1998). Investor psychology and security market under- and overreactions. Journal of Finance, 6(53), 1839–1885.
Daskalaki, C., Kostakis, A., & Skiadopoulos, G. (2014). Are there common factors in individual commodity futures returns? Journal of Banking and Finance, 40, 346–363.
Ding, R., & Hou, W. (2015). Retail investor attention and stock liquidity. Journal of International Financial Markets, Institutions and Money, 37, 12–26.
Drake, M. S., Roulstone, D. T., & Thornock, J. R. (2012). Investor information demand: Evidence from google searches around earnings announcements. Journal of Accounting Research, 4(50), 1001–1040.
Drobetz, W., & Otto, T. (2020). Empirical asset pricing via machine learning: Evidence from the European stock market, Available at SSRN
Dzielinski, M. (2012). Measuring economic uncertainty and its impact on the stock market. Finance Research Letters, 3(9), 167–175.
Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. Journal of Finance, 2(25), 383–417.
Fama, E. F., & French, K. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33, 3–56.
Fama, E. F., & French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116(1), 1–22.
Fang, L., & Peress, J. (2009). Media coverage and the cross-section of stock returns. Journal of Finance, 5(64), 2023–2052.
Figelman, I. (2007). Stock return momentum and reversal. Journal of Portfolio Management, 1(34), 51–67.
Fiske, S., & Taylor, S. (1998). Social cognition (2nd ed.). McGraw-Hill.
Gao, L., & Suss, S. (2015). Market sentiment in commodity futures returns. Journal of Empirical Finance, 33, 84–103.
Gremers, M., & Pareek, A. (2014). Short-term trading and stock return anomalies: Momentum, reversal, and share issuance. Review of Finance, 4(19), 1649–1701.
Gu, S., Kelly, B., & Xiu, D. (2020). Empirical asset pricing via machine learning. Review of Financial Studies, 33(5), 2223–2273.
Gu, S., Kelly, B., & Xiu, D. (2021). Autoencoder asset pricing models. Journal of Econometrics, 222(1), 429–450.
Guo, T., Finke, M., & Mulholland, B. (2014). Investor attention and advisor social media interaction. Applied Economics Letters, 4(22), 261–265.
Han, L., Xu, Y., & Yin, L. (2018). Does investor attention matter? The attention-return relationships in FX markets. Economic Modelling, 68, 644–660.
He, X., Feng, G., Wang, J., & Wu, C. (2021). Predicting individual corporate bond returns. Technical report, City University of Hong Kong.
Ho, T. K. (1995). Random decision forests. In Proceedings of the third international conference on document analysis and recognition (pp. 278–282).
Hong, H., & Stein, J. (1999). A unified theory of underreaction, momentum trading, and overreaction in asset markets. Journal of Finance, 6(54), 2143–2184.
Hu, Y., Li, X., Goodell, J. W., & Shen, D. (2021). Investor attention shocks and stock co-movement: Substitution or reinforcement? International Review of Financial Analysis, 73, 101617.
Huang, D., Jiang, F., Tu, J., & Zhou, G. (2015). Investor sentiment aligned: A powerful predicitor of stock returns. Review of Financial Studies, 28(3), 791–837.
Huang, S., Huang, Y., & Lin, T.-C. (2019). Attention allocation and return co-movement: Evidence from repeated natural experiments. Journal of Financial Economics, 132(2), 369–383.
Kalsson, N., Loewenstein, G., & Seppi, D. (2009). The ostrich effect: Selective attention to information. Journal of Risk and Uncertainty, 38, 95–115.
Kaniel, R., Liu, S., Saar, G., & Titman, S. (2012). Individual investor trading and return patterns around earnings announcements. Journal of Finance, 2(67), 639–680.
Kyriakou, I., Mousavi, P., Nielsen, J. P., & Scholz, M. (2019). Forecasting benchmarks of long-term stock returns via machine learning. Annals of Operations Research, 297, 221–240.
Li, X., Ma, J., Wang, S., & Zhang, X. (2015). How does Google search affect trader positions and crude oil prices? Economic Modelling, 49, 162–171.
Li, J., & Yu, J. (2012). Investor attention, psychological anchors, and stock return predictability. Journal of Financial Economics, 2(104), 401–419.
Li, Y., Goodell, J. W., & Shen, D. (2021). Comparing search-engine and social-media attentions in finance research: Evidence from cryptocurrencies. International Review of Economics & Finance, 75, 723–746.
Lou, D. (2010). Maximizing short-term stock prices through advertising. Available at SSRN 1571947.
Merton, R. C. (1987). A simple model of capital market equilibrium with incomplete information. Journal of Finance, 3(42), 483–510.
Moat, H. S., Curme, C., Avakian, A., Kenett, D. Y., Stanley, H. E., & Preis, T. (2013). Quantifying Wikipedia usage patterns before stock market moves. Scientific Reports, 3(1), 1–5.
Neely, C., Rapach, D., Tu, J., & Zhou, G. (2014). Forecasting the equity risk premium: The role of technical indicators. Management Science, 60, 1772–1791.
Nisbett, R., & Ross, L. (1980). Human inference: Strategies and shortcomings of social judgment. ThriftBooks-Baltimore New Jersey Englewood-Cliffs.
Patel, J., Shah, S., Thakkar, P., & Kotecha, K. (2015). Predicting stock market index using fusion of machine learning techniques. Expert Systems with Application, 4(42), 2162–2172.
Peng, L., & Xiong, W. (2006a). Investor attention, overconfidence and category learning. Journal of Financial Economics, 3(80), 563–602.
Peng, L., Xiong, W., & Bollerslev, T. (2007). Investor attention and time-varying comovements. European Financial Management, 3(13), 394–422.
Peng, L., & Xiong, W. (2006b). Investor attention, overconfidence and category learning. Journal of Financial Economics, 80, 563–602.
Prasad, A. M., Iverson, L. R., & Liaw, A. (2006). Newer classification and regression tree techniques: Bagging and random forests for ecological prediction. Ecosystems, 2(9), 181–199.
Preis, T., Moat, H. S., & Stanley, H. E. (2013). Quantifying trading behavior in financial markets using Google Trends. Scientific Reports, 3(1), 1–6.
Quintana, D., Sáez, Y., & Isasi, P. (2017). Random forest prediction of IPO underpricing. Applied Sciences, 6(7), 636.
Rapach, D., & Zhou, G. (2013). Forecasting stock returns. Handbook of economic forecasting (pp. 328–383). Elsevier.
Sicherman, N., Loewenstein, G., Seppi, D. J., & Utkus, S. P. (2016). Financial attention. Review of Financial Studies, 4(29), 863–897.
Smith, G. P. (2012). Google Internet search activity and volatility prediction in the market for foreign currency. Finance Research Letters, 2(9), 103–110.
Vlastakis, N., & Markellos, R. N. (2012). Information demand and stock market volatility. Journal of Banking & Finance, 6(36), 1808–1821.
Wold, H. (1966). Estimation of principal components and related models by iterative least squares. Multivariate Analysis, 391–420.
Ying, Q., Kong, D., & Luo, D. (2015). Investor attention, institutional ownership, and stock return: Empirical evidence from China. Emerging Markets Finance and Trade, 3(51), 672–685.
Zhang, W., Shen, D., Zhang, Y., & Xiong, X. (2013). Open source information, investor attention, and asset pricing. Economic Modelling, 33, 613–619.
Zhang, B., & Wang, Y. (2015). Limited attention of individual investors and stock performance: Evidence from the ChiNext market. Economic Modelling, 50, 94–104.
Zhu, Z., Sun, L., & Chen, M. (2019). Fundamental strength and short-term return reversal. Journal of Empirical Finance, 52, 22–39.
Acknowledgements
This work is supported by the National Natural Science Foundation of China (72071141).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chu, G., Goodell, J.W., Shen, D. et al. Machine learning to establish proxies for investor attention: evidence of improved stock-return prediction. Ann Oper Res 318, 103–128 (2022). https://doi.org/10.1007/s10479-022-04892-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-022-04892-0