Skip to main content
Log in

A Sentiment Index of the Housing Market in China: Text Mining of Narratives on Social Media

  • Published:
The Journal of Real Estate Finance and Economics Aims and scope Submit manuscript

Abstract

Many efforts have been made to investigate the sentiment in financial and commercial real estate markets, but only a few studies focus on residential markets because of the lack of appropriate sentiment measuring approaches. In this study, we utilize social media narratives to build sentiment indexes for the housing market in China, where house-price-related narratives are abundantly documented on social media. With the help of the latest text analysis technologies from the deep learning and natural language processing fields, our indexes are built on a solid basis for understanding the semantic meanings of textual data. Highlighting the semantic temporality of text, we build separate future and past sentiment indexes to capture people’s prior beliefs and posterior feelings about price movements, respectively. The future sentiment index could serve as an alternative to survey-based expectations, measure the impacts of policies on people’s beliefs, and have remarkable power in predicting the future movements of both listed developers’ stock prices and house prices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. See http://weibo.com.

  2. See https://www.techinasia.com/weibo-more-users-than-twitter.

  3. Source: http://tech.sina.com.cn/i/2011-01-28/18045144960.shtml and http://tech.sina.com.cn/i/2015-02-20/doc-iavxeafs1236800.shtml.

  4. See https://kefu.weibo.com/faqclassifylist?id=75 for more details about Sina Weibo.

  5. When building sentiment indexes, we also try to exploit the reposting, commenting and liking volumes as the weights for the original microblogs. The resulting indexes remain almost unchanged. More details are provided in Appendix 1.

  6. See http://www.deeplearning.net/software/theano/index.html.

  7. See https://github.com/fxsjy/jieba.

  8. See https://dumps.wikimedia.org/.

  9. See http://radimrehurek.com/gensim/.

  10. It should be noted that the classification models in related work are typically a ternary classifier, distinguishing between three types (positive, neural and negative), while ours is a stack of binary sub-classifiers. Here, we have carefully adjust the accuracy metrics before comparison.

  11. The average accuracies of LSTM and the best benchmark are 88.40% and 85.24%, respectively; so the misclassified samples are reduced by (88.40%—85.24%) / (100%—85.24%) = 21.41%.

  12. The distribution is inconsistent with that of the labeled corpus because in the corpus labeling we disproportionally add more relevant sentences to provide a larger training sample for the temporality and sentiment classifiers.

  13. About 5.9% of microblogs in the whole dataset provide geo-tags.

References

  • Akerlof, G. A., & Shiller, R. J. (2009). Animal spirits: How human psychology drives the economy, and why it matters for global capitalism. Princeton University Press.

    Google Scholar 

  • Ammann, M., & Schaub, N. (2021). Do Individual Investors Trade on Investment-related Internet Postings? Management Science, 67(9), 5679–5702.

    Article  Google Scholar 

  • Antweiler, W., & Frank, M. Z. (2004). Is all that talk just noise? The information content of Internet stock message boards. The Journal of Finance, 59(3), 1259–1294.

    Article  Google Scholar 

  • Armona L., Fuster A., & Zafar B. (2018). Home price expectations and behavior: evidence from a randomized information experiment. Review of Economic Studies, rdy038, https://doi.org/10.1093/restud/rdy038.

  • Azar, P. D., & Lo, A. W. (2016). The wisdom of Twitter crowds: predicting stock market reactions to FOMC meetings via Twitter feeds. Journal of Portfolio Management Special QES Issue 2016, 42(5), 123–134.

    Google Scholar 

  • Baker, M., & Wurgler, J. (2006). Investor sentiment and the cross-section of stock returns. The Journal of Finance, 61(4), 1645–1680.

    Article  Google Scholar 

  • Baker, M., & Wurgler, J. (2007). Investor sentiment in the stock market. The Journal of Economic Perspectives, 21(2), 129–151.

    Article  Google Scholar 

  • Barberis, N., Shleifer, A., & Vishny, R. (1998). A model of investor sentiment. Journal of Financial Economics, 49(3), 307–343.

    Article  Google Scholar 

  • Bartov, E., Faurel, L., & Mohanram, P. S. (2017). Can Twitter help predict firm-level earnings and stock returns? The Accounting Review, 93(3), 25–57.

    Article  Google Scholar 

  • Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8.

    Article  Google Scholar 

  • Bollerslev, T., Patton, A. J., & Wang, W. (2016). Daily house price indices: Construction, modeling, and longer-run predictions. Journal of Applied Econometrics, 31(6), 1005–1025.

    Article  Google Scholar 

  • Brockman, P., Li, X., & Price, S. M. (2015). Differences in conference call tones: managers vs. analysts. Financial Analysts Journal, 71(4), 24–42.

    Article  Google Scholar 

  • Brown, G. W., & Cliff, M. T. (2004). Investor sentiment and the near-term stock market. Journal of Empirical Finance, 11(1), 1–27.

    Article  Google Scholar 

  • Brown, G. W., & Cliff, M. T. (2005). Investor sentiment and asset valuation. The Journal of Business, 78(2), 405–440.

    Article  Google Scholar 

  • Case, K. E., & Shiller, R. J. (1989). The efficiency of the market for single-family homes. American Economic Review, 1(79), 125–137.

    Google Scholar 

  • Case, K. E., & Shiller, R. J. (2003). Is there a bubble in the housing market? Brookings Papers on Economic Activity, 2, 299–342.

    Article  Google Scholar 

  • Case, K. E., Shiller, R. J., & Thompson, A. K. (2012). What Have They Been Thinking? Homebuyer Behavior in Hot and Cold Markets. Brookings Papers on Economic Activity, 2012(2), 265–315.

    Article  Google Scholar 

  • Checkley, M. S., Higón, D. A., & Alles, H. (2017). The hasty wisdom of the mob: How market sentiment predicts stock market behavior. Expert Systems with Applications, 77, 256–263.

    Article  Google Scholar 

  • Chen, H., De, P., Hu, Y. J., & Hwang, B. H. (2014). Wisdom of crowds: The value of stock opinions transmitted through social media. The Review of Financial Studies, 27(5), 1367–1403.

    Article  Google Scholar 

  • Chen, H., Sun, M., Tu, C., Lin, Y., & Liu, Z. (2016). Neural sentiment classification with user and product attention. Paper presented at Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (pp. 1650–1659), Austin, Texas. Association for Computational Linguistics. https://doi.org/10.18653/v1/D16-1171

  • Chen, X., & Ang, P. H. (2011). The Internet police in China: Regulation, scope and myths. In D. Herold & P. Marolt (Eds.), Online Society in China (pp. 40–52). Routledge.

    Google Scholar 

  • Clayton, J., Ling, D. C., & Naranjo, A. (2009). Commercial real estate valuation: Fundamentals versus investor sentiment. The Journal of Real Estate Finance and Economics, 38(1), 5–37.

    Article  Google Scholar 

  • Cohen, L., & Frazzini, A. (2008). Economic links and predictable returns. The Journal of Finance, 63(4), 1977–2011.

    Article  Google Scholar 

  • Cornelli, F., Goldreich, D., & Ljungqvist, A. (2006). Investor sentiment and pre-IPO markets. The Journal of Finance, 61(3), 1187–1216.

    Article  Google Scholar 

  • Da, Z., Engelberg, J., & Gao, P. (2014). The sum of all fears investor sentiment and asset prices. The Review of Financial Studies, 28(1), 1–32.

    Article  Google Scholar 

  • Das, P. K., Freybote, J., & Marcato, G. (2015). An investigation into sentiment-induced institutional trading behavior and asset pricing in the REIT market. The Journal of Real Estate Finance and Economics, 51(2), 160–189.

    Article  Google Scholar 

  • Das, S. R., & Chen, M. Y. (2007). Yahoo! for Amazon: Sentiment extraction from small talk on the Web. Management Science, 53(9), 1375–1388.

    Article  Google Scholar 

  • Day, M., & Lee, C. (2016). Deep learning for financial sentiment analysis on finance news providers. Paper presented at 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (pp. 1127–1134).IEEE.https://doi.org/10.1109/ASONAM.2016.7752381

  • De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann, R. J. (1990). Noise trader risk in financial markets. Journal of Political Economy, 98(4), 703–738.

    Article  Google Scholar 

  • Deng, Y., Girardin, E., & Joyeux, R. (2018). Fundamentals and the volatility of real estate prices in China: A sequential modelling strategy. China Economic Review, 48(4), 205–222.

    Article  Google Scholar 

  • Deng, Y., Gyourko, J., & Wu, J. (2012). Land and house price measurement in China. In Heath A., Packer F. and Windsor C. (Eds.), Property Markets and Financial Stability (pp. 13–43). Bank of International Settlement and Reserve Bank of Australia.

  • Doran, J. S., Peterson, D. R., & Price, S. M. (2012). Earnings conference call content and stock price: The case of REITs. The Journal of Real Estate Finance and Economics, 45(2), 402–434.

    Article  Google Scholar 

  • Fang, H., Gu, Q., Xiong, W., & Zhou, L. (2016). Demystifying the Chinese housing boom. NBER Macroeconomics Annual, 30(1), 105–166.

    Article  Google Scholar 

  • Freybote, J., & Seagraves, P. A. (2017). Heterogeneous investor sentiment and institutional real estate investments. Real Estate Economics, 45(1), 154–176.

    Article  Google Scholar 

  • Gallimore, P., & Gray, A. (2002). The role of investor sentiment in property investment decisions. Journal of Property Research, 19(2), 111–120.

    Article  Google Scholar 

  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.

    Article  Google Scholar 

  • Jin, C., Soydemir, G., & Tidwell, A. (2014). The US housing market and the pricing of risk: Fundamental analysis and market sentiment. Journal of Real Estate Research, 36(2), 187–219.

    Article  Google Scholar 

  • Johnson S. G. B., & Tuckett D. (2017). Narrative decision-making in investment choices: How investors use news about company performance. SSRN preprint. https://doi.org/10.2139/ssrn.3037463

  • Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 82(1), 35–45.

    Article  Google Scholar 

  • Kim, Y. (2014). Convolutional neural networks for sentence classification. https://doi.org/10.48550/arXiv.1408.5882

  • King, G., Pan, J., & Roberts, M. E. (2013). How censorship in China allows government criticism but silences collective expression. American Political Science Review, 107(2), 326–343.

    Article  Google Scholar 

  • Kumar, A., & Lee, C. (2006). Retail investor sentiment and return comovements. The Journal of Finance, 61(5), 2451–2486.

    Article  Google Scholar 

  • Lee, C., Shleifer, A., & Thaler, R. H. (1991). Investor sentiment and the closed-end fund puzzle. The Journal of Finance, 46(1), 75–109.

    Article  Google Scholar 

  • Li, F. (2010). The information content of forward-looking statements in corporate filings—a naïve Bayesian machine learning approach. Journal of Accounting Research, 48(5), 1049–1102.

    Article  Google Scholar 

  • Li, Q., Wang, T., Li, P., Liu, L., Gong, Q., & Chen, Y. (2014). The effect of news and public mood on stock movements. Information Sciences, 278, 826–840.

    Article  Google Scholar 

  • Lin, C. Y., Rahman, H., & Yung, K. (2009). Investor sentiment and REIT returns. The Journal of Real Estate Finance and Economics, 39(4), 450.

    Article  Google Scholar 

  • Ling, D. C. (2005). A random walk down main street: Can experts predict returns on commercial real estate? Journal of Real Estate Research, 27(2), 137–154.

    Article  Google Scholar 

  • Ling, D. C., Naranjo, A., & Scheick, B. (2014). Investor sentiment, limits to arbitrage and private market returns. Real Estate Economics, 42(3), 531–577.

    Article  Google Scholar 

  • Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. The Journal of Finance, 66(1), 35–65.

    Article  Google Scholar 

  • Marcato, G., & Nanda, A. (2016). Information content and forecasting ability of sentiment indicators: Case of real estate market. Journal of Real Estate Research, 38(2), 165–203.

    Article  Google Scholar 

  • Menzly, L., & Ozbas, O. (2010). Market segmentation and cross-predictability of returns. The Journal of Finance, 65(4), 1555–1580.

    Article  Google Scholar 

  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, 2013, 3111–3119.

    Google Scholar 

  • Mordhorst, M., & Schwarzkopf, S. (2017). Theorising narrative in business history. Business History, 59(8), 1155–1175.

    Article  Google Scholar 

  • Nassirtoussi, A. K., Aghabozorgi, S., Wah, T. Y., & Ngo, D. C. L. (2014). Text mining for market prediction: A systematic review. Expert Systems with Applications, 41(16), 7653–7670.

    Article  Google Scholar 

  • Nguyen, T. H., Shirai, K., & Velcin, J. (2015). Sentiment analysis on social media for stock movement prediction. Expert Systems with Applications, 42(24), 9603–9611.

    Article  Google Scholar 

  • Price, S. M., Doran, J. S., Peterson, D. R., & Bliss, B. A. (2012). Earnings conference calls and stock returns: The incremental informativeness of textual tone. Journal of Banking and Finance, 36(4), 992–1011.

    Article  Google Scholar 

  • Price, S. M., Seiler, M. J., & Shen, J. (2017). Do investors infer vocal cues from CEOs during quarterly REIT conference calls? The Journal of Real Estate Finance and Economics, 54(4), 515–557.

    Article  Google Scholar 

  • Qiu, L., & Welch, I. (2004). Investor sentiment measures. NBER Working Paper No. 10794.https://doi.org/10.3386/w10794

  • Ranco, G., Aleksovski, D., Caldarelli, G., Grčar, M., & Mozetič, I. (2015). The effects of Twitter sentiment on stock price returns. Plos One, 10(9), e138441.

    Article  Google Scholar 

  • Rauch, H. E., Tung, F., & Striebel, C. T. (1965). Maximum likelihood estimates of linear dynamic systems. AIAA Journal, 3(8), 1445–1450.

    Article  Google Scholar 

  • Schmeling, M. (2009). Investor sentiment and stock returns: Some international evidence. Journal of Empirical Finance, 16(3), 394–408.

    Article  Google Scholar 

  • Shiller, R. J. (2005). Irrational exuberance. Princeton University Press.

    Google Scholar 

  • Shiller, R. J. (2017). Narrative economics. The American Economic Review, 107(4), 967–1004.

    Article  Google Scholar 

  • Shleifer, A., & Vishny, R. W. (1997). The limits of arbitrage. The Journal of Finance, 52(1), 35–55.

    Article  Google Scholar 

  • Siganos, A., Vagenas-Nanos, E., & Verwijmeren, P. (2014). Facebook’s daily sentiment and international stock markets. Journal of Economic Behavior & Organization, 107, 730–743.

    Article  Google Scholar 

  • Sohangir, S., Wang, D., Pomeranets, A., & Khoshgoftaar, T. M. (2018). Big data: Deep learning for financial sentiment analysis. Journal of Big Data, 5(1), 3.

    Article  Google Scholar 

  • Soo, C. K. (2018). Quantifying sentiment with news media across local housing markets. The Review of Financial Studies, 31(10), 3689–3719.

    Article  Google Scholar 

  • Sprenger, T. O., Sandner, P. G., Tumasjan, A., & Welpe, I. M. (2014). News or noise? Using Twitter to identify and understand company-specific news flow. Journal of Business Finance & Accounting, 41(7–8), 791–830.

    Article  Google Scholar 

  • Stambaugh, R. F., Yu, J., & Yuan, Y. (2012). The short of it: Investor sentiment and anomalies. Journal of Financial Economics, 104(2), 288–302.

    Article  Google Scholar 

  • Sun, L., Najand, M., & Shen, J. (2016). Stock return predictability and investor sentiment: A high-frequency perspective. Journal of Banking & Finance, 73, 147–164.

    Article  Google Scholar 

  • Sun, W., Zheng, S., Geltner, D. M., & Wang, R. (2017). The housing market effects of local home purchase restrictions: Evidence from Beijing. The Journal of Real Estate Finance and Economics, 55(3), 288–312.

    Article  Google Scholar 

  • Sundermeyer, M., Schlüter, R., & Ney, H. (2012). LSTM neural networks for language modeling. Paper presented at 13th Annual Conference of the International Speech Communication Association (pp. 194–197). International Speech Communication Association. https://doi.org/10.21437/Interspeech.2012-65

  • Tang, D., Qin, B., & Liu, T. (2015). Document modeling with gated recurrent neural network for sentiment classification. Paper presented at Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 1422–1432). Association for Computational Linguistics. https://doi.org/10.18653/v1/D15-1167

  • Tang, V. W. (2018). Wisdom of crowds: Cross-sectional variation in the informativeness of third-party-generated product information on Twitter. Journal of Accounting Research, 56(3), 989–1034.

    Article  Google Scholar 

  • Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3), 1139–1168.

  • Tsai, I. (2013). The asymmetric impacts of monetary policy on housing prices: A viewpoint of housing price rigidity. Economic Modelling, 31, 405–413.

  • Walker, C. B. (2014). Housing booms and media coverage. Applied Economics, 46(32), 3954–3967.

  • Wang, X., Li, K., & Wu, J. (2020). House price index based on online listing information: The case of China. Journal of Housing Economics, 50, 101715.

    Article  Google Scholar 

  • Wu, J., & Deng, Y. (2015). Intercity information diffusion and price discovery in housing markets: Evidence from Google searches. The Journal of Real Estate Finance and Economics, 50(3), 289–306.

  • Wu, J., Gyourko, J., & Deng, Y. (2012). Evaluating conditions in major Chinese housing markets. Regional Science and Urban Economics, 42(3), 531–543.

  • Wu, J., Gyourko, J., & Deng, Y. (2016). Evaluating the risk of Chinese housing markets: What we know and what we need to know. China Economic Review, 39, 91–114.

  • Yu, J., & Yuan, Y. (2011). Investor sentiment and the mean-variance relation. Journal of Financial Economics, 100(2), 367–381.

  • Zeiler, M. D. (2012). Adadelta: an adaptive learning rate method. https://doi.org/10.48550/arXiv.1212.5701

  • Zheng, S., Sun, W., & Kahn, M. E. (2016). Investor confidence as a determinant of China’s urban housing market dynamics. Real Estate Economics, 44(4), 814–845.

  • Zhou, G. (2018). Measuring investor sentiment. Annual Review of Financial Economics, 10(1), 239–259.

  • Zweig, M. E. (1973). An investor expectations stock price predictive model using closed-end fund premiums. The Journal of Finance, 28(1), 67–78.

Download references

Acknowledgements

We are grateful for the comments of the anonymous reviewers, the editors, and participants in the 2017 GCREC Annual Conference and the 2018 Asia-Pacific Real Estate Research Symposium. We appreciate the excellent research assistance of Xindian Li. This research is funded by the National Natural Science Foundation of China (Project No: 91546113, 71373006, 71673156, 71874093), and the Tsinghua University Initiative Scientific Research Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Wu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

Robustness Checks on Weights of Microblogs.

When aggregating the classification results of microblog sentences to sentiment indexes, we treat all microblogs with equal weight. However, the reposting, commenting, and liking volumes of a microblog may suggest its influential power on social media and thus can act as the weight that the microblog contributes to the sentiment index. Hence, we calculate the total volume of reposting, commenting, and liking, RCL, and separately use RCL + 1 and log(RCL + 1) + 1 as the weight for each microblog.

Table 6 presents the correlation coefficients between sentiment indexes built with different weights. All the coefficients are over 0.97, suggesting that the resulting index values are not sensitive to the weights applied to microblogs.

Tables 6, 7, 8 and 9

Table 6 Correlation coefficients between sentiment indexes with different weights of microblogs

Appendix 2

Numerical Series of Sentiment Indexes.

Table 7 presents the numerical series of our national-level sentiment indexes.

Table 7 Numerical series of sentiment indexes

Appendix 3

User-Characteristic-Specified Indexes.

Figure 7 presents the user-characteristic-specified sentiment indexes. In general, most of the sub-indexes show quite similar patterns to that of the full-sample index.

Panel a displays the sentiment indexes specified for different genders. Two indexes are very close in the historical variation patterns, while the absolute index values for females are higher than those for males by a small but consistent spread (about 0.04 and 0.03 for future and past indexes, respectively). This suggests that female users are more likely to hold positive tendencies for both perceptions and expectations of house price changes.

Panels b and c show the sentiment indexes specified for different levels of follower numbers and microblog volumes, respectively. Again, the sub-indexes show very similar variations to each other during the sample period. According to Granger tests, the sentiment indexes for users with more followers or microblogs do not lead the indexes for other users. This implies that the opinion leader phenomenon does not exist or is relatively weak.

Figure 7

Fig. 7
figure 7

User-characteristic-specified sentiment indexes (a) Gender specified (b) Follower-volume specified (c) Microblog-volume specified

Appendix 4

Robustness Checks on VAR Results.

Table 8 presents the VAR and Granger test results between the sentiment indexes and listed developers’ stock price changes. The lag lengths are longer than those selected by the AIC and BIC criteria, to test the robustness of the results in Table 4. Both the estimated VAR coefficients and Granger test results remain broadly unchanged.

Table 9 presents the corresponding robustness checks for house price prediction in Table 5. In addition to adding extra lags, we further replace the house price data with two alternative indexes: the NBS newly-built house price index and a resale house price index based on listing data by Wang et al. (WLW) (2020). In all cases, the test results remain robust.

Table 8 Robustness checks on VAR and Granger test for stock index changes
Table 9 Robustness checks on VAR and Granger test for house price changes

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, E., Wu, J., Liu, H. et al. A Sentiment Index of the Housing Market in China: Text Mining of Narratives on Social Media. J Real Estate Finan Econ 66, 77–118 (2023). https://doi.org/10.1007/s11146-022-09900-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11146-022-09900-5

Keywords

Navigation