Dynamic Cross-Correlation between Online Sentiment and Stock Market Performance: A Global View

This paper focuses on investigating the dynamic cross-correlation relationship between online sentiment and returns of major global stock markets based on the MF-DCCA method. We use Daily Happiness (DHS), an index derived from Twitter posts through textual analysis as a proxy of online sentiment. By dividing the global ﬁnancial markets into developed and developing ones, we are able to test the heterogeneous relationship between stock market performance and sentiment at diﬀerent economic developing level. Empirical results show that there exists a power-law cross-correlation relationship between ﬁnancial market and online sentiment in some developed countries and all developing countries, and the relationship is more stable in the developing countries. Moreover, we apply rolling window analysis to capture the dynamic evolution characteristics and ﬁnd the relationship has a strong consistency over time. Our work provides a much more delicate perspective to test the relationship between online sentiment and ﬁnancial markets performance and enriches the existing literature.


Introduction
Investor behavior in financial market cannot be fully explained by classical financial theory under the hypothesis of completely rational person. e behavioral financial theory takes human behavioral bias including limited investor attention [1][2][3] and emotional behavior into consideration, guiding researchers to examine the relationship between sentiment and financial market performance. Various proxies have been come up with the aim of capturing sentiment precisely, including closed-end fund discount [4,5], indices extracted from financial and market indicators [6], and indices generated by textual analysis through either financial newspapers [7] or social media [8,9]. Compared with others, the sentiment indicator, based on social media information, is exogenous to financial markets and can be acquired at high frequencies, making it practical in sentiment study. e interaction between media sentiment and stock market activity began to draw widespread attention since Tetlock [7] found that the emotional orientation of Wall Street Journal content has the function of predicting board movements of stock market, and high media pessimism reveals a downward price pressure on stock market. Alanyali [8] did a similar research based on Financial Times press issues. With the rapid development of modern technology, social media contains not only newspapers but also Internet social platforms, making capturing social media sentiment through Internet search engine data possible. Various proxies have been employed to represent online sentiment, and these sentiment proxies can be based on Facebook posts [9,10], Google search [11,12], or Baidu search index [13,14]. In particular, Bollen et al. [15] analyze the text content of daily Twitter and found that the sentiment index derived from Twitter is correlated with Dow Jones Industrial Average over time. Another widely used sentiment indicator based on Twitter posts is Daily Happiness Index (DHS), an index extracted from 10% of all tweets using textual analysis. DHS is derived from the worldwide social media Twitter with millions of thousands of users, and the massive users around the world ensure the rationality of using DHS to measure online sentiment. Compared with the sentiment index extracted from financial market, DHS provides a broader horizon of sentiment. In fact, a strand of recent papers have documented a link between sentiment proxies strictly exogenous to financial markets and stock returns, including the loss of sports games [16], morning sunshine [17], TV program [18], and even sunspots and the stars [19]. Meanwhile, DHS is strictly exogenous to the financial market and avoids the endogenous problems that may arise later. e existing literature using DHS as a sentiment proxy mainly focuses either on the linear relationship between sentiment and stock market performance [20][21][22] or on the lead-lag Granger causality relationship in developed financial markets [23]. ese research methods have shortcomings in capturing the microdynamic changes and nonlinear relationship between sentiment and stock market performance. A more simulation model is urgently needed to generate more pervasive and accurate profile of sentiment.
Deepening of globalization not only enables people from different countries to share messages via the same social media, but also makes financial markets worldwide connect with each other [24][25][26][27] more intensely. us, it is of great importance to characterize the correlation between online sentiment and financial market performance in a worldwide perspective and further study the heterogeneity of the relationship between sentiment and different markets. In this paper, we investigate the cross-correlation between online sentiment and returns of major global financial markets. Specifically, we choose Daily Happiness (DHS) as the exogenous proxy for online sentiment and characterize the nonlinear relationship between sentiment and stock market returns dynamically through MF-DCCA, which has been proved to be practical in simulating multifractal features of financial market.
Our research may contribute to the existing literature in two ways: On one hand, we characterize the dynamic relationship between online sentiment and stock market return using MF-DCCA, and the cross-correlation between the two is distinguished in different wavebands.
us, a much more delicate perspective has been found to test the relationship between sentiment and financial markets. On the other hand, previous researches mainly use DHS as a sentiment proxy of either the US market [7] or other individual market [9,10]. In this paper, we divide the global stock market into subsamples according to the economic development level and compare the similarities and differences of the cross-correlation relationship between online sentiment and stock market performance. e rest of the paper is organized as follows. Section 2 introduces the model and methodology, Section 3 describes the data in this study, Section 4 presents the empirical results, and Section 5 concludes the paper.

Methodology
We mainly follow the rationale of Zhou [27], using MF-DCCA (multifractal detrended cross-correlation analysis) model to assume the dynamic relationship between online sentiment and stock market. MF-DCCA model is a Frontier approach to measure nonlinear and unstable correlations.
is research branch is originated from [28], in which DEA model is proposed and gradually became the most widely used nonlinear analyzed method. Subsequent studies continue to optimize DEA model [29][30][31]. Podobnik and Stanley [32] creatively apply DEA to the long-range cross-correlation analysis of two nonstationary series and construct a new method named DCCA. On this basis, Zhou [27] added multifractal function method into DCCA and proposed MF-DCCA. MF-DCCA has advantages in fitting nonlinearity and multifractals in the crosscorrelation between time series in the financial market and has been widely used [32][33][34][35][36][37][38].
For any two equal-time series of length N x i , y i , i � 1, 2, . . . , N, there are five main steps to construct MF-DCCA algorithm.
Step 1. Construct two accumulated differential sequences as profiles: where x and y are the mean values of the time series, respectively.
Step 2. Divide X t and Y t into N s � int(N/s) nonoverlapping segments of equal lengths s. Notably, N is often difficult to maintain as an integer multiple of s, which will cause the part of the end of the sequence that is less than s to be discarded by calculating int(N/s). To solve this, we perform the same segmentation process from the end forward again and finally get 2 N s nonoverlapping parts containing all the information in the original time series.
Step 3. For each segment, evaluate the local trend with least squares fit, and then calculate the difference between the original time series and the fitting polynomial to get the detrended covariance. For segment λ � 1, 2, . . . , N s , For the flashback segment λ � N s + 1, N s + 2, . . . , 2N s , 2 Discrete Dynamics in Nature and Society Step 4. Take the average value of all the detrended covariances to obtain the q-order fluctuation function. For any q ≠ 0, For any q � 0, It is worth noting that MF-DCCA degenerates to the conventional DCCA method when q � 2.
Step 5. Draw a log-log graph with F q (s) set as the y axis and s as the x axis, and observe the trends at different scales. Specifically, there exists a power-law crosscorrelation relationship of the following form if the two series have long-range cross-correlation: where the scaling exponent H xy (q) for each q can be obtained by observing the slope of the log-log plots of F q (s) versus s through ordinary least squares.
H xy (q) < 0.5 reveals that the two series fluctuate towards the opposite direction. On the contrary, when H xy (q) > 0.5, there is a positive cross-correlation between the two sequences; that is, when one sequence fluctuates in the positive direction, the other will also fluctuate in the same direction. No significant cross-correlation relationship exists if H xy (q) � 0.5. If q � 2, MF-DCCA collapses to DCCA, and the exponent H xy (q) is equivalent to the generalized Hurst exponent.

Data
We use the Daily Happiness Index (DHS) extracted from Twitter as a proxy for online sentiment in accordance with previous literature [20][21][22][23]. DHS is compiled by Hedonometer using Amazon's Mechanical Turk service and natural language text analysis algorithm (For more details on DHS, see http://hedonometer.org/index.html). DHS conveys 10% of all daily Twitter information (Nearly 50 million text messages). Massive real social data ensure the authority of DHS in measuring the online sentiment. We obtain DHS index from September 10 th , 2008, to August 31 st , 2019.
We use market size and trading volume as the basis for selecting representative financial markets in this study. Our empirical sample includes the world's top 20 stock market daily data indices, covering Asian, European, and American main stock exchanges. Among these 20 indices, S & P500, NASDAQ, and Dow Jones Industrial Average and other four indices are all from USA. We choose the two highest-ranked indices, S & P 500 and NASDAQ as representatives in order to solve multicollinearity between indices. Meanwhile, we drop Shanghai Composite Index from our sample because Twitter is prohibited in Chinese Mainland, and DHS cannot reflect the online sentiment of local Chinese mainland effectively. Our final sample is, thus, containing 11 stock market indices of 10 countries in total (All stock market data used in this paper are from YAHOO! Finance: http://finance. yahoo.com). e developed markets include the United States (S & P500 and NASDAQ), the United Kingdom (FTSE), Germany (DAX), France (FCHI), Japan (Nikkei), and South Korea (KOSPI), and the developing countries include Brazil (BVSP), Mexico (MXX), India (BSE), and Indonesia (JKSE), respectively. e time interval is from September 10 th , 2008, to August 31 st , 2019, and nonsynchronized time period data is excluded in each group to ensure comparability. Table 1

Cross-Correlation Test.
It is necessary to verify whether there is a cross-correlation between online sentiment and stock market returns before using MF-DCCA for dynamic cross-correlation analysis. Following previous studies [29,30,34,35], we employ Q cc statistic test to quantitatively measure the cross-correlations between online sentiment and stock market returns. e cross-correlation statistic Q cc between the time series x i and y i is defined as where their cross-correlation function is shown as follows: According to Podobnik and Stanley [29], the crosscorrelation statistic Q cc (m) is approximately χ 2 (m) distributed with m degrees of freedom. e null hypothesis of χ 2 test is that there is no cross-correlation between the two series. In other words, if the statistic Q cc (m) is larger than the critical value of Chi-Square test, the null hypothesis is rejected, indicating that two series are cross-correlated. Figure 2 shows the cross-correlation test result between stock market returns and online sentiment measured by DHS in developed countries, while Figure 3 exhibits the test results in the developing ones. e horizontal axis indicates the degrees of freedom m after the natural logarithm, and the vertical axis is the statistic Q cc (m). Full line marked with circles is the cross-correlation statistics of online sentiment Discrete Dynamics in Nature and Society and the corresponding financial market. Full line representing the critical value of the χ 2 (m) distribution Q cc (m) with m degrees of freedom at the 5% significance level is shown as comparison. We set freedom m ranging from 1 to 1500. e results in Figure 2 show that, in the United States, Japan, and Germany, the cross-correlation statistics Q cc (m) of these four financial markets are all lager than the critical value regardless of degrees of freedom, suggesting a significant cross-correlation between online sentiment and financial market return in these countries. As to the other two developed countries, South Korea and UK, the cross-correlation statistics Q cc (m) are quite close to or even lower than the critical value of χ 2 (m) distribution at 5% significant level under large degrees of freedom. e empirical results show that, in the developed countries, the relationship between online sentiment and stock market return is heterogeneous. When it comes to the developing countries, we can see that, in Figure 3, the cross-correlation statistics Q cc (m) are all higher than the critical value. To sum up, the cross-correlation between online sentiment and financial market return is stronger in the developing countries compared with that in the developed ones. e empirical results provide a complementary to Zhang et al. [21] by analyzing the different dynamic correlations between financial markets and sentiment in developing and developed countries, respectively. In addition, compared with Da et al. [11,12], we are concerned about the cross-correlation rather than linear relationship between online sentiment and stock market performance.

MF-DCCA.
In the Q cc (m) test of cross-correlation, South Korea, UK, and France cannot reject the null hypothesis; that is, there is insufficient evidence to prove that there is a significant cross-correlation between financial market return and the online sentiment measured by DHS index. e other seven countries reject the null hypothesis, confirming a significant cross-correlation between financial   Discrete Dynamics in Nature and Society market return and online sentiment. e cross-correlation test based on the statistics Q cc (m) gives a clue for the presence of cross-correlation qualitatively. In this part, we try to test the cross-correlations quantitatively by estimating the cross-correlation exponent using MF-DCCA method. In this paper, the range of the slitting length s is set to 10 < s < (N/4) (N is the length of the financial market return sequence in each group), and the fluctuation function order q is set to be ranging from − 10 to 10. e corresponding online sentiment sequence belongs to the small-band sequence when q < 0; otherwise, it belongs to the large-band sequence. Figure 4 shows the log-log plots of log(F q (s)) versus log(s) as q � − 10, − 9, . . . , 9, 10 for the fluctuation function of financial markets and investor sentiment in both developed countries(left side) and developing ones (right side). It can be seen that all curves belonging to 8 financial markets overall present an obvious linear trend despite the fluctuation with the changes of different interval length s. e empirical results demonstrate that there is a significant power-law cross-correlation between financial markets and online sentiment.
In Figure 4, the fluctuation function of financial market returns and online sentiment in various countries shows a relatively stable trend before log(s * ), and after that, the trend changed significantly. s * is the "crossover" defined by Podobnik et al. [32]. We use the crossover to divide the cross-correlation between two sequences into short-term relationships (when s < s * ) and long-term relationships (when s > s * ). Specifically, in the developed countries, the "crossover" of the cross-correlation relationship between the market return of S & P500 (USA), NASDAQ (USA), Nik-kei225 (Japan), and DAX (German) and online sentiment occurs at about 245 days, 255 days, 95 days, and 102 days, respectively. Among the developing countries, the "crossover" of the cross-correlation between the return of the BVSP (Brazil), JKSE (Indonesia), MXX (Mexico), BSE (India), and online sentiment is at about 79 days, 96 days, 61 days, and 161 days, respectively.
We further construct the scaling exponent H xy (q) of the fluctuation function between the return of each financial market and online sentiment under different time length s and different order q, so as to explore the heterogeneity of markets in different countries. Table 2 shows the average value of scaling cross-correlation exponents H xy (q) in the developed countries and developing countries under Discrete Dynamics in Nature and Society different order q for long term (S > S * ) and short term (S < S * ). As is shown, when q � 2, H xy (2) are all greater than 0.5 in different countries, proving a strong positive cross-correlation between financial market return and online sentiment. In other words, financial market returns tend to change in the same direction as online sentiment measured by DHS does. However, it is also noteworthy that, in our sample, the fluctuation scaling exponent H xy (q) in the developed countries is smaller than that in the developing countries. In addition, in the developed countries, H xy (q) is close to 0.5 (H xy (2) � 0.5564) in the short term (when s < s * ), which means that the degree of synergy between the financial markets and online sentiment is low.
To further study the multifractal nature of the crosscorrelation coefficient between financial market and online sentiment, we calculate the degree of multifractal ΔH q under different wavebands: e greater the ΔH q is, the greater the degree of multifractal is. e last three rows of Table 2 show the crossover between the two under different order q. Overall, the fractal degree of the cross-correlation between financial market and online sentiment in the developing countries is significantly greater than that in the developed countries. e empirical results show that the relationship between financial market return and online sentiment is more stable in the developed countries, whereas they have weaker cross-correlation relationship compared with the developing countries. It is easy to conclude that although the cross-correlation between the return of financial markets and online sentiment in developing countries is stronger, the volatility is also greater, and the relationship is more unstable.  Discrete Dynamics in Nature and Society Considering different wavebands (q < 0 or q > 0), the degree of fractal in the small waveband (q < 0) of financial markets and online sentiment in developing countries is significantly smaller than that in the large waveband (q > 0).
is supplement shows that, in the developing countries, the cross-correlation between financial markets and online sentiment is more stable in small wave band.
is finding is consistent with most studies except Zhang et al. [40], whose work proves that, in the long run, the relationship between internet activity and Chinese market volatility is more accurate. is may partially be due to the difference between China and our sample countries in financial market composition as well as internet development level. [39] argued that the exponent at a given time depends on the time-window length. To rule out the impact of time-window length, we redo the empirical test using MF-DCCA based on rolling windows. MF-DCCA based on rolling window is practical in capturing the dynamic evolution characteristics of the cross-correlation relationship between the financial market of various countries and the online sentiment   It can be seen from the figures that the cross-correlation relationship between financial market returns and online sentiment shows strong consistency over time, and this finding applies to all countries. Moreover, the crosscorrelation relationship has declined rapidly with the short sharp drop of the global financial market at the end of 2015. e scaling exponent H xy (2) of some countries even fell below the critical value, which indicates that the cross-correlation relationship between financial market return and online sentiment can be affected by macroeconomic environment and even changed from positive to negative. Table 3 shows the statistical characteristics of the dynamic cross-correlation between financial market returns and online sentiment in various countries.

Rolling Windows Discussion. Grech and Mazur
Overall, the averages of scaling exponent index H xy (2) are all greater than 0.5 regardless of the nationality, giving a clue that the financial markets and online sentiment generally show a positive cross-correlation worldwide. Besides, the standard deviation of the H xy (2) of the developing markets is generally larger than that of the developed country markets; this implies that financial markets in the developing countries tend to fluctuate more and more easily be affected by online sentiment, and this situation has improved in the developed countries. Zhang et al. [20] focus on testing whether there exists the linear or nonlinear Granger causality between sentiment and major financial markets returns, and they find out a strong relationship in the USA, but in the Middle East and North Africa, there only exists one direction Granger  causality pattern from DHS to market returns. Different from theirs, we find evidence that the developing countries tend to be fluctuated more intensely by online sentiment; this may be because we focus on dynamic correlations, while Zhang et al. [23] were concerned with causality relationship between sentiment and financial market performance.

Conclusion
In this paper, we investigate the cross-correlation between financial market return and online sentiment based on MF-DCCA method. We choose representative financial markets covering developed and developing countries in different regions and calculate the market index returns, and Daily Happiness Index (DHS) is applied as a proxy variable for online sentiment. We firstly find that there is no generic cross-correlation between financial market returns and online sentiment in the developed countries; specifically, we do not find a crosscorrelation relationship between financial market return and online sentiment in South Korea and UK. Yet, our research shows that there exists a power-law cross-correlation relationship between financial market and online sentiment in some developed countries and all developing countries represented by Brazil and India in our sample, and the cross-correlation relationship is stronger than that in the developed ones.
We further set the different time interval lengths and retest the cross-correlation relationship and find that whether in the long-term or short term, there is a significant positive cross-correlation between the financial market returns and online sentiment; that is, the financial market returns tend to change in the same direction as online sentiment does measured by the Daily Happiness Index (DHS). It is worth noting that the cross-correlation between financial market return in the developed countries and online sentiment is weak in the short term. Moreover, we study the cross-correlation under different fractal degrees ranging from − 10 to 10. e empirical results show that cross-correlation between financial markets and online sentiment in the developed countries is more stable.
Finally, we perform rolling window analysis to capture the dynamic evolution characteristics of cross-correlation relationship. We find that the cross-correlation relationship between financial market and online sentiment has a strong consistency over time, but the cross-correlation relationship between financial markets and online sentiment in the developing countries fluctuates more drastically.
Our findings confirm the dependency between online sentiment and global financial markets, and we also suggest the heterogeneous relationship between sentiment and market performance in different economies. As is shown, the emerging financial markets in the developing countries fluctuate drastically and show a certain degree of instability compared with the developed ones. e underlying mechanism on explaining this may be attributed to the degree of market maturity, regulatory effectiveness, and financial literacy of market participants. is needs an interdisciplinary analysis from a more holistic perspective. We leave these questions for future research.

Data Availability
More details on DHS can be obtained from http:// hedonometer.org/index.html. All stock market data used in this paper are from YAHOO!Finance (http://finance. yahoo.com).

Conflicts of Interest
e authors declare no conflicts of interest.