The Cowles–Jones test with unspecified upward market probability

: The Cowles-Jones test for sign dependence is one of the earliest tests of the random walk hypothesis, which stands at the beginning of modern empirical finance. The test is still discussed in popular textbooks and used in research articles. However, the Cowles-Jones test statistic considered in the literature requires that the upward probability of the market or asset under consideration be specified under the null hypothesis, which is only very rarely possible. If the upward probability is estimated in advance, the resulting test is undersized (even asymptotically). This note considers a corrected Cowles-Jones test statistic which does not require the upward probability to be specified under the null. It turns out that the asymptotic variance is greatly simplified as compared to the uncorrected test. The corrected test is illustrated with an application to daily returns of the Dow Jones Industrial Average index and monthly returns of the MSCI Emerging Markets index. It is shown that the corrected and uncorrected tests can lead to opposite conclusions.


Introduction
The random walk hypothesis (RWH) states that it is not possible to construct nontrivial forecasts of an asset price based on only the asset's price history, and the desire to shed light on its validity played a major role in the development of modern empirical finance (Lo, 2000).Cowles and Jones (1937) were among the first to devise a statistical test for this purpose.This test became widely known through its discussion and development in the popular textbook by Campbell et al. (1997) and is still frequently used and cited, with about 250 citations in Google Scholar in the last decade. 1he Cowles-Jones (CJ) test is a test for the predictability of the sign of the return or price change, i.e. for whether the conditional probability of the market rising or declining nontrivially depends on the past return, or, more precisely, sign history.Absence of sign predictability means that the conditional probability of an upward movement of the market is always equal to the unconditional probability, but it does not imply any particular value of this probability.However, the CJ test described in the literature requires that the upward market probability is specified under the null hypothesis, which poses a problem for the applications.To overcome this problem, it is often suggested to estimate the upward market probability in advance via the relative sample frequency of positive returns, but the effect of this procedure on the distribution of the test statistic is not taken into account (e.g.Fiorenzani et al., 2012, Ch. 1.3.2;Linton, 2019, Ch. 4.1.1;Tsinaslanidis and Guijarro, 2023). 2 Therefore, a version of the test that corrects for the use of an estimated upward probability of the market may be desirable, and the current paper develops such a test statistic.It turns out that the uncorrected test is undersized, but only moderately so if the upward probability is close to 0.5, which is often the case for daily stock returns.The difference between the corrected and uncorrected tests is more significant in applications to returns at lower frequencies, such as monthly, quarterly, or yearly.This is illustrated below in Section 3.2.
As set out in the previous paragraph, the aim of this paper is to derive a Cowles-Jones test statistic which takes into account the typical practical situation where the upward probability is not specified under the null hypothesis of no predictability.We will not go into the details of the discussion about the advantages and disadvantages of the various tests of the RWH.Rather, in the current paragraph, we only briefly mention some important aspects and provide references to the literature for further details.A comprehensive analysis of the relation between sign predictability and conditional mean and variance dynamics is provided by Christoffersen and Diebold (2006).An important result is that, in the presence of conditional heteroskedasticity, it is possible to have sign predictability without conditional mean predictability.Linton (2019, Ch. 4) compares sign predictability tests with tests based on the sample autocorrelation function, such as the variance ratio test of Lo and MacKinlay (1988); see also Charles and Darné (2009).An advantage of the sign-based tests is that they are robust to the fat tails which characterize financial markets.For example, under conditional heteroskedasticity, standard distributional results for the sample autocorrelation function require the finiteness of the fourth moments (Mikosch and Stȃricȃ, 2000), which may be a questionable assumption for many financial return series (for some evidence and a review of the literature, see Haas and Pigorsch, 2009).For an overview of different tests of the RWH, see Chs. 5 and 6 of Taylor (2005) and Chs. 3 and 4 of Linton (2019).To exploit sign predictability for actual forecasting, a variety of approaches have been developed.A recent contribution, with a review of the literature, is Xie et al. (2023).Finally, recall that the RWH focuses only on predictability by means of the assets' own price history and thus does not exclude predictability from other factors, see e.g.Li et al. (2023) for discussion.
The remainder of this article is organized as follows.The corrected CJ test statistic is developed in Section 2. Section 3 presents two applications of the modified test, namely to the daily returns of the Dow Jones Industrial Average index from 1928 to 2022 and the monthly returns of the MSCI Emerging Markets index from 1988 to 2022.Section 4 provides a short summary of the results.

The Cowles-Jones test with unspecified upward probability
Suppose that (r t ) is a stationary time series of returns, and define a corresponding sequence of indicators by (1) Under the null hypothesis of no sign predictability, (I t ) is a sequence of independent and identically distributed (iid) Bernoulli random variables, i.e.
Note that the success probability of the Bernoulli distribution, i.e. the probability of a positive return, is not specified under H 0 in (2).Let this probability be denoted by The Cowles-Jones (CJ) test of ( 2) is based on the notion of sequences and reversals, where a sequence is a pair of consecutive returns of the same sign, whereas a reversal is a pair of consecutive returns of opposite sign.Suppose we observe a stretch of T + 1 returns r t , t = 0, 1, . . ., T .The number of sequences in the sample is and the number of reversals, N R = T − N S .The CJ ratio is the ratio of sequences to reversals, N S /N R , and the test is carried out by assessing whether this ratio is significantly different from the number we would expect under (2).For the purpose of testing, it is convenient to write the ratio as where is a consistent estimator of the unconditional probability of a sequence, π S , i.e.
π S := Pr(Y t = 1), and ( 4) is a consistent estimator of the population CJ ratio The asymptotic distribution of (4) under the null hypothesis has been calculated by Campbell et al. (1997) and is given by where CJ 0 is the population value of the CJ ratio under the null, i.e.
where π S ,0 = π 2 + (1 − π) 2 is the probability of a sequence under the null, and Result ( 7) provides the basis for a test of the hypothesis Note that, in contrast to (2), the upward probability π is specified under H ′ 0 in (10).However, this is a rather uncommon situation and usually we want to test the less restrictive null hypothesis (2). 3 We now develop this case, where a suitable test statistic is of the general form considered in Hausman (1978) where CJ 0 is the CJ ratio estimated under H 0 in (2), i.e. where To find the limiting distribution of (11), note that, under H 0 , (I t ) is an iid sequence, and (Y t ) is a 1-dependent process.4Thus, we can invoke the multivariate central limit theorem for strictly stationary processes with finite dependence (cf.Theorem 7.7.6 in Anderson, 1971) to conclude that where with where the covariance in ( 16) follows from Equation (31) on p. 429 in Anderson (1971), and Cov(I t+1 , Y t ) = 0.
From ( 14), the limiting distribution of ( 11) is normal with a variance that can be determined by means of the delta method.To keep the calculations as simple as possible, note that, under H 0 , CJ 0 is an asymptotically efficient estimator of CJ, with asymptotic variance σ 2 0 given by the Cramér-Rao lower bound, i.e.
Thus, from Lemma 2.1 of Hausman (1978), the asymptotic variance of ( 11) is simply 5 where σ 2 1 is given in (9). 6Quantities σ 2 0 , σ 2 1 and ν can be consistently estimated by inserting π in (13) for π in ( 17), ( 9) and ( 21) respectively and a standard application of Slutzky's theorem shows that the corrected CJ test statistic is 5 For the case of a scalar parameter, as in our application, the result in Lemma 2.1 of Hausman (1978) was already given by Fisher (1925( ). Fisher's (1925) ) argument is essentially as follows.Consider two asymptotically normal estimators θ 1 and θ 2 of a parameter θ, with asymptotic variances ν 11 and ν 22 , respectively, i.e.

√
T . The linear combination is likewise asymptotically normal, with asymptotic variance where ν 12 is the asymptotic covariance between θ 1 and θ 2 .The quadratic (19) attains its minimum when Now assume that θ 2 is asymptotically efficient, so that (19) attains its minimum when x = 0 (because the variance of θ 3 cannot be made smaller than that of θ 2 ).From (20) we can observe that this implies ν 12 = ν 22 .Thus, the asymptotic variance of θ 1 − θ 2 is ν 11 + ν 22 − 2ν 12 = ν 11 − ν 22 . 6See the appendix for a sketch of the (straightforward) calculations leading to the expression on the right-hand side of (21).
with the limiting distribution being valid under (2).
The uncorrected CJ test used in the literature is carried out by considering the test statistic √ T ( CJ − CJ 0 )/ σ 1 , where σ 1 is obtained by replacing π with π in (9).Since σ 2 1 > ν when π 0.5, the uncorrected test is undersized.For π = 0.5, CJ 0 attains its minimum, and in this case CJ 0 converges at rate T rather than

√
T , so that the need to estimate π under the null has no effect on the limiting distribution of (22) (cf.Lehmann and Casella, 1998).
The variances ( 9) and ( 21), as functions of π, are shown in Figure 1.As they are close to each other as long as π is close to 0.5, the uncorrected test will be only moderately undersized for daily returns, where positive and negative returns typically occur with roughly equal frequency.The differences between the corrected and the uncorrected test will be more significant when applied to returns at lower frequencies, such as monthly, quarterly or yearly.An illustration of this feature is given in the next section.

Examples
In this section, we illustrate the modified CJ test statistic ( 22) with two empirical examples.Section 3.1 examines the daily returns of the Dow Jones Industrial Average (DJIA) index from October 1928 to September 2022.The DJIA has been widely used in studies of stock market predictability and the results
in Section 3.1 largely confirm the findings in the literature, namely that the predictability of earlier decades appears to have disappeared since the 80s, most likely because markets have become more efficient.The second example deals with the monthly returns of the MSCI Emerging Markets index from January 1988 to October 2023.For monthly returns, the unconditional upward probability deviates significantly from 0.5, which has a nonnegligible effect on the undersizedness of the uncorrected test (cf.the discussion at the end of Section 2).In fact, for the first half of the sample, we find below in Section 3.2 that the uncorrected test incorrectly 7 fails to reject the null hypothesis at the 5% level.

Application to the Dow Jones index
Considering Cowles and Jones (1937) and the availability of a long price history, the Dow Jones Industrial Average (DJIA) index has often been used to test the predictability of stock market returns and the profitability of trading rules, such as, in the influential paper by Brock et al. (1992).
Even though Charles Henry Dow (1851-1902) had published a stock market index since 1894, the publication of the DJIA with 30 stocks commenced on October 1, 1928 and we select this as the starting point of the series of daily data that we use in this section, with the endpoint being given by September 30, 2022. 8The level and the percentage log returns of the index over the sample period are shown in the upper and lower panel of Figure 2, respectively. 7Here, "incorrectly" means that the test decision is caused by the use of an inappropriate asymptotic variance, leading to an undersized test. 8Information about the index, as well as the data, are obtained from Williamson (2023Williamson ( ). 1930Williamson ( 1940Williamson ( 1950Williamson ( 1960Williamson ( 1970Williamson ( 1980Williamson ( 1990Williamson ( 2000  Results are reported in Table 1 for nine subperiods as well as for the entire sample period (bottom row), with the value of the test statistic ( 22) being given in the last column of the table.Over the entire sample period from October 1928 to September 2022, the test rejects the null hypothesis (2), which appears to be mainly due to the presence of sign predictability over roughly the first half of the sample.Evidence for predictability of the daily DJIA returns is much weaker for the more recent decades.These results are broadly in agreement with those in the literature examining predictability in long time series of daily DJIA returns.For example, Brock et al. (1992) investigate the daily returns of the DJIA from 1897 to 1986 and find strong evidence for the predictive power of technical trading rules.Sullivan et al. (1999) reconsider the results of Brock et al. (1999) and find that the predictive content of these rules has disappeared in the decade following the sample used in Brock et al. (1992).As one possible explanation for this phenomenon, Sullivan et al. (1999Sullivan et al. ( , p. 1684) ) suggest that "[...] it is possible that, historically, the best technical trading rule did indeed produce superior performance, but that, more recently, the markets have become more efficient and hence such opportunities have disappeared.This conclusion certainly seems to match up well with the cheaper computing power, the lower transaction costs and increased liquidity in the stock market that may have helped to remove possible short-term patterns in stock returns."More recent results based on variance ratio tests pointing in the same direction are reported in Linton (2019, Ch. 3.6), who provides a similar interpretation and further references.to September 2022, as well as for the entire sample period (bottom row).Quantity t CJ reported in the last column is the test statistic defined in ( 22), which has a limiting standard normal distribution under the null hypothesis (2).

Application to the MSCI Emerging Markets index
The second example we consider is for the monthly returns of the MSCI Emerging Markets (EM) index (denoted in US$) from January 1988 (the first available observation) to October 2022. 9The level and the percentage log returns are shown in the upper and lower panel of Figure 3, respectively.The sample consists of 418 returns and we consider results for the entire sample period as well as for an equal split into two subperiods of 209 observations each. 9The data have been extracted from the Refinitiv Eikon database.

√
T ( CJ − CJ 0 )/ σ 1 , where σ 1 is as in (9) but with π replaced by π in (13), see the discussion at the end of Section 2. From the analysis at the end of Section 2, this test is undersized.In particular, for the first subsample, the corrected test rejects the absence of sign predictability at the 5% significance level, whereas the uncorrected test fails to do so.This illustrates that, when the relative frequency of positive returns is not very close to 0.5, using the corrected version of the test may indeed lead to a revision of the test decision from the uncorrected test.

Data Science in Finance and Economics
Volume 3, Issue 4, 324-336.T ( CJ − CJ 0 )/ σ 1 , where σ 1 is (9) with π replaced by π in (13).Note that, for purpose of illustration, the p-value of the uncorrected test has likewise been calculated under the (actually unwarranted) assumption that it has a limiting standard normal distribution under the null, i.e. p-value = 2(1 − Φ(|t uc CJ |)), where Φ is the cdf of the standard normal distribution.

Conclusions
The Cowles-Jones test for sign dependence is a classical test of the random walk hypothesis for financial markets.The Cowles-Jones test currently discussed in the literature is not suitable for the practically most relevant situation with an unspecified upward probability under the null hypothesis.This article provides a corrected Cowles-Jones test statistic that can be used in this case.Somewhat surprisingly, the resulting test statistic turns out to be simpler than the uncorrected version.The empirical example in Section 3.2 shows that the corrected and uncorrected versions of the test can indeed lead to opposite test decisions.

Appendix
We can express σ 2 1 in (9) as Subtracting (17), we get the expression after the second equality sign in (21).

Use of AI tools declaration
The author declares he has not used Artificial Intelligence (AI) tools in the creation of this article.

Figure 1 .
Figure1.Shown are the asymptotic variances of the Cowles-Jones test statistic both for the case where π is specified (dashed line) as well as the case where π is unspecified under the null (solid line).

Figure 2 .
Figure2.Daily index level (log scale) and percentage log return time series of the Dow Jones Industrial Average (DJIA) index.With the index level at time t denoted by S t , the percentage log return is defined as r t = 100 × log(S t /S t−1 ).
Figure3.Monthly index level (log scale) and percentage log return time series of the MSCI Emerging Markets (MSCI EM) index.With the index level at time t denoted by S t , the percentage log return is defined as r t = 100 × log(S t /S t−1 ).

Table 1 .
Cowles-Jones test results for the daily DJIA returns * .Shown are results of the Cowles-Jones (CJ) test applied to the daily DJIA returns for nine subperiods spanning the time from October 1928 * Monthly index level (log scale) and percentage log return time series of the MSCI Emerging Markets (MSCI EM) index.With the index level at time t denoted by S t , the percentage log return is defined as r t = 100 × log(S t /S t−1 ).Results are reported in Table2.In addition to the quantities reported in Table1for the DJIA, the last column of Table 2 also reports the value of the uncorrected (uc) Cowles-Jones test statistic, i.e.

Table 2 .
Cowles-Jones test results for monthly MSCI Emerging Markets returns * .Shown are results of the Cowles-Jones (CJ) test applied to the monthly MSCI Emerging Markets (EM) returns for two subperiods spanning the time from January 1988 to October 2022, as well as for the entire sample period (bottom row).Quantity t CJ reported in the penultimate column is the test statistic defined in (22), and quantity t uc CJ reported in the last column is the uncorrected Cowles-Jones test statistic, i.e.