Trends, Reversion, and Critical Phenomena in Financial Markets

Financial markets across all asset classes are known to exhibit trends. These trends have been exploited by traders for decades. Here, we empirically measure when trends revert, based on 30 years of daily futures prices for equity indices, interest rates, currencies and commodities. We find that trends tend to revert once they reach a critical level of statistical significance. Based on polynomial regression, we carefully measure this critical level. We find that it is universal across asset classes and has a universal scaling behavior, as the trend's time horizon runs from a few days to several years. The corresponding regression coefficients are small, but statistically highly significant, as confirmed by bootstrapping and out-of-sample testing. Our results signal to investors when to exit a trend. They also reveal how markets have become more efficient over the decades. Moreover, they point towards a potential deep analogy between financial markets and critical phenomena: our analysis supports the conjecture that financial markets can be modeled as statistical mechanical ensembles of Buy/Sell orders near critical points. In this analogy, the trend strength plays the role of an order parameter, whose dynamcis is described by a Langevin equation with a quartic potential.

It is well-known that financial markets across all asset classes exhibit trends. These trends have been exploited very successfully by the tactical trading industry over the past decades, including the former "turtle traders" [1] and today's CTA industry.
A close look at the available data reveals that those trends tend to revert as soon as they become too strong. In this paper, we demonstrate this based on 30 years of daily futures returns for equity indices, interest rates, currencies and commodities. We analyze trends with 10 different time horizons, ranging from 2 days to 4 years, and empirically measure the critical strength, beyond which trends tend to revert. Here, the "strength" of a trend is defined in terms of its statistical significance, namely as the t-statistics of the trend.
In a first step, we measure the daily average return of a market as a function of the values of the 10 trend strengths on the previous day. In order to increase the statistical significance of the results, we aggregate across different markets and time scales. Our key observation is that tomorrow's average return can be quite accurately modeled by a polynomial of today's trend strength. It consists of a positive linear term that is responsible for the persistence of trends, and a negative cubic term that is responsible for the reversion of trends.
Trends tend to revert beyond a critical trend strength, where the two terms balance each other. The corresponding regression coefficients are small, but statistically highly significant.
In a second step, we refine this quantitative analysis. Using multiple nonlinear regression, we empirically measure how the observed cubic function varies • with the time scale of the trends: we find that trends of medium strength persist at scales of several days to several years, while reversion dominates at shorter or longer time scales. We model this scale-dependence by polynomial regression as well.
• with the asset class: we find that the available data do not allow us to fit different model parameters to different asset classes. Within the limits of statistical significance, the model parameters are thus universal, i.e., independent of the asset.
• over time: we find that the patterns have gradually changed over the decades. In particular, trends have become less persistent, and there is little evidence that classical trend-following can perform as well in the future as it did in the past.
Since financial market returns are only in a rough approximation independent, normally distributed random variables, we cannot trust the standard significance analyses for regression results. Instead, we use bootstrapping and cross validation to confirm that our results are statistically highly significant out-of-sample, and robust. Throughout this paper, we try hard not to introduce a single parameter more than is absolutely necessary to capture the essence of the empirically observed patterns. We find that we may fit at most 6 parameters to our 30-year data set, and identify what we believe are the 4-6 most relevant parameters.
While trends have been exploited by the systematic trading industry for decades, they arrived relatively late in academia. Early observations on market trends appear, e.g., in [2,3]. Early literature on the interplay of trends and reversion has focused on their crosssectional counterparts (momentum and value) for single stocks [4]. With the advent of alternative beta strategies [5,6], trend-following has become an active academic research area [7,8,9,10,11,12]. By now, there is an extensive literature on trend-following, including backtests of its performance more than a century into the past [13,14], and efforts to optimize trend-following strategies by machine learning methods [15]. For a recent review of trend-and reversion strategies, see [16] and references therein.
Much of the financial literature in this field tries to improve trading strategies, be it by new trend signals, by new algorithms for mapping signals to position sizes, by identifying market environments in which a given strategy works best or worst, or by reducing trading costs or risks. However, while the results reported in our article also have implications for investors (e.g., they signal when to exit trends), our key motivation for publishing them goes much further: as discussed in section 5, the cubic polynomial, the scaling relations, and the universality that we observe all point towards a potential deep analogy between financial markets and statistical mechanical systems near second-order phase transitions. This in turn supports the idea that markets can be modeled in terms of "social networks" of traders. Our results lay the empirical basis for systematically analyzing the nature of these networks.
As a corollary, our observations also support a modified version of the efficient market hypothesis: they suggest that market inefficiencies do exist, but disappear before they become strongly statistically significant. In addition, our measurements quantify how markets have become more efficient with respect to trends over the decades.

Data
Our analysis is based on historical daily log-returns for the set of 24 futures contracts shown in table 1. This set is diversified across four asset classes (equity indices, interest rates, currencies, commodities), three regions (Americas, Europe, Asia) and three commodity sectors (energy, metals, agriculture). We use futures returns, instead of the underlying market returns, because futures returns are guaranteed to be marked-to-market daily. Moreover, they are readily available for all asset classes and net of the risk-free rate, which also makes returns in different currencies and interest rate regimes comparable with each other.
where the long-term daily risk premium µ i and the long-term daily standard deviation σ i of a market i are measured over the whole 30-year period. For some futures markets, the log-returns r i (t) had to be backtracked or proxied as follows: 6. German 10-year "Bund" futures: their history begins on Nov 27, 1990. Before, we have reconstructed futures returns from daily German 10-year and short-term interest rates, assuming a duration of 8. This data correction is also minor: it merely slightly affects the initial trend strengths at the beginning of 2002, when the analysis begins.
7. Natural gas futures: their history begins on Apr 5, 1990. Before, 1.5-fold levered crude oil futures returns are used as proxies for the natural gas futures returns, using the U.S.
Libor rate as the cost of leverage. The 1.5-fold leverage reflects the higher volatility of natural gas compared with crude oil. Again, this data correction is minor, as it merely affects the initial trend strengths at the beginning of 1992, when the analysis starts.

Time Scales
We will examine the interplay between trends and reversion at 10 different time scales:

Trend Strengths
As reviewed in [17,18], there are many different definitions of the strength of a trend, most of which are highly correlated. For the purpose of this study, we need a definition that • has only a single free parameter, the horizon T (to avoid overfitting historical data) • can be computed recursively (which will later help to relate it to critical phenomena) Let us develop the most convenient such definition step by step. For a given time horizon T , we define the trend strength φ i,T (t) of a market i at the end of day t ∈ Z as a weighted average of past daily returns of that market (i.e., on or before day t) -more precisely, of the normalized past daily log-returns (1) in excess of the long-term risk premium: where w T (n) is a weight function for the time scale T . Removing the long-term risk premia (2) is necessary to ensure that the long-term expectation value of the trend strengths φ i,T is zero. If we did not remove the risk premia, very long-term trends in equity-and bond markets, where such risk premia are generally assumed, would almost always be positive and never revert. This mix-up of trends with risk premia would distort our results, as discussed in appendix A2. Note that the long-term risk premia are estimated over the whole time period in (2). However, to avoid any biases, in the out-of-sample cross-validation of section 4 we estimate the risk premia only from the training samples, excluding the validation samples.
We also normalize the weight function w T (n) such that the trend strength φ i,T has standard deviation 1. Assuming that market returns on different days are independent from each other (which is true to high accuracy), this implies: With this normalization, φ i,T can be regarded as the statistical significance of the trend.
E.g., φ i,T = 2 represents a highly significant up-trend, while φ i,T = −0.5 represents a weakly significant down-trend. This normalization makes all trend strengths comparable with each other, and will thus allow us to aggregate across different markets and time scales below.
It can be verified thatw T satisfies (3). However, ψ is quite volatile and jumps when an outlier return enters the rolling time window.
One way to solve this problem is to use the common definition of φ i,T in terms of a moving average crossover: one subtracts the average log-price of asset i over a longer time period L from the average log price of the same asset over a shorter time period S. As pointed out in [17], this corresponds to a wedge-like weight function ( fig. 1, solid line). It makes the trend strength less volatile, as outlier returns affect it only gradually over the time period S.
It also filters out short-term trends on time scales smaller than S, which helps to seperate trends at different time scales from each other. shows the weight function used in this paper, compared with three standard alternatives.
All four weight functions shown here have the same average lookback period.
Unfortunately, the moving price average has two parameters L, S (instead of just one parameter T ) that must be fitted to the data in any analysis, which tends to reduce the statistical significance of the results. In this article, we will therefore use another similar weight function that involves only the single parameter T ( fig. 1, grey area; for comparability with the other weight functions, the figure shows w T /2 instead of w T ): With this normalization factor N T , one can verify that (3) is indeed satisfied. Moreover, this definition, together with (5), allows for a recursive combined computation of the two variants ψ, φ of the trend strength (which will be important in section 5): The "average lookback period" of this trend strength, i.e., the expectation value E[n + 1] of the number of days we look back (where "today", i.e. n = 0, counts as a 1-day lookback), is We have verified that, for a given horizon T , all definitions of the trend strength in fact yield quite similar results in our regression analysis of section 4, as long as the weight function rises gradually, decays gradually, and the average lookback period is the same. However, we use (6,7) here, as it is the simplest mathematical function that satisfies these criteria, and has only the single free parameter T , and can be computed recursively. To limit the impact of outlier values of φ i,T on our results, we will cut it off at ±2.5 in the actual regression analysis of section 4, i.e., we will use the capped and floored version According to [11], the standard practise of the managed futures industry (which focuses on trends, and not reversion) for this threshold is 2.0. We use the slightly higher value of 2.5, because this will allow us to study more precisely the regime where trends revert, yet it will not give excessive weights to outliers. This is supported by Appendix A1, which compares the results of our regression analysis of section 4 for thresholds from 2.0 to 3.0. For thresholds > 2.5, we get a higher adjusted R-squared. However, the results are less robust, and the statistical significance of the regression betas decreases. For thresholds < 2.5, the reversion regime would be largely removed from the analysis, leading to a lower adjusted R-squared without improving the overall statistical significance of the results. Table 2 displays a small extract of the resulting database for our analysis. Only two of the 7305 business days and only three of the 24 markets are shown. The third column shows the normalized daily log-returns (1), which have standard deviation 1. The 7305 business days cover only the 28-year period from Jan 1992 -Dec 2019, because the first two years 1990-1992 were only used to compute the initial trend strengths at the beginning of 1992.

Database
The full table with 7305 × 24 = 175 320 lines is published along with this paper.

Qualitative Observations
This section begins with an exploratory analysis of our data. The analysis in this section is only qualitative, but it serves to motivate the specific quantitative, statistically rigorous regression analysis of the following section. We stress again that our aim is not to improve futures trading strategies, which would have to include risk limits, trading cost minimization, and other features. Rather, we simply want to empirically measure and model the small autocorrelations of market returns as accurately as possible as a basis for future work.

Next-day Return vs. Trend Strength
We use the data of table 2 to measure the expected daily return of a futures market as a function of the trend strengths in that market on the previous day. To this end, we first con-struct 7305 · 24 · 10 = 1 753 200 pairs of data. Each pair consists of the normalized log-return R i (t) in that market on day t, and one of the 10 trend strengths φ i,T (t − 1) on the previous day. So each return appears in 10 data pairs, each time paired with a different trend strength.
We observe that the average next-day return is close to zero at zero trend strength, and grows linearly with the trend strength for small strengths. As the trend strength increases further, the average next-day return peaks, then decreases again until it becomes zero somewhere below trend strength 2. For even stronger trends, the average return decreases dramatically. This behavior is mirrored on the left-hand side of the graph for down-trends.
We have verified that this pattern remains almost the same if another day of delay is added, i.e., if the next-day return in our data pairs is replaced by the return 2 days later.
Thus, trends tend to revert when they become too strong. This makes sense intuitively: after strong trends, markets tend to be overbought or oversold, so one expects a reversion to "value". Our analysis quantifies where exactly this happens: below a critical trend strength of 2, before trends become strongly statistically significant. Note that this is not in line with classical trend-following, which would follow the trend no matter how strong it becomes.
The dashed line in fig. 2 (left) indicates the trading position that a classical trend-follower would take as a function of the trend strength.

Dependence on the Time Scale
In a next step, we analyze how the pattern observed in the previous sub-section depends on the time scale. To this end, we refine the bins used above: we split each bin into 10 smaller

Counting Degrees of Freedom
As emphasized in [19], one must be very conservative in introducing new factors and parameters in financial market models. Before modeling the observed patterns in detail, let us therefore do a back-of-the-envelope calculation of how many parameters we can hope to fit in our model without over-fitting our daily return data, and what fraction of the variance of these returns we can hope to explain by trend factors.
Our 7 305 · 24 = 175 320 daily log-returns are not independent, because the 24 markets are correlated with each other. How many independent markets are there? The daily returns are normalized to have variance 1. For a portfolio that invests 1/24 in each market, we find a variance of σ 2 ∼ 1/8, just as if it contained n m = 8 independent assets. A principal component analysis confirms that the first 8 (resp. 12) principal components explain 65% (resp. 80%) of the variance of the returns of our 24 markets. In this sense, these returns effectively live in a space of dimension n m ∼ 8. Adding more markets to our 24 time series does not significantly increase n m .
What is the highest annualized Sharpe ratio S that one can hope to achieve by systematically trading a broadly diversified set of highly liquid futures markets based on trends and reversion? Experience with the Managed Futures ("CTA") industry suggests that S can be at best 1. The small number of CTA's that have achieved a higher Sharpe ratio for several years in a row presumably also pursue other strategies that are not purely based on trends, or they are not market-neutral (in the sense of zero net exposure to each market over time).
An annualized Sharpe ratio of S = 1 implies a daily Sharpe ratio ρ for each market of So the predicted next-day return of a market has a correlation of ρ = 0.02 with the actual next-day return. E.g., if we only try to predict the sign of the next return, we can at best hope to be right on 51 and wrong on 49 out of 100 days. The adjusted R-squared (achieved out-of-sample in real trading) is then R 2 adj ∼ ρ 2 = 4 basis points (1bp = 4 · 10 −4 ). Clearly, the variance of financial market returns is overwhelmingly due to random noise.
If we fit k parameters to our data, and if our returns were independent and identically distributed ("iid"), then, for small R 2 , the adjusted R-squared would be approximately for Y years of daily data. If we require that the correction for the in-sample bias does not erode more than 20% of our R 2 , then we conclude that we cannot fit more than k ∼ N ·1 bp ∼ 6 parameters to our 28 years of data. This 20%-requirement is not too conservative, as our returns are only approximately "iid", and therefore the actual correction per fitted parameter will be higher than 20% (below, cross-validation will show that it is indeed more than twice as big). We conclude that we must use parameters wisely, and not "waste" them on features that may be artifacts of our limited data set.

Dependence on the Trend Strength
The graph in fig. 2 (left) suggests to model the next-day normalized log-return R(t + 1) (1) as a polynomial of the current trend strength φ(t) (2) across all markets and time scales: where represents random noise, and a measures the average risk premium µ i /σ i across all assets. Similar models with a polynomial random force have been postulated previously, notably by econophysicists with a background in critical phenomena [20,21,22,23]. Our observations of the previous section give clear empirical support for a polynomial ansatz.
We will discuss the relationship with critical phenomena in more detail in section 5. Since market returns cannot be assumed to be independent, identically distributed normal variables, we cannot trust the usual estimates of the t-statistics, adjusted R-squared, and F-statistic. Instead, the test statistics shown in table 3 are measured empirically as follows: • • Table 3 also reports R 2 and R 2 adj "aggregated across time scales". Those are based on using the equally-weighted mean of the 10 trend strengths on each day to predict the next-day return for each market. I.e., we combine the 10 different trend factors into a single one, which naturally has a higher predictive power than each single factor by itself. This regression is thus performed on only 7305 · 24 = 175 320 pairs of data.
• The F -statistics can be computed numerically to be F = 4.6 with a p-Value of 0.7% by modelling the distribution of regression coefficients in fig. 2 (right) by an elliptical distribution. However, the distributions in subsequent sections are not even approximately elliptical. We will therefore use R 2 adj and not F to compare the out-of-sample explanatory power of our models with each other.
The regression results of table 3 confirm and quantify our conclusions from the previous section. We see that the values of b and c -although very small -are statistically highly significant, despite the fact that market returns are neither normally distributed, nor independent, nor identically distributed. So is the average long-term risk premium a. The overall result is significant at the 99% level. The aggregated out-of-sample R 2 adj that combines the predictions from all 10 time scales matches our initial expectation of 4 bp (9). Note that the correction for the aggregated in-sample bias, R 2 − R 2 adj = 0.93 bp, is much bigger than what would have been expected if returns were "iid", namely 2/(260 · 8 · 28) = 0.34 bp.
We have also tested the quadratic, quartic and quintic terms in φ T in (10). None of them turned out to be statistically significant at the 95% level. We therefore drop them from our analysis to avoid over-fitting the historical data (i.p., the t-statistics for d is below 1).  fig. 4 (we neglect the overall risk premium a, which is not the focus of this paper).

Dependence on the Time Scale
From the coefficient b of the linear term, which models trends, we observe that trendfollowing works best at time scales from 3 months to 1 year, where b peaks. This appears to be in line with the time scales on which typical CTAs follow trends. Even at those scales, the critical trend strength beyond which trends tend to revert, is below 2. So trends never become strongly significant.
For scales below a few days and above several years, b seems to go to zero, which means that trends are not persistent there. This is consistent with the heat map in fig. 2 (right). On the other hand, the coefficient c of the cubic term, which ensures that trends revert, is quite stable, except that its magnitude appears to be somewhat lower for the 2-and 4-year scales. The 2-and 4-year results must be taken with a grain of salt, though, as there are only 14 independent 2-year trends and 7 independent 4-year trends in our 28-year time window.
Indeed, a preliminary check based on 60 years of monthly returns resulted in c ∼ −0.6% at the 8-year scale. The available data thus indicate that, unlike trend-following, mean reversion works at all time scales. This is also consistent with our earlier observation from the heat map in fig. 3

(left).
To quantify these observations, we refine our regression ansatz (10). We continue to model the cubic coefficient c by a constant, but we model the dependence of the linear coefficient b(k) on the logarithm k of the time scale T = 2 k by a parabola: with (∆k) 2 = b/e. The critical trend strength φ c (k) = (−b(k)/c) 1/2 , at which the expected return E(R t+1 ) is zero (without the noise ), and beyond which trends revert, is then an ellipse with semi-axes ∆k and φ c (k 0 ). Altogether, we now fit 4 parameters to our data: • The "persistence of trends" b, i.e. the value of b(k) at its peak • The "strength of reversion" c • The range k 0 ± ∆k of the log of the time scales T = 2 k at which markets may trend.  Fig. 3 (right) plots the elliptic region, which seperates the "trend regime" (inside) from the "reversion regime" (outside). For its second semi-axis, we find φ c (k 0 ) = 1.78±0.32. This quantifies the empirical heat map in fig. 3 (left) and confirms that highly significant trends of strength (i.e., t-statistics) φ c ≥ 2 always tend to revert.
The errors of the regression parameters in table 4 are again computed by bootstrapping.
The distribution of b and c looks the same as in the univariate case (Fig. 2, right).  We have tried to further refine ansatz (11). First, b(k) in fig. 4 (left) seems to be tilted to the right, which could be accounted for by models such as We find that such models increase the adjusted Rsquared at best marginally. Therefore, we use the simplest model (11) in this paper, to avoid over-fitting the historical data.
Second, we also tested for a polynomial dependence of c on k. The most significant ansatz was that c(k) is also a parabola proportional to −b(k). In this case, the critical trend strength is constant across all time scales, and the region within which markets trend is rectangular instead of elliptic. The distribution of the parameters b(k 0 ), c(k 0 ) then turns out to have the shape of the stretched annulus shown in fig. 5 (right). However, this scenario seems less likely, as it yields a much lower adjusted R-squared (0.77 bp).

Dependence on the Asset Class
Can we refine our 4-parameter-model further by distingushing between asset classes, i.e., by fitting seperate regression parameters for equities, bonds, currencies and commodities?
To test this, we have repeated the regression analysis of the previous section for these 4 sub-sets of our data. Fig. 6 (left) shows the 16th, 50th and 84th percentile of the values of the 4 regression parameters for each asset class, divided by the values of the regression parameters for the overall sample. E.g., for equities, the quantiles for b are (1.80%, 2.82%, 4.01%), which are multiples of (0.90, 1.41, 2.01) of the overall regression coefficient 2.00% (see table   3). Those multiples are what is shown in the first bar of fig. 6 (left).
The results of a regression analysis, including bootstrapping and cross-validation, are shown in table 5. The decrease of c, which measures the strength of reversion, is only weakly significant. However, the decrease of b, which measures the persistence of trends, is significant at the 97.5% confidence level. In principle, we could compute the year Y 0 , in which b(t) = 0: Thus, if one were to take this linear down-trend of b literally, one would conclude that the phenomenon of persistent market trends may have already disappeared. However, there are other scenarios for the time decay of the persistence of trends that are consistent with our data. E.g., for an exponential decay scenario, in which trends never disappear, we find an only slightly lower adjusted R-squared of 1.39 bp instead of 1.52 bp, with b ∼b · e −Qt with decay rate Q ∼ (24 years) −1 . We have also tested scenarios where all 4 parameters or other subsets of them change at different rates, but found that all of these scenarios significantly reduce the adjusted Rsquared. It is left for future work to investigate the time evolution of the pattern of trends and reversion in more detail.

Analogies with Critical Phenomena
In this section, we point out some striking analogies between the empirical observations of sections 3 and 4 and critical phenomena in statistical mechanics. Analogies between financial markets and critical phenomena, such as scaling relations, have long been observed [24]. Our results go further: they seem to directly and specifically identify the trend strength with the order parameter of a Landau-type mean field theory with a quartic potential.
Analogies with critical phenomena are plausible, if financial markets are regarded as statistical mechanical systems, whose microscopic constituents are the Buy/Sell orders of individual traders. It is conceivable that these orders can be modeled by degrees of freedom that sit on the vertices of a hypothetical "social network of traders". These degrees of freedom may interact with each other in analogy with spins on a lattice, thereby creating the macroscopic phenomena of trends (herding behavior) and reversion (contrarian behavior).
To imitate these phenomena and their interplay, various spin-and agent models have been proposed in the literature (see, e.g., [25,26], and [27] for a recent review).
Candidates for the "social network of traders" include small-world networks [28], scalefree networks [29], or the Feynman diagrams of large-N field theory [30]. For a recent review of candidates for social networks, see [31]. To our knowledge, no convincing specific model has emerged as a consensus so far. Our results provide an empirical basis for accepting or rejecting such candidates: any statistical-mechanical model of financial markets, if accurate, must replicate the interplay of trends and reversion observed in this paper.
To make this precise, let us first reap the benefits of our recursive definitions (5,7) of the trend strength, which lead to simple differential equations in the "continuum limit" T 1: To be specific, let us focus on the 6-month time horizon, i.e., T = 2 7 = 128 trading days (the results for other horizons are similar). Combining (12) with the ansatz (10) implies the following second-order stochastic differential equation for the trend strength φ: with rescaled random noise . Its simpler cousin ψ in (5) obeys a first-order equation: with the following empirical parameter values, as measured by a regression analysis that is analogous to that reported in section 4 for ψ: is the purely dissipative Langevin equation, which is reminiscent of the earlier description [20] of the dynamics of financial markets at intraday scales by another Langevin equation. In the theory of critical phenomena, the Langevin equation is well-known to describe the dynamics of the order parameter of certain statistical mechanical systems near second-order phase transitions [32,33]. This is consistent with the conjecture that the trend strength (defined as either φ or ψ) plays the role of an order parameter, in analogy with the magnetization in spin models.
To take the analogy further, statistical mechanical systems near second-order phase transitions are characterized by universal critical exponents. E.g., a scalar field theory with a φ 4 potential similar to the potentials V in (13,14) describes water and steam and other physical systems in the same universality class (such CO 2 or the Ising model) near their critical points [33]. For all systems within this universality class, the parameters b and c show the same scaling behavior as a function of the length scale L (e.g., b ∼ L κ for some exponent κ).
In critical dynamics, scaling with L also translates into a scaling with the time horizon T [32].
In section 4, we have seen that -within the limits of statistical significance -the values of the coefficients b and c are the same for very different markets, such as equity indices, bonds, FX-rates, and commodities. The parameters k 0 and ∆k in (11), which characterize how b behaves under a rescaling of the time horizon T , are also the same. This could be an expression of universality and scaling in financial markets. To confirm this, it will be key to examine how the scaling behavior in (11) extends to intra-day and multi-year time horizons T = 2 k with k > 10 or k < 1. For example, it might reflect a complex critical exponent [34].
Together with the stochastic differential equations (13,14), the empirically observed scaling behavior may uniquely specify a particular social network that models financial markets.
To conclude this section, let us compare with some previous work. In [21], a related model for the dynamics of asset prices was postulated. The role of the trend was played by the deviation of the current asset price from its unknown "value". Terms of any order were considered in the polynomial potential, and the corresponding classical solutions were discussed. Compared with [21], our trends are measurable, and we focus on a quartic potential, empirically observe the values of its coefficients and their scale dependence, and provide a simple and intuitive map between the quadratic (quartic) terms and trends (reversion).
In [22], another model with a polynomial random force similar to (10) was postulated.
The trend strength was defined by a moving average crossover (which does not lead to exact differential operators such as (12) in the continuum limit). This model was applied in [23] to intraday returns for the USD/JPY and USD/EUR exchange rates during stress periods.
Instead of our quartic potential with stable coefficients, only a cubic potential was measured.
Morevoer, its coefficients, including their signs, were found to rapidly vary in time.
However, these studies were based on very different data sets, namely tick data (instead of daily data) for single assets over time periods of several weeks (instead of decades). Thus, it is no surprise that the stable quartic potential (corresponding to the cubic trem in (10)) was not found in [23]: as we have seen, in order to detect it with strong statistical significance, one needs not only decades of data, but also aggregate them over a broadly diversified set of assets. Also, since the coefficient of the cubic potential reported in [23] varies rapidly in time, it can be expected to average out over long time scales. This is consistent with the fact that we do not observe a cubic potential in our empirical long-term analysis.

Summary and Discussion
In this paper, we have empirically observed the interplay of trends and reversion in financial markets, based on 30 years of daily futures returns across equity indices, interest rates, currencies and commodities. We have considered trends over ten different time horizons of T = 2 k days with k ∈ {1, 2, ..., 10}, ranging from 2 days to approximately 4 years. For a given market i on a given day t, we have defined the trend strength φ i,k (t) as the statistical significance (t-statistics) of a smoothed version of its mean return over the past 2 k days, in excess of the market's long-term risk premium.
Our key results, as illustrated in figs 2 and 3, are the following: for a given market i and each time horizon labeled by k, tomorrow's normalized log-return R i (t + 1) can accurately be modeled by a cubic polynomial of today's trend strength in that market: Here, i represents random noise. α i is the normalized long-term risk premium of market i, which has not been not the focus of this paper. Instead, we have concentrated on determining the coefficients b, c, and the function f k , which measure how the expected return of an asset varies in time. As discussed, we interpret b as the persistence of trends, and c as the strength of trend reversion. Within the limits of statistical significance, we find that they are universal, i.e., the same for all assets. Over the past 30 years, we find from This implies that trends may only be stable if the log of the time horizon is within the range k 0 ± ∆k, corresponding to time scales from a few days to several years. The parameters k 0 and ∆k are also universal. By bootstrapping and cross-validation, we have found that all four parameters in (16) and (17) are statistically highly significant out-of-sample.
Let us now discuss these results. First, they imply that trends tend to revert above a critical trend strength, where the linear and cubic term in (15) balance each other. This critical trend strength lies below 2 in all cases. In other words, by the time a trend has become statistically significant, such that it is obvious in a price chart, it is already over. This supports a variant of the efficient market hypothesis [35,36,37]: inefficiencies in financial markets are eliminated before they become strongly statistically significant.
Despite being insignificant, small trends can add value for investors through tactical asset allocation strategies, if accompanied by appropriate risk management and broad diversification across assets. While this paper does not recommend investment strategies, we note that the inclusion of the cubic term in (15) appears to be a major improvement over classical trend-following, as it takes investors out of trends before they are likely to revert (see also the comments on systematic asset management in appendix A3). We believe that publishing such strategies and subjecting them to an academic discussion and independent review will ensure a high level of professionality in asset management.
Trend-following has been very successful in the 80's and 90's, when it was the proprietary strategy of a limited number of traders. By now, large amounts of capital have flown into this strategy, so it can no longer be expected to provide a "free lunch." Indeed, while we have not observed a consistent weakening of the strength of reversion c, we have seen that the persistence of market trends b has clearly decreased over the decades. This measures the rate, at which markets are becoming more efficient with respect to trends.
What will happen, when all investors try to exploit trends and reversion? Then both phenomena should weaken, until they earn a moderate equilibrium return that just compensates for the systematic risk of these strategies and their implementation costs. In this sense, trend-following and mean reversion may just become "alternative market factors" as part of the general market portfolio. In fact, the weakening of b that we have observed here indicates that this development is already well underway at least for traditional trend-following.
On a conceptual level, our precise measurement of trends and reversion reveals intriguing analogies with critical phenomena in physics. They support the conjecture that financial markets can be modeled by statistical mechanical systems near second-order phase transitions. In such a model, Buy/Sell orders would represent microscopic degrees of freedom that live on a "social network" of traders. The trend strength would play the role of an order parameter, whose dynamics is described by the stochastic differential equations (13,14). Together with an extension of the scaling behavior (17) to shorter and longer horizons, these equations provide an empirical starting point for developing such a model.
If such a statistical mechanical theory of financial markets can be established, it will introduce powerful concepts from field theory into finance, such as the renormalization group, critical exponents, and Feynman diagrams. This will lead to a new and deeper understanding of financial markets, and phenomena such as trends, reversion, and shocks will become more accessible to scientific analysis. Further research in this direction is underway.
The following table compares the regression results of subsection 4.2 with the results that would be obtained for alternative choices of some of the parameters:

A1. Caps and Floors for the Trend Strength
In section 2, we have capped the magnitude of the trend strength at 2.5 to limit the effect of outliers on the results. • Increasing the cap beyond 2.5 increases the adjusted R-squared, but decreases the significance (t-statistics) of the regression betas. This makes sense intuitively, because the results are now dominated by the regime of strong reversion at "outlier" trend strength |φ| > 2.5. As such outliers are rare, the statistical significance decreases.
• Decreasing the cap below 2.5 decreases the adjusted R-squared without improving the overall significnce of the regression betas. This is also understandable, as it removes much of the reversion regime from the analysis. Thus, our cap/floor of ±2.5 is a good compromise, where neither trends nor reversion outliers dominate the results.

A2. Long-term Risk Premia
In equation (2), we have removed the long-term risk premia µ i from the trend strengths φ i,T .
Here, we explain what happens if we do not remove the risk premia from the trend strengths: • Trends in markets, for which such risk premia are generally assumed, would then have an upward bias, i.e., positive expectation value. Especially very-long-term trends in equity and bond markets would almost always be positive and never revert.
• Table 6 (col. 7-9) shows how this mix-up of trends and risk premia would modify the results of subsection 4.2 (shown under "0%"). If 50% or 100% of the risk premia were included in the definition of the trend strength, the parameter k 0 (which measures the time horizon at which the persistence b of trends peaks) would strongly increase.
• In fact, if we think of risk premia as trends with infinite time horizon, we expect that, without removing risk premia, the trending regime of sub-section 4.2 would extend all the way to infinite horizon. We would then model it by a parabola instead of an ellipse.

A3. Comments on Systematic Asset Management
The key motivation for this article is to lay the empirical basis for a statistical-mechanical model of financial markets, which can hopefully explain the analogies with critical phenomena in physics. Nevertheless, let us briefly comment on implications for systematic trading: • According to the back-of-the-envelope estimate of sub-section 3.3, an annualized Sharpe ratio of order 1 for a systematic futures trading strategy corresponds to an adjusted R-squared of about 4 basis points in predicting daily returns of individual markets.
• By the same argument, the aggregated adjusted R-squared of 6 basis points of subsection 4.2. corresponds to an annual Sharpe ratio of √ 1.5 for a market-neutral strategy.
• Trading costs reduce the Sharpe ratio, especially at intra-month time horizons. Diversifying into new types of markets or including risk premia can increase the Sharpe ratio.
• Risk control mechanisms, such as sizing positions based on current market volatility, or stop-losses in the reversion regime, can either decrease or increase the Sharpe ratio.