Inferring characteristic timescales from the effect of autoregressive dynamics on detrended fluctuation analysis

Exploiting recent progress in the theoretical understanding of detrended fluctuation analysis (DFA), we use the non-asymptotic properties of the fluctuation function in order to extract more information from time series data than just its Hurst exponent. In particular, we can identify exponential relaxation and oscillation periods and estimate their specific values. We illustrate the strength of this method through applications to climate data. Thereby, we determine the relaxation time of the atmospheric response to perturbations. We also find by DFA a period length of the dominant frequency mode of the El Niño Southern Oscillation to be 3.3 years.


Introduction
Anomalous statistical properties in time have been of interest for researchers ever since the Hurst effect [1] was found in 1951 [2]. Long range correlations in time were found in many real world time series like temperatures [3], heart rates [4], biology [5], finance [6] and various others. The question whether or not a system exhibits long range correlations is relevant for error bars [7], predictions [8] and modeling in general. However, results that indicate long range correlations should always be treated with care, since power law decay and exponential decay of correlations can not be distinguished easily for finite time series [9,10].
Detrended fluctuation analysis (DFA) was developed in the 90s [11] and is to this day a very popular method for the detection of long range autocorrelations. For a long time the literature that dealt with the interpretation of DFA results has been entirely empirical [12][13][14]. Nevertheless, effects of all kinds of nonstationarities are pretty well known due to these studies. The effect of short time correlations and how they can be misinterpreted for long range correlations is a well known pitfall of this method [9].
In recent years there has been a lot of progress in the theoretical understanding of DFA [15][16][17]. Its relation to the correlation function, and therefore also to other tools of fluctuation analysis is known [15]. Due to this better understanding we do not have to rely on empirical results any more, but we can discuss the behaviour of theoretical fluctuation functions [10]. DFA has two advantages over traditional methods for analyzing fluctuations, namely, on the one hand it is smoother and numerically more robust for short datasets and on the other hand the influence of simple trends can be removed easily [18].
In this text we try to find better understanding of what DFA really shows for a large class of systems. We go one step further and ask what we can learn from fluctuation functions even if the process is not long range correlated. By fitting fluctuation functions we are able to determine fundamental properties like relaxation times and period lengths even for noisy systems.
In order to illustrate the power of our method, we show applications to climate time series. We use temperature measurements from the Hadley Center Met Office (crudata.uea.ac.uk/cru/data/temperature/). For global mean surface temperature (GMST) we use HADCRUT4 data [19]. The El Niño southern oscillation (ENSO) signal T Nino is calculated from HADSST3 [20], the sea surface temperature anomalies (SST), by first Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.
Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
averaging SST in the Pacific domain between 160°E-90°W and 5°S-5°N [21]. Then we subtract the global mean SST time series and remove the small remaining trend.

DFA and calculation rules
The correct implementation of DFA is described elsewhere [22]. For our purpose, it is sufficient to understand what it does. DFA-q produces the so called fluctuation function ( ) s . For stationary processes this is a nonlinear transformation of the autocorrelation function C(t) of the process [15] where L q (t, s) is some sophisticated kernel which is able to suppress the effects of polynomial trends of order q−1. σ 2 is the variance of the process, that can be measured for stationary time series and approximated otherwise. s has the same unit as t, however, it is not possible to calculate the fluctuation function up to the total length of the time series. In fact s should be smaller than 1/4 of the maximal value of t [22].
The fluctuation function inherits a superposition principle from the autocorrelation function, that means for independent processes X and Y we get [12] This principle explains cross-overs which might occur if a signal is the sum of two independent signals with different correlation times or scaling.

Uncorrelated and correlated noise
As equation (1) shows, DFA measures the correlations in a time series. This should not be confused with other anomalous statistical properties. Anomalous diffusion can have three root causes [23,24]. The correlation time can be infinite, the second moment of the increments can diverge or the system can be non-stationary. As a general model we consider a set of independent, identically distributed random variables with distribution ρ(ξ) (see figure 1). These random variables are now also randomly distributed in time, where the inter-event times r have a probability distribution W(r). We assume, that W(r) has a finite first moment and the process η t is therefore stationary. If ρ(ξ) has a finite second moment, equation (1) holds. If the second moment is infinite, the scaling α=0.5 is still the same. Only the offset σ 2 becomes a random variable, as we show in figure 2 (left). The squared fluctuation function is given as [15] Now we can define a dynamics X t =D(η t , X t−1 , X t−2 , K). Usually this is done by a fractional or an ordinary differential equation as we do it later in this text. In addition one might ask, what happens if the noise is already correlated as in a process Y t =D(X t , Y t−1 , Y t−2 , K). In general, these processes yield exponents The relation is simple for ARFIMA(0, d, 0) [25] processes, defined as with the backshift operator Bx t =x t−1 . If such a process is driven by another ARFIMA instead of white noise ò,it could be written as  Figure 2 (right) shows, that relation (5) is also true for other processes than ARFIMA. This is not surprising, since all continuous correlation functions can be locally approximated with ARFIMA correlation functions.
We conclude that while the additive composition of two signals leads to equation (3), inseting one process instead of white noise into another one leads to equation (5).

Linear response (AR(1))
We will now go back to systems driven by white noise. A standard example for a dynamics D is the autoregressive model of order one, AR(1). It is the discrete version of a linear first order differential equation in one dimension, driven by white noise We consider AR-parameters a in the range < < a 0 1. We have seen in section 3 that the distribution of the noise η is not important for our purpose as long as it is uncorrelated. The variance σ X 2 can be obtained by taking the square and the ensemble average of equation (7). The result, 2 , depends on a as well as on the variance of η. The correlation function decays exponentially with relaxation time τ [25] t It is well known in the literature that results for processes with exponential decay of correlations in DFA can easily be misinterpreted for long range correlations [9,26]. Thus we know we have to be careful especially when dealing with short time series. We can, instead of fitting a straight line to ( ) s , try to fit the theoretical fluctuation function of AR(1). The analytical result for the AR(1) process can be derived from equation (1). For DFA-1 it yields [10] (see supplementary material available online at stacks.iop.org/NJP/21/033022/mmedia for Python code)  with J(s), K(s) polynomials in s Instead of equation (7) let us now consider an idealized, random process with spikes, which shall make the idea of uncorrelated noise and our generalized notion of AR(1) a bit clearer. The spike magnitudes shall be distributed according to ρ(ξ) exponentially. The probability of occurrence of a spike is identical for each time step, which yields exponential waiting times W(r). Finally, each spike also decays exponentially with rate γ=0.2. In fact, this is an AR(1) process according to definition (7) with a=e − γ and a noise η which is a Poisson process with random magnitude. In the left panel of figure 3, we show one realization of the process. On the right we show  of a time series of length 3000. The theoretical AR(1) result is fitted by the method of least squares to the logarithm of the fluctuation function for logarithmic time steps. So we ensure that all timescales are equally taken into account for the fit. The AR(1) coefficient obtained by the fit is a≈e − γ in a very good approximation, while the spike magnitude and their frequency only affect the values of  by a global prefactor.
The same method can be applied to real data. As an example we show monthly and yearly averages of GMST. These datasets have a trend due to climate change that can be removed using DFA-3 [3] (see appendix of [26] for theory and supplementary material for Python code). In previous work [26], the authors have shown that there is a case to make for yearly GMST data being modeled as an AR(1) process. There are two main fluctuating drivers of global temperature, volcanic eruptions and ENSO. On the timescale of years, ENSO is slightly anti-persistent, while the volcano forcing has a > 0.5. When both are superimposed, the result in DFA is α≈0.5 for the relevant timescales. A linear response model with these inputs is able to model GMST pretty accurately [21].
When fitting ( ) s with the theoretical fluctuation function of AR(1) in DFA-3, we obtain an AR parameter of a=0.32 (see figure 4), which is signifficantly lower than to the one used in [21] (a=0.5), however, the goodness of the fit implies, that such a linear response model with relaxation time τ=10.2m is at least one good model for climate variability of the timescale of years. When performing DFA on monthly data and fitting again the fluctuation function of AR(1), we obtain a=0.78 in monthly time units, which yields an even shorter correlation time than the one for yearly data. However, a closer look at figure 4 reveals that the fit is bad, i.e. the monthly data is not well described by AR (1). The reason is that intraseasonal climate variability, which can not be described by white noise, plays a role here. So we see that we should not blindly fit AR(1) models but should take systematic deviations as warnings, as one does when fitting straight lines to fluctuation functions.

Periodicity and noisy oscillations (AR(2))
In addition to linear response models we also want to study a model with nonlinear response. The autoregressive model of order two, AR(2), is defined by where, as in the previous section, η is any type of uncorrelated noise and a 1 , are the AR-parameters. The variance is where σ η 2 is the variance of the noise. Using the backshift operator equation (11) is equivalent to , which we can always rewrite as with real or complex roots g 1 , g 2 . The correlation function C(t) can be calculated from the well known Yule-Walker equation [25]. The result is = + ( ) C t c g c g t t 1 1 2 2 , with c 1 and c 2 obtained from C(1)=a 1 /(1−a 2 ) and C (0)=1. The theoretical fluctuation function for AR(2) (see supplementary material for Python code) can be expressed in terms of  a of AR(1), because C(t) is the superposition of two AR(1) correlation functions (8), If g 1 , g 2 are complex numbers with positive real part, equation (11) is the discrete version of a damped driven harmonic oscillator and the period length T is given by [25] We want to concentrate on this scenario, as oscillations are a fundamental feature of many real world systems, and equation (11) is the simplest way to model them.
A very prominent example is the seasonal cycle in all kinds of climate data. The fluctuation function of a signal with sinusoidal trend was presented in [12]. The problems it causes for DFA are known and there are several methods to deal with it [27]. In figure 5 (left), we show that sinusoidal signals without damping can be fitted with AR(2) models, just as we did for AR (1), and the period length can be determined in this way with low uncertainty. Note, that in contrast to the plain sinusoidal signal, the fluctuation function (14) would yield α=0.5 in the asymptotic limit of long times. This means that the fit in figure 5 (left) only works for a finite timerange, which depends on the s-range of the analysis.
There are other phenomena of frequency modes in nature that do not produce clear sinusoidal signals due to stronger perturbing effects and superposition with other modes, a very prominent one being the already mentioned ENSO [28]. The question whether or not there are frequency modes and which frequencies are dominant is still not finally solved [29]. Here we use the fluctuation function of the AR(2) model in order to find the dominant frequency of ENSO. Since intraseasonal climate variability is present in T nino and we only want to examine the El Niño oscillation itself, we average over a longer time period (T nino in figure 6 (left)). In figure 5 (right) we see that we can get a good fit of the AR(2) model to  of the 11 month average of ENSO. This method cannot reveal more than one dominant frequency mode, however the goodness of fit suggests that this is enough, and other effects can be neglected for this timescale. We obtain a period length of 3.3y, which is a bit less than the results of [30] for the low frequency mode.
Similar to the seasonal cycle, ENSO also affects ( ) s of temperature time series [31]. Around the ENSO period we see a crossover in GMST for monthly data in figure 4 (right). The yearly analysis is not affected, because the ENSO period is too short. We now subtract the 11 months averaged ENSO time series with a tuned pre factor, which is compatible to direct least squares fits from the GMST time series. In figure 6 we see that  of the resulting signal is significantly less curved than the original GMST fluctuation function, hence 'El Niño detrending' works similar to removing the seasonal cycle. In an ideal case, where the resulting signal T-T nino is independent of T nino , the two fluctuation functions add up to the fluctuation function of GMST according to equation (3).

Discussion
With these examples we have shown that we have a new tool in the box of time series analysis if we use all advantages of DFA. These are present especially for short datasets or datasets with trends where a direct measurement of the correlation function is hopeless. We can fit the fluctuation function of AR(1) and AR(2) models to DFA results for time series and learn the relaxation time and oscillatory modes even if they are hidden  under white noise and even if they are long compared to the total length of the dataset (up to 1/4th). The strength of this method, compared to conventional methods, is that together with parameter estimation it already yields a validation of the appropriateness of a linear model. Here we have shown some examples for climate data, however, applications might be in many other fields too. Note that the advantage of DFA, namely its smoothness might also be a disadvantage when we are interested in details and not only the dominant timescale. Those might not be visible due to the superposition principle of DFA. The method also does not tell us anything about the second moment of the distribution (see figure 2), which is also associated with anomalous diffusion. Still understanding fluctuation functions for as many systems as possible certainly helps to distinguish between different correlation types.