On Studying Relations between Time Series in Climatology

6 Relationships between time series are often studied on the basis of crosscorrelation coefficients 7 and regression equations. This approach is generally incorrect for time series irrespective of the 8 crosscorrelation coefficient value because relations between time series are frequency-9 dependent. Multivariate time series should be analyzed in both time and frequency domains, 10 including fitting a parametric (preferably, autoregressive) stochastic difference equation to the 11 time series and then calculating functions of frequency such as spectra and coherent spectra, 12 coherences, and frequency response functions. The example with a bivariate time series 13 'Atlantic Multidecadal Oscillation (AMO)  sea surface temperature in Niño area 3.4 (SST3.4)' 14 proves that even when the crosscorrelation is low, the time series' components can be closely 15 related to each other. A full time and frequency domain description of this bivariate time series 16 is given. The AMO  SST3.4 time series is shown to form a closed feedback loop system with 17 a two-year memory. The coherence between AMO and SST3.4 is statistically significant at 18 intermediate frequencies where the coherent spectra amount up to 55% of the total spectral 19 densities. The gain factors are also described. Some recommendations are offered regarding 20 time series analysis in climatology. 21

Thank you for your suggestion.
The article by Guez et al (2013) does not deal with time series analysis methods (at least, explicitly), uses cross-correlation coefficients, and its objects of study do not include the Atlantic Multidecadal Oscillation.Its "… aim is to follow climate dynamics and to investigate the temporal stability of their structure" (p.68006-p1), which is not a goal in my article.The data consists of daily values of air temperature anomalies "(actual values minus the climatological averaged over the years for each day)" on a spatial grid, not of mean annual values as in my article.The authors do not analyze any time series in the frequency domain nor does the article contain time series models in the form of stochastic difference equations.Thus, it has no direct bearing on either the mathematical methods described in my article (parametric multivariate time series analysis), upon the time scales (climate), or upon the geographical objects (no AMO data).
The article by Ludescher et al ( 2013) deals with spatial gridded data in the Niño area 3.4 (that is, SST3.4).Similar to the Guez et al article, it is based on the climate network approach, not on time series analysis approach used in my article.It uses crosscorrelation between nodes of the grid (random variables analysis) within the area (no teleconnections through oceans as in my article) and deals with forecasting El Niño events (not a subject in my article).Again, no overlap with my article either in methods of analysis or the goals.
With all due respect, the articles that you recommend are not relevant to my article.
Referring to them in my article would be inappropriate. what is the optimal time-domain stochastic model for a given multivariate time series? which components of the time series could be regarded as inputs and outputs of respective climatic system?

List of corrections in the article
 is there any interaction between the inputs and the outputs (are there any closedfeedback loops within the system)?
 what are the statistical properties of the multivariate time series in the time and frequency domains?
In this article, we will apply the methods first developed in theory of information (Gelfand and Yaglom, 1956), time series analysis (Bendat and Piersol, 1966;Box and Jenkins, 1970), econometrics (Granger and Hatanaka, 1964;Granger, 1969), and in geophysics (Robinson, 1967;Robinson and Treitel, 1980) to study relations between multivariate time series of climatic data; the goal is to describe time series in the time and frequency domains, including climatic teleconnections that can hardly be found within the correlation/regression approach.

Elements of multivariate time series analysis
Note first that the linear correlation/regression approach as a means of studying relations between scalar time series, including teleconnections within the climatic system, is generally inapplicable to time series analysis.The simplest example given in Privalsky and Jensen (1995) and repeated in Emery and Wilson (2004) is a zero crosscorrelation coefficient between two strictly linearly connected white noise sequences, one of which is obtained by applying a shift operator to the other.A low correlation coefficient may occur between any time series related to each other through more complicated but still strictly linear transformations.In particular, it can be a time series and its first difference, or any autoregressive-moving average (ARMA) time series and its innovation sequence, or the time series at the input and output of a linear filter.The general statement is that if a time series is obtained from another time series through a strictly linear inertial transformation the correlation coefficient between them will not be equal to 1 in spite of the strictly linear dependence between them.
Relations between two time series (say, A and B), which are not mutually independent, always correspond to one of the following three situations: A affects B but not vice versa, B affects A but not vice versa, A and B affect each other (interaction).Again, the correlation/regression approach does not allow one to determine what the actual situation is.It can be done within the framework of the time-and-frequency domain analysis of multivariate time series.
The linear regression equation B = A + , where  is a constant and  is a white noise sequence (regression error) means that the spectra sA(f), sB(f) of the time series A and B are identically shaped because sA(f) =  2 sB(f) + 2   (where f is frequency and 2   the regression error variance).This result is irrelevant when A and B are time-invariant random variables but if A and B are time series, it puts an unnecessary limitation upon their properties.In the general case, the shapes of the spectra are not identical, which would mean that  is not white noise thus making the regression equation inadequate.This is another reason why both the crosscorrelation coefficient and respective linear regression equation cannot describe relations between time series.
The problem is solved if one uses methods of time series analysis including simultaneous description of multivariate time series in the time and frequency domains.It means fitting a stochastic difference equation to the time series, analyzing its properties in the time domain and then calculating and analyzing functions that describe the time series in the frequency domain.
For a number of considerations (see below), the approach used here will be limited to the autoregressive (AR) case.Also, we will be regarding only the bivariate case.The extension to higher dimensions is rather simple (e.g., Bendat and Piersol, 1966;Robinson, 1967) and will be briefly described at the end of this section.
Let the bivariate time series 1, 2, [ , ] be a (zero mean) sample record of an ergodic discrete-time random process; here n = 1, …, N is the dimensionless argument, N∆t time series length in time units ∆t, and the strike means matrix transposition.The sampling interval ∆t is supposed here to be equal to 1.
In the time domain, the time series is described with a stochastic difference equation where p <  is the order of autoregression, are matrix AR coefficients, and is a bivariate innovation sequence (white noise) with a covariance matrix If the white noise an is stationary, the AR coefficients are not time-dependent, and the roots of the characteristic equation corresponding to Eq. ( 1) lie outside the unit circle, the bivariate time series (1) presents a sample of a stationary linearly regular random process.
The order p of an optimal AR model that agrees with the observed time series xn must be chosen on the basis of quantitative statistical considerations.Probably, the most efficient way to select an optimal order is to use order-selection criteria such as Akaike's AIC, Schwarz-Rissanen's BIC, Parzen's CAT, and Hannan-Quinn's  (e.g., Box et al., 2008;Parzen, 1977;Hannan and Quinn, 1979).
Properties of the time series xn in the frequency domain are defined with the spectral matrix which is obtained through a Fourier transform of stochastic difference equation ( 1) fitted to the time series xn.Here f is frequency in cycles per sampling interval (in our case, year -1 ), s11(f), s22(f) are spectra and s12(f), s21(f) complex-conjugated cross-spectra of the time series components x1,n and x2,n.In particular, the coherence function The importance of the coherence function in time series analysis and modelling is illustrated with the following property.If the components of an ergodic bivariate time series present processes at the input and output of any linear time-invariant system, the coherence between them will be equal to 1 at all frequencies where the spectral density is not too close to zero.
The spectral matrix (4) describes a linear stochastic system with the time series x2,n and x1,n as the system's input and output, respectively.The coherent spectral density, or coherent spectrum ( ) / ( ) shows in what manner the spectral energy s22(f) of the input x2,t is transformed into the spectral energy s11(f) of the output x1,t as well as the phase difference between them (the gain factorG12(f)and phase factor 12(f), respectively).The spectral characteristics calculated on the basis of Gaussian AR models with properly selected autoregressive orders satisfy the requirements of the maximum entropy method in spectral analysis.This is one of the reasons for selecting the AR modelling.
In the general case of an M-variate time series 1, , [ ,..., ] , the time domain model is still given by Eq. ( 1) with the matrix AR coefficients and with the innovation sequence Consequently, the spectral matrix (4) changes to where sii(f) and sij(f) are spectral (if i = j) and cross-spectral (if i  j) densities, respectively, of the time series components xi,n, i = 1, …, M. The elements of the spectral matrix (8) are used to calculate spectra, multiple and partial coherence functions, multiple and partial coherent spectra, and M-1 frequency response functions (see Bendat and Piersol, 2010).The spectral matrix (8) is Hermitian.
It should be noted that if the multivariate time series is long (by orders of magnitude longer than the largest time scale of interest) and if the spectra of its components are intricate, the above-described approach may not be the best, especially in the time domainbecause of the high order of the optimal stochastic difference equation.In such cases the analysis may have to be limited to a frequency domain description of the time series by using the nonparametric (e.g., Percival andWalden, 1993, Bendat andPiersol, 2010) approach.However, this can hardly happen if one is interested in properties of the time series at climatic time scales.

An example of bivariate time series analysis
The example with actual climatic data given below proves that the components of a bivariate time series can be connected to each other in spite of the fact that the crosscorrelation coefficient between them is low.It also provides a simultaneous description of a climatic system in both time and frequency domains.The case with a high crosscorrelation coefficient between the components of the ENSO system (Southern Oscillation Index and SST variations in the Niño area 3.4) has been treated in detail in Privalsky and Muzylev, 2013 where it was shown, in particular, that both time series are close to white noise, interact with each other mostly through the innovation sequence, and that the coherence function, coherent spectra, and the frequency response functions between SOI and SST are frequency dependent.
The El Niño−Southern Oscillation (ENSO) system is believed to affect many phenomena in the Earth climate (e.g., Philander, 1990).We will construct an autoregressive model of the bivariate time series  Morice et al., 2011), for the same time interval from 1876 through 2014 (Fig. 1a).
As seen from the fiugure, the two components behave in a different manner: AMO contains much stronger low-frequency variations than SST3.4.The correlation between AMO and SST3.4 (Fig. 1b) is very low, with the crosscorrelation coefficient 0.06 and the maximum absolute values of the crosscorrelation function below 0.26.With the crosscorrelation-based approach that prevails in climatology, the conclusion would have to be that the two scalar time series are either not related to each other at all or that the connection between them is very weak.And it would not be correct.
Consider first the time domain properties.All four above-mentioned order selection criteria selected the same order p = 2.The respective AR(2) model is with the innovation covariance matrix 0.014 0.007 0.007 0.290 All coefficients in Eq. ( 9) are statistically significant at a confidence level of 0.95.
Obviously, Eq. ( 9) describes a closed feedback loop system: AMO (x1,n) depends upon two of its previous values and upon two previous values of SST3.4 (x2,n) and SST3.4 depends upon two previous values of both SST3.4 and AMO.The eigen-frequency of this system was found to be 0.24 year -1 so that respective period of about 4 years is the time required to close the system's feedback loop.
The stochastic difference equations ( 9) and the innovation sequence covariance matrix (10) allow one to understand how much of the variances of AMO and SST3.4 can be explained by the "deterministic" components of the model ( 9) that describe the dependence of x1,n (AMO) and x2,n (SST3.4)upon their own past values and upon past values of the other time series.The variance of AMO 2 1   0.035(C) 2 while, according to Eq. ( 10), the variance R11 of the innovation sequence a1,n is 0.014(C) 2 .Therefore, the part of the AMO variance 2 1  which cannot be explained by the dependence of the time series upon their past behavior is 0.014/0.035 0.4.Thus, Eq. ( 9) allows one to explain about 60% of the AMO variance by its dependence upon its own past values and upon the past values of SST3.4.
Both AMO and SST3.4 time series are Gaussian so that their autoregressive spectral estimates satisfy the requirements of the maximum entropy spectral analysis.The AMO spectrum s11(f) quickly decreases with frequency, which is characteristic of spatially-averaged climatic processes.Such behavior of the spectrum means that AMO is strongly dependent on its past values.The SST3.4 spectrum s22(f) is not monotonic, has a wide maximum at intermediate frequencies and, in general, does not differ much from a white noise spectrum.The dependence on its past values is weak.
Though the crosscorrelation coefficient between AMO and SST3.4 is very low, the maximum entropy estimate of coherence Co12(f) shown in Fig. 2b is statistically significant everywhere except at the frequencies below 0.13 year -1 and above 0.44 year -1 .It is bell-shaped and exceeds 0.6 within the frequency band from 0.20 year -1 to 0.36 year -1 , that is, roughly, at time scales between 3 and 5 years.Its maximum value is 0.74 at f  0.27 year -1 .This behavior of the coherence function means that the components of this bivariate time series contribute to each other up to about 55% of the spectral density at intermediate frequencies.
The coherence between AMO and SST3.4 is weak at the low-frequency end, where AMO's spectral energy is much higher than elsewhere.The high coherence occurs at intermediate frequencies where the spectral density of AMO is much lower.The strong dependence of AMO on its past values and the relative closeness of the SST3.4 spectrum to a constant seem to be the reasons why the stochastic model ( 9) can explain so much of the total AMO variance and less of the total SST3.4 variance.
The contribution of SST3.4 to the AMO spectrum is where s11(f) is the AMO spectrum (Fig. 3a).Respective contribution of AMO to SST3.4 shown in Fig. 3b is . These coherent spectrum estimates are statistically significant within the frequency band from 0.18 year -1 to 0.38 year -1 where they amount to 25%  55% of the spectral densities s11(f) and s22(f).This is a substantial contribution but it occurs within the frequency band where the spectral density of AMO is at least an order of magnitude smaller than at lower frequencies where the coherence between AMO and SST3.4 is not significant.The crosscorrelation coefficient that "integrates" the coherence function is small in spite of the relatively close connection between the two processes at moderate frequencies for at least two reasons: the complex structure of the interdependence between the time series components expressed by Eqs. ( 9) and (10) which cannot be described with a linear regression equation and the low absolute contribution of SST3.4 to the AMO variance.
This example also shows that using proper methods of analysis allows one to avoid filtering of time series in order to suppress 'noise'.Indeed, though the low-frequency variations dominate the spectrum of AMO, the coherence function has revealed the 'signal'  a teleconnection between AMO on SST3.4 at intermediate frequencies where the AMO spectrum is low.This is another useful property of autoregressive time and frequency domain models.
Our frequency-domain results generally agree with the earlier results by Park and Dusek (2013) regarding the connection between AMO and the Multivariate ENSO Index (MEI) at intermediate frequencies.The authors used a nonparametric spectral estimationsingular spectrum analysis keeping 10 first empirical orthogonal functions that cover slightly over 75% of the time series total variances.At frequencies above 0.15 year -1 , our estimate of coherence is quite similar to the coherence estimate in Park and Dusek (2013).However, the authors did The phase factors in this case cannot give explicit information about the AMO -SST3.4system because its feedback loop is closed (interaction between AMO and SST3.4).We cannot compare our spectra with those shown in Park and Dusek (2012) because their spectral estimates are given without confidence bounds but generally the shapes of the spectra at frequencies below 0.5 year -1 seem to be rather similar.

Conclusions and comments
1. Relations between time series should not be studied on the basis of crosscorrelation coefficients and regression equations.An efficient approach within the framework of time series analysis includes two stages both involving parametric (preferably, autoregressive) modelling:  fitting a stochastic difference equation to the time series (time domain), analyzing the selected model, and  using the fitted equation to calculate and analyze frequency domain characteristics (spectra and coherent spectra, coherence function(s), gain and phase factor(s)).
This two-pronged approach is little known in climatology and related sciences.
2. Methods of multivariate time series analysis should be used in all cases, irrespective of the value of the crosscorrelation coefficient.The crosscorrelation and regression coefficients do not generally describe relations between time series.In particular, a low crosscorrelation coefficient does not necessarily mean the lack of even a strictly linear dependence between the time series.
3. The stochastic difference equation ( 9) with the innovation sequence covariance matrix (10) shows quantitatively that AMO and SST3.4 interact with each other so that AMO and SST3.4 can be regarded as either inputs or outputs to the AMO -SST3.4system.
It also reveals that the system's memory extends for two years.The dependence of AMO and SST3.4 upon their own past and upon the past of the other time series explains about 60% and 25% of the AMO and SST3.4 variances, respectively.
4. The frequency domain analysis of the system shows that the spectra of AMO and 6.The coherent spectra AMO -SST3.4 and SST3.4 -AMO are ststistically significant at frequencies from 0.18 to 0.38 year -1 contributing 25%-55% of the spectral densities of SST3.4 and AMO to each other.The gain factors in the band behave in a manner similar to the behavior of the coherennt spectra and peak at about 0.1 and 4.0, respectively.
These conclusions provide answers to the questions formulated at the introduction to this work.
Time series can often be treated as Gaussian.The ability to use a Gaussian approximation is important because for such time series the nonlinear approach cannot give better results than what is obtained within the linear approximation.This latter statement holds, in particular, for time series extrapolation within the framework of the Kolmogorov-Wiener theory.Also, as shown by Choi and Cover (1984), the random process that maximizes the entropy rate under constrains on p first values of the correlation function is a Gaussian autoregressive process of order p.
Many time series, especially at climatic time scales, are short, that is, their length does not exceed the time scales of interest by orders of magnitude.Therefore, the non-parametric analysis in the frequency domain may not be efficient because, with short time series, it would produce less reliable results.Besides, the nonparametric approach does not allow one to obtain explicit stochastic models in the time domain.(These are two more reasons to prefer the parametric modeling.) A parametric (first of all, autoregressive) analysis in time and frequency domains is effective because it results in relatively accurate estimates due to the postulation of a stochastic model for the time series.In particular, the frequency domain estimates obtained with the properly selected order satisfy, in the Gaussian case, the requirements of the maximum entropy spectral analysis.However, it is not correct to say that any autoregressive spectral estimate has this important property.The number of parameters to be estimated should always be much smaller than the time series length.If, for example, the length Nt of a bivariate time series is 128 years, one should hardly expect statistically reliable results for models with AR orders higher than 5 (20 AR coefficients plus 3 elements of the noise covariance matrix to be estimated).This is one of the reasons why it is not possible to make any realistic conclusions about the presence of 60-70 years long "periods", "cycles", "oscillations", or about any other features at such large time scales unless the time series is at least 300 -400 years long.For parametric models, it is strongly recommended to determine the model's order and, consequently, the number of parameters to be estimated, on the basis of order selection criteria; an improper selection of the order invalidates the results of analysis.
Page ## are shown in accordance with this document 1. Page 8, line 5: two references added in accordance with Prof. Gluhovsky's recommendation 2. Page 10, lines 2-4: the paragraph removed in accordance with Prof. Piterbarg's recommendation 3. Page 18, lines 14-17: two references added to the list of references in accordance with Prof. Gluhovsky's recommendation 4. Page 20, lines 7-10: caption to Figure 2 is corrected in accordance with Prof. Piterbarg's recommendation 5. Page 21, line 3: caption to Figure 3 is corrected.1IntroductionStudying relations between time series on the basis of observations is a common task in all branches of Earth sciences.Normally, it requires getting quantitative answers to the following questions: ) describes the linear dependence between the time series components x1,n and x2,n in the frequency domain.It can be thought of as a frequency-dependent set of correlation coefficients between components of a bivariate time series.It is the coherence function Co12(f) (and not the crosscorrelation coefficient) that describes the degree of linear dependence between two scalar time series.Values of Co12(f) satisfy the condition 0  Co12(f)  1.
of the output spectrum s11(f) that can be explained by the linear dependence between x1,n and x2,n.describes the share of the spectrum s22(f) defined by the contribution of x1,n to x2,n.Finally, the complex-valued frequency response function 12 () G f  12 22 not estimate the frequency response function (FRF) because, according to them, "a physicallybased transfer function is likely [to be] nonlinear".Actually, a nonlinear theoretical model of the FRF between AMO and SST3.4 time series would have been in disagreement with observations because both time series are Gaussian.At frequencies from 0.1 year -1 to 0.4 year - 1 , the gain factors of the empirical FRFs AMO -SST3.4 and SST3.4 -AMO (Fig.4) behave in the same manner as the coherent spectra shown in Fig.3and their values at intermediate frequencies amount to approximately 0.1 and 4, respectively.The coherent spectra and gain factors are shaped similarly because the coherence function is bell-shaped and the AMO and SST3.4 spectra are rather flat at intermediate frequencies.

SST3. 4
behave in a different manner, with the AMO spectrum decreasing fast with frequency and with a relatively flat SST3.4 spectrum.5.In spite of the very low crosscorrelation coefficient between the time series of AMO and SST3.4,a close linear dependence between them was shown to exist at intermediate frequencies corresponding to time scales from 3 to 5 years.The coherence between AMO and SST3.4 is statistically significant in a wide frequency band centered at 0.26 year -1 where the coherence peaks at 0.74.This result has been obtained earlier byPark and Dusek (2013) for a similar climatic systm.

Figure 3 .
Figure 3. Coherent spectra: contribution of SST3.4 to the AMO spectrum (a), and contribution of AMO to the SST3.4 spectrum (b).Dashed, and grey and blue lines show approximate 90% confidence bounds and AMO and SST3.4 spectra, respectively.