Forecasting abrupt changes in foreign exchange markets: method using dynamical network marker

We apply the idea of dynamical network markers (Chen et al 2012 Sci. Rep. 2 342) to foreign exchange markets so that early warning signals can be provided for any abrupt changes. The dynamical network marker constructed achieves a high odds ratio for forecasting these sudden changes. In addition, we also extend the notion of the dynamical network marker by using recurrence plots so that the notion can be applied to delay coordinates and point processes. Thus, the dynamical network marker is useful in a variety of contexts in science, technology, and society.


Introduction
There are many situations where it is desirable to forecast when sudden changes in foreign exchange markets might occur because their changes influence the fundamentals of economies for many countries. Although methods of providing early warning signals for the markets have been proposed [1,2], these methods have some limitations because they assume that their models remain valid even when we face sudden changes. However, these assumptions are too strong because sudden changes might be due to bifurcations that change the qualitative characteristics of the underlying economic systems.
Chen et al [3] proposed an index, called a dynamical network marker, for providing early warning signals that forecast qualitative changes accompanying bifurcations. In this paper, we modify the construction of the dynamical network marker to create an index for a dynamical network marker from a single time series. Then we apply the dynamical network marker to a time series of foreign exchange markets. Moreover, we extend the dynamical network marker using recurrence plots [4,5] so that the dynamical network marker can be applied to a set of delay coordinates and point processes.
The rest of this paper is organized in as follows. In section 2, we briefly review the dynamical network marker. In section 3, we modify the construction of the dynamical network marker so that we can apply the index for a dynamical network marker to a single time series. In section 4, we extend the dynamical network marker using recurrence plots with delay coordinates [6,7] or distances for point processes [8,9]. In section 5, we discuss the results and conclude this paper.

Dynamical network marker
The dynamical network marker proposed by Chen et al [3] provides an early warning signal for a change in a network system within which a bifurcation is involved. Let G = (V, E) represent a graph with a set of nodes V and a set of edges E, for which we now consider all-to-all connections between every two nodes of V. We the divide nodes into two groups: ⊂ V V l is a subset of nodes which we call the leading network; the remaining subset = V V V \ o l are the nodes that are outside of the leading network. If a network system is getting close to a bifurcation point, then the standard deviations for the temporal changes of the nodes in V l increase; the correlation coefficients between the two nodes in V l also increase; each correlation coefficient between a node in V l and another node in V o decreases to 0. Let σ be the mean for the standard deviations for the temporal changes of the nodes in the leading network. Denote by C l the mean of the correlation coefficients between the two nodes in V l . We also use C o for the mean of the correlation coefficients between the nodes in V l and the nodes in V o . Then, an index for the dynamical network marker is defined as In the paper by Chen et al [3], the authors calculated σ, C l , and C o for a set of samples for each of the three conditions, normal, pre-disease, and disease states, and showed that I is significantly largest for the pre-disease state, where a transition related to a bifurcation is going on.

Modification for a single time series
We modify the index I so that we can obtain an index for a dynamical network maker from a single time series generated from a network system, especially foreign exchange markets. Suppose that a g-dimensional time series x s ( ) of the observed part of the state is given and that each variable x s ( ) m of time series x s ( ) corresponds to a price for a currency against another at time s. In addition, we suppose that there is the unobserved part x s ( ) of the state. Then, the system we consider can be written as where W s ( ) and W s ( ) are dynamical noise, either of which might be zero. We regard the unobserved part x s ( ) as a bifurcation parameter and apply the idea of the dynamical network biomarker.
Let w be the window size.  Given a leading network V l , the index I is also calculated for the window − w t wt ( ( 1), ], which we denote by I V ( ) t l . Namely, for a leading network V l , we define I V ( ) t l mathematically as follows: for forecasting the abrupt changes that happen in the second part of the time series. We write the chi-squared value obtained for the second part of the time series using V l , θ V (ˆ) I l and I V (ˆ) c l by χ 2 .
First, we tested the index I with the following coupled map lattice [10]:  where we set ε η = = = a 3.8, 0.05, 0.01, and = N 100. We generated a series of length 10 000. A part of the time series is shown in figure 1. We assumed that we could observe = … y s n ( ) ( 1, , 5) n . We calculated the index I t every 10 points to forecast the abrupt changes within the next 10 points ( = w 10). From the first half of the dataset, we found that the leading network containing the nodes of y y y , , 1 2 5 is most significant (figure 2). We present the scatter plot between + I log ( 1) Second, we applied the index I t to the following Lorenz'96 model [11,12] in a 240 dimensional phase space: for the example of coupled map lattice using equation (4). Labels on the horizontal axis show sets of nodes for leading networks. For example, '1, 2' means the leading network consisting of y 1 and y 2 . In this example, the most significant leading network was the one with y 1 , y 2 , and y 5 . for the example of the coupled map lattice. The gray dash-dotted line is the linear fitting for the scatter plot.
. , where we applied the following periodic boundary conditions: We set can be observed every 0.01 unit times to generate a time series of 20 000 time points and that we forecast their abrupt changes for the following time period of 0.1 unit times every 0.1 unit times using the idea of the dynamical network marker ( = w 10). The example of this time series is shown in figure 4. By calculating the maximum for each potential leading network over θ I (V l ) and θ c , we obtain the most significant leading network of nodes v 4,1 and v 5,1 (see figure 5). Then, we forecasted the abrupt changes for the second half of the time series. The scatter plot between is shown in figure 6. These quantities were correlated with a correlation coefficient of 0.13 and a p-value of × − 4.7 10 5 . The forecasting results are summarized in table 2. We can see that the value of + c t 1 tends to be higher when I t is higher. These forecasts correlated well with the outcomes of the abrupt changes.
Third, we tested our method using the datasets of foreign exchange markets. The time period was 88 weeks between 7 May 2007 and 28 December 2008. We used the pairs of the euro (EUR)/United States dollar (USD), USD/Japanese yen (JPY), USD/Swiss franc (CHF), EUR/JPY, and EUR/CHF. We took the moving average of price for the exchanges every five minutes for each pair. We truncated weekends and connected 12 am on Saturdays with 1 am on the following Mondays. Thus, there are 125 664 time points on this dataset.
The example of this time series is shown in figure 7. We calculated the index I t every one hour to predict the abrupt changes that might happen in the following one hour ( = w 12). Hence there are 5235 instances to be predicted. By calculating the maximum of , we obtained the most significant leading network comprising the USD/CHF, EUR/JPY, and EUR/ CHF pairs (figure 8). Then we tried to forecast the abrupt changes for the following part of the datasets. The quantities correlated well with the correlation coefficient 0.74 and a p-value of less than − 10 300 (figure 9). We found that we tended to see a Table 1. The results of forecasting abrupt changes for the coupled map lattice using equation (5). The table shows the numbers of the corresponding events using the most significant leading network for the second half of the dataset. The odds ratio was 47.43. The p-value obtained using the Fisher's exact test in R package was less than × − 2.2 10 16 . The null hypothesis is that the two classifications are independent. The same R package was used for the other tables to obtain the p-values.   for the example of the Lorenz'96 model using equation (4). Labels on the horizontal axis show sets of nodes for leading networks. For example, '1, 2' corresponds to the leading network composed of v 1,1 and v 2,1 . In this example, the most significant leading network was the one with v 4,1 and v 5,1 .    for the example of foreign exchange markets using equation (4). The numbers 1, 2, 3, 4, and 5 in the horizontal axis correspond to EUR/USD, USD/JPY, USD/CHF, EUR/JPY, and EUR/CHF, respectively. In this example, the most significant leading network was the one with nodes of USD/CHF, EUR/JPY, and EUR/CHF, respectively. for the foreign exchange markets. The gray dash-dotted line is the linear fitting for the scatter plot. Table 3. The results of forecasting abrupt changes for foreign change markets using equation (5). See the caption of table 1 to interpret the results. The odds ratio was 13.29. The p-value was less than 425  1984  2409  Total  2311  2646  4957 larger change in + c t 1 when I t was larger (table 3). Thus, these forecasts correlated well with the outcomes of abrupt changes.

Extension to recurrence plots
We translate equation (5) so that we can use a similar notion for delay coordinates and point processes using recurrence plots [4,5].
A recurrence plot is originally a two dimensional visualization of time series data. Both axes show the same time axis. For a pair of times, we calculate a distance and plot a point if the distance is smaller than a threshold value. Otherwise, we do not plot a point there. Mathematically, given a time series of 1, 2, , m and threshold r m for system m, a recurrence plot for system m can be defined as and E is the embedding dimension. This simple graph can show a lot of things, for example, we can calculate correlation entropy and correlation dimension from recurrence plots [13,14]. We also recently learned that a recurrence plot contains almost all the information for the underlying dynamics because we can reproduce the rough shape of the original time series if the original time series is given as a time series with a fixed sampling frequency [15,16]. Thus, recurrence plots have been used in a variety of contexts in science and technology including climate [17,18], medicine [19], and economics [20,21], to name a few. An important quantity for a recurrence plot is the recurrence rate [5], which is the proportion of plotted places. We control each threshold r m so that the recurrence rate becomes 0.2. Intuitively, such r m becomes larger when the standard deviation is larger.
A joint recurrence plot [22] is an extension of the recurrence plot towards bivariate analysis. A joint recurrence plot for systems m 1 and m 2 is defined as the intersection of two recurrence plots for the corresponding systems, namely, The important quantity for the joint recurrence plot is the joint recurrence rate [22], which is the proportion of plotted places in the joint recurrence plot. Similarly to the absolute value for the correlation coefficient, the joint recurrence rate for J i j r r ( , , , ) m m m m ,   , respectively, while these new values can be obtained for delay coordinates and point processes as well. Therefore, we can expect that equation (10) can also detect early warning signals for a variety of contexts.
Using the first part of the time series, we define the most significant leading network for the above extension similarly to the case of equation (5). Namely, let   (see table 4). Thus, the odds ratio was higher when using equation (12) than when using equation (5).
The second example is that of the Lorenz'96 model above. In this example, we set = E 2. When we applied equation (12), we found that the leading network containing v 4,1 and v 5,1 was most significant (figure 12). The two quantities ′ + I log ( 1) were correlated with the correlation coefficient 0.13 and a p-value of × − 7.5 10 5 ( figure 13). When we forecasted the abrupt changes for the second half of the dataset, we obtained the odds ratio of 2.59 (see table 5), which was larger than that of table 2 obtained using equation (5). Therefore, our extension using equation (12) yielded better forecasts than equation (5).
Our third example is the dataset of foreign exchange markets. In this application, we set E = 2. Namely, the size of delay coordinates is 10 min. We found, whilst using the first half of the dataset, that the leading network containing USD/JPY, USD/CHF and EUR/JPY was most significant ( figure 14). The two quantities ′ + I log ( 1) for the example of the coupled map lattice using equation (11). See the caption of figure 2 to interpret the figure. The most significant leading network in this example was the one with nodes y 4 and y 5 . for the coupled map lattice. The gray dash-dotted line is the linear fitting for the scatter plot. Table 4. The results of forecasting abrupt changes for the coupled map lattice using equation (12). See the caption of table 1 to interpret the results. The odds ratio was infinity because there was no event satisfying θ 0  304  304  Total  56  443  499 with the correlation coefficient 0.73 and a p-value of less than − 10 300 ( figure 15). When we forecasted the abrupt changes in the second half of the dataset, we achieved the odds ratio of 33.34 (see table 6). The value is also larger than the one in table 3 obtained by equation (5).

Discussions
We checked the validity of the proposed indices of equations (5) and (12) for the real dataset of the foreign exchange markets by using surrogate data analysis [23][24][25]. First, we applied random shuffle surrogates [23] to generate 200 data that do not have temporal correlation but preserve the spatial distribution. We found that the chi-squared values obtained by equations (5) and (12) from the original time series were larger than those obtained from random shuffle for the example of the Lorenz'96 model using equation (11). See the caption of figure 5 to interpret the results. The most significant leading network in this example was the one with v 4,1 and v 5,1 . surrogates (see figures 16 and 17). Therefore, we can reject the null hypothesis that there is no temporal correlation (empirical p-values: 0.01 and 0.01, respectively, two-sided test). Second, we applied the multivariate version of iterative adjusted Fourier transform surrogates [24] to the datasets, testing the null hypothesis that the time series was generated from linear noise with monotonic nonlinear transformations. The results presented in figures 18 and 19 show that we Table 5. The results of forecasting abrupt changes for the Lorenz'96 model using equation (12). See the caption of table 1 to interpret the results. The odds ratio was 2.59. The p-value was × − 6.6 10 9 . for the example of foreign exchange markets using equation (11). See the caption of figure 8 to interpret the results. The most significant leading network was the one with USD/JPY, USD/CHF, and EUR/JPY. for the foreign exchange markets. The gray dash-dotted line shows the linear fitting for the scatter plot. cannot reject the null hypothesis (the empirical p-values: 0.76 and 0.70, respectively). We also took into account the trends of the time series and generated surrogate data by combining the method of [25] with that of [24], because it is known that the nonstationarity sometimes influences the surrogate data analysis [26]. We chose to randomize only the phases for the top 50 high-frequency components. However, the results were still the same (see figures 20 and 21): Table 6. The results of forecasting abrupt changes for foreign change markets using equation (12). See the caption of table 1 to interpret the results. The odds ratio was 33.34. The p-value was less than × − 2.2 10 16 .
3728 3823 Total 744 4491 5235 Figure 16. Analysis using random shuffle surrogates [23] with equation (5), for the foreign exchange markets. All the observables were shuffled simultaneously so that the 'spatial' correlations were preserved. The solid thick line shows the value for the original data and the histogram shows the values for the random shuffle surrogates. Figure 17. Analysis using random shuffle surrogates [23] with equation (12), for the foreign exchange markets. Here all the observables were shuffled simultaneously. The solid thick line shows the value for the original data and the histogram shows those for the random shuffle surrogates.
we could not reject the null hypothesis of linear noise (the empirical p-values: 0.66 and 0.61, respectively). There are two possibilities: the first possibility is that equations (5) and (12) are not sensitive to the potential nonlinear characteristics; the second possibility is that the markets are governed by linear dynamics. To differentiate these two possibilities, we used the entropy obtained by the joint permutations [27,28], which is an extension of the permutation entropy [29]. We used the permutations of length three and combined five permutations for each price time series to construct a joint permutation. In this case, the null hypothesis of linear noise was rejected by both the surrogate data of [24] ( figure 22, the empirical p-value: 0.01) and those of [25] extended to multivariate time series based on [24] (figure 23, the empirical p-value: 0.01). Therefore, it is likely that the temporal evolution of foreign exchange markets was governed by nonlinear dynamics. Simultaneously, the results of figures 18-23 mean that the proposed quantities from equations (5) and (12) mainly reflect linear characteristics of the underlying market dynamics. Figure 18. Analysis using multivariate iterative amplitude adjusted Fourier transform surrogates [24] with equation (5), for the foreign exchange markets. The thick line shows the value for the original data and the histogram shows the values for the surrogate data. Figure 19. Analysis using multivariate iterative amplitude adjusted Fourier transform surrogates [24] with equation (12), for the foreign exchange markets. The thick solid line shows the value for the original data and the histogram shows the values for the surrogate data.
Next, we randomized the 'spatial correlation', or correlation among different price time series components by introducing random delays to each of the five components. First, we used delays of multiples of an hour, up to 1000 h. We generated 200 combinations of such data. Then, when we compared their chi-squared values with those obtained by the same simultaneous delays to all the five components, the simultaneous delays tended to achieve larger chi-squared values than those of the random delays (see figures 24(a) and (b) for equations (5) and (12), respectively; we used 50 different same simultaneous delays to obtain the values). We also applied the random delays of multiples of a week, up to 44 weeks. But the results were similar: the same simultaneous delays tended to achieve larger chi-squared values than those of random delays temporarily shifting different components (see figures 24(c) and (d) for equations (5) and (12), respectively). Thus, the spatial correlation also exists in the dataset of the foreign exchange markets.
The overall results agreed well with the common hypotheses of econophysics that the markets are not random walk [30,31] and are correlated with each other [30], and the results  were consistent with our previous work [20]; the underlying dynamics is possibly of deterministic chaos.
When we partially observe a high-dimensional system and apply an index for the dynamical network marker, we eventually regard the unobserved part of the high-dimensional system as a bifurcation parameter, as we discussed in section 3. Therefore, when we try to forecast abrupt changes in the partial observations for a high-dimensional system, the notion of the dynamical network marker is still valid.
The method has a limited tolerance to observational noise. We present the results with 1% Gaussian observational noise in figures 25-32. By comparing them with figures 3, 6, 11, and 13, we found that both equations (5) and (12) were robust in terms of observational noise.
Equation (12) is more robust than equation (5) when one of the time series does not change and is constant. When a time series does not change, then C l t , and C o t , become 0 and we cannot    (5). In (b) and (d), we used equation (12). In each panel, the range of χ 2 or χ′ 2 is shown by a box plot: the range between 25% and 75% is shown by the box; the median is shown by the red horizontal line; the outliers are shown by plusses. 'Simul' and 'Random' correspond to the cases of applying the same simultaneous delays and random delays to different price time series components, respectively. The p-values obtained from the rank-sum tests were × − 8.1 10 28 , × − 1.3 10 26 , × − 3.6 10 8 , and × − 2.7 10 20 , respectively. Namely, the original time series consistently showed higher significance levels than those of different temporal shifts applied to different components for the foreign exchange markets. for the example of the coupled map lattice with 1% observational noise using equation (4). See the caption of figure 2 to interpret the results. The most significant leading network was the one with y 1 , y 2 , and y 5 . define the index I t properly. Such events actually happened 278 times in the time series of foreign exchange markets. From this viewpoint, equation (12) is more desirable.
The results are not sensitive to the selection of threshold values. When we used 100 initial conditions to evaluate the effects of the size of threshold values, we found that the p-value based on equation (12) was smaller than the p-value obtained based on equation (5) in 91, 96, 83, and 74 out of 100 cases for the example of the coupled map lattice, and in 76, 76, 78, and 63 out of 100 cases for the example of the Lorenz'96 model, when the threshold values were controlled so that the recurrence rates became 0.05, 0.1, 0.2, and 0.3, respectively. These numbers also show the tendency for the proposed equation (12) to be more effective than the original equation (5).  for the example of the coupled map lattice with 1% observational noise using equation (11). See the caption of figure 2 to interpret the figure. The most significant leading network was the one with y 4 and y 5 .
The superiority of equation (12) to equation (5) implies that delay coordinates help to reconstruct the information of the hidden part of a high-dimensional system including dynamical noise [32], deterministic driving force [33], and genuine time-varying parameters [34], so that we can forecast the future values for the observed part more accurately. When we use delay coordinates, we still have the important problem of how we should choose embedding dimensions. Possible methods include predictions [35,36] and false nearest neighbors [37,38]. We might be able to optimize the embedding dimension using the first half of the dataset as well, while we simply set = E 2 throughout the paper. This part is a topic for future research. Although our examples here are limited to the ones in foreign exchange markets, the proposed methods themselves are general, and thus can be applied to wider contexts including the prediction of abrupt changes for renewable energy outputs [39][40][41]. These applications are  for the example of the Lorenz'96 model with 1% observational noise using equation (4). See the caption of figure 5 to interpret the results. The most significant leading network was the one with v 1,1 , v 4,1 and v 5,1 . in progress and we will write about them on another occasion; they should make it possible to introduce more renewable energy resources by keeping power grid systems stable.
In summary, we extended and applied the dynamical network marker proposed by Chen et al [3] to a single time series. We achieved statistically significant odds ratios. In addition, we extended the index of [3] by using recurrence plots so that we can apply the idea of the dynamical network marker to delay coordinates [6,7] and point processes [8,9]. The imaginary examples and the real example of foreign exchange markets showed that we tend to achieve higher probabilistic gains by using our extension. The proposed new index defined in equation (12) can provide better early warning signals given a stream of high-dimensional time series data.  for the example of the Lorenz'96 model with 1% observational noise using equation (11). See the caption of figure 5 to interpret the results. The most significant leading network was the one with v 4,1 and v 5,1 . for the Lorenz'96 model under 1% observational noise. The gray dash-dotted lie shows the linear fitting for the scatter plot.