Capturing rogue waves by multi-point statistics

As an example for complex systems with extreme events we investigate ocean wave states exhibiting rogue waves. We present a statistical method of data analysis based on multi-point statistics which for the first time allows grasping extreme rogue wave events in a statistically highly satisfactory manner. The key to the success of the approach is mapping the complexity of multi-point data onto the statistics of hierarchically ordered height increments for different time scales for which we can show that a stochastic cascade process with Markov properties is governed by a Fokker-Planck equation. Conditional probabilities as well as the Fokker-Planck equation itself can be estimated directly from the available observational data. With this stochastic description surrogate data sets can in turn be generated allowing to work out arbitrary statistical features of the complex sea state in general and extreme rogue wave events in particular. The results also open up new perspectives for forecasting the occurrence probability of extreme rogue wave events, and even for forecasting the occurrence of individual rogue waves based on precursory dynamics.


Introduction and motivation
The observation and study of waves on the sea is probably one of the oldest scientific and cultural endeavours of mankind. But even today the sea's state can not be regarded as anything else than an enigma to man and science. Of course ocean waves have inspired a tremendous number of often groundbreaking results in mathematics, physics and related sciences, including nonlinear waves, localisation, extreme events, turbulence and many more. But still, the fully irregular and complex state of the sea is far from being understood. And both the characteristics of its irregularity, as well as the rare but extremely large wave events occurring sometimes, now often called rogue waves, abscond satisfying description, even in statistical terms.
Obviously the difficulties with understanding irregular and extreme or rogue ocean waves has to be seen in the context of extreme events in complex systems in general. Driven by various motives there has been extensive research on extreme events in a many fields, from the sciences, via meteorology and climate change, up to the social sciences and economics [1,2,3,4,5,6]. It is still a strongly debated question whether extreme events are generally linked to some universal stochastic mechanisms or if they rather originate through special features of the individual systems under study [7]. Still, a common point of all observations is that the empirical data are frequently punctuated by extreme events which seem to play an important role. Often an analysis approach is to approximate the observations by means of a generalized stochastic model in which some variables are represented in terms of stochastic components [8]. Usually the complex systems under study are very high dimensional and thus finding adequate methods to model the stochastic components remains a challenge.
Besides the description of extreme events in complex systems there is also the demand of their prediction. Despite the fact that we have irregular and complex behavior of rogue waves, there are increasing number of research towards defining an early warning system for rogue wave occurrences [9,10] or establishing a prediction method for short term prediction of rogue events. Studies on prediction methods mainly relys on deterministic behavior of non-stationary solutions of the underlying wave equations [11,12,13] and also deterministic nonlinear time series analysis [14].
The present work is based on the finding that complex systems can often be described highly successfully as stochastic processes in scale rather than time or space. Examples are now known for various very different fields, like turbulence [15,16], economics [17,18,19], biology [20,21,22], and many more, see [23]. With scale dependent processes, originally introduced by Friedrich and Peinke [15], fractal and multi-fractal structures [23] and even more generally joint n-point statistics [18,24] can be reduced by Markov properties to particular three point statistics. In a previous study [25] we have already shown for ocean wave data that certain scale dependent processes may have Markov properties. However, in [25] the Markov properties for the pure scale process could only be derived for deliberately pre-filtered data. In the present contribution we extend our previous investigations and base our analysis of the wave dynamics on general joint (N + 1)-point probability density functions (PDF) p(h(t), h(t − τ 1 ), ..., h(t − τ N )).
Here h(t) denotes the water surface elevation measured at a given location at time t and τ i are different time increments. The joint PDF provides the likelihood of a sequence of water surface elevation heights for N + 1 different instants of time. We show how a Fokker-Planck equation can be derived which describes these general joint PDFs. Knowing the corresponding stochastic process for the general multi-point statistics we can show that also the extreme events, i.e. the rogue waves, are grasped by this stochastic approach. The approach also allows time-series reconstruction in a statistical sense, and thus a statistically valid prediction of rogue wave occurrence.
The paper is structured as follows. First the mathematical aspects of multi-point and multi-scale description as well as the connection to scale dependent stochastic processes are introduced. Then the validity of the description based on observational ocean wave data is demonstrated. Finally the approach is applied to reconstruct time series for the underlying observational data and to forecast the occurrence probability of rogue waves in the given sea state.

Multi-point statistics
In this section the statistical background of our approach for a multi-point reconstruction is presented. In the following we use the short-hand notation h i := h(t i ) for the elevation of the water surface measured at a given location at time t i , with h i+j := h(t i + t j ). We define the relative change in surface height over a time interval or, respectively, a time scale τ j as The aim is to calculate the joint probability p(h * , t * ; h 1 , t * − τ 1 ; ...; h N , t * − τ N ) of occurrence of the event {h * , t * }, together with the knowledge of the past points We assume that the system has no explicit time dependence, i.e. the system is stationary. The probability of occurrence of the event {h * } under the conditions of the past points is given by Next, the joint (N+1)-point PDF can be expressed in an equivalent way by joint increments statistics Note instead of the knowledge of wave heights at N+1 points, we consider now the knowledge of N height increments and one selected height h * . Without loss of generality we take τ i < τ i+1 , and thus introduce a hierarchical ordering of the increments ξ i .
Only if the conditional PDFs do not depend on h * , i. e. if p(ξ 1 ; ξ 2 ; ...; ξ N |h * ) = p(ξ 1 ; ξ 2 ; ...; ξ N ), (4) the (N+1)-point statistics reduces to N-scale statistics of the increments ξ i at scales τ i . In our previous work [25] we had applied filtering based on Hilbert-Huang transform techniques (HHT) to the wave data. The filtering removed the dependency on h * by kind of separating off the underlying dominant frequency, i.e. the wave like nature of the system. Still, already in this case Markov properties could be shown for the filtered wave data and thus the multi-scale PDF could be factorized in In our present work here we do not apply any pre-filtering and stay focussed on the very direct data itself. This renders the approach much more general and we directly start with the multi-point statistics (Eq. (2)) to investigate if the Markov property of the process is given for More specifically we will first investigate from the observational data if holds. This we will take as hint for the validity of Markov property. Using Eq. (6), the multi-point PDF Eq. (3) can then be factorised as As Eq. ( 6) is nothing else than the Markov property of a stochastic process of ξ i evolving in the time scale τ i , the evolution of conditional PDFs of Eq. (8) can be expressed by Kramers-Moyal expansion [26], where Kramers-Moyal coefficients D (n) are defined as Note that the pre-factor −τ in Eq. (9) indicates that we consider the process for decreasing τ -values and an evolution in log-scale of τ . If the Kramers-Moyal coefficient D (4) is zero, then it follows from the Pawula theorem that all coefficients for n ≥ 3 are zero, too cf. [26]. The Kramers-Moyal expansion then yields a Fokker-Planck equation with just two coefficients, D (1) denotes the drift and D (2) the diffusion coefficient. With this the Fokker-Planck equation turns out a suitable description for the conditional probabilities of the water surface height increments, from which in turn the general multi-point joint PDF of the surface heights themselves, Eq. (8), can be determined as .
Using the increment notation, omitting the τ -values and definingξ j := h 1 − h j with the corresponding time scale τ j − τ 1 and j = 2, . . . , N, this equation simplifies to For a given height h * the probability of its occurrence p(h * |h 1 , τ 1 ; ...; h N , τ N ) is given by the simple conditional PDFs, which can be calculated from the Fokker-Planck equation, or which can be estimated directly from the data. Note the simple conditional PDFs p(ξ i , τ i |ξ j , τ j ; h * ) only contain information about three height values h * , h i , h j ; or more abstractly, of three points of the time series h(t). Thus Eq. (12) is a three point closure of the multi-point problem.

Results based on observational data
The wave measurements used in this study were taken in the Sea of Japan, at a location 3 km off the Yura fishery harbor, where the water depth is about 43 meters, further details can be found in [27,28,29,30]. First we want to examine if the conditional PDFs depend on the wave height itself by comparing both sides of the equation In Fig. 1 the comparison of conditional PDFs from both sides of Eq. (13) at scales τ 1 = 14 and τ 2 = 28 seconds and for two different values of h * are shown. To get sufficient data we always use an interval of h * with ±σ h /4 (where σ h = h 2 ). For h * = 0 in Fig. 1(a) both distributions are almost the same but for values of h 0 = 0, like in Fig. 1(c) a significant shift of the red contour plot (solid lines), which is the left hand side of Eq. (13), is found. As .epsa result from this one can clearly deduce that the conditional PDFs p(ξ 1 |ξ 2 ; h * ) do depend on h * .
Next the Markov properties according to Eq. (7) can be checked. Note that we have to compare two data sets according to ξ j | ξ j+1 ;h * and ξ j | ξ j+1 ;ξ j+2 ;h * , and the size of each of these data sets is very different. The verification is thus performed by the use of the Wilcoxon test [16] as this test is suitable to compare the statistical similarity of two sample sets of different sizes. The validity of the Wilcoxon test can be shown by the normalized expectation value < ∆Q * > of the number of inversions of the conditional wave height increments ξ j | ξ j+1 ;h * and ξ j | ξ j+1 ;ξ j+2 ;h * . If Markov properties are given, < ∆Q * > has a value of 2/π ≈ 0.8. The values of < ∆Q * > in Fig. 2 for different values of h * show that Markov properties hold for (τ ≥ 14 seconds). This defines a finite minimum step size or scale in the Markov process of the evolution of the surface elevation increments ξ i .Such a finite step size is well known for stochastic processes in general [31], and the scale is called the Einstein-Markov length, which has for example also been found in a similar way for turbulent flow data, cf. [32,33]. The scale has been marked by a vertical red dashed line in Fig. 2. Note that compared to our previous work [25] the Markov properties are fulfilled without applying a Hilbert-Huang Transform (HHT) to the original data, which is due to the fact that we have now included the dependencies on h * . Based on the finding that Markov properties are fulfilled for the evolution of water surface height increments ξ j with decreasing time scale τ j we can now proceed to estimate the corresponding stochastic process via the above mentioned Kramers-Moyal coefficients. Based on the knowledge of the conditional PDF like shown in Fig. 1 the conditional average in Eq. (10) is known too. The estimation of lim δτ →0 causes some problems, in particular due to the Einstein-Markov length, but has been worked out already in several publications [16,34,35]. Besides this direct estimation we optimize the obtained functional forms of D (1) and D (2) by minimizing the differences between measured conditional PDFs and those obtained by numerical solutions of the resulting Fokker-Planck equation [36]. Fig. 3 shows the estimated drift and diffusion functions, D (1) (ξ, τ, h) and D (2) (ξ, τ, h), of ocean wave surface elevation data for τ = 140 seconds and different values of wave height h * . For h * = 0 the D (1) (ξ, τ, h) curves are shifted in the vertical direction, whereas no significant change is found for the diffusion term. Furthermore the fourth order Kramers-Moyal coefficient D (4) (ξ, τ, h) is indeed found to be close to zero for different values of h * , thus the Fokker-Planck description we propose in Eq. (11) can be assumed to be valid. Note that the surface elevation height increments, ξ, are given in the units of their standard deviation in the limit τ → ∞, σ ∞ , which is identical to √ 2σ h ≡ 2 h 2 [16].  To ease parameterisation, the drift and diffusion terms can be approximated by first and second order polynomials in ξ, The height dependency of the drift function is expressed by the d 10 (h * , τ )-coefficient and our results are shown in Fig. 4. The results indicate once more that we have a strong wave height dependency in our process.

Reconstruction of time series
The knowledge of the conditional probabilities p(h * |h 1 , τ 1 , . . . , h N , τ N ) and its estimation by Eq. (12) can be used to generate a new data point h * . Shifting the procedure by one step and repeating the same procedure may be used to generate new surrogate time series. For technical reasons one should avoid zeros in conditional pdfs if one uses Eq. (12). Here we used kernel density estimation which is very helpful for parameter ranges for which we have only limited data [37,38]. The initial idea for reconstructing time-series following this procedure was originally developed in a similar way for fluid turbulence data, see [39]. The time scales we use here for this process are τ n = n · τ EM where n = 1, 2, . . . , 7 and the Einstein-Markov time scale τ EM = 14 seconds, as shown in Fig. 2. (The maximal value of n = 7 was chosen, as for that time step the autocorrelation of the height increments approaches zero.) In Fig. 5(a)  To illustrate that the reconstructed time series are indeed statistically similar to the measured wave data we repeat the above mentioned Fokker-Planck analysis. In Fig. 6 we show that from the surrogate data we obtain the same drift and diffusion coefficients. Also the corresponding PDFs p(ξ i , τ i ) obtained from the measured data and from the numerical solution of the Fokker-Planck equation using the estimated drift and diffusion terms are the same as shown in Fig. 7(a). Furthermore the statistics of the wave height maxima are well grasped by the reconstructed data, see Fig. 7(b). Both empirical and reconstructed data follow a generalized gamma distribution very well, as expected from [25]. From this verification of the obtained stochastic process we conclude that both, the empirical data and the reconstructed data, have the same multi-point statistics. Based on the proposed reconstruction of time series it is now possible to generate long synthetic time series to work out further statistical features of the wave data. We have chosen 1000 data points of empirical data as initial condition and run it to produce 1.1 × 10 6 synthetic data with sampling rate of 1 Hz. In this reconstructed time series we have captured three events that we could consider as rogue waves, using the usual definition [40], saying h * must be larger than 2 times the significant wave height, which is 2.4 m for our data. The corresponding three sections of the reconstructed time series are shown in Fig. 8 (b,c and d). Also, we performed 4096 different runs of 2048 seconds blocks. From these data we captured 33 time series with extreme values and a corresponding waiting time of about 2.5 × 10 5 seconds to obtain an extreme event, or a rogue wave.
Next we discuss the possibility to forecast emerging rogue waves. From the conditional probability, Fig. 5(e) (Black curve) we can quantify the likelihood of the appearance of the measured amplitude of h r > 5.2 m by integration and obtain 23.6%. This likelihood P extreme (h r > 5.2 m) can be evaluated for each time step and result in the changing risk of emerging rogue waves, as shown in Fig. 9. This  probabilistic characterization of extreme events returns some false alarms and as well some true hits. 6.6×10 5 6.8×10 5 A common method to test the quality of a prediction is the receiver operating characteristic curve (ROC) [41,42,43]. The idea of ROC consists of comparing the rate of true predicted events with the rate of false alarm. The most quantitative index describing a ROC curve is the area under it which is known as accuracy. In Fig. 10 we have plotted ROC curves for our prediction, P extreme , first by considering h r = 5.2 m to detect the extreme event alarms. The corresponding ROC curve is plotted in black (solid) line. To investigate the robustness of our reconstruction method, we considered lower amplitude wave height for h r = 2.5 m and h r = 3.5 m and the corresponding ROC curves are shown in Fig. 10 in red (dotted) and blue (dash) lines, respectively. In all three cases we have accuracy of ROC curves bigger than 80% which indicate that our multi-point procedure is a proper method for time series reconstruction and can be used for short time prediction purposes.

Conclusions
We have presented a new approach for a comprehensive analysis of the complexity of ocean wave dynamics. The complexity of multi-point statistics can be simplified by a three point closure, based on which an arbitrary N-Point statistics can be expressed by a hierarchy of nested three point statistics ordered in a cascade like structure. We have been able to show for the first time that by our stochastic approach not only the joint N-point statistics can be grasped but also extreme events, rogue waves, can be captured statistically. We have also shown how for each instant in time the conditional probability of the next wave height can be determined. As the height profile of waves changes from moment to moment, also the probability of the next value of the wave height is changing dynamically. These changes may thus clearly give rise to measures indicating the risk of the appearance of rogue waves ahead of their actual emergence. Most interestingly this was possible although in the measured data only one event of rogue wave was recorded. From our analysis of the occurrence probabilities it becomes clear that the rogue wave for this wave conditions is integral part of the entire complex stochastic. In our opinion this can only be achieved as we were able to crave out an N-point approach for this complex system.