Advanced analysis of temporal data using Fisher-Shannon information: theoretical development and application in geosciences

Complex non-linear time series are ubiquitous in geosciences. Quantifying complexity and non-stationarity of these data is a challenging task, and advanced complexity-based exploratory tool are required for understanding and visualizing such data. This paper discusses the Fisher-Shannon method, from which one can obtain a complexity measure and detect non-stationarity, as an efficient data exploration tool. The state-of-the-art studies related to the Fisher-Shannon measures are collected, and new analytical formulas for positive unimodal skewed distributions are proposed. Case studies on both synthetic and real data illustrate the usefulness of the Fisher-Shannon method, which can find application in different domains including time series discrimination and generation of times series features for clustering, modeling and forecasting. The paper is accompanied with Python and R libraries for the non-parametric estimation of the proposed measures.


INTRODUCTION
The ubiquity and extensive growth of available temporal data requires the development of reliable techniques to extract knowledge from them and to understand multifaceted time-dependent phenomena. Over the last decades, an increasing attention was payed towards the use of Fisher-Shannon information as a measure to characterize the complexity and non-stationarity of non-linear time series. Originally proposed for statistical estimation purposes (Fisher, 1925), the Fisher information measure (FIM) has been extensively used in theoretical physics (Frieden, 1990). FIM and Shannon entropy power (SEP) (Shannon, 1948) are closely related, as shown by information theory (Dembo et al., 1991;Cover and Thomas, 2006). The Fisher-Shannon complexity (FSC) -the FIM and SEP product -was proposed as a possible definition of atom complexity (Angulo et al., 2008;Esquivel et al., 2010). Following Frieden work, FIM has found applications in non-linear time-series analysis. Martin et al. (1999) analysed complex non-stationary EEG signals and showed that FIM can have better discrimination performance than Shannon entropy. FIM was also used to detect behavior changes of dynamical systems (Martin et al., 2001). Vignat and Bercher (2003) showed that a joint analysis of both SEP and FIM can be required to perform effective discrimination of non-stationary signals.
The Fisher-Shannon method has been used to analyse complex dynamical processes in geophysics. Discrimination between the electric and magnetic components of magnotelluric signals is performed in . Tsunamigenic and non-tsunamigenic earthquakes were efficiently separated in the Fisher-Shannon information plane, using FSC . Micro-tremors time series were identified depending on the soil characteristics of the measurement sites (Telesca et al., 2015b). Telesca et al. (2015a) proposed a classifier of (non-)tsunamigenic potential of earthquake build on several time series features, including FIM, SEP, FSC. Finally, FIM was also used dynamically with sliding window techniques in order to study precursory patterns in seismology (Telesca et al., 2009b) and volcanology (Telesca et al., 2010).
Many environmental processes have also been studied using the Fisher-Shannon method. Lovallo et al. (2013) and Pierini et al. (2011) studied climatic regimes identification in rainfall time series. Hydrological regimes discrimination have also been investigated (Pierini et al., 2015). Analysing remotely sensed sea surface temperature, Pierini et al. (2016) have shown that the Fisher-Shannon method is able to clearly identify the Brazil-Malvinas Confluence Zone, which is known to be one of the most energetic area of oceans.  analysed more than ten years of hourly wind speed data in the Fisher-Shannon information plane. The same authors studied yearly variation of the FIM, the SEP and the FSC on wind measurements . Guignard et al. (2019b) have found correlations between daily variance of temperature and daily FSC of high-frequency wind speed records in urban area. Authors have also pointed out relationships between Fisher-Shannon analysis of wind speed daily means and topographical features -height and slope -in complex mountainous regions (Guignard et al., 2019a). Telesca et al. (2009a) discriminated some pollutants, including cadmium, iron, and lead, in the Fisher-Shannon plane. Similarly, Amato et al. (2019) have shown a relationship between the Fisher-Shannon analysis outputs of three air pollutants -Nitrogen dioxide, Ground level ozone and Particulate Matterand measurement location in term of land use and of anthropogenic sources of pollutant emission.
The research involving Fisher-Shannon method is rather scattered and comes from various fields, e.g. information theory, physics, dynamical systems and statistics. This paper aims to gather and fix the stateof-the-art concerning Fisher-Shannon information measures, discussing theoretical and operational tools for the application of FIM and SEP to the analysis of non-linear time series. Moreover, FSC is identified as a sensitivity measure of the SEP and as a non-Gaussianity measure of the data. Applications to a synthetic experiment will illustrate the behavior of SEP, FIM and FSC when used to analyze chaotic time series. Finally, the discussed measures will be used to study real environmental data, highlighting their usefulness as tools for exploratory data analysis in geosciences.
The remainder of the paper is organised as follows. Concepts of Fisher-Shannon analysis are reviewed in section 2, including SEP, FIM, FSC and information plane. Section 3 provides analytical formula for such quantities in the particular cases of random variables following some positive skewed distributions, namely Gamma, Weibull and log-normal ones. Then, non-parametric estimation using kernel density estimationfor which Python and R packages are proposed -is presented in section 4. Experiments on simulated and real-world data are performed in section 5. Finally, section 6 concludes the paper.

Shannon Entropy Power and Fisher Information Measure
Let us consider a univariate continuous random variable X with its probability density function (PDF) f (x), which is supposed to be sufficiently regular for the exposition of our purpose. Its differential entropy (Cover and Thomas, 2006) is defined as For example, if X is a centered Gaussian random variable of variance σ 2 , a direct computation gives H X = 1 2 log(2πeσ 2 ). However, it will be more convenient to work with the following quantity, called the Shannon Entropy Power (SEP) (Dembo et al., 1991), which is a strictly monotonic increasing transformation of H X . The SEP is constructed such that in the Gaussian case we have N X = σ 2 . Very often, entropies H X and N X are interpreted as global measures of disorder / uncertainty / spread of f (x). The higher the entropy, the higher the disorder.
The Fisher Information Measure (FIM) (Vignat and Bercher, 2003), also known as the Fisher information of X with respect to a scalar translation parameter (Dembo et al., 1991), is defined as This quantity should not be confused with the Fisher information of a distribution parameter. In particular, the derivative of the log-density is relative to x and not to some parameter. However, the FIM is equivalent to the Fisher information of a location parameter of a parametric distribution (Cover and Thomas, 2006). Under mild regularity conditions, one has the following alternative formulation (Lehmann, 1999), The quantity I X is sometimes interpreted as a measure of order / organization / narrowness of X. If X is Gaussian, I X = 1/σ 2 . It should be noted that H X , N X and I X only depend on the distribution f (x).

Properties
The SEP and the FIM respect several properties. First, both quantities are positive. It is also easy to see the scaling properties of the SEP and the FIM (Rioul, 2011), for any real number a = 0, by change of variable. Notice also that the SEP and the FIM are invariant under additive deterministic constant, by the same argument. Harder to show are the entropy power inequality (Dembo et al., 1991) and its dual the Fisher information inequality (Zamir, 1998), for a random variable Y independent of X, with equality if X and Y are Gaussian.
Moreover, several relationships show that the FIM closely interact with the SEP and the differential entropy. Let Z be a random variable independent of X with finite variance σ 2 Z . The de Bruijn's identity (Rioul, 2011;Cover and Thomas, 2006) i.e. the variation of the differential entropy of a perturbed X is proportional to I X . Therefore, a possible interpretation of the FIM is that it quantifies the sensitivity of H X to a small independent additive perturbation Z. Using the entropy power inequality (6) and de Bruijn identity (8), one can show the isoperimetric inequality for entropies, with equality if and only if X is Gaussian. The proof and the nomenclature motivation of equation (9) can be found in Dembo et al. (1991), where a remarkable analogy is done with geometry. This shows that SEP and FIM are intimately interlinked.

Fisher-Shannon Complexity
The joint FIM/SEP analysis has been used as a statistical complexity measure, albeit there is no clear consensus about the definition of signal complexity (Esquivel et al., 2010). The Fisher-Shannon Complexity (FSC) is define as C X = N X I X (Angulo et al., 2008). From the scaling properties (5), it is easy to show that the FSC is constant under scalar multiplication and addition. In particular, normalisation or standardisation of X has no effect on the FSC. Additionally, the isoperimetric inequality for entropies (9) states that C X ≥ 1, with equality if and only if X is Gaussian. An interpretation of this quantity is the following. If Z is independent of X and has a finite variance σ Z , one obtains the relationship by using the de Bruijn identity (8), Hence, the FSC can be interpreted as a sensitivity measure of N X to a small independent additive perturbation.

Fisher-Shannon Information Plane
The PDF of X can then be analyzed displaying the SEP and FIM within the so-called Fisher-Shannon Information Plane (FSIP), see Fig. 1 (Vignat and Bercher, 2003). Although standard linear scale plot are very often used for the FSIP in the literature, log-log plot are more adequate in practice. In the FSIP, the only reachable values are in the set D = {(N X , I X ) ∈ R 2 |N X > 0, I > 0 and N X I X ≥ 1}, due to (9). Vignat and Bercher (2003) showed that for any point (N, I) ∈ D, it exists a random variable X (from an exponential power distribution) such that N X = N and I X = I. Figure 1. The Fisher-Shannon information plane with a random variable X of FSC equal to 10. Scalar multiplication of X corresponds to a displacement along the iso-complex curve passing through X. The unreachable points are in grey. Note the logarithmic scale.
A curve in D is said to be iso-complex if the FSC along the curve is constant. As C X is constant up to a multiplicative factor a = 0, and looking up at the scaling properties (5), one can move on any iso-complex curve by varying a. Fig.1 shows the iso-complex curve of complexity C X = 10 as an example. The boundary of D is the iso-complex curve with FSC equal to 1, and is reached if and only if X is Gaussian, as states by (9). On this boundary, the standard deviation σ (which plays the role of the scaling parameter in the Gaussian case) is equivalent to the multiplicative factor a. Hence, while a point in the FSIP is described by (N X , I X ), one can also describe it by (a, C X ). In the light of this, one can also think of FSC as a scale-independent measure of non-Gaussianity of X.

ANALYTICAL SOLUTIONS FOR SOME DISTRIBUTIONS
In this section we propose analytical formulas for the SEP, FIM and FSC by analytical computations for parametric distributions, which could be used for parametric estimations. Vignat and Bercher (2003) obtained analogous results for the Student's t-distribution and the exponential power distribution (also known as generalized Gaussian distribution). The Gaussian case was already presented in section 2 as an example. The differential entropy of the distributions proposed in this section have been computed by Lazo and Rathie (1978). From this, the SEP is directly obtained. However, to our knowledge, above the FIM-based calculations for Gamma, Weibull and log-normal distributions were never presented. Proofs can be found in the appendix.

Gamma distribution
The PDF of a Gamma random variable X is given by and f (x) = 0, for x < 0, where Γ denotes the gamma function and θ, k > 0 are respectively the scale and shape parameters.
PROPOSITION 1. The SEP of the Gamma distribution with scale θ > 0 and shape k > 0 is where ψ is the digamma function.
The FIM and the FSC of the Gamma distribution with scale θ > 0 and shape k > 2 are respectively

Weibull distribution
The PDF of a Weibull random variable is and f (x) = 0, for x < 0, where µ is the location parameter, λ > 0 is the scale parameter and k > 0 is the shape parameter.
PROPOSITION 2. The SEP of the Weibull distribution with location µ, scale λ > 0 and shape k > 0 is where α = k−1 k and γ is the Euler-Mascheroni constant.
The FIM and the FSC of the Weibull distribution with location µ, scale λ > 0 and shape k > 2 are respectively

Log-normal distribution
The log-normal PDF with parameters µ and σ > 0 is , for x > 0, and f (x) = 0, for x ≤ 0.
The notation of the parameters µ and σ are motivated by the fact that the logarithm of a log-normal random variable follows a normal distribution of mean µ and variance σ 2 . However, µ and σ play respectively the role of the scale parameter and the shape parameter for the log-normal distribution.
PROPOSITION 3. The SEP, the FIM and the FS complexity of the log-normal distribution with µ and σ > 0 are given by

DATA DRIVEN NON-PARAMETRIC ESTIMATION
Complex real-world data sets rarely follow parametric distributions. Providing enough data, it is also possible to carry out Fisher-Shannon analysis with a non-parametric estimation of density, which release us from parametric assumptions on the distribution . In this paper, integral estimates of the SEP and the FIM are considered, which consist of substitute kernel density estimators (KDE) of f (x) and its derivative in the integral forms of (1) and (3), (Bhattacharya, 1967;Dmitriev and Tarasenko, 1973;Prakasa Rao, 1983;Györfi and van der Meulen, 1987;Joe, 1989). Python and R implementations of this section content are proposed, see the software availability section at the end of this paper.
Following (Wand and Jones, 1994), let X 1 , . . . , X n be a random sample of size n from a PDF f (x). Consider also the kernel K(u), a bounded PDF which is symmetric around zero and, has a finite fourth moment and is differentiable.
where h > 0 is the bandwidth parameter. In this paper, the Gaussian kernel defined by K(u) = (2π) −1/2 exp(−u 2 /2) is used and the estimator (10) becomeŝ The integral estimate of (2) isN Let us note f the derivative of f with respect to x. Usually, f is estimated byf h . With the Gaussian kernel we obtainf Then, the integral estimate of (3) isÎ The FSC is estimated by multiplyingN X byÎ X .
Several techniques exist in order to automatise the bandwidth choice (Wand and Jones, 1994). In the following, the 2-stages direct plug-in method (Sheather and Jones, 1991) is used. This method estimates the optimal bandwidth regarding the asymptotic mean integrated squared error off h . The interested reader can found further technical details in (Wand and Jones, 1994) and (Sheather and Jones, 1991).

CASE STUDIES
In this section we explore two applications of SEP, FIM and FSC to time series. First, a synthetic experiment is used to show the usefulness of the method in detecting the dynamical behaviours of chaotic systems. Then, an example of possible application of the proposed method to real environmental data is discussed.

Logistic map
A synthetic experiment is designed to investigate how SEP, FIM and FSC can be used to detect behavioural changes in nonlinear dynamical systems. Following the experiment proposed by Martin et al. (2001), the logistic map defined by where c is the control parameter, is analyzed using sliding window technique.
The sequence (x n ) is computed up to n = 1000 for c ∈ [3.5, 4]. Centered Gaussian noise with different level of variance, 0.05, 0.10, 0.15, is added to x n . The well known bifurcation diagram of the logistic map is displayed in Fig. 2. The SEP, FIM and FSC are computed on data included in the overlapping windows of width 2.5 · 10 −3 along the control parameter, and the results are shown in the same Figure. The Lyapunov exponent is also added for comparison reason. The results are also displayed in the FSIP, see Fig. 3.
Analyzing the results obtained from the data without noise, it is easy to see how the SEP, FIM and FSC peak occurrences correspond to dynamic changes shown by the bifurcation diagram and the Lyapunov exponent. With the logarithmic scale on the y-axis, the behavior of the SEP is somewhat symmetric to the behaviour of the FIM, i.e. the FIM seems to be inversely proportional to the SEP. However, this is not exactly the case, otherwise the FSC would be constant. In some sense, the perturbations in the FSC reflect the departure from the inverse proportionality between the SEP and the FIM. In the FSIP, perfect inverse  proportionality corresponds to iso-complex curves. Indeed, the trajectory of the logistic map in the FSIP is stretched along iso-complex curves, see Fig. 3.
Adding noise shows that most of the peaks become undetectable, see Fig. 2. However, FSC seems to be the measure which suffers the least to noise in data. Note also, that FIM is less impacted than SEP. The noise effect is more interesting in the FSIP, see Fig. 3. While the uncorrupted data is quite hard to interpret due to the superposition of the trajectory with itself, adding some noise seems to clarify complexity and trajectory behaviors in the FSIP. Noise stimulates the emergence of protuberances roughly corresponding to "islands of stability" of the (uncorrupted) bifurcation diagram, where Lyapunov exponent is negative. This emergence is due to the fact that FIM is less impacted than SEP, as it was seen above.

Application to high frequency wind data
The Fisher-Shannon information method can find a wide application in the geo-environmental domains. Here we show how they can be applied to retrieve relevant knowledge from environmental time series. Specifically, high frequency wind speed data are analyzed. The time series consists of 1Hz frequency wind speed data, from 28 November 2016 to 29 January 2017 (Fig. 4). The data (motus.epfl.ch) were measured at 25.5 m above the ground by a sensor which is placed on meteorological mast located on the campus of the Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland. Notice that the mast is surrounded by a building layout of 10 meter average height. More information on these measurements can be found in Mauree et al. (2017a,b). The Fisher-Shannon quantities are computed with non-overlapping moving windows of 1 hour width along the time axis. Globally, all quantities vary with time, indicating non-stationarity, see Fig. 4. The SEP seems to roughly replicate the behavior of the original time series. This is due to a proportional effect between the mean and the variance of the data, as shown in Fig. 5. As for the logistic map case, the FIM is roughly inversely proportional to the SEP (not shown in logarithmic scale). The FSC is close to 1 during long period of time, e.g. between the 17 th and the 27 th January 2017. This should indicate a local behaviour of wind speed close to a Gaussian one. During these periods, wind speed is not necessarily calm, e.g. the 17 th January. Conversely, The FSC also exhibits some peaks where wind speed is rather low, which should indicate a more complex distribution of the data. To verify this, a closer exploration of the data is required. To this aim, we considered four subsets of three hour length, denoted by A, B C and D and represented on Fig. 4 by color red, purple, blue and green, respectively. Histograms and quantile-quantile (Q-Q) plots of these data subsets are also plotted with the corresponding colors. The subset D is chosen during the period of almost unitary FSC. The corresponding histogram and Q-Q plot confirms the very-close-to-Gaussian behavior of the data. The subset C is also chosen with a FSC close to 1, but centered on the maximum of SEP of the 17 th January 2017 which corresponds also to a high wind speed activity. The histogram shows again a distribution close to a Gaussian one, but with a higher variance than C. This was an expected output, since for Gaussian distribution the SEP equals to the variance and C was chosen with a high SEP. The Q-Q plot shows little departure for the left tail, but the data are still relatively close to what was expected. The subset B is centred on a peak of FSC. The histogram shows a distribution which is very far from Gaussianity. It is clearly asymmetric and has at least two modes -maybe three. The Q-Q plot shows a strong departure from the Gaussian distribution, especially on the left tail. The subset A is centred on the highest FSC value. Its histogram shows three -maybe four -modes. The corresponding Q-Q plot shows how for this subset data are even farther from Gaussianity than for the previous subset.
These results show the high complexity of these data, whose behaviour can rapidly change locally in time or even during calm weather. Further analysis of a larger data set of these measurements using the FSC can be found in Guignard et al. (2019b).

CONCLUSIONS
This paper introduced the Fisher-Shannon information method as effective data exploration tool able to give an insight into complex non-linear and non stationary time series. The Fisher-Shannon method was presented in a unified framework and new interpretations of FSC was identified. In particular, the detection of potential Gaussian behaviour in the data was exemplified on high-frequency wind speed data. Moreover, FIM and FSC were computed in closed forms for some parametric distributions. Theoretical formulas for other random variables can be derived depending on the problem at hand. While SEP, FIM and FSC were presented as information-based exploratory tools, they can also be used as time series discrimination or, more generally, to generate time series features for clustering, modelling and forecasting.
From a theoretical point of view, future studies should involve generalisation of the Fisher-Shannon method to the multivariate case. Several investigations could be made for the KDE of the FIM. In particular, other estimates could be provided by re-substitution techniques as with entropy. Optimal bandwidth choice regarding to asymptotic mean squared error of FIM -or even FSC -could be derived. More practically, an exploratory analysis on spatio-temporal data is planned.

APPENDIX
The differential entropy H X for Gamma, Weibull and log-normal distributions can be found in (Lazo and Rathie, 1978) and (Cover and Thomas, 2006). The SEP is simply a non-linear transform of H X . PROOF OF PROPOSITION 1. Computing the second derivative of log f (x), one has and then, using (4), the variable change x = θy and the properties of the Gamma function, yielding the FIM for the Gamma distribution. The FSC is directly obtained by multiplying the SEP and the FIM.
PROOF OF PROPOSITION 2. Starting from the Weibull PDF, one has and with the variable change y = ( x−µ λ ) k , and using the variable change y = log x − µ, one have 2σ 2 −2y−2µ dy.

CONFLICT OF INTEREST STATEMENT
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

AUTHOR CONTRIBUTIONS
F.G. conceived the main conceptual ideas, conduct investigations, developed the theoretical formalism, performed the calculations, interpreted the computational results and wrote the original draft. F.G. and M.L. developed Python and R packages. F.G., F.A. and M.K. wrote the final version of the paper. M.K. carried out the supervision, project administration and funding acquisition. All authors discussed the results, provided critical feedback, commented, reviewed and edited the original manuscript, and gave final approval for publication.

FUNDING
F.G. and M.K. acknowledge the support of the National Research Programme 75 "Big Data" (PNR75) of the Swiss National Science Foundation (SNSF), project no. 167285.