Data series embedding and scale invariant statistics

https://doi.org/10.1016/j.humov.2009.08.004Get rights and content

Abstract

Data sequences acquired from bio-systems such as human gait data, heart rate interbeat data, or DNA sequences exhibit complex dynamics that is frequently described by a long-memory or power–law decay of autocorrelation function. One way of characterizing that dynamics is through scale invariant statistics or “fractal-like” behavior. For quantifying scale invariant parameters of physiological signals several methods have been proposed. Among them the most common are detrended fluctuation analysis, sample mean variance analyses, power spectral density analysis, R/S analysis, and recently in the realm of the multifractal approach, wavelet analysis. In this paper it is demonstrated that embedding the time series data in the high-dimensional pseudo-phase space reveals scale invariant statistics in the simple fashion. The procedure is applied on different stride interval data sets from human gait measurements time series (Physio-Bank data library). Results show that introduced mapping adequately separates long-memory from random behavior. Smaller gait data sets were analyzed and scale-free trends for limited scale intervals were successfully detected. The method was verified on artificially produced time series with known scaling behavior and with the varying content of noise. The possibility for the method to falsely detect long-range dependence in the artificially generated short range dependence series was investigated.

Introduction

It has been recognized that many biological systems exhibit complex behavior that is governed by fractal dynamical process (Goldberger et al., 2002, Peng et al., 2000, Stanley et al., 1992). For example, in healthy human gait this is revealed through long-memory fluctuations of the gait cycle, also referred to as the stride interval. Such feature emerges from the data characterized by long-range correlation with power–law decay that lasts over a large number of data points. It persists in various “free” walking rhythms and disappears only in an artificially (metronomically-paced) or naturally constrained walk (disease or aging) (Hausdorff et al., 1995, Hausdorff et al., 1996). It is also present in the human heartbeat interval data series (RR intervals) whether extracted from subjects with a normal healthy sinus rhythm or from patients with congestive heart failure (Peng et al., 1995, Stanley et al., 1992).

Detrended fluctuation analysis (DFA) (Peng et al., 1994) and multifractal approach (wavelet analysis) (Muzy, Bacry, & Arneodo, 1993; Stanley et al., 1999) are commonly used for revealing such correlations and especially for uncovering alterations resulting from the diseases or the aging processes. DFA is particularly well suited for quantifying the long-range correlation in biological time series that almost always include so-called ‘‘trends” as nonstationary ingredients. It is still unclear whether it is advisable to eliminate those heterogeneities from the biological sequences when looking for the intrinsic long-range correlations that are produced solely by the internal dynamic process (Bernaola-Galván et al., 2002, Bernaola-Galván et al., 1996).

In this paper we show that when the time series data are embedded in the high-dimensional vector space they form a trajectory that exhibits fractal-like behavior when the dimension of the embedding space is used as the relevant scale of observation. We demonstrate that this is a competitive method for separation of different fractal characteristics in bio-systems data. The proposed approach uses original data without removing their intrinsic trends (heterogeneities).

The lengths of the measured data sequences are frequently rather short due to the objective and subjective difficulties in the measurement process (elderly, or subjects in poor health). Revealing the scale invariant structures from such short data sets could be difficult. We demonstrate that embedding the data in a high-dimensional phase space allows the estimation of the present long-memory trends in smaller data sets, which is particularly suitable for clinical applications.

Section snippets

Embedding

Embedding is a mapping of the consecutive time series data X(ti) (or some other natural order data) to m-dimensional pseudo-phase space coordinates (delay coordinates) (Packard et al., 1980, Parker and Chua, 1987), where the respective state “delay” vectors are given asxiT=[x(ti),x(ti+τ),x(ti+2τ),x(ti+3τ),,x(ti+(m-1)τ)].In the above expression τ is the “lag time” defined as some integer multiple of the sampling time ts; ts = ti+1  ti, with the product (m  1) τ identified as “window width” τw. It

Physio-Bank data

The first set of experiments was performed on the selected data from the Physio-Bank gait database (Goldberger et al., 2000). These data records represent one hour stride interval time series from 10 young, healthy men walking at their normal (usual) rate and at a rate paced with a metronome. The variance of the trajectory vectors main diagonal projection was calculated using summation of the rows of data covariance matrix Ξ in Mathematica.

Calculations were performed as follows: for the

Conclusions

Embedding time series data in a high-dimensional pseudo-phase space and projecting the trajectory vectors on the subsequent lower subspaces presents a simple method for uncovering fractal dynamics in physiological signals. Data vectors’ main diagonal projections reveal scale invariant statistics when embedding dimension m is used as the scale of the observation. The applicability of the method is tested on different human gait data sets from Physio-Bank database of various lengths (normal and

Acknowledgments

The authors wish to thank PhysioNet and the National Institutes of Health (USA) for their valuable free resources of human gait data that made this investigation possible.

References (43)

  • H.E. Stanley et al.

    Fractal landscapes in biological systems: Long-range correlations in DNA and interbeat heart intervals

    Physica A

    (1992)
  • P. Terrier et al.

    GPS analysis of human locomotion: Further evidence for long-range correlations in stride-to-stride fluctuations of gait parameters

    Human Movement Science

    (2005)
  • J. Beran

    Statistics for long-memory processes

    (1998)
  • P. Bernaola-Galván et al.

    Compositional segmentation and long-range fractal correlations in DNA sequences

    Physical Review E

    (1996)
  • Z. Chen et al.

    Effects of nonstationarities on detrended fluctuation analysis

    Physical Review E

    (2002)
  • R.B. Davies et al.

    Tests for Hurst effect

    Biometrika

    (1987)
  • D. Delignières et al.

    Fractal dynamics of human gait: A reassessment of the 1996 data of Hausdorff et al

    Journal of Applied Physiology

    (2009)
  • A. Eke et al.

    Physiological time series: Distinguishing fractal noises from motions

    Pflügers Archives

    (2000)
  • A. Eke et al.

    Fractal characterization of complexity in temporal physiological signals

    Physiological Measurement

    (2002)
  • A.L. Goldberger et al.

    PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals

    Circulation

    (2000)
  • A.L. Goldberger et al.

    Fractal dynamics in physiology: Alterations with disease and aging

    Proceedings of the National Academy of Science of the United States of America (PNAS)

    (2002)
  • Cited by (0)

    View full text