On the application of entropic half-life and statistical persistence decay for quantification of time dependency in human gait

Entropic half-life (ENT1⁄2) and statistical persistence decay (SPD) was recently introduced as measures of time dependency in stride time intervals during walking. The present study investigated the effect of data length on ENT1⁄2 and SPD and additionally applied these measures to stride length and stride speed intervals. First, stride times were collected from subjects during one hour of treadmill walking. ENT1⁄2 and SPD were calculated from a range of stride numbers between 250 and 2500. Secondly, stride times, stride lengths and stride speeds were collected from subjects during 16 min of treadmill walking. ENT1⁄2 and SPD were calculated from the stride times, stride lengths and stride speeds. The ENT1⁄2 values reached a plateau between 1000 and 2500 strides whereas the SPD increased linearly with the number of included strides. This suggests that ENT1⁄2 can be compared if 1000 strides or more are included, but only SPD obtained from same number of strides should be compared. The ENT1⁄2 and SPD of the stride times were significantly longer compared to that of the stride lengths and stride speeds. This indicates that the time dependency is greater in the motor control of stride time compared to that of stride lengths and stride speeds. 2020 The Authors. Published by Elsevier Ltd. This is an open access article under theCCBY license (http:// creativecommons.org/licenses/by/4.0/).


Introduction
A key feature of human walking is the time dependency in the stride-to-stride fluctuation of stride time (ST), stride length (SL) and stride speed (SS) (Hausdorff et al., 1995;Hausdorff et al., 1996;Deriaz, 2011, 2012;Terrier et al., 2005). Several studies have characterized this time dependency using entropy measures and detrended fluctuation analysis (DFA) to improve the fundamental understanding of walking motor control as well as describing the impairment induced by various diseases (Afsar et al., 2016;Alkjaer et al., 2015;Gates and Dingwell, 2007;Hausdorff, 2009;Hausdorff et al., 1997;Kaipust et al., 2012). Entropy measures quantify the regularity of a time series and among several different entropy algorithms, and sample entropy (SaEn) is the most popular one (Yentes et al., 2013). Additionally, DFA returns a scaling exponent describing the degree of statistical persistence or anti-persistence in a time series. However, while both methods quantify different characteristics of the time dependency in a time series, they do not return an output on an interpretable physiological or physical time scale. Thus, comparing the outcomes to other biomechanical or neurophysiological measurements is difficult.
Recently, Von Tscharner and colleagues introduced entropic half-life (ENT½) which estimates the elapsed time until the predictability in a time series is halved (Baltich et al., 2014;Federolf et al., 2015;Zandiyeh and Von Tscharner, 2013). ENT½ is based on consecutive calculations of SaEn on rescaled versions of the original time series with increasing randomization (Zandiyeh and Von Tscharner, 2013). When applied to movement variables, ENT½ quantifies how long time elapses before the influence of previous movements on future movements has reduced substantially. Recently, we applied ENT½ to ST intervals recorded during overground and treadmill walking in order to estimate the time dependency of human gait in an interpretable scale (number of strides) . We observed that the predictability in ST intervals was halved within 11 and 14 consecutive strides during overground and treadmill walking, respectively, and with no significant differences between the two conditions . In addition to ENT½, we introduced statistical persistence decay (SPD) which is based on the same rescaling method and applies DFA to estimate the deterioration of statistical persistence over time in a time series. We observed that the statistical persistence in ST intervals was deteriorated into uncorrelated noise within~50 strides during walking .
A critical aspect when applying DFA and SaEn is the number of included data points and comparison of results between studies with different number of data points should be made with caution (see Eke et al. (2000), Eke et al. (2002), Delignieres et al. (2005), Yentes et al. (2013), Yentes et al. (2018), Marmelat and Meidinger (2019)). Furthermore, the outcomes estimate the relative degree of regularity and persistency/anti-persistency, respectively. In contrast, ENT½ and SPD return outputs in an interpretable scale (e.g. the number of strides  or miliseconds (Baltich et al., 2014)). It is crucial to the exploration of this advantage and future comparisons with other biomechanical or neurophysiological measurements as well as between studies using different data lengths that any potential data length bias can be validly adjusted for. Thus, the main purpose of the present study was to investigate the effect of data length on ENT½ and SPD and verify that the both methods can validly be used on data sets of various lengths. We included the ST data recorded during treadmill walking in our previous study and reanalyzed it for time series lengths of 250-2500 strides . Furthermore, to verify these results we analyzed ST data recorded from a similar subject group during treadmill walking but of a shorter duration. For the methods to be valid, the analyses of the two data sets should reveal similar results.
In our previous study, we applied ENT½ and SPD to ST intervals. However, the dynamics of stride-to-stride fluctuations depends on the variable in question (e.g. ST, SL and SS) suggesting that different control strategies are utilized for different variables (Decker et al., 2013;Dingwell et al., 2010). Thus, it has been observed that the fluctuations in SS during treadmill walking exhibit an uncorrelated noise-like pattern in contrast to ST and SL which shows statistical persistence (Decker et al., 2013;Dingwell and Cusumano, 2015;Dingwell et al., 2010;Terrier and Deriaz, 2012). The secondary purpose of the present study was to investigate ENT½ and SPD in ST, SL and SS fluctuations during treadmill walking. Additionally, to compare the present study with previously reported results, DFA was calculated on these time series.

Method
The present study included a reanalysis of data from two separate experiments. The first experiment has been described by Raffalt and Yentes (2018) and Yentes et al. (2018) and the second experiment has been described in Wiens et al. (2017).

Subjects
The first experiment included 14 subjects with an age (mean ± SD) of 25.0 ± 4.2 years, body mass of 69.4 ± 16.9 kg and height of 170.8 ± 11.9 cm, and the second experiment included 10 subjects with an age of 21.1 ± 1.5 years, body mass of 71.6 ± 10 .0 kg and height of 172.8 ± 11.1 cm. The subjects had no diagnosed lower limb injuries within the past years and no neurological disorders. Upon arrival to the laboratory, the experimental protocol was explained to the subjects and they gave their informed written consent to participate. The two studies were approved by the Institutional Review Board of the University of Nebraska Medical Center and were carried out in accordance with the approved guidelines.

Experimental setup
During the first experiment, the subjects walked for one hour on a treadmill (AMTI, Watertown, MA) at their self-selected preferred walking speed. This speed was established by repeatedly increasing and decreasing the speed of the treadmill above and below what was reported as most comfortable by the subject. Footswitches (Trigno TM 4-channel FSR Sensor, Delsys Inc., Natrick, MA) placed under both heels which recorded heel strikes at 148 Hz. No objective measures of fatigue development were obtained, however, none of subjects reported fatigue to influence their gait.
During the second experiment, the subjects walked for 16 min on a treadmill (Bertec, Columbus, Ohio) at their self-selected preferred walking speed (determined as in the first experiment). The three dimensional position of reflective markers placed on the heel and the first metatarsal on both feet was recorded at 100 Hz by eight infrared cameras (Vicon, Oxford, UK).

Data analysis
For the first experiment, heel strikes were identified from the footswitches and the time between consecutive heel strikes on the right leg was calculated for 2500 strides for each subject. Additional ST interval time series were created with 250, 500, 750, 1000, 1250, 1500, 1750, 2000, 2250 and 2500 strides included. For the second experiment, heel strike was identified as a local maximum in the heel marker data in the anterior-posterior direction indicating a change from a forward motion during the end of swing phase to a backward motion during the contact phase (Zeni et al., 2008). ST, SL and SS were extracted from the individual number of strides completed during the walking trial. ENT½ and SPD were calculated from the ST interval time series from both experiments and additionally, from the SL and SS intervals extracted from the second experiment.

Entropic half-life
The method of ENT½ has been described in details elsewhere (Baltich et al., 2014;Raffalt and Yentes, 2018;Zandiyeh and Von Tscharner, 2013) and briefly below. The original ST interval time series was gradually randomized through successive reshaping where the first reshape time series (RTS) was equal to the original time series (e.g. [1-2-3-4-5-6-7-8-9-10-11-12]). The second and third RTS were then reorganized for the every second and third data point, respectively (e.g. [1-3-5-7-9-11-2-4-6-8-10-12] and [1-4-7-10-2-5-8-11-3-6-9-12]), and so on. The time series was reshaped 100 times where each reshaping resulted in an increased distance between subsequent strides. SaEn was calculated with m = 2 and r = 0.2 on each RTS and normalized according to equation (1): where SaEn RS was the SaEn of the reshaped time series, SaEn OR was the SaEn of the original time series and SaEn RAN was the average SaEn of 50 randomized time series created by a random permutation of the original time series. Finally, the normalized SaEn values from the RTS were plotted in a semi logarithmic plot as a function of stride number. ENT½ was identified as the stride number corresponding to the first normalized SaEn value above 0.5 . To verify parameter consistency, the ENT½ was calculated with m = 2 and 3 and r = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7 and 0.8.

Statistical persistence decay
SPD uses the same RTS as the ENT½ and calculates the scaling exponent using DFA with a box size range of [2,N] and a scaling region of 10-30 as previously described . Critical limits were established using equation (2): where l aRAN and r aRAN are the average scaling exponent and corresponding standard deviation of 100 random time series created by a random permutation of the time series. As the order of the RTS becomes increasingly randomized with each rescaling, the statistical persistence changed towards the critical limit. The number of strides corresponding to the first scaling exponent within the critical limit was identified as the SPD. Thus, SPD indicates a change in the time series fluctuations from persistence/anti-persistence towards uncorrelated noise.

Detrended fluctuation analysis
Detrended fluctuation analysis was applied to calculate the scaling exponent from the ST, SL and SS time series of the second experiment. Scaling exponents above 0.5 indicate statistical persistency, scaling exponents below 0.5 indicate statistical persistency and scaling exponents close to 0.5 indicate an uncorrelated structure in the time series in question.

Theoretical signals
The present study also included the analysis of four theoretical signals (brown noise, pink noise, white noise and a Lorenz attractor signal) of 10 different lengths from 250 to 2500 data points. The analysis and results are presented in the supplementary material and summarized in the results and discussion.

Statistics
To investigate the effect of included data points on the ENT½ and SPD, a one-way ANOVA for repeated measures with time series length as independent factor with 10 levels (i.e. 10 different time series lengths) was applied to the results from the first experiment. In case of a significant effect of time series length, the nature of the relationship between the dependent variables and time was established by a linear, quadratic and power regression analysis, respectively. The best fitting regression equation was determined by the size of the percentage of variance explained by the regression (r 2 ) and reported with corresponding 95% confidence intervals, 95% prediction intervals and p-value. To validate the regression equations for ENT½ and SPD extracted from the first experiment, they were used to predict the ENT½ and SPD values using the individual number of strides from the second experiment. The predicted values were then compared to the calculate values and the level of agreement was evaluated by a Bland-Altman plot. Additionally, to investigate the effect of using either ST intervals, SL or SS on the ENT½, SPD and scaling exponent, a repeated measures ANOVA on ranks was applied with a Student-Newman-Keuls post hoc test. Level of significance was set at 5%. All calculations were performed in Sigmaplot (Systat Software, Inc. 2014, version 13.0, Germany).

Time series length
There was a significant effect of the number of strides on the ENT½ from the first experiment (F = 3.835, p < 0.001). The post hoc test revealed that the ENT½ at 250 strides were significantly lower than the ENT½ at 2500, 2250, 2000, 1750 and 1250 strides but no other differences in ENT½ were observed (Fig. 1). The relationship between the number of strides and ENT½ could be described with significant linear, quadratic and power law relationships (Table 1). The percentage of variance for the three regression analyses was low (<5%) but highest for the quadratic regression (3.8%).
There was a significant effect of the number of strides on the SPD from the first experiment (F = 36.490, p < 0.001). The SPD increased with increase in strides and the post hoc test revealed a general pattern of a significant increase in SPD when the analyzed number of strides was increased with 750 strides (Fig. 2). The relationship between the number of strides and SPD could be described with significant linear, quadratic and power law relationship (Table 1). The percentage of variance for the three regression analyses was approximately 57%.

Predictions of ENT½ and SPD
The Bland-Altman plots for the calculated and predicted ENT½ and SPD showed moderate agreement both between the calculated and the predicted ENT½ and between the calculated and the predicted SPD with all data points but one within the mean ± 1.96S D band (Fig. 3). The mean offset of ENT½ was approximately 0 strides and 5 for the SPD, however, with skewness in the data point distribution.

Stride-to-stride fluctuation variables
The ENT½, SPD and scaling exponent of the ST, SL and SS in the second experiment are presented in Table 2. There was an effect of variable (ST, SL and SS) on the ENT½ (Chi-square = 11.03, p = 0.003), SPD (Chi-square = 15.24, p < 0.001) and scaling exponent (Chi-square = 20.00, p < 0.001). The post-hoc test revealed that ENT½ was significantly higher in ST variable compared to SL and SS variables (Fig. 4). The SPD and the scaling exponent were significantly higher for the ST variable compared to SL and SS variables. Additionally, the SPD and scaling exponent were significantly higher for the SL variable compared to the SS variable.

Parameter consistency
The test for parameter consistency of the entropic half-life calculated from the second experiment showed that while m = 3 appeared more robust compared to m = 2 for ST, both m = 2 and 3 was equally affected by changes in the r parameter for SS (Fig. 5). For SL m = 2 was more robust at low r values and m = 3 was more robust at high r values.

Theoretical signals
The analyses presented in the supplementary material showed that there was no effect of time series length on the ENT½ when applied to the theoretical signals. In contrast, SPD increased for brown noise, pink noise and the Lorenz attractor signal with increasing time series length while no effect of time series length was observed for the white noise signal.

Discussion
The main purpose of the present study was to investigate the effect of data length on the calculation of ENT½ and SPD in gait data and verify that both methods can validly be used on data sets of different lengths. Furthermore, the present study aimed at applying the two methods to SL and SS time series in addition to the ST intervals time series previously presented . Recommendations based on the present study are summarized in Table 3. ENT½ and SPD quantify the time dependency and while ENT½ is related to the predictability in the signal, the SPD is related to the statistical persistency or anti-persistency of the signal. Applying SaEn and DFA on each reshaped time series enables assessment of how long time elapses before information from previous data points no longer affect future data points. When applied to random signals with no correlation between data points, both measures return very low values and, when applied to brown noise, pink noise or a Lorenz attractor signal, the ENT½ and SPD are significantly higher, indicating that these signals have greater time dependency (see Supplementary material and . When applied to gait, these methods quantify how information from previous completed strides is influencing future strides. Low values indicate that limited information from the current stride will be influencing the next stride. Thus, each new stride can be considered a completely new task, which would raise the demand for cognitive processing of sensory information. In contrast, higher values of ENT½ and SPD indicate that considerable information from previous strides influencing future strides.

Effect of data series length
Similar to other nonlinear tools, both ENT½ and SPD were affected by the number of included data points (in this case the number of strides) (Marmelat and Meidinger, 2019;Yentes et al., 2018;Yentes et al., 2013). The outcome of both methods increased with increasing number of included strides. While the quadratic relationship between strides and ENT½ only explained a very limited amount of the variance, the linear relationship between strides and SPD explained a considerable amount.
From the first experiment, the ENT½ of the ST increased from 3 strides when using 250 strides to 14 strides when including 2500 strides. When 1000 strides or more were included, the ENT½ reached a plateau and did not change significantly. This would suggest that including 1000 strides or more would ensure a robust valid estimation of the ENT½. This observation also validates the conclusion reached in our previous study, where 2500 strides were included and the predictability in ST was halved within 14 strides during treadmill and rule out the possibility that the results was biased by the number of included strides. Furthermore, this observation has methodological implications as including 1000 strides would require relative long experimental trials. Considering an average ST of 1 s, this would require more than 16 min of walking. While this should not constitute a problem for healthy individuals,    it could be a challenge for individuals with walking impairments. While it could be tempting to solve this limitation by concatenate several shorter trials (e.g. 100 strides) into one long trial, we do not recommend this. Concatenation of shorter walking trials would disrupt the inter-stride correlation and potentially bias the results. A recent study observed questionable reliability when applying SaEn on concatenated time series (Orter et al., 2019). For the SPD no plateau was reached with increasing number of included strides. Thus, as more information (i.e. strides) is included in the formation of statistical persistency, the more 'disruption' is needed to remove this. This apparent data point-dependency of the SPD suggests that the robustness of the statistical persistency can be quantified as the slope of the data-point vs SPD plot. The steeper the slope, the more disruption is needed to interfere with the statistical persistency. It is beyond the scope of this study to determine if this statistical persistence robustness can be manipulated by altered task constraints and future studies should explore this. However, based on the results of the present study, comparing SPD values calculated from different numbers of data points would be ill-advised.
The results ENT½ and SPD on the theoretical signals presented in the supplementary material were well in line with the observations on the ST intervals and emphasize the importance of using equal time series length when applying SPD.

Entropic half-life and statistical persistence decay in different data sets
The regression equations derived from the first experiment performed moderately in predicting the ENT½ and SPD outcome from the second experiment, indicating that results from two different experiments should be compared with caution unless the same number of strides is used. One reason for the moderate predictions could be the considerable inter-subject variation in both ENT½ and SPD but also the limited number of observations. The standard deviation of ENT½ from the first experiment when including 1000-2500 strides ranged from 15 to 22 strides. Equally, the standard deviation of ENT½ from the second experiment when on averaged 785 strides were included was 14.4. The standard deviation of SPD increased from 9 to 20 strides when 1000 and 2500 strides were included, respectively. Acknowledging this considerable inter-subject variation, it is advisable to use multiple trials before and after interventions or between changing test conditions for each subject to achieve a valid estimation of ENT½ and SPD when conducting a repeated measures design study. This is well in line with recommendations made for other commonly used nonlinear tools in gait research .

Fluctuations in stride-to-stride characteristics
The results of the DFA revealed that both ST and SL fluctuations exhibited statistical persistency and SS fluctuations exhibited statistical anti-persistency confirmed the observations of previous studies (Decker et al., 2013;Dingwell et al., 2010). The ENT½ results revealed that significantly more strides were completed before the influence of previous ST intervals on future ST intervals was reduced substantially compared to that of SL intervals.
Furthermore, relative few strides were required to remove the influence of previous SS intervals on future SS intervals. Similar results were observed for the SPD. Combined with the DFA results, this indicates that structure of the ST and SL fluctuations where a long stride is statistically likely to be followed by an even longer stride also possessed a long time dependency. In contrast, the structure of the SS fluctuations where a long stride is statistically likely to be followed by shorter stride possessed a short time dependency. In agreement with previous studies (Decker et al., 2013;Dingwell et al., 2010), this suggests that control of SS during treadmill walking rely on rapid adjustments in order to maintain a constant position in the middle of the treadmill belt.

Declaration of Competing Interest
The authors declare no conflict of interest.