Exploring time-delay-based numerical differentiation using principal component analysis

Natural systems, including the dynamics of engineered structures, are often considered complex; hence, engineers employ different statistical methods to understand these systems better. Analyzing these systems usually require accurate derivative estimations for better understanding, i.e. a measured displacement can be used to estimate the forces on a cylindrical structure in water by using its velocity, and acceleration estimations. In this study, we use a nonlinear method based on embedding theory and consider the time-delay coordinates of a signal with a fixed lag time. We propose a new method for estimating the derivatives of the signal via redefining the delay matrix. That is, the original signal is updated with the second principal component of the delay matrix in each derivation. We apply this simple method to both linear and nonlinear systems and show that derivatives of both clean and/or noisy signals can be estimated with sufficient accuracy. By optimizing the required embedding dimension for the best derivative approximation, we find a constant value for the embedding dimension, which illustrates the simplicity of the proposed method. Lastly, we compare the method with some common differentiation techniques. show that our proposed method works better in estimating higher-order derivatives than Legendre polynomial approximations proposed by Gibson et al. [9] when the signal has a certain level of noise. Also, our new approach only requires one simple function as opposed to the use of different polynomial functions for different derivatives as in Gibson et al. [9]. Moreover, the new approach does not require tuning of any parameters to get reliable derivative estimations. Further, we show that we can handle the intrinsic inability of PCA to deal with traveling motions by mirroring the signal before applying our method. We test the new method (with and without prior mirroring) by applying it to data from both numerical simulations (including 33 low-dimensional chaotic flow systems as in Sprott [10]) and experimental signals and compare the results. We show that the higher-order derivatives of both linear and nonlinear systems can be estimated with sufficient accuracy, and in some cases, the noise is reduced. For example, we apply the new method to analyze data from an experiment that was carried out in an ice tank. In the experiment, ice floes were marked with Qualisys markers and using motion tracking, the positions of the points were recorded. In addition, accelerometers were placed on some of the floes. Hence, the accuracy of the derivatives could be compared with direct measurements. When compared with the direct measurements (measured acceleration is compared with the estimated acceleration from displacement measurements), the proposed time-delay-based differentiation yields very close results, showing the accuracy of the method. To further increase the confidence in our proposed method, we compare results obtained by our method for signals collected in the ice tank with other 7 various differentiation methods. Comparison suggests that our method gives comparable/similar results with those from the other methods. In addition, the proposed method is applied on 17 electrocardiogram (ECG) experimental signals. Results show that our new method produces equivalent estimates with other common differentiators.


Introduction
Data collection and processing is a standard engineering practice used by industry and academia to understand the dynamics of a system in a given environment. These dynamics are usually complex; hence, engineers employ different statistical methods to understand the data better. For example, if the dynamical system is not known a priori and/or it is not possible to collect enough measurements to represent it, the time-delay coordinates of a single variable allow us to extract additional information (variables). The method of delays to study dynamics of experimental data was first proposed by Packard et al. [1]. Takens [2] improved the method and proved that the dynamics in the time-delay coordinates produce a new attractor with a similar topology. This is very important because it shows that it is possible to reconstruct the state-space from a low-dimensional time series, and when the dimension is high enough, reconstruction is always an embedding. For example, recently, Yu and Li [3] have used delay coordinates where the aircraft accidents for a certain time period have been reconstructed using chaos analysis.
In an earlier work, Broomhead and King [4] combined principal component analysis (PCA) with the method of delays and studied the effect of lag time on the reconstruction to eliminate linearly dependent coordinates and accurately estimate the phase portrait, hence the underlying dynamics. They showed that when the time series has some Gaussian noise embedded in the signal, PCA is the optimal linear coordinate transformation and can reduce noise. Similarly, Shang et al. [5] developed a smoothing algorithm that is based on Chaotic Singular-Value Decomposition, which essentially relies on the use of delay coordinates and PCA. It should be noted that PCA is also known as the Karhunen-Loeve decomposition, proper orthogonal decomposition (POD) or singular-value decomposition (SVD), which has been widely used in all engineering areas to find coherent structures (or dominant modes), e.g., Kerschen et al. [6]; Gedikli et al. [7,8]. Therefore, we use PCA and SVD interchangeably throughout the paper.
Gibson et al. [9] studied PCA and the method of delays in depth and derived a closed-form solution to PCA in the limit of small window widths. They demonstrated that with the exception of the first principal component, the resulting principal components are qualitatively equivalent to average time derivatives. In addition, they identified the relationships between delays, derivatives, and principal component analysis and showed that eigenvectors of the system resemble Legendre polynomials. They derived a discrete set of Legendre polynomials and found stable estimates of derivatives using these polynomials.
In this paper, we investigate the method of delays combined with PCA to find reliable estimates of the derivatives. In this method, we create a Hankel matrix of a single variable (a single time-dependent measurement) and compute its principal components. We relate the second principal component of the data to the first derivative of the signal using an analytical formula. Then, instead of analytically estimating higher-order derivatives using Legendre polynomials, we treat the first time derivative of the signal as new signal. Therefore, we create a new Hankel matrix of the first derivative, compute the second principal component of the new matrix and estimate the second derivative. We also use the same approach to estimate the higher-order derivatives and find a constant value for the required embedding dimension that best approximates derivative by optimization. In addition, we show that our proposed method works better in estimating higher-order derivatives than Legendre polynomial approximations proposed by Gibson et al. [9] when the signal has a certain level of noise. Also, our new approach only requires one simple function as opposed to the use of different polynomial functions for different derivatives as in Gibson et al. [9]. Moreover, the new approach does not require tuning of any parameters to get reliable derivative estimations. Further, we show that we can handle the intrinsic inability of PCA to deal with traveling motions by mirroring the signal before applying our method.
We test the new method (with and without prior mirroring) by applying it to data from both numerical simulations (including 33 low-dimensional chaotic flow systems as in Sprott [10]) and experimental signals and compare the results. We show that the higher-order derivatives of both linear and nonlinear systems can be estimated with sufficient accuracy, and in some cases, the noise is reduced. For example, we apply the new method to analyze data from an experiment that was carried out in an ice tank. In the experiment, ice floes were marked with Qualisys markers and using motion tracking, the positions of the points were recorded. In addition, accelerometers were placed on some of the floes. Hence, the accuracy of the derivatives could be compared with direct measurements. When compared with the direct measurements (measured acceleration is compared with the estimated acceleration from displacement measurements), the proposed time-delay-based differentiation yields very close results, showing the accuracy of the method. To further increase the confidence in our proposed method, we compare results obtained by our method for signals collected in the ice tank with other 7 various differentiation methods. Comparison suggests that our method gives comparable/similar results with those from the other methods. In addition, the proposed method is applied on 17 electrocardiogram (ECG) experimental signals. Results show that our new method produces equivalent estimates with other common differentiators.
In Section 2, the new method of the time-delay-based differentiation or, in short, the TDD is described. In Section 3, the applications of the TDD method to different numerical and experimental systems are presented in more detail. These systems include the following: Case 1 is a quasi-periodic signal; Case 2 is a nonlinear mass-spring-dashpot system as in Chatterjee [11]; Case 3 is a nonlinear Chi-Chi earthquake system; Case 4 is a hysteretic Bouc-Wen system; and Cases 5-6 are experimental systems that consist of wave-induced ice floe collisions. In Section 4, the optimization results to find the embedding dimension and required shifting are presented. In Sections 5-7, we investigate the performance of the proposed TDD method and its variant on clean and noisy simulated signals and experimental signals, and illustrate the results of higher-order derivative estimations. In the end, summary and conclusions are provided in Section 8.

Time-delay-based differentiation (TDD)
TDD method is inspired by the idea of state-space reconstruction using a time-delayed signal. Previously, Gibson et al. [9] proposed the use of Legendre polynomials to estimate derivatives of the signals where one need to calculate different coefficients with increasing complexities for increasing order of derivatives. Although such a method gives good estimates of derivatives, it becomes highly time consuming as the required order of derivatives increase. In this study, we pose the following question: if we can create the state-space representation of a given signal using the combined SVD and method of time delays, can we also estimate the higher-order derivatives using this dynamic measurement considering the directional behavior of the signal? In other words, when the SVD of a Hankel matrix is considered, does the second principal component of the signal help us to estimate higher-order derivatives (i.e., second, third and so on)? If so, how can we relate the second principal component to higher-order derivatives? Consider a signal with a time history of x ∈ R n×1 . We stack x to form a Hankel matrix H ∈ R (n−r)×r , which yields where the subscript represents the time instant when the entry of x is sampled, n is the length of the signal, and r is the embedding dimension and specifies the number of columns in H that are populated by the delayed x. H can be decomposed into three matrices by utilizing SVD: where columns of U ∈ R (n−r)×r are the left orthonormal singular vectors of H , columns of V ∈ R r×r are the right orthonormal singular vectors of H , and S ∈ R r×r is the singular value matrix with its diagonal elements in descending order. In this decomposition, U includes temporal modes, and V contains spatial modes. In addition, only diagonal elements of S, i.e., s ii , contain units since column vectors of U and V are normalized.
In the following, we use u i to denote the i th column vector of U and v ii to denote the i th diagonal elements of V counted from the left upper corner. By using components of U , V and S, the i th mode approximation of x can be computed as follows: where superscript d means that the quantity is estimated by means of the method based on time delay. When i = 1, x d 1 is the reconstruction of x. Assuming the first time derivative of x can be best estimated with the second mode approximation, as in Eq. (3), we show that the first time derivativeẋ d and the second time derivativeẍ d can be estimated as followṡ where F s is the sampling frequency, and x d 2 and x d 3 represent second and third modes, respectively. Although higher-order estimates of the derivatives can be approximated by higher subspace dimensions of SVD (x (k),d is the k th time derivative), error in the estimation increases significantly as the subspace dimension increases because most of the energy in the signal is embedded in the lower subspace dimensions. We call this estimation SVD-based differentiation to illustrate how it differs from the TDD method.
In the TDD method we ask: if SVD-based differentiation using the method of time delays, as shown above, produces reliable phase portraits (e.g., Gibson et al. [9], Broomhead and King [4]), can we repeat this process to estimate higherorder derivatives? For example, can we treat the first derivative of a signal as our new signal and estimate the second derivative; then, after estimating the second derivative, treat the second derivative as the new signal and estimate the third derivative and so on. If so, how accurate would the estimated derivatives be?  In order to answer these questions, we propose the TDD method, which updates the original signal in the Hankel matrix with the estimated derivative and finds one higher-order derivative with the same approach (see Fig. 1 for the illustration of the TDD method). In other words, we only need Eq. (6) to estimate the second time derivatives in TDD: x ud 2 =x 2 F s (6) wherex 2 represents the second mode approximation using the updated Hankel matrix. It should be remembered that SVD-based differentiation and the TDD method yield the same results for the first derivatives since the first column of the Hankel matrix is the original signal in both methods. However, we update the Hankel matrix for each derivative in the proposed TDD method, whereas we directly use higher subspace dimensions to estimate the higher-order derivatives in SVD-based differentiation (or higher-order derivatives can also be estimated using SVD-based differentiation if they are combined with Legendre polynomials as in Gibson et al. [9]).
Takens [2] proved that if the dimension of the underlying dynamical system D F is known, r > 2D F generally guarantees an embedding. In other words, when the dimension is sufficiently high, a reconstruction is almost always an embedding from the perspective of state-space reconstruction. There are many studies devoted to finding the minimum embedding dimension of a dynamical system for a reliable state-space reconstruction such as Kennel et al. [12], Cao [13], Kennel and Abarbanel [14], Maus and Sprott [15], Chelidze [16].
In our study, we use required embedding dimension that best approximates the derivatives, which is different from the minimum embedding dimension. Therefore, we create an optimization scheme to find the required embedding dimension that approximates the derivatives best (see Section 4).
Since our time-delay-based derivatives are shorter than the original signal, shifting of the signal is necessary to carry out the optimization. We show in Section 4 that the required embedding dimension (r) and the required shifting (ξ k −1) for the TDD method are 5 and 2k, respectively, where k is the order of the time derivative. For the SVD-based differentiation, we show thatr = 5 and the required shifting (ξ k=1 − 1) = (ξ k=2 − 1) = 2. Figs. 2 and 3 illustrater and (ξ k − 1) for the SVD-based differentiation and TDD method, respectively. As can be seen, the required shifting for the first two derivatives in the TDD method are (ξ k=1 − 1) = 2, (ξ k=2 − 1) = 4, respectively.

Applications of TDD in numerical and experimental systems
To test the performance of the TDD method, we use various numerical and experimental systems. First, we test the TDD method by applying it to a quasi-periodic signal (Case 1), and then, we apply it to various dynamical systems (Cases 2-6). Dynamical systems include numerical systems, such as the nonlinear mass-spring-dashpot system in Chatterjee [11] (Case 2), the nonlinear Chi-Chi earthquake system (Case 3) and the hysteretic Bouc-Wen system (Case 4), and experimental systems, such as the wave-induced ice floe collisions (Cases 5 and 6). Note that the six cases presented here are singlevariable cases, i.e., each case describes a single variable. We also applied the TDD method to 33 different chaotic systems in Sprott [10] to test the performance of the method (not shown).
To illustrate the quality of the time derivatives, we compute the normalized root mean square error (NRMSE) as follows: In Eq. (7), a represents the analytical, numerical or measured time derivatives of the signal, and b represents the estimated derivatives using the TDD method. a max and a min are the maximum and minimum values of a, respectively. N k is the length of the k th time derivative (see Figs. 2 and 3).

Case 1 -Quasi-periodic time series
We consider a quasi-periodic signal that is obtained by superimposing seven sinusoids with incommensurate frequencies. Eqs. (8), (9) and (10) show the analytical description of the signal with its corresponding first and second time derivatives, respectively. In this system, the sampling frequency is chosen as 50 Hz, and the signal is generated between 0 and 600 s. Fig. 4 shows the original signal with its estimated derivatives and the corresponding phase plane plots of second time derivative versus first time derivative using both SVD-based differentiation and the TDD methods. As seen, SVD-based differentiation fails to estimate the second time derivative of the signal (see Fig. 4c), obtaining an NRMSE value of 14.9%; on the other hand, TDD successfully estimates the second time derivative, achieving an NRMSE value of 0.0446% (see Fig. 4d). One can also compare the derivatives by observing their phase plane plots. In this case, Fig. 4e also shows that SVD-based derivative estimation fails to reconstruct the phase plane, whereas TDD-based derivative estimation (Fig. 4f) successfully reconstructs the desired phase plane.

Case 2 -A simple one-dimensional nonlinear mass-spring-dashpot system
We utilize the nonlinear mass-spring-dashpot system presented in Chatterjee [11] to examine the applicability of SVD-based differentiation and TDD for a nonlinear system. Fig. 5 shows the sketch of this nonlinear system. Each mass from m 1 to m 10 is subjected to a force from a linear spring with stiffness k, a force from a linear dashpot with a constant coefficient of c and an external forcing function F (t). Additionally, a nonlinear force from a hard spring k h is exerted on mass m 5 . This mass-spring-dashpot system moves on frictionless surface.
The equations of motion for this system can be written as follows:  Source: Adapted from [11].
where H(x) is the Heaviside function, the subscript represents different masses, i stands for mass 2, 3, . . . , 9 excluding 5, and F (t) = A sin ωt is the external force. Here, m 1 = m 2 = · · · = m 10 = 1, k s = 1, c = 0.3, A = 0.08, ω = 0.2 and k h = 5, which are the same nondimensional values as in Chatterjee [11]. The sampling rate is chosen as 100 Hz, and the simulation time is 0 ∼ 600 s. Fig. 6 shows the original displacement time history of the simple spring-mass-dashpot system with its estimated derivatives and the corresponding 3-dimensional phase plane plots using both SVD-based differentiation and TDD methods. Similar to the previous case, SVD-based differentiation once again fails to estimate the second time derivative of the signal (see Fig. 6c), obtaining an NRMSE value of 3.43%; on the other hand, TDD successfully estimates the second time derivative, with the NRMSE value of 0.0949% (see Fig. 6d). One can also compare the derivatives by observing their phase plane plots. In this example, Fig. 6e also shows that SVD-based derivative estimation does not reconstruct the phase plane accurately, whereas TDD-based derivative estimation (Fig. 6f) successfully reconstructs the desired phase plane with a high accuracy.

Case 3 -Nonlinear Chi-Chi earthquake system
In this case, we consider a single degree of freedom linear system with a nonlinear force, as presented in Smyth and Wu [17]. The system represents the Chi-Chi earthquake and can be expressed as follows:  where m = 1, c = 0.3, and k s = 9 andẍ g represents the ground acceleration. The system is excited by ground accelerations measured at 200 Hz at Taichung, Taiwan, during the Chi-Chi earthquake [17]. The data are provided by the Seismology Center, Central Weather Bureau in Taipei, Taiwan, through the strong motion virtual data center [18].
As in Smyth and Wu [17], velocity and displacement are obtained by using the Newmark method. Thereafter, Eq. (15) is solved to find acceleration. Simulation is performed for 0 ∼ 160 s. Fig. 7 shows the original displacement time history of the nonlinear Chi-Chi earthquake system with its estimated derivatives and the corresponding 3-dimensional phase plane plots using both SVD-based differentiation and TDD methods. As clearly seen, SVD-based differentiation fails to estimate the second time derivative of the signal (see Fig. 7c&e), obtaining the NRMSE value of 3.45% and underestimating the nonlinearity in the signal; on the other hand, TDD performs better at estimating the second time derivative, with the NRMSE value of 1.24% (see Fig. 7d&f), where the inherent nonlinearity in the signal is correctly estimated.
The displacement, velocity and acceleration are found first by utilizing the MATLAB built-in function ode45 to solve the dynamic system equations, Eqs. (16) ∼ (17). Thereafter, the SVD-based differentiation and TDD methods are applied, and the results are compared as shown in Fig. 8. It is evident that both the SVD-based differentiation and TDD methods estimate the first time derivatives quite well (see Fig. 8b). The acceleration estimations using SVD-based differentiation and TDD are very close to the simulated results, as shown in Fig. 8 (c, d, e and f). The quality of the TDD estimation of acceleration is similar to that of SVD-based differentiation when comparing the NRMSE values, with an NRMSE in SVD-based differentiation of 0.963% and an NRMSE in TDD of 1.23%.

Experimental systems
In this section, the quality of the TDD method in estimating derivatives is investigated by examining two different dynamical systems from experiments. The experimental data utilized here were collected during two experimental test runs (i.e., test runs #3210 and #3220, respectively) of the LS-WICE project (see Tsarau [19], Cheng et al. [20], Herman et al. [21], Tsarau et al. [22] and Li and Lubbad [23] for details). Fig. 9 shows the sketch of the experimental setup, where the wavemaker is capable of generating waves with different wave frequencies, wave amplitudes and wavelengths (hence various wave steepnesses). In the experiments, 28 (longitudinally) × 6 (transversely) rectangular ice floes with dimension 1.5 m by 1.63 m were produced in total by cutting a 36 mm thick intact ice sheet. In the tests, ice floes were instrumented with Qualisys markers, and an optical Qualisys system monitored 3-dimensional spatial positions of 5 ice floes with a frame rate 100 frames/s. To study the dynamics of wave-induced ice floe collisions, two inertia measurement units (IMUs) were deployed on two ice floes that provided direct acceleration measurements. Regular waves were generated by the wavemaker and propagated into the ice field. Two ultrasound sensors were mounted before the leading edge of the ice field to record the produced wave profiles. In the tests, ice floes moved predominately in the vertical plane under unidirectional monochromatic waves. Fig. 9. Simplified sketch of the experimental setup, where S1 and S2 are ultrasound sonars to measure wave elevations. The flow is generated by the wavemaker and flows to the right. Two markers of interest are highlighted by magenta and light blue color. The ice floes where these two markers were placed on are assigned with letters C and D, respectively. In this subsection, the position of the selected marker on ice floe C is analyzed for test runs #3210 and #3220 (wave steepnesses are 1.27 × 10 −2 and 1.56 × 10 −2 , respectively), where the selected marker on ice floe C is highlighted by magenta color in Fig. 9. The reason we selected two different runs is because these two different runs caused different dynamical motions of the ice floes such that different nonlinearities were observed (see Figs. 10c and 11c).
As shown in Fig. 9, the velocity and acceleration of the ice floe under consideration were not measured directly; hence, a differentiator filter [24, referred as DF henceforth] and the TDD method were both applied to estimate them. It should be remembered that IMUs and markers were placed on neighboring ice floes, and videos of the experiment showed that the two neighboring ice floes moved together; therefore, it is expected that similar accelerations will be obtained when the motion of the two ice floes is compared (see more details in Li et al. [25]).

Case 5 -Wave-induced ice floe collisions (wave steepness is
In this case, we apply the TDD method to wave-induced ice floe collisions, where the wave steepness is 1.27 × 10 −2 . Fig. 10 shows the measured (blue) and estimated accelerations using the TDD method (red) and the DF (black), respectively, where we exclude the results from SVD-based differentiation method for clarity. As a result, the resulting acceleration estimations using both the TDD and DF successfully generate the acceleration signals as expected. However, it should be pointed out that the amplitudes of the resulting estimations are not the same as the direct measurements. This is attributed to the interaction between the ice floes (see Fig. 9) and the weight of the IMUs since each IMU is very heavy compared to the weight of the ice floe (1.5% of the ice floe weight); therefore, a large inertia is produced.

Case 6 -Wave-induced ice floe collisions (wave steepness is
In this case, we apply the TDD method to the wave-induced floe collisions when the wave steepness is 1.56 × 10 −2 . Similar to the previous case, Fig. 11 shows that the computed accelerations using both the TDD method and the DF yield similar results. However, one can also observe strong nonlinear interactions in the phase portrait in Fig. 11c, which differs from the system in the previous case in Fig. 10c. For application of TDD method on more experimental signals, reader is referred to Section 7. In (c), the magenta color indicates the maximum positive acceleration, whereas the red color represents the maximum negative acceleration.

On the required embedding dimension and required shifting
Both SVD-based differentiation and the TDD methods rely on the critical parameter, viz., embedding dimension. The embedding dimension used in this work differs from the notion of minimum embedding dimensions (e.g., Kennel et al. [12], Cao [13], Kennel and Abarbanel [14], Maus and Sprott [15], Chelidze [16]), where we optimize the embedding dimensions to find the best derivative approximations when compared with the analytical/simulated results. The optimization problem can be solved by minimizing the following function:  , respectively; superscripts s and t denote simulated and theoretical, respectively;r is the required embedding dimension that best approximates the time derivatives; τ k is the truncation for the k th time derivatives relative to the signal because of the time delay; n is the length of the signal; and r is the embedding dimension tested, ranging from 3 to 20. Note that the SVD-based differentiation method requires the same embedding dimension and shifting for all orders of time derivatives. Thus, we only perform optimization for k = 1. According to Eqs. (18) ∼ (21), when ε k (r, ξ k ) reaches the minimum for a signal with different embedding dimensions r and different shiftings (ξ k − 1), it results in the best derivative approximation.
Optimization is applied to 33 low-dimensional chaotic flow systems listed in Sprott [10]. These systems have multiple variables in the ranges of 2 ∼ 4, i.e., N = 2 ∼ 4, for each system. The MATLAB function ode45 is applied to integrate the differential equations describing the low-dimensional chaotic systems. For the simulation, sampling frequencies have been arbitrarily chosen, ranging from 20 ∼ 100 Hz.
In addition to the low-dimensional chaotic systems mentioned above, optimization is further employed on a highdimensional chaotic flow system that is described by the Mackey-Glass differential equation in Mackey and Glass [26] as follows: where θ is the time delay and set to 100 herein, as in Cao [13]. The solutions of Eq. (22) are found by the MATLAB function dde23 and sampled at 100 Hz. Fig. 12 illustrates the optimization results of the required embedding dimension for the selected systems. In the analysis, signals including Case 1, Airy function [27], tanh(t) and sech(t) have been simulated up to the fourth order for Each color represents one system or a signal. (a) represents the error of first-order time derivative estimates for 33 low-dimensional chaotic flow systems and Mackey-Glass delay differential equations. (b) represents the error of second-order time derivatives using the Airy function, the jerk chaotic system I, Case 1, tanh(t) and sech(t). (c) represents the error of third-order time derivative estimates using the same signals as in (b). (d) represents the error of fourth-order time derivative estimates using the same signals as in (b) but excluding the jerk chaotic system I due to lack of a simulated fourth-order time derivative.
simplicity. The 33 low-dimensional chaotic flow systems and the Mackey-Glass differential equation have been omitted in the higher-order derivatives because higher-order derivative simulations were not possible for these systems. Because of the same reason, the jerk chaotic system I [28] is only simulated to the third order.
As a result of the optimization, we show that required embedding dimensionr = 5 always gives the best derivative approximation regardless of the systems/signals chosen. Actually, a smaller embedding dimensions result in underestimated derivatives, whereas higher embedding dimensions lead to overestimated values.
Figs. 13 and 14 show the example optimizations, illustrating the required shifting that minimizes the error in TDD. It is seen that the required shifting data points are 2, 4, 6 and 8 for the first four orders of time derivatives. Theoretically, the SVD-based differentiation and TDD methods yield the averaged derivative for each row of the Hankel matrix. When r = 5, the required shifting for SVD-based differentiation should therefore be 2. Analogously, the same shifting is needed for Gibson-II as well. Regarding Gibson-I, the embedding dimension is r = 3; therefore, the shifting is 1 for this method.
As a result, we show that the shifting for the TDD method is 2k, where k represents the k th time derivative, which is in agreement with the optimization results, as shown in Fig. 13. In addition, we employ genetic algorithm to check the required embedding dimension by varying the shifting between 0 and 19, and embedding dimension from 3 to 20 to minimize Eq. (18). Results also confirm that the required embedding dimensionr is 5 and shifting is 2k.

Clean signal
In this section, we systematically test the proposed TDD method for selected cases and compare with some common numerical differentiation techniques, such as the fast Fourier transform (fft) and fourth-order central difference (C4D), and with Gibson-I and Gibson-II. Methods that are described as Gibson refer to Gibson et al. [9], where they use the method of delays combined with Legendre polynomials to estimate the derivatives. Gibson-I and Gibson-II differ from each other based on the two different constant polynomial coefficients p, where Gibson-I refers to p = 1 and Gibson-II refers to p = 2.
Since Gibson-I and Gibson-II are based on the method of delays, the resulting derivatives should also be shifted. As in the TDD method, we follow the same optimization approach and find the corresponding shifts of 1 and 2 for Gibson-I and Gibson-II, respectively. Note that Gibson-I is limited to the second time derivative estimation because the Legendre polynomial exists up to the second order for p = 1. Similarly, Gibson-II is applicable for time derivative estimations up to the order of 4. Table 1 illustrates the NRMSE value of the first two derivative approximations for Cases 1 to 4 for clean (without noise) signals. Because the SVD-based differentiation and TDD methods are the same when the first derivative is considered, they are indicated in the same row next to each other in Table 1. As seen, the TDD method generally yields comparable results to the other methods for both the first and second derivative estimations, and the SVD-based second derivative estimation    is usually not as good as the other methods, as expected. In contrast to all of the other cases, the SVD-based approach produced comparable second-order time derivative estimations only in Case 4 in Table 1. Fig. 15 shows the comparisons of the first two derivatives obtained by simulation, TDD, Gibson-II, C4D and fft. As can be seen, no obvious discrepancy exists between the simulated velocity (ẋ) and the estimates from TDD, C4D and Gibson-II combined. However, because the boundary of the signal (not shown here) is not smooth and periodic, a boundary effect on the velocity estimates using fft can be clearly seen in Fig. 15g. As a consequence, estimated derivatives have to be truncated at the beginning and end section to eliminate boundary effects as described in Kutz [29]. The results show that all tested methods provide estimates of accelerations that generally agree well with analytical and simulated accelerations.
From an engineering perspective, jerk (derivative of the acceleration) and jounce (derivative of the jerk) are very important parameters (Eager et al. [30], Jazar [31], Smith and Christensen [32]); hence, they should be estimated as accurately as possible. One can also estimate these parameters using TDD following the steps in Fig. 1 for five different systems, including Case 1 (quasi-periodic signal), Airy function as in Oliphant [27], jerk chaotic system I as in Sprott [28], jerk chaotic system II as in Linz and Sprott [33] and the time-varying hyperbolic tangent function. Table 2 shows the NRMSE value of the third-and fourth-order derivative approximations of the selected cases for clean (without noise) signals. In the analysis, the sampling frequency of the Airy function, the jerk chaotic system I, the jerk chaotic system II and tanh(t) are chosen as 100 Hz, 20 Hz, 50 Hz and 40 Hz, respectively.
The results of higher-order derivative estimations show one common behavior that TDD successfully computes the third-, fourth-and even higher-order derivatives (not shown). In addition, when some level of noise is added to those signals, as in Table 3, the TDD method performs better at estimating higher-order derivatives.

Effect of noise
It is known that real time series are inevitably contaminated by noise; therefore, we further investigate the TDD method in the presence of low-level Gaussian noise for selected cases, where the selected cases include the first four cases (Cases 1 ∼ 4), similar to Table 1.
Subsets of principal components have maximum variance; therefore, they are also expected to have a maximum signalto-noise ratio, meaning that the power of the signal will be much larger than the embedded noise power. It should also be noted that the TDD method uses a fixed embedding dimension that is found by optimization; therefore, in this sense, PCA becomes the optimal linear coordinate transformation as described in Gibson et al. [9]. Table 3 illustrates the NRMSE value of the first two derivative approximations for Cases 1 to 4 in the presence of noise. As seen, all the methods tested give comparable results when the first-order derivative is considered except fftbased differentiation. For the first two derivatives, TVRD outperforms the other methods in Case 1, whereas Tikhonov and DF give the lowest NRMSEs in Case 4. In Cases 2 and 3, Tikhonov is in general superior to other methods. TDD produces similar results with Gibson-II for the first derivative while better results than the latter method for the second derivative. In comparison with DF, TDD yields similar NRMSEs for Case 1 and Case 3. As a result, TDD generally gives comparable estimates of first and second-order derivatives. Another important point here is that, the parameters fed to TVRD (i.e., regularization parameter) and DF (i.e., cut-off frequency and order number of DF) need to be tuned to obtain comparable results in terms of NRMSE. As for Tikhonov, regularization matrix, such as identity matrix, first order derivative operator matrix and second order derivative operator matrix [37][38][39][40], is data dependent and should be chosen prior to analysis. Therefore, one need to intuitively know what to expect in the end, which is not always easy. On top of these limitations, the performance of the TVRD and Tikhonov (solved by using LSQR algorithm [41], because of possible largescale problem encountered) are slow when compared with other methods because of the number of iterations involved to obtain converged results.
Regularization parameter for Tikhonov can be determined by heuristic L-curve method [39,42], generalized crossvalidation [43] or other methods mentioned in Hansen [42]. However, none of the existing methods to determine regularization parameter perform well for all signals [42]. The significant effect of the noise on the selected signals can also be seen from the corresponding time history plots. Fig. 16 illustrates the numerical and estimated first-and second-order time derivatives of Case 2 using TVRD, TDD, Gibson-II, C4D and fft. As seen, TDD produces smoother acceleration signals than the other tested methods except for TVRD that is not cost efficient; therefore, it also produces clearer phase portraits as described in the earlier sections.

On the performance of TDD on chaotic systems with multistability
Recently, two studies were performed to explore the dynamic behavior of chaotic systems with multistability [44,45]. One newly devised system described in Natiq et al. [44] is shown below (referred to as Natiq-I hereafter): where y 1 , y 2 , y 3 are state variables; ξ , σ , β, α and µ are system parameters; the control term f 1 (y 1 ) = cos (y 1 ).
The second system is a plasma perturbation mathematical model (henceforth denoted as Natiq-II) and is expressed as [45]: where y 1 , y 2 , y 3 are state variables; σ , β and µ are independent system parameters; a is a coefficient; the control terms f 1 (y 1 ) = exp ( −y 2 1 ) and f 2 (y 1 ) = cos (y 1 ). To simulate the Natiq-I and Natiq-II system, ξ , σ , β, µ, α, a, control terms f i and initial conditions are selected as those presented in Natiq et al. [44,45], see also Appendix. Here, sampling frequency 100 Hz is used for simulation. Figs. 17-18 exhibit the mean NRMSE (i.e. mean of NRMSE forẏ 1 ,ẏ 2 , andẏ 3 ) of first time derivative estimates by various methods for clean signals, which are produced by simulation, relative to simulated derivatives. Generally, C4D and Gibson-I yield least discrepancy. Gibson-II and TDD give similar results except for Natiq-II cases #21, 22 and 24. However, the deviation of results by using TDD from simulated derivatives is limited within 0.065%. In most of Natiq-I and Natiq-II cases, fft produces worst approximation of derivatives, though edge effect is minimized by removing estimations of derivatives in the vicinity of boundary.
By checking the signals (y 1 , y 2 and y 3 ) for Natiq-II cases #21, 22 and 24, we find that signals jump abruptly near initial condition and are strongly damped. In order to improve the performance of TDD for these cases, we propose to  mirror the signals, i.e. reverse the original signals and append the reversed signals to the original signals. This way, we not only increase the length of the signal, but also assume that the mirrored signal is periodic. And, after TDD is employed, one can eliminate the mirrored part and obtain the differentiated signal. This signal mirroring technique is similar to zero-phase-shift method for causal filters [46]. The combination of TDD with the signal mirroring technique is referred as mirrored TDD, or simply TDD m hereafter.
Again, as seen in Figs. 17-18, TDD m performs either equivalently with or better than TDD and Gibson-II in estimating derivatives.
We also apply TDD m to Cases 1 ∼ 4. Applying TDD m on Cases 1 ∼ 4 contaminated by noise (see Table 3), we find that TDD m yields same or slightly better results when compared with TDD, e.g., NRMSE ofẋ decreases from 0.121% to 0.0857% for Case 4. When applied to clean signals (see Table 1), TDD m results in that NRMSEs ofẋ andẍ drop from 0.0358% to 0.00796% and 0.0446% to 0.0194%, respectively for Case 1. NRMSE ofẋ for clean signal of Case 4 after using TDD m reduces from 0.100% to 0.0520%. For other cases (Case 2 and Case 3), TDD m gives the identical results as TDD.
TDD m reduces NRMSE ofẋ of Case 4 (both clean and noisy signals) due to that x has quasi-linear drift component (see Fig. 8a) and TDD m manages to handle traveling motions better than TDD (see the reason in Section 7).
In terms of third and fourth order time derivatives (see Table 2), TDD m improves the estimations for Airy function markedly. More specifically, NRMSE of ...
x declines from 0.179% to 0.0560%. Similar to Natiq-II cases #21, 22 and 24, time histories of second and third derivatives of Airy function are decaying in time. This is why TDD m gives rise to better estimates of higher order derivatives than TDD for Airy function.

On the performance of TDD and its variant on real experimental signals
Here, we apply TDD and TDD m together with other methods on experimental signals obtained from the same experimental campaign described in Section 3.2.
As Tikhonov in general produces high-quality estimates of derivatives for noisy signals (see Table 3), we compare results from other differentiation methods with those given by using Tikhonov.
Deviations of the results obtained by using various differentiation methods from those by Tikhonov are summarized in Fig. 19. For all experimental signals, Kalman gives results that deviate most significantly from those by means of Tikhonov (NRMSE larger than 1%). The main reason is that the motion of floes C and D differs slightly from ice floes that were instrumented with IMUs (see details in Li et al. [25]). However, Kalman optimally combines displacement measurements with acceleration measurements. Hence, the results given by Kalman are quality-assurance for estimates produced by other methods [25].
TDD yields evidently different estimates of velocities for test runs #3210 and #3220 in comparison with Tikhonov. This is attributed to quasi-linear drift motion of ice floes C and D in these two test runs (see Figs. 10c and 11c). TDD is based on SVD, which has inherent caveat for traveling motion signals [48]. In contrast, TDD m improves the estimates appreciably for test runs #3210 and #3220.
For all motion signals of ice floes studied here, TDD m gives similar estimates of velocities with the methods C4D, Gibson-I, Gibson-II, Tikhonov, TVRD and DF (NRMSE within 0.65%).
Another set of experimental signals analyzed here are ECG signals [49][50][51]. These ECG signals include 17 states measured on 45 patients, and are sampled at 360 Hz with a duration of 10 s [49]. We randomly choose 17 signals and each of them corresponds with a different state. The results are displayed in Fig. 20. It is seen that results obtained by means of TDD and TDD m are equivalent and deviate least from those produced by Tikhonov in general (NRMSE less than 1.5%). In contrast, C4D results in estimates that have largest discrepancy from the results given by Tikhonov. The other methods (Gibson-I, Gibson-II, TVRD and DF) yield derivatives with modest difference from Tikhonov.

Summary and conclusion
In this study, we introduced a simple analytical function based on the method of delays with PCA to estimate time derivatives of a given signal. In the function, we showed that the sampling frequency term scales the derivatives in PCA correctly and estimates the higher-order derivatives accurately when compared with some common numerical differentiators.
It is known that time derivative estimations change with the embedding dimension in the delay coordinates; however, when TDD is considered, we showed that the required embedding dimension that best describes the analytical derivatives is a constant value, which is equal to 5. This is particularly significant, given the simplicity of the method. In addition, unlike Gibson et al. [9], TDD does not require creating complex polynomials to estimate higher-order derivatives because we update the Hankel matrix in each derivation. Using numerical (including 38 different chaotic systems) and experimental cases, we have demonstrated the applicability of TDD to different types of signals in both the absence and the presence of some level of Gaussian noises. At the end, we have shown that TDD and its variant TDD m produce comparable results in the absence of noise and computes the derivatives better in the presence of low-level Gaussian noise.
We conclude that there is not a single method that works best in all applications in terms of accurate and fast numerical differentiation. Depending on the application, noise level and many other parameters, one method might outperform the other. Therefore, this process is really application dependent. Some methods such as total variation regularized differentiation (TVRD), Tikhonov regularization based differentiation (Tikhonov), differentiator filter (DF) either need to be tuned or a regularization matrix should be chosen to obtain comparable or better results; which means, one need to intuitively know what to expect as a result priori and this is not always easy. Also, the performances of the TVRD and Tikhonov are slow when compared with other methods that are mentioned in this paper. However, one consistent result is that the new TDD (and TDD m ) method always gives comparable results to estimate the derivatives of a signal when compared with other well established methods such as central-fourth order difference, TVRD, and Gibson's Legendre polynomials and therefore it can be used as a new numerical differentiaton method.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix. Parameter values and initial conditions for chaotic systems with multistability
Parameter values and initial conditions for Natiq-I and Natiq-II system are listed in Table A.1 and Table A.2, respectively.