Reconstructing complex system dynamics from time series: a method comparison

Modeling complex systems with large numbers of degrees of freedom has become a grand challenge over the past decades. In many situations, only a few variables are actually observed in terms of measured time series, while the majority of variables—which potentially interact with the observed ones—remain hidden. A typical approach is then to focus on the comparably few observed, macroscopic variables, assuming that they determine the key dynamics of the system, while the remaining ones are represented by noise. This naturally leads to an approximate, inverse modeling of such systems in terms of stochastic differential equations (SDEs), with great potential for applications from biology to finance and Earth system dynamics. A well-known approach to retrieve such SDEs from small sets of observed time series is to reconstruct the drift and diffusion terms of a Langevin equation from the data-derived Kramers–Moyal (KM) coefficients. For systems where interactions between the observed and the unobserved variables are crucial, the Mori–Zwanzig formalism (MZ) allows to derive generalized Langevin equations that contain non-Markovian terms representing these interactions. In a similar spirit, the empirical model reduction (EMR) approach has more recently been introduced. In this work we attempt to reconstruct the dynamical equations of motion of both synthetical and real-world processes, by comparing these three approaches in terms of their capability to reconstruct the dynamics and statistics of the underlying systems. Through rigorous investigation of several synthetical and real-world systems, we confirm that the performance of the three methods strongly depends on the intrinsic dynamics of the system at hand. For instance, statistical properties of systems exhibiting weak history-dependence but strong state-dependence of the noise forcing, can be approximated better by the KM method than by the MZ and EMR approaches. In such situations, the KM method is of a considerable advantage since it can directly approximate the state-dependent noise. However, limitations of the KM approximation arise in cases where non-Markovian effects are crucial in the dynamics of the system. In these situations, our numerical results indicate that methods that take into account interactions between observed and unobserved variables in terms of non-Markovian closure terms (i.e., the MZ and EMR approaches), perform comparatively better.


Introduction
A complex system is a system composed of highly interconnected components in which the collective property of an underlying system cannot be described by dynamical behavior of the individual parts alone.
That is to say, the emergent behavior of a system is not evident in the components by themselves, but only in the system as a whole. Examples of this include lasers, fluids, financial markets, biological and social systems, but also Earth system and in particular climate dynamics variables [1][2][3][4]. Typically, complex systems are governed by nonlinear interactions and intricate fluctuations, and to retrieve the dynamics of a system from time series of only a few observed variables, it is required to characterize and assess interactions between deterministic tendencies and random fluctuations.
For complex systems, with large numbers of degrees of freedom interacting on various time scales, deriving explicit time-evolution equations remains both conceptually and computationally challenging. Many approaches from time series analysis and stochastic modeling have been proposed to model the behavior of complex systems based on observed time series by separating the system's behavior between observed macroscopic and hidden microscopic scales. Furthermore, it is typically assumed that these scales can be separated into slow and fast dynamics, respectively. Stochastic differential equations (SDEs) are then used as a result of the impossibility to determine all necessary processes and scales in comprehensive numerical models. In SDEs the macroscopic variables are modeled explicitly while the microscopic ones are represented by noise terms. The intuition underlying such approaches is to identify a small number of variables which are assumed to characterize the underlying dynamics, and solve the governing motion equation for the reduced-dimensional system in terms of SDEs. For instance master stability function has been extensively used to analyze the stability of the synchronous state in dynamical systems (both stochastic and deterministic) [5,6].
During recent years, interest in modeling stochastic dynamics using Langevin equations (LEs) has elevated in many disciplines [7], such as investigating turbulent cascades [8], describing nano-friction fluctuations [9], or modeling molecular dynamics [10] and financial systems [11]. A commonly used, non-parametric method for deriving the deterministic drift and stochastic diffusion terms of an LE is based on the Kramers-Moyal (KM) coefficients, which correspond to the first and second order moments containing the spatial and temporal dependencies of the underlying dynamics [12,13].
In the LE-approach, the dependencies between slow and fast variables are assumed to be negligible as a consequence of the separation of time scales. However, this assumption is not always well-grounded, for instance in Brownian motion [14], when the mass of particles is comparable with that of the surrounding particles, the random force does not obey white noise behavior anymore. Generalized Langevin equations (GLEs) have been proposed as a modification of LEs with no constraint of time scale separability [15]. A GLE equation can be retrieved from times series using projection operator techniques [15][16][17] and linear response theory [12]. In this study, we obtain the GLE model of a system within the projection operator framework. This technique, developed by Mori [15] and Zwanzig [18](MZ), provides a mathematically consistent foundation for closed, reduced-dimension models. It is restrained, however, to Hamiltonian dynamical systems around thermodynamic equilibrium. This formalism applies a projection operator P that projects the whole system dynamics onto the subspace of slow variables, and projects out the remaining degree of freedom. As a consequence of this dimension reduction, non-Markovian terms arise explicitly in the resulting equation of motion, representing interactions between the slow (observed) and the fast (hidden) variables.
Another dimension-reduction approach, belonging to a class of multivariate stochastic models, is empirical model reduction (EMR) [19,20]. This methodology, which can be understood as an extension of the classical linear inverse model (LIM) approach [21], has been extensively used in modeling observed climate variables [22] and in several cases can successfully reproduce their statistical characteristics. In constructing the EMR model, it is assumed that the time evolution of a system can be approximated by information embedded in the observed time series with only a coarse knowledge about a system's dynamics.
When fitting a prescribed drift term with free parameter to a real-world series, the residual will typically not be white neither in space nor in time; i.e., it will be state-dependent and exhibit serial correlations. To cope with this problem, the EMR approach iteratively models the time derivative of the residual as a linear function of the state of the observed variable and the residual itself, until the final residual is approximately independent (see methods section below). Substantively, this stochastic model constructs the evolution of macroscopic variables by introducing hidden components representing microscopic variables. In the parameterization procedure, hidden variables are approximated by the past observed variables, which constitutes a non-Markovian closure [20].
The main purpose of this work is to systematically investigate the statistical properties of different dynamical systems using the three aforementioned methods (KM, MZ, and EMR). In section 2, we shortly introduce the employed reduction methods. In section 3 we present a detailed comparison of the performance of these methods based on different kinds of synthetic and real-world time series, and discuss them briefly. Concluding remarks are provided in section 4.

Langevin equation
In the early twentieth century Paul Langevin proposed a quantitative description of random motion of colloidal particles suspended in a fluid, which has been known as a key problem of non-equilibrium statistical mechanics. This theory's applicability later has been extended to express the dynamical behavior of varieties of macroscopic systems without genuine particle ontologies. The original Langevin equation is a stochastic differential equation, which represents the time evolution of a subset of degree of freedoms containing both frictional and random forces that are associated with the fluctuation-dissipation theorem (FDT) [12].
Consider a system for which the evolution of the macroscopic states x(t) obeys the following equation of motion: where f(x, t) and g(x, t)η(t) represent the deterministic force (e.g., friction and gravity) and stochastic forces (e.g., noise and chaotic particle interactions in many-body systems), respectively. Here, η(t) is conventionally a stationary, δ-correlated Gaussian process with zero mean: η(t) = 0 and The presence of δ-correlated noise indicates that the Langevin process is a Markov process. For the class of stochastic processes following equation (1), the drift f and the diffusion g can be directly estimated from the measured data without any prior knowledge about the internal dynamics of a system using the Kramers-Moyal (KM) coefficients [23,24]. The nth KM coefficient D n (x, t) can be understood in terms of the transition probability densities in the limit of dt → 0. Formally, the KM coefficients are defined as: In this study, to calculate conditional moments from time-discrete data, we employed the Nadaraya-Watson estimator [25,26]: The kernel function K is here assumed to be Gaussian and accordingly the bandwidth h is determined using Silverman's rule of thumb: h = 1.06σN The numerical discretization of the LE, in Itô's interpretation of stochastic integration, is as follows: where dW denotes the increments of a Wiener process (known as a stochastic process with stationary independent normally distributed increments). The numerical integration of the equation (4) can be easily implemented using the Euler-Maruyama scheme with accuracy of order √ Δt.
Although the KM method has been applied successfully in many fields, the accuracy of estimation can be affected by finite-time effects since the estimation of drift and diffusion coefficients is sensitive to the sampling frequency of the underlying time series [27,28]. Using the adjoint Fokker-Planck equation (AFPE) [29], we can derive D 1 and D 2 analytically for example for an Ornstein-Uhlenbeck (OU) process, and compare to the discrete-time estimates to judge on their accuracy: With a special initial condition p + (x, 0) = (x − X) we can solve the AFPE analytically. For instance, in the case of an OU process (containing 10 6 points with sampling frequency of 10 −2 ) subjected to additive and multiplicative noise, respectively, we compared the first and second order KM coefficients with those obtained from AFPE. As can be concluded from figure 1, the values of the KM coefficients estimated from discrete times series according to the equation (2) and the corresponding exact solutions are in a good agreement. One can also use the second-order expansion for moments [30] to check the robustness of estimated drift and diffusion coefficients for above systems.

Generalized Langevin equation
The Langevin equation is the simplest approximation to describe the dynamics of non-equilibrium systems. As remarked earlier, the random force is assumed to behave like a Gaussian white noise, which does not lead to a good approximation if the time scale of the macroscopic variables is not much longer than that of the microscopic variables. This issue also requires a modification of the fluctuation dissipation theorem in which the correlation function of fluctuations is proportional to the memory of the frictional force. Therefore, generalized Langevin equations (GLE) have been proposed to account for long-range correlations and memory effects of complex systems that do not exhibit strong time scale separation. This generalized equation satisfies a certain fundamental consistency condition which relates the memory function to the auto-correlation of the stochastic force.
One way to derive a GLE is by means of the Mori-Zwanzig (MZ) formalism [17,31], a projection-based dimension reduction method that redefines a set of ordinary differential equations into a reduced system with a time-independent Hamiltonian as long as the system is close to equilibrium. The MZ formalism assumes that a macroscopic system can be well described by projecting the full microscopic dynamics of a system onto the space of macroscopic variables. The resulting equation consists of three terms; the first, local term defines the self-interactions of the macroscopic variables, the second, non-local term describes memory dependencies of the macroscopic variables, and the last term represents the residual force associated with the fast variables that is typically approximated by white noise: By multiplying both sides of equation (7) with x(t 0 ) and taking the average [32][33][34], we arrive at the following expression: where R(t)x(t 0 ) = 0, because the fast variable of the underlying system is uncorrelated with the slow variable. We can rewrite equation (8) in form of an Volterra integro-diffential equation [35,36] for the autocorrelation function The memory kernel K can then be calculated by solving this equation with known dC/dt and C. Note that the discrete version of the integral , where we applied the composite left rectangular formula, with t j = t 0 + hj using equal spacing h. Thus having obtained estimates for the deterministic non-Markovian and the memory term on the right-hand side of equation (7), we are able to determine the random force. The non-Markovian deterministic part on the right-hand side of the equation (7), corresponding to the drift, is nonlinear for systems not close to equilibrium and Ωx(t) is thus replaced with F( x(t) with a nonlinear function F(x(t)). In such cases, we estimate this term using the first KM coefficient and estimate the memory kernel [34] via: Note that because we are considering discrete numerical time series, we are in practice using the following evolution equation:

Empirical model reduction
The primary goal of spatio-temporal modeling of complex systems is to characterize the properties of underlying dynamics with minimal assumptions about non-sufficient observations. Over the last few years, linear and nonlinear inverse stochastic modeling approaches have been intensively developed and applied to obtain reduced models that can explain the statistics of a full system [21,37]. The linear inverse model (LIM) approach assumes that the relevant dynamics can be decomposed into linear deterministic and additive Gaussian random fluctuations terms. Although the linear simplification can be considered as a decent tool for describing dynamical systems close to equilibrium, not surprisingly, most of real-world systems exhibit nonlinear behavior. Hence, the empirical model reduction (EMR) [19] has been proposed as a nonlinear generalization of LIM approaches. The general form of the EMR approach is as follows: first, model the increments dx k = x k+1 − x k of an observed variable x as a nonlinear function of x plus a residual, typically via a least-squares minimization yielding the parameters of f: Second, model the residual as a linear function of x k and r 0 k and repeat this until the remaining residual has vanishing autocorrelation: Here, dr n k equals r n k+1 − r n k with time index k and n denoting the regression level. The presence of these hidden variables that explicitly depend on the past values of slow variables brings forth 'memory' effects. Incorporating these adaptive number of memory steps resembles the MZ formalism. Opposed to the MZ method, which provides solutions for the dynamics of a system considering the full memory via the kernel K(t) function, the EMR offers approximate solutions based on the iterative procedure described above. We refer to Kondrashov et al [20] for further details and in particular explanations of the relationship between EMR and the MZ formalism.  (13)) with Δt = 0.01 whose statistical features are shown in red color.
In equation (12), in fact, we substitute the deterministic term f by the first KM coefficient. This term can be a polynomial with different orders depending on the process dynamics. For instance, to reconstruct a stochastic process that takes place in a double-well potential V(x), we first estimate the deterministic part via the first KM coefficient and then approximate the stochastic term using the recursive procedure.

Results and discussion
In the following, we present our results of KM, MZ, and EMR performance for reconstructing various dynamical systems with different levels of complexity. The performance of these three methods is examined in terms of statistical properties such as probability density functions (PDFs) and ACFs of large sets of time series simulated with the derived SDE models.
The OU process is known to be mean-reverting, i.e., the drift coefficient controls the forcing of the state variable x(t) back to its mean μ. We generate 10 6 data points with θ = 1, μ = 0, σ = 0.5 and time sampling Δt = 0.01. In order to numerically integrate equation (13), we apply the Euler-Maruyama method. We recall that for stochastic integration, there are two different approaches, Itô and Stratonovich. In the case of the OU process with an additive noise, these two interpretations are equivalent and in this study we mostly employ the Itô method.
In this study the deterministic terms of simulated time series constructed by the three aforementioned inverse modeling methods (KM, MZ, and EMR) are estimated directly from the first KM coefficient. Figure 2 displays the statistical properties of the time series generated using the original OU process as well as corresponding, randomly chosen time series obtained from the three different methods, in comparison with the original time series. The PDFs are obtained as averages over 1000 simulated time series and the error bars indicate the deviations ±σ around the averages for the 1000 realizations. We calculate the mean squared error (MSE) as an error metric to evaluate the performance of the KM, MZ, and EMR methods in reconstructing the statistical properties of underlying systems. The summary of MSE values can be found in tabels 1 and 2. For the specific settings investigated in the case of equation (13), all three SDE models perfectly reproduce the statistic of the underlying dynamics. Here the auto-correlation exponentially decays and the memory coefficients of GLE are zero except at τ = 1, indicative of a Markovian process.
Dynamical processes with multiplicative noise are ubiquitous in the natural sciences and beyond [38][39][40]. For a system subject to an additive noise the deterministic potential corresponds to stochastic steady-state potentials through V s = V d σ , where σ is the intensity of fluctuations that are independent of the system state x. It implies that, in response to the stochastic force, the system exhibits stochastic fluctuations from its deterministic attractor. In the case of a multiplicative noise, a noise-induced drift becomes visible and consequently a new attractor basin is generated that does not exist in the absence of state-dependent fluctuations. In equation (13) we substitute the additive noise with a multiplicative noise (i.e., the diffusion term is multiplied by 1 + x 2 ) whose intensity depends on the instantaneous value of the state As it can be concluded from figure 3, all three methods work very well in reproducing the statistical features of the linear system also when exposed to symmetric multiplicative noise.
For more comparison, we analyze a system subject to asymmetric noise, which results in a skewed distribution. We thus multiply the noise term in equation (13) by (1 + x) (instead of (1 + x 2 )) and evaluate the statistical properties of the system. According to figure 4, the KM model exhibits better performance in comparison to the two other models, since it can directly estimate the (in this case asymmetric) state-dependent noise from the time series.
So far we considered additive and multiplicative noise, with η itself given by Gaussian white noise. But in many physical and biological systems, fluctuations exhibit some degree of correlation that cannot be satisfied by white noise. Therefore we substitute the stochastic term of equation (13) with a first order autoregressive process (AR(1)) and investigate the performance of the three aforementioned methods to derive SDEs in the presence of colored noise.
An AR(1) process is given by: where η is a Gaussian white noise process with zero mean and constant variance. According to figure 5(b), the statistical properties of simulated time series constructed via the KM method can not perfectly follow the original ones in this case. From tables 1 and 2 it can also be concluded that MZ and EMR perform better (with smaller MSEs). This is expected, since the presence of colored noise implies a deviation from the white-noise assumption of the LE and the KM method to derive it. It is also of interest to evaluate the performance of these SDE models when the system exhibits some short-term memory in the deterministic part. For this purpose we consider a stochastic delay differential equation given by: The random force R is given by serially correlated noise produced by a second-order autoregressive (AR(2)) process in which the current value depends on the two previous values. As can be observed from figure 6, the EMR and MZ methods yield significantly better approximations of the PDF and ACF than the KM method for this system with short-term memory in the drift and colored noise. This provides an instructive example for situations where the MZ and EMR approaches clearly outperform the KM approach (as theoretically expected).

Real-world time series
In recent years, quantifying stochastic dynamics of financial time series (e.g. stock prices and stock market indices) through stochastic differential equations in order to describe their evolution has attracted considerable attention. In this study we analyze the weekly S & P500 stock index for the time span of 35 years (1950-1985) and analyze its distinctive statistical properties. Here the stock return price ln p(t+δt) represents the state variable x(t). The statistical results of the simulated time series estimated by the three inverse modeling methods is shown in figure 7. According to the KM coefficients the deterministic part of the dynamics is described by a linear function of the state variable, while the stochastic term exhibits nonlinear behavior.
As mentioned above, we exert the first KM coefficient also to approximate the deterministic term of the GLE and EMR models. In this first real-world case, the approximation of the PDFs is not perfect anymore. The results reveal that the MZ formalism outperforms the two other methods in terms of the ACF, with very good agreement to the original return series ( figure 7(b), second row, second column). It is worth mentioning that the statistical properties of financial time series exhibit a time-scale dependence and long-range correlations [41,42]. Therefore we conclude that the memory effects of recent returns occurring in different time scales (from minutes to several days) can be modeled well by the MZ technique.
Another empirical time series that exhibits uni-modal variability is the Niño-3 index [43], which is one of several El Niño-Southern oscillation (ENSO) indicators in regard to tropical Pacific sea surface temperatures (SST). During the last decades, understanding the mechanisms underlying ENSO variability and prediction of future fluctuations has attracted substantial attention [44][45][46][47]. ENSO describes variations in temperature and pressure in the eastern Pacific ocean and has significant impacts on global climate variability. We reconstruct the Niño-3 monthly sea surface temperature (SST) indices averages across (5N-5S, 150-90W) from 1891 to 2015 using the KM, MZ, and EMR inverse modeling approaches. The non-Gaussian behavior of Niño-3 indicates a nonlinear process, quantified by the positive skewness of the SST distribution, which may reside in the interaction of oceanic variables of interest and the atmospheric fast forcing [48]. This kind of interactions brings about memory effects into the system dynamics that should not be expected to be fully captured by models established based on Markovian assumptions. Figure 8(b) displays the PDFs and ACFs of the resulting simulated time series. Although all three inverse methods produce similarly skewed and heavy-tailed distributions, comparing the AFCs, MZ achieves a higher accuracy approximation than KM and EMR. We repeated this procedure for the Niño-4 index, and also in that case found that the MZ approach outperforms the other two methods. According to these results, the most significant characteristic of these real-world processes leading to ENSO variability is the presence of serial correlations connected to internal interactions between observed (slow) and unobserved (fast) variables which can be captured best by the MZ approach.
In order to investigate whether these results are prone to overfitting, we first calibrated the three SDE models on the first half of the time series and validated the parameters by comparing the resulting time series statistics with the ones of the second half of the time series for both systems (Niño-3 and S & P500, respectively). The results are presented in figures 9(a) and (b). It can be inferred that the longer-term variations in the ACF are still captured to some degree by the MZ method. For the ENSO case, the slow variations of the ACF correspond to low-frequency variability present in this time series, which is not noise but an important part of (internal) climate variability. The fact that the reproduction of these slow variability modes is less accurate when only calibrating on the first half of the time series is due to the fact that capturing these slow variability modes is harder when considering only a shorter part of the time series. For the S & P500 time series however, the comparably faster fluctuations of the ACF most likely correspond to noise. The very good approximation of the ACF when applying the MZ formalism to the full S & P500 time series could hence be the result of overfitting when deriving the full kernel K.

Synthetic time series
Up until now we studied processes with a unique mode; however, there are various stochastic dynamics in natural systems that do not exhibit uni-modality. We start with generating a synthetic process in which a particle moves in a double-well (DW) potential at ± √ θ, driven by additive Gaussian noise Here we construct the stationary time series with θ = 1, σ = 0.5 and Δt = 0.01. For all three inverse models the specific values of the drift terms are again derived from the first-order KM coefficients from the original time series. The non-Markovian term in the GLE derived via the MZ formalism, which accounts for interactions between the slow and fast variables, was estimated from the information encoded in the ACF [32,34] (equation (7)). Considering the Markov property of the underlying system (17), the kernel function K of the MZ approach is zero (as theoretically expected). The left column in figure 10 illustrates the average PDFs of model-simulated time series in comparison with original time series, for 1000 realizations, and the right column displays the associated ACFs. It can be seen that these three stochastic models can mimic the key features of the original time series very well.
In contrast, in the presence of a multiplicative noise, estimating the underlying dynamic is not trivial anymore. We substitute the additive noise in the previous example with √ 1 + x 2 . As it can be observed in

Real-world time series
Identification and understanding the triggering mechanism of the Dansgaard-Oeschger events, abrupt temperature increases during the last glacial period (roughly from 115.000 years to 12.000 years BP) has attracted considerable attention in the past years [49][50][51][52][53]. These rapid and strong warming events (up to 16k within few decades [54]), followed by slow-paced cooling phases are inferred from the content of stable isotope composition of water δ 18 O in different Greenland ice cores. These proxy records allow to empirically reconstruct the temperature profile over the last 120.000 years with high temporal resolution. In this study we investigate the high-resolution (20 years-average) Ca 2+ (interpreted as a proxy for atmospheric circulation patterns), collected from the NGRIP ice core on the GICC05 time scale [55]. Because of the substantially better signal-to-noise ratio, we focus here on the Ca 2+ time series between 60ka and 30ka b2k. We apply the KM, MZ, and EMR approaches to reproduce dynamical and statistical properties of the underlying dynamics of [Ca 2+ ].
Since the integration of the diffusion term of an SDE is not uniquely defined, two different frameworks for stochastic calculus have been proposed, namely the Itô and the Stratonovich scheme (see appendix A for further details). Up to now, we have interpreted the SDEs in the Itô sense, for which a consistent discretization scheme is the Euler-Maruyama method with straightforward white-noise increments.
In contrast, in the Stratonovich calculus white noise is approximated by continuously fluctuating noise with finite memory, which may be more suitable for approximating real-world time series. Indeed, it has been argued that the Stratonovich scheme is more appropriate for continuous physical systems, such as ocean and atmospheric circulation systems [56]. A well-defined SDE discretization scheme that is consistent with the Stratonovich interpretation is the Heun method, which has been shown to perform with higher precision for a dynamical system model of paleoclimate variability [57].
We compared the performances of both stochastic calculi and corresponding discretization schemes (Euler-Maruyama and Heun, respectively) for the Ca 2+ ice core time series. According to figure 12, the Stratonovich calculus performs better than the Itô calculus in terms of approximating the PDF. This likely corresponds to the intrinsic properties of the Stratonovich interpretation in which fast quantities with finite correlation time, as observed in continuous physical systems such as climate variables, can be approximated as white noise. The better performance obtained using the Stratonovich calculus is in agreement with observations made in earlier studies [56,57]. Figure 13 shows the statistical properties (PDFs and ACFs) of the observed and simulated [Ca 2+ ] time series in the interval 60.000 years to 30.000 years before the year 2000 AD. Because of the large amplitude of the [Ca 2+ ] concentration variations, the calculations were conducted in natural logarithmic scale [50]. The drift term, again directly derived from the KM coefficient for all three inverse methods, exhibits a nonlinear behavior with two stable equilibrium states, which reflects the switching mechanism between cold stadials and warmer interstadials.
The results presented in figure 13 show that the modeled time series for all three methods could accurately reproduce the bimodality of the observed PDFs, although it should be noted that the LE derived with the KM approach approximates the observed PDF better than the GLE and EMR approaches: the depth of the two potential wells in MZ and EMR are shallower than the observed PDF suggests. It is clear from the results illustrated in the right column of the figure 10(b) that the MZ

Conclusions
In the last decades, there have been considerable methodological advances in understanding and quantitatively modeling the behavior of complex systems. Among others, stochastic differential equations (SDEs) are a promising approach for this challenge in situations where only a few variables are actually measured. For some typical example settings, we have studied the performance of three methods to derive SDEs-the Kramer-Moyal approach to derive Langevin equations (LE) with potentially multiplicative noise, the Mori-Zwanzig approach to derive generalized Langevin equation (GLE) including a non-Markovian term, and the empirical model reduction (EMR) approach to derive GLEs-for various synthetic and real-world time series. Since the microscopic dynamics of complex systems are often not accessible, these methods regard the unobserved, fast variables as random fluctuations and investigate the properties of underlying systems from the evolution of only a few macroscopic, comparably slow observables. In this study corresponding numerical simulations of all three inverse methods have been examined in terms of PDFs and ACFs of the simulated time series, as metrics for assessing the models' performance. We generally observed a nearly optimal performance of all three approaches for unimodal Markovian systems. For unimodal, non-Markovian systems, the MZ and EMR strongly outperformed the KM approach as theoretically expected. It is thus not possible to draw a firm conclusion about which SDE deviation method is in general superior in comparison to the others. According to our results, the performance of SDE models strongly relies on the effects of memory on the underlying dynamics. We could show that LEs (as derived by the KM approach) obtain better results for systems with weak memory contribution but asymmetric multiplicative noise, since it can directly estimate the state-dependent noise with higher precision. On the other hand, for systems with memory effects and colored-noise forcing, it is essential to take into account the non-Markovian closure terms. Hence, the MZ and EMR approaches can be considered more reliable in reconstructing the dynamics of systems exhibiting strong memory effects. In these two methods, the interactions between observed and unobserved variables are taken into consideration in terms of memory effects. That is, the EMR approach incorporates adaptive numbers of memory steps, whilst the MZ method considers the full memory of a system via the kernel K. With regards to prospective investigations, we suggest that the adaptability of the three investigated SDE models in reproducing dynamical characteristics of multi-variate time series should be further investigated. For example, Boers et al [50] introduced a two-dimensional stochastic delay differential equation model as an approximation of the GLE to analyze the temperature and dust proxy records from Greenland ice cores. Their results revealed the role of the coupling strength between the two variables, where the model has problems in reproducing the bimodality in absence of coupling terms, and also shows that the explicit consideration of memory terms helps to reproduce the asymmetry observed in those temperature and dust proxy records. Furthermore, it should be of interest and high practical relevance for real-world systems to consider an adaptive stochastic differential equation framework that can reconstruct the underlying dynamics of complex systems driven by Lévy noise.