Finite-range viscoelastic subdiffusion in disordered systems with inclusion of inertial effects

This work justifies the paradigmatic importance of viscoelastic subdiffusion in random environments for cellular biological systems. This model displays several remarkable features, which makes it an attractive paradigm to explain the physical nature of biological subdiffusion. In particular, it combines viscoelasticity with distinct non-ergodic features. We extend this model to make it suitable for the subdiffusion of lipids in disordered biological membranes upon including the inertial effects. For lipids, the inertial effects occur in the range of picoseconds, and a power-law decaying viscoelastic memory extends over the range of several nanoseconds. Thus, in the absence of disorder, diffusion would become normal on a time scale beyond this memory range. However, both experimentally and in some molecular-dynamical simulations, the time range of lipid subdiffusion extends far beyond the viscoelastic memory range. We study three 1d models of correlated quenched Gaussian disorder to explain the puzzle: singular short-range (exponentially correlated), smooth short-range (Gaussian-correlated), and smooth long-range (power-law correlated) disorder. For a moderate disorder strength, transient viscoelastic subdiffusion changes into the subdiffusion caused by the randomness of the environment. It is characterized by a time-dependent power-law exponent of subdiffusion, which can show nonmonotonous behavior, in agreement with some recent molecular-dynamical simulations. Moreover, the spatial distribution of test particles in this disorder-dominated regime is shown to be a non-Gaussian, exponential power distribution, which also correlates well with molecular-dynamical findings and experiments. Furthermore, this subdiffusion is nonergodic with single-trajectory averages showing a broad scatter, in agreement with experimental observations for subdiffusion of various particles in living cells.

The viscoelastic nature of subdiffusive processes in the cytosol of biological cells in its functional liquid-like state received a solid experimental support [4,11,14,15,17,22,24,28,86,[88][89][90]. It was, however, also challenged [19,56,57,91] due to a broad distribution of diffusion coefficients derived from single-trajectory averages because free viscoelastic subdiffusion is ergodic [55,64,65,92]. Hence, diffusion coefficient derived from single-trajectories should not be broadly distributed unless the test particles are broadly distributed in sizes, if diffusion is ergodic. Moreover, the single-trajectory and ensemble averages should coincide. This is, however, frequently not the case. One needs to explain a combination of viscoelastic subdiffusion with non-ergodic features like a large scattering of single-trajectory averages. For living cells, it seems obvious that viscoelastic subdiffusion, as a basic pertinent mechanism, should be combined with trapping caused by elements of the cytoskeleton and molecular crowding. This general idea was until now expressed within three different modeling approaches: (i) by a combination of FBM with CTRW via a subordination, i.e., introducing a randomized time in FBM [86]; (ii) by considering FBM living on a fractal structure [17], and (iii) by considering viscoelastic GLE subdiffusion in random potentials [87] modeling a quenched disorder. The third approach appeals as more physical and fundamental, allowing for various modifications and huge variations, including natural generalizations beyond thermal equilibrium [65].
Indeed, locally memoryless diffusion in disordered media modeled by a random potential or random force-field provides the foundation of several anomalous diffusion theories [52,93]. For example, the model of CTRW with power-law distributed residence time and divergent MRT in traps comes naturally from the model of exponential energy U(x) disorder with energy fluctuations characterized by a root-mean-square (RMS) amplitude σ = δU 2 (x) 1/2 , yielding α = k B T /σ < 1 for σ > k B T [50,51,54]. Furthermore, uncorrelated random Gaussian force f (x) model, f (x)f (x ′ ) ∝ δ(x − x ′ ) yields a Brownian motion of the potential U(x) = − x f (x ′ )dx ′ in space, with growing variance, δU 2 (x) ∝ x. It provides a continuous space generalization of the original lattice Sinai model [94] as a Langevin-equation-based subdiffusion in such a fluctuating potential. It yields δx 2 (t) ∝ | ln(t/t 0 )| a with a = 4 [93,95,96], i.e., a logarithmically slow subdiffusion. The model of Gaussian energy disorder naturally emerges due to the central limit theorem [93] in many physical systems ranging from organic photoconductors [97,98] and supercooled liquids [99,100] to diffusion of regulatory proteins on DNA tracks [36][37][38][101][102][103][104][105]. Here, Sinai model should be contrasted with the model of uncorrelated or short-range correlated random stationary Gaussian potentials with a finite energy RMS fluctuation, σ, and normalized autocorrelation function g(x) = U(x ′ + x)U(x ′ ) /σ 2 , which seems to be more appropriate for such systems. It was, however, long time thought to lead just to normal diffusion with a renormalized diffusion coefficient [97,99,106,107]. This theoretical result is known as de Gennes-Bässler-Zwanzig renormalization of diffusion and rate processes by disorder [107]. However, numerical results on the overdamped Langevin diffusion [36,[79][80][81][82][83] and Monte Carlo jump diffusion [9,108] revealed a typical subdiffusion even for a short-range correlated disorder, like g(x) = exp(−|x|/λ), on the spatial scale [36,83] strongly exceeding the correlation length λ, when σ exceeds k B T by several times. Also experiments with colloidal particles in optically created random Gaussian potentials confirm the existence of such transient subdiffusion [8,9]. This subdiffusion occurs paradoxically much faster in absolute terms than the normal diffusion expected from the de Gennes-Bässler-Zwanzig renormalization [36,83]. Its non-ergodic origin is by now well understood [36,83]. Remarkably, de Gennes-Bässler-Zwanzig renormalization is mathematically valid asymptotically for any stationary ergodic Gaussian potential with decaying spatial correlations, g(x) → 0, x → ∞ [36]. However, it becomes practically irrelevant and even misleading for σ ≫ k B T . Moreover, strikingly enough, transient subdiffusion emerges also for a potential, which is uncorrelated on the sites of a lattice with a discretization step or lattice period ∆x and continuous in between [83]. It can also last very long with λ → ∆x in its spatial range estimate [83].
Another recent surprise was provided by the emergence of a Sinai-like logarithmic diffusion with b ∼ 1 − 4 in such stationary Gaussian potentials for σ > 5 k B T [83]. All these remarkable features were explained within a simple scaling theory [83] inspired by one developed for the continuous-space Sinai diffusion by Bouchaud at al. [93,95]. It explains also the origin of a very long transient regime of standard, power-law-scaling ensemble subdiffusion with α ≈ 2k B T /σ eff and σ eff ≈ (1.24 − 1.52)σ depending on the model of g(x) [83]. This dependence looks remarkably similar to the case of exponential disorder, and the single-trajectory averages are also broadly scattered. Generally, α(t) is time-dependent, with an extended (for σ > 3k B T ) intermediate regime of α(t) ≈ 2k B T /σ eff = const, which turns into a very slow logarithmic increase, α(t) ∼ 2(k B T /σ) 2 ln(t/t 0 ), while approaching the asymptotic regime of normal diffusion [83]. For these reasons, we find frequently subdiffusion in random Gaussian potentials in the case of a sufficiently strong disorder, at odds with the outdated dogma of simple renormalization [107]. In this respect, it is pertinent to note that for proteins nonspecifically bound to DNA, disorder strength can reach 103,105], i.e., it is not small at all, contrary to what is often stipulated [102]. To think in terms of normal diffusion with a renormalized diffusion coefficient in such a situation is very misleading [36,83]. Indeed, for a typical D σ=0 = 3 µm 2 /s [109], D σ ≈ 4 × 10 −5 nm 2 /s for σ = 5 k B T , which means that diffusion over 2 nm would take over one day [83]. Subdiffusional search is orders of magnitude faster [36][37][38] than this renormalization predicts. In a combination with a local energy bias caused by the disorder correlations [83], this paradoxically fast subdiffusion provides a way to resolve long-standing speed-stability paradox [105]. Such a local energy bias caused by correlated disorder can indeed direct search to a target binding site and greatly accelerate it [83,104].
The first systematic study of overdamped viscoelastic subdiffusion in random Gaussian potentials was done in Ref. [87] for two fundamental models of correlations: the already mentioned exponentially decaying correlations (Ornstein-Uhlenbeck process in space) and power-law decaying correlations with infinite correlation range. An earlier study was also done in [110] for a peculiar model of g(x) and a weakly corrugated parabolic potential. One of the significant results of [87] is that a small disorder, σ < 2 k B T , makes an almost negligible impact on viscoelastic subdiffusion on the ensemble level, i.e., viscoelasticity wins over disorder. However, the disorder can introduce a substantial scatter of single-trajectory averages already for σ ∼ 1 − 2 k B T . Another significant result is that for a stronger σ > 2 k B T disorder such a diffusion starts to look on the ensemble level as diffusion caused by Gaussian disorder (in the absence if viscoelastic memory effects), i.e., the disorder seems to win over viscoelasticity. It can easily be mistaken for diffusion caused by CTRW with divergent MRT [87].
However, residence times are described by a generalized lognormal distribution, and an appropriate scaling time of subdiffusion is expressed through the fractional friction coefficient [87].
In this paper, we include inertial effects in viscoelastic motion [64,65], which are pertinent, e.g., for subdiffusion in lipid systems [30,32,34]. Next, we are looking for a physical explanation of unusual non-Gaussian, exponential power distribution P (x, t) ∝ exp −|x/x χ (t)| χ(t) of spreading the test particles positions revealed in viscoelastic subdiffusion in cytosol of living cells [28] (χ was fixed to the value χ = 1 of the Laplace distribution in this paper while fitting the experimental data) and in biological membranes with a generally time-dependent power-law exponent χ(t) χ(t) ∼ 1.4−2.2 [33,34] and a time-dependent width x χ (t). We believe that the medium's disorder causes it. Indeed, Langevin's memoryless diffusion in Gaussian potentials yields such a distribution with χ(t) ∼ 1.4 − 2; see Fig. 2 of Supplementary Material in [36]. Also experimental distributions of colloidal particle displacements in optically created random Gaussian potentials look similar, cf. Fig. 11 in [8]. For viscoelastic subdiffusion in random Gaussian potentials, it was, however, not studied until now. Next, viscoelastic subdiffusion in pure homogeneous lipid bilayers has a finite time range. For example, for the cases studied in Ref. [32] subdiffusion regime with α ≈ 0.6 lasted until about 10 ns and then crossed over into normal diffusion. However, in lipid membranes with cholesterol and disordered lipid gel phases, crossover occurs not to normal diffusion but rather to another subdiffusion regime with α ≈ 0.8 that lasts until 100 ns [32,34]. It is natural to suppose this regime is due to the medium's disorder, rather than viscoelasticity, or presents a combination of both.
Moreover, a non-monotonic behavior of α(t) was found in such complex lipid systems crowded besides with proteins [34,111]. Namely, after reaching a maximum α(t) can drop for a while and then increase again. In this paper, we conceive and investigate a pertinent basic model capable to qualitatively explain these puzzling features as a combination of transient viscoelastic subdiffusion and subdiffusion caused by the disorder.

Model and Theory
The model is based on subdiffusion governed by a Kubo-Zwanzig GLE [70,71,112,113] mẍ in a zero-mean random stationary Gaussian potential U(x) characterized by a stationary ACF g(x) and RMS σ. It yields a quenched random force, f (x) = −dU(x)/dx. In Eq. (1), m is the mass of a particle, η(t) is a frictional memory kernel and ξ(t) is a zero-mean thermal Gaussian noise force, which is completely characterized by its autocorrelation function (ACF) ξ(t ′ )ξ(t) . The friction and noise are related by the fluctuation-dissipation relation, ξ(t ′ )ξ(t) = k B T η(|t − t ′ |), reflecting thermal fluctuation-dissipation theorem (FDT) [70]. The memory kernel is assumed to have a power law decay, η(t) = η α exp(−ν h t)/[t α Γ(1 − α)], with an exponential memory cutoff ν h . Here, η α is the fractional friction coefficient and τ h = 1/ν h is a crossover time to normal diffusion, in the absence of quenched random force, f (x) = 0. Then, the natural time unit is the velocity relaxation time scale τ v = (m/η α ) 1/(2−α) entering the stationary velocity autocorrelation (VACF), which in the limit τ h → ∞ reads [64,65,75,114] K It provides a very good approximation within our model for t ≪ τ h and τ h ≫ τ v , which is assumed. Here, v T = k B T /m is thermal velocity, and E a (z) = E a,b=1 (z) is the Mittag-Leffler function, E a,b (z) = ∞ n=0 z n /Γ(an + b). The law of diffusion is given by the twice-integrated VACF, for initial Maxwell distribution of velocities and t ≪ τ h . Initially, diffusion is ballistic, where D α is the fractional diffusion coefficient obeying a generalized Einstein relation, D α = k B T /η α , which also expresses FDT. Such a course-grained for t ≫ τ v process is nothing else FBM and this FBM description holds for τ v ≪ t ≪ τ h . For t ≫ τ h , the normal diffusion regime gradually emerges. In this work, we scale time in The potential is considered on a lattice with spatial grid size ∆x. It presents randomly generated energy values on the lattice sites, which are continuously connected by parabolic splines, so that that the quenched random force f (x) is piece-wise linear in space [115]. A spectral algorithm for the generation of such a random potential is described in [115]. It was successfully used in many papers and requires to specify some large spurious periodization length L and g(x). We consider three models of correlation decay: (1) g(x) = exp(−|x|/λ) (short-range singular disorder), (2) g(x) = 1/[1 + x 2 /λ 2 ] γ/2 with γ = 1 and g(x) ∼ (λ/|x|) γ at x ≫ λ (long-range nonsingular disorder with diverging correlation length), and (3) g(x) = exp(−x 2 /λ 2 ) (shortrange non-singular disorder) for two values λ = 10 and λ = 100. The model of power-law correlated disorder with γ = 1 emerges due to a charge-dipole interaction, e.g., in the context of charge diffusion in molecularly doped polymers [98], where typically σ ∼ 4 in units of room k B T r ∼ 0.025 eV. The electrostatic nature of this model of correlated disorder qualifies it as an important physical model. The models of exponentially correlated disorder (Ornstein-Uhlenbeck process in space) and Gaussian correlations are basic models of generic interest. Exponential correlations lead to a singular model, while a Gaussian decay provides a non-singular model of short-range correlations. Typical realizations of corresponding random potentials are shown in Fig. 1. Notice a very rugged character of potential fluctuations in the case of exponential correlations and a strong local bias on a typical scale of λ in the cases of exponential and power-law correlations. In this respect, the character of a smooth disorder is not changed upon making ∆x ever smaller, provided ∆x ≪ λ: there are typically about one or two local potential minima per λ length. The Ornstein-Uhlenbeck process in space (exponential correlations) is very different: there are huge many local minima and maxima within its correlation length and their number increases upon diminishing ∆x. Indeed, this process can only be defined on a lattice with some finite ∆x given its singular character because for this process the force RMS f 2 (x) 1/2 = σ 2/(∆xλ) → ∞ diverge with ∆x → 0 [36,83,87]. Power-law-decaying and Gauss-decaying correlations yield smooth U(x) [83,87]. In these cases, when ∆x → 0. Since the analytical treatment of diffusion and transport processes of strongly non-Markovian GLE dynamics in random correlated potentials is not feasible, we investigated it numerically.

Numerical approach
The numerical approach is based on approximating the memory kernel by a sum of exponentials and a hyper-dimensional Markovian embedding of the low-dimensional GLE dynamics. It is very well developed both in application to sub-diffusive [64,65,87] and super-diffusive GLE [116][117][118], including FLE dynamics. This numerically accurate approach is detailed in Appendix A.

Ensemble vs. trajectory averages, ergodicity and its breaking
Notice that ergodicity of random processes can be understood in various senses [119], with ergodicity in the mean value serving as the basic central notion [119]. The ergodicity of a diffusion process x(t) is considered in the current literature as ergodicity of x(t) in its squared increments [55]. Namely, it is considered as a mean-value ergodicity of the stochastic process defined by the squared position increments, Here, t ′ is the running time of this process and t is a forward time-increment of the i-th trajectory counted from this running time. The time-average of these squared position increments over t ′ is defined as δx 2 , as a running trajectory average within a time window ∆T w = T w − t ≫ t. It depends on t and T w . In 1d, the single- with growing t. Often, the definition involves an additional gamma-function factor , which we will, however, not use for the singletrajectory averages. The empiric ensemble averaging presents a more common concept, M will be mostly omitted in the following while assuming that is sufficiently large. Moreover, t ′ is set to zero in the corresponding ensemble averages, while sampling initial particles positions randomly in space. Notice that only for the stationary in wide sense increments this average does not depend on t ′ . Diffusion is considered ergodic if both averages exist and coincide, δx 2 From a mathematical point of view, these averages can be identical only for processes with stationary increments. Otherwise, they cannot be expected to coincide [119,120]. Nevertheless, from a physical point of view both averages always make sense and should be considered different and mutually complementary characteristics of diffusion processes. Evidently, they are not expected to ever coincide in disordered media with quenched disorder, where the position increment for any particular trajectory is never stationary. By Khinchin theorem [119,120], a sufficient condition for ergodicity of a stationary random process in the mean value is the decay of its ensemble stationary autocorrelation function (ACF) to zero (in the limit M → ∞, which is equivalent to the ensembleaveraging in a proper mathematical sense using the probability density function). Slutsky's theorem poses an even weaker condition: it is necessary and sufficient that the time-average of the stationary ensemble ACF decays to zero in the limit of infinite averaging interval [119,120]. For the discussed ergodicity of diffusion processes, a more complex sufficient condition necessarily emerges, and it can be formulated in terms of four-points correlation functions of increments [119]. However, for Gaussian processes, it reduces to decay of the (infinite) ensemble autocorrelation function of increments to zero [119,120]. It is the case of FBM (a singular non-differentiable stochastic process with stationary increments) and its inertial generalization [64,92] based on GLE, including FLE, with a well defined (by contrast with FBM) velocity v(t) =ẋ(t). We consider it in this paper as a basic process, which is ergodic in the absence of an external potential. However, when a trapping (e.g., bistable) potential is present, the limit of sufficiently large t ′ → ∞ must be considered. Otherwise, both averages can never coincide even for perfectly ergodic processes, requiring some relaxation time to reach a stationary limit within a potential trapping domain. This is also the reason why a meanergodic, or even correlation-ergodic process can be profoundly non-ergodic, e.g., from the viewpoint of statistics of transitions between two trapping subdomains [68,69,121]. different from one expected from the relaxation dynamics in such trapping domains. In such a case, swift transitions between two trapping domains can occur in the background of a slow relaxation dynamics within the whole domain of bistability [64,65,69]. Here, the main preconditions of the rate theory [107] are violated, and, nevertheless, it can remain useful and predictive, to a certain extent [48,64,65]. In periodic potentials, such an inertial generalization of FBM is also asymptotically ergodic [64]. However, the pace of establishing this ergodic asymptotics heavily depends on the potential amplitude in units of the thermal energy k B T , and this asymptotics is not necessarily achievable in practice [64]. These remarks already explain why viscoelastic subdiffusion in complex environments is generally not ergodic, and a scatter of single-trajectory averages does not contradict the viscoelastic nature of subdiffusion in complex inhomogenous fluids, including cytosol of biological cells.
On the practical level, we neither ever have infinitely long trajectories nor infinitely large ensembles. Hence, both averages never coincide empirically exactly even for ergodic process. However, for a genuinely ergodic process, they do agree well for sufficiently large ∆T w ≫ t and M ≫ 1, which is the case of FBM and potential-free GLE subdiffusion [55,64,65,92]. For this, however, ∆T w must exceed t by a factor of at least 100. Notice that some authors consider even T w ∼ t or ∆T w ∼ 0 in discussion of ergodic properties [55]. It makes, strictly speaking, a little sense in this context. This point is crucial because most of the existing thus far experimental confirmations of aging in anomalous diffusion in living cells have T w , which at best is only one hundred times larger than the corresponding t. Such proofs lack a profound statistical significance, especially considering that also M is, at best, several hundred in most experiments, as we discuss below in the context of our model.

Ensemble averages
We first performed calculation of the ensemble averages with M = 10 5 particles in each case (cf. Appendix A for pertinent details) using three models of correlation decay and two values of λ = 10 and λ = 100. The memory kernel approximation is chosen in such a way that FLE subdifussion with α = 0.6 should last in the absence of potential until about t = 10 3 . Then, it gradually changes into asymptotically normal diffusion, see the cases of free diffusion in Figs. 2, 3 (a). This behavior is convenient to characterize by a time-dependent power law exponent α(t) = d ln δx 2 (t) /d ln t, see in Figs. 2, 3 (b). Initially it is always ballistic, α = 2, δx 2 (t) ≈ (v T t) 2 , given initially thermally distributed velocity variable in this work. Then, after some transient an FBM regime with α eff ≈ 0.6 establishes for t ≥ 10 and lasts until t ∼ 10 3 − 10 4 . Then, significant deviations from α = 0.6 start. The power exponent grows and the regime of normal diffusion gradually establishes until the end of simulations at t max = 10 7 . This behavior strongly reminds one for diffusion of lipids in 1,2-Distearoyl-sn-glycero-3-phosphocholine (DSPC), 1-stearoyl-2-oleoyl-sn-glycero-3-phosphocholine (SOPC), and 1,2-Dioleoyl-sn-glycero-3-phosphocholine (DOPC) lipid bilayers in [32,34], see Fig. 8 in [34]. Therein, FBM and FLE subdiffusive regime with α = 0.6 lasts until about 10 ns and then changes gradually into normal diffusion. With τ v = 0.323 ps (Appendix B), our t = 10 4 corresponds to 3.23 ns and t = 10 7 to 3.23 µs.
The presence of a random potential radically changes this simple picture. Within the time-range of purely viscoelastic subdiffusion (t < 10 3 ) disorder practically does not influence it on the ensemble level, see in Figs. 2, 3, (a). It is especially surprising for a very rough exponentially correlated disorder with λ = 10 and σ = 2 in Fig. 2 (a) (see a typical realization of such a disorder in Fig. 1 (a)), and also for a power-law correlated disorder for σ = 4 and λ = 10. Naive expectation based on the de Gennes-Bässler-Zwanzig renormalization is that the diffusional spread should be suppressed by the factor exp[−σ 2 /(k B T ) 2 ], which is approximately 0.01832 for σ = 2 and 1.12535 × 10 −7 for σ = 4. However, this expectation is not justified; see in Fig. 2 (a). Time-dependent α(t) shows some dynamics in Fig. 2 (b). However, changes on the level of δx 2 (t) are not easy to detect in part (a). This striking feature was described earlier within a bit different model (no inertial effects, diffusion is initially normal due to a normal friction component in GLE) in [87]. Also, in periodic potentials, GLE subdiffusion is asymptotically insensitive to the presence of potential [64,65]. For a sinusoidal potential, it can be even proven rigorously within a quantum-mechanical setting while considering the sub-Ohmic model of generalized Brownian motion [114], which corresponds to the considered viscoelastic GLE subdiffusion [65]. In periodic potentials, however, transient effects can last very long, depending on the potential amplitude [64,65]. In the present case, no related transients can be seen on the ensemble level, even for σ = 4 in Fig. 2 (a), which is quite surprising. The influence on the level of single-trajectory averages is, however, in the case λ = 10, profound, see below. Next, transition to normal diffusion regime does not occur until t max = 10 7 . A new anomalous diffusion regime caused by disorder is gradually established for t > 10 5 , with vibrant transient features, which depend on the type of correlation decay, σ, and λ.
Indeed, for exponential correlations and σ = 2, α(t) ≈ 0.55 − 0.57 < 0.6 for t = 10 2 − 10 4 in Fig. 2 (b) (full red line therein). For t > 10 4 it starts to grow slowly and reaches α ≈ 0.774 at t = 3.2 × 10 6 . In this respect, it is worth noting that in the case of overdamped normal diffusion in such a potential, α(t) reaches a minimal value α ≈ 0.70 [36,83] before it starts to grow logarithmically slow in time until unity. We expect such a slow growth also in the case under study. However, the asymptotic regime of normal diffusion is numerically out of reach. Also in the case of Gaussian correlation decay, α(t) first drops slightly below 0.6, reaching its minimum α min ≈ 0.50 at t ≈ 2×10 3 and then slowly grows until α ≈ 0.77 at t = 3.2×10 6 . Especially interesting is the case of power-law correlations with divergent correlation length because in this case subdiffusion caused by the correlated Gaussian disorder alone (without viscoelastic memory effects) is the longest [83]. In this case, α(t) stays close to 0.6 for t between 10 and 10 5 . Then, it grows and stays nearly constant α(t) ≈ 0.72 from t = 10 6 till the end of the simulations. In this respect, α ≈ 0.70 is about the minimal value of transient subdiffusion caused also by Gaussian power-law correlated disorder [83]. In the case of normal overdamped diffusion in considered random potentials, the spatial range of nonergodic transient subdiffusion is determined by a nonergodicity length L erg , which is a typical length scale, where the statistical weight function w(x) = exp[−U(x)/(k B T )] lacks self-averaging [36,83]. For σ = 2, L erg ≈ 35λ in the case of exponential correlations, L erg ≈ 52λ in the case of Gauss-decaying correlations, and L erg ≈ 167λ in the case of power-law decaying correlations with γ = 0.8, see Table I in [83]. Hence, in the case of power-law correlations (here γ = 1 vs. 0.8 in [83]) one can expect that subdiffusion will last at least until about δx ∼ 1.67 × 10 3 x 0 for λ = 10 and δx ∼ 1.67 × 10 4 x 0 for λ = 100, which for x 0 = 0.026 nm (Appendix B) would correspond to 43.42 nm and 434.2 nm, correspondingly. With increasing σ, L erg grows super-exponentially fast. Already for σ = 4, L erg ∼ 10 6 λ for a short-range correlated disorder [83]. Then, the corresponding nonergodicity length reaches practically a macroscale, L erg ≈ 0.26 mm and 2.6 mm at λ = 10 and λ = 100, correspondingly. In this respect, the case of power law correlation in Fig. 2 with σ = 4 is interesting. From t = 10 4 on in part (b), α(t) ∼ 0.44 − 0.47 with α min ≈ 0.44 at the end of simulations, where it seems to slightly decline further. It agrees well with α min ≈ 0.43 for power-law correlations in Fig. 3 (b) of Ref. [83] obtained for memoryless diffusion. The case of λ = 100 in Fig. 3 is interesting for some lipid-proteins systems. For such a large λ and a smooth disorder, the time range of viscoelastic diffusion is restricted virtually by motion within one local minimum of the random potential, cf. Fig. 1  (b), of a typical spatial size λ. In this case, α(t) dynamics in Fig. 3 (b) shows an interesting novel non-monotonic feature in all studied cases. Namely, it first starts to grow, like in the potential-free case. However, after a maximum is reached, α(t) drops. For example, for exponential correlations (in this singular case, a very rough multiple- extrema structure of potential holds also on a spatial scale much smaller than λ) and σ = 2, α max (t) ≈ 0.763 is reached at t = 5.4 × 10 5 . After this, it drops to α(t) ≈ 0.717 at t = 3.17 × 10 6 . Given the results in [83], one can predict that it will further drop to α ≈ 0.70 and then start to grow logarithmically slow to the asymptotically normal value. Furthermore, for Gauss-decaying correlations and σ = 2, α max (t) ≈ 0.760 is reached at t = 2.84 × 10 5 , and it drops to α(t) ≈ 0.648 at t = 3.17 × 10 6 . For this model of the disorder, α min ≈ 0.6, see in Fig. 3 (d) in [83]. Hence, it is expected that with increasing time, it will drop further to this minimum value and then gradually increase to the asymptotically normal value. In case of power-law correlations and σ = 2, α max (t) ≈ 0.842 is reached at t = 5.57 × 10 5 , and it drops to α(t) ≈ 0.768 at t = 3.17 × 10 6 . One expects that it will further drop to α min = 0.70 and then gradually grow. Interestingly, a reminiscent nonmonotonous behavior of α(t) was found both for lipids and proteins in MD simulations of disordered lipid-protein systems, see Figs. 15, 16 in [34], and references therein. Next, for σ = 4 and exponential correlations, α min ≈ 0.375 in the case of normal diffusion [36] (see inset in Fig. 1 therein) and it can stay for a very long time nearly constant. In Fig. 3 (b), the minimal value reached at t = 3.17 × 10 6 for exponential correlations is α ≈ 0.456. The true minimum is hence still not reached. For power-law correlations in this figure, the minimum at the end of simulations is α ≈ 0.529. It is well above the expected minimal α min ≈ 0.43, which is still not reached. Also, for Gaussian correlations, α ≈ 0.406 at the end is still above the expected minimum. After it will be reached, α(t) is expected to stay nearly constant for a long time and then logarithmically slow grow until unity. These predicted features are, however, beyond any numerical study currently possible.
It must be emphasized that in our estimations we tacitly assume that on the time scale exceeding the long-time cutoff of the memory kernel τ h we can think in terms of normal diffusion with an effective friction coefficient η eff = ∞ 0 η(t)dt, which takes place in a random potential. The validity of this assumption should be checked in the subsequent research. One cannot exclude that a synergy of viscoelastic memory effects and nonergodic effects due to the randomness of potential will tremendously increase nonergodicity length.

Distribution of particle positions
A very intriguing question emerges on the origin of the exponential power distribution Such a distribution is commonly found in viscoelastic biological subdiffusion, e.g., for RNA-protein complexes in various cells with a fixed power exponent χ = 1 in Ref. [28]. Furthermore, crowded protein-lipid systems also display this distribution with χ = 1.4 − 1.6 [33,34]. Very important in this respect is that normal diffusion in disordered Gaussian potentials typically displays this distribution with a time-dependent χ(t), see Fig. 2 of Supplemental Material in Ref. [36], where χ ∼ 1.4 − 1.6 for σ = 2 − 4. Hence, it is expected that the viscoelastic subdiffusion of a restricted temporal range in Gaussian potentials will display this striking feature. Indeed, our simulations in Fig. 4 confirm this expectation. They reveal the following common features. Initial delta-distribution spreads first approximately as a Gaussian, χ = 2, see in panel (c). Then, during the viscoelastic stage, χ fluctuates around χ = 2, and can even reach χ ≈ 2.3, see the case of power-law correlations in panel (c). It drops below χ = 2, beyond the temporal range of viscoelastic correlations, where the very occurrence of subdiffusion is on the count of the randomness of potential. For σ = 2, χ(t) relaxes down to χ min ≈ 1.59 for several models of correlations under study, see in panel (a) for power-law correlations. For σ = 4, the minimal value of χ in parts (b), (c) is χ = 1.45. However, the real minimum, in this case, is still not achieved until the end of the simulations. The relaxation law is inverse logarithmic with t c ≈ 253.34 and r ≈ 0.0842 for σ = 4 in the case of power-law correlations, see in Fig. 4 (c). After reaching the minimum, it is expected that χ will stay nearly constant for a while and then it will gradually increase to χ = 2 asymptotically, while the normal diffusion limit will gradually be achieved. To realize this feature, see the case σ = 1 in the inset of Fig. 2 of Supplemental Material in Ref. [36]. In the case λ = 100, the relaxation behavior in Fig. 4 (c) is shifted to the right (not shown). For example, for σ = 2 and power-law correlations, χ ≈ 1.736 at the end of the simulations, and for σ = 4 it arrives at χ ≈ 1.704, i.e., indeed χ min is still far from being reached in both cases.

Trajectory averages
Next, we proceed with studying single-trajectory averages, which are of special interest because disorder generally implies a broken ergodicity. For this, we use T w = 2 × 10 6 and evaluate single-averages until t max = 0.1T w . The results are shown in Fig. 5 for σ = 2, three models of correlation decay and two values of λ. Consider first the case λ = 10 in parts (a,c,e). First of all, in the initial regime, t < 2, (including thermal ballistic diffusion), self-averaging occurs for every trajectory. No scatter is present. In this regime, ensemble and trajectory averages coincide. Thus, diffusion is ergodic. For t > 2 in the cases of exponential and Gaussian correlations and t > 10 for power-law correlations, the scatter becomes visible. It is noticeable that it becomes enhanced for t > 10 3 , beyond the range of viscoelastic memory effects. Hence, we consider two different time intervals (i) [10, 10 3 ] and (ii) t > 10 3 to characterize diffusional properties . Straight lines in lower insets display best fits with Eq. (6). In both cases, fitting parameters are given in Table 1.
with a pair of random variables (D, α) for each separate trajectory defined on the corresponding time intervals. The results are depicted in upper and lower insets in parts (a,c,e), correspondingly. They display a striking feature. Namely, while one of the variables can be considered completely random, another is tightly related by or α(D) = α 0 − α 1 ln(D/D 0 ) in the case of a strong scatter, see lower insets in Fig. 5, (a-f), with D 0 , α 0 , α 1 in Table I with α 2 = α 0 + α 1 , b 1 = α 1 /D 0 . Indeed, the upper insets in Fig. 5, (a,c,e) display such a linear dependence for 10 < t < 1000, however, with α 2 , b 1 , which are not related to α 0 , α 1 , D 0 , see in Table 1.
Next, the ensemble-averaged time-averaged (EATA) variance of particle positions lies essentially below the corresponding ensemble-averaged variance, cf. Fig. 5 (a,c,e). Also the corresponding values of α are different. For example, in the upper inset in (a), α EATA ≈ 0.512 for EATA, while α ≈ 0.571 for the ensemble-average therein. Likewise, in the lower inset (a), α EATA ≈ 0.727 for EATA and α ≈ 0.608 for the ensemble-average. Interestingly, a very similar feature, i.e., a small scatter in viscoelastic FBM regime (t < 10 ns) with α EATA ≈ 0.52 − 0.54 and a strong scatter with α EATA ≈ 0.79 − 0.82 was revealed in a DSPC lipid-cholosterol mixture in Ref. [32], see also Fig. 10 in review [34]. The case λ = 100 implies a new feature. Here, practically no scatter appears until t = 10 3 for power-law and Gaussian correlations, i.e., smooth disorder, cf. panels (d), and (f) in Fig. 5, correspondingly. For rough exponentially correlated disorder, some very small scatter occurs yet for t < 10 3 , see in part (b). However, in all cases, a  shown in (a,c). Hence, this is not genuine aging, but rather a transient statistics effect. The best fits with constants (full lines) shown in (a,c) look almost perfect, upon taking error bars into account. Also, CVs power-law decay with exponent a shown in (b,d) until T w reaches about 100t in each case. After this, CV saturates around 0.2 and remains significant. Hence, no self-averaging occurs with a further growing T w . Time-averages remain random variables. Diffusion is not ergodic.

Aging
Finally, we address the issue of ergodicity, its breaking, and the phenomenon of aging. The very fact that the ensemble average and EATA are distinct implies that ergodicity is broken. Next, the ensemble-averaged δx 2 i (t) Tw M for a fixed t and sufficiently large M = 200 shows first a power-law decrease with increasing T w (a spurious "aging"), cf. Fig. 6, a, c, for exponential and Gaussian correlations, correspondingly. However, for a statistically significant T w , it just fluctuates around a mean value (significantly different from δx 2 (t) M ) with root-mean-squared (RMS) amplitude of fluctuations which very slowly decays with T w . The corresponding coefficient of variation (ratio of this RMS to the mean value), square of which is named the ergodicity breaking parameter or EBP within this context [55], EBP = CV 2 , is shown in Fig. 6, b, d. CV does initially decay as a power law, CV ∼ 1/T a w with a ∼ 0.241 − 0.319. However, this decay is only transient. For T w > 100t, it saturates around CV ∼ 0.2 or EBP ∼ 0.04. Diffusion is non-ergodic. However, aging is only transient and spurious. It can be a typical experimental situation, upon a critical appraisal.
Indeed, in Ref. [86] diffusion of insulin granules shows aging behavior in Fig. 2B therein. However, maximal T w /t in this figure is 100 (lowest curve therein). With a further increase of T w (our notations are different), it saturates [86]. For the upper curve in the discussed figure, the maximal value of T w /t is 25. Next, also a protein membrane diffusion in Ref. [27] displays aging, at the first look, see Fig. 1 (f) in Ref. [27] and also Fig. 6 (f) in review [34]. However, a careful examination of the experimental data points in these figures reveals that D i,sgl (T w ) M is nearly constant for T w ∼ 0.33 − 3.33 s evaluated for M = 600 trajectories longer than 3.33 seconds (our notations are different), contrary to the fit made, which overlooks this striking feature. It looks similar to one in our Fig. 6 (a,c).

Conclusions
The model of viscoelastic subdiffusion of finite temporal range in random Gaussian potentials developed in this paper shows several remarkable features, which are consistent with many experimental observations and with the viscoelastic nature of cytosol and lipid membranes. First, it is consistent with fractional Brownian motion anti-correlations revealed in many experiments. Second, experimentally claimed powerlaws are often even looking visually not quite as power laws (straight lines in double logarithmic coordinates). A generalized log-normal distribution, which features systems with Gaussian disorder, can provide a better fit in many situations [87]. Third, Gaussian disorder yields a non-Gaussian exponential power distribution of particle positions. Similar distributions are often measured experimentally and might be confused for exponential Laplace distribution. Using a variable power exponent χ instead of the fixed χ = 1 value (Laplace distribution) in fitting experimental data can help to reveal this remarkable feature, which can be often overlooked. Fourth, subdiffusion's viscoelastic nature is entirely consistent with a broad scatter of single-trajectory averages, which can reach four orders of magnitude in anomalous diffusion coefficient, even for a moderate disorder strength, σ = 2k B T . Simultaneously, the ensemble averages are only weakly sensitive to the presence of disorder, which is also a vital feature. The fifth, powerlaw exponent of subdiffusion α(t) can show non-monotonous time-dependence similar to one found in molecular-dynamic simulations of crowed lipid-protein systems. Sixth, single-trajectory averages show a transient aging behavior and no aging when the timeaveraging interval exceeds the corresponding time more than a hundred times. However, scatter remains also for a further increasing time-averaging interval. Hence, such diffusion is not ergodic. Ensemble and ensemble-averaged time-averages are significantly different in general. However, aging is transient and spurious. In this respect, typical published experimental data on aging is consistent with our observations, as we also detailed in this work. All these features are quite robust and universal concerning models of correlation decay.
These profound features establish our model as a promising exploratory framework for subdiffusion in complex fluid-like disordered environments. Of course, generalizations to other models of quenched disorder and higher dimensions are highly desirable and welcome. We believe that the presented results, due to their generality, will help to explain the bulk of pertinent experimental data and inspire a subsequent research work.

Acknowledgment
Funding of this research by the Deutsche Forschungsgemeinschaft (German Research Foundation), Grant GO 2052/3-2 is gratefully acknowledged. We acknowledge also support by the Regional Computer Centre Erlangen, Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities, as well as University of Potsdam (Germany), which kindly provided GPU high-performance computational facilities for doing this work.

Appendix A. Numerical approach
We detail the numerical approach in this Appendix. It is based on approximating the power-law memory kernel by a sum of exponentials (a Prony series expansion [122][123][124][125]) and hyper-dimensional Markovian embedding of underlying non-Markovian dynamics [64,65,116,117]. The method is numerically accurate. The results coincide within the numerical precision tolerance with the analytical results available in case of linear dynamics.
The memory kernel approximation reads [64,65] The sum of exponentials obeys a fractal scaling with a scaling parameter b. It approximates the power-law decay [52,64,65,126] of this memory kernel in an almost optimal way [127]. The choice of ν 0 is related to the time step of simulation ∆t. To avoid numerical instability, ν 0 ∆t should be smaller than one. The power-law regime extends in this approximation from a short time (high-frequency) cutoff, ν −1 0 , to a large time (small frequency) cutoff, τ h = τ l b N −1 . The choice of N is dictated by the maximal time t max of simulations: τ h should exceed t max by at least several times, if one wishes to simulate FLE dynamics on the whole time scale.
In this paper, we focus, however, on somewhat different setup. Namely, τ h is quite finite and defines a maximal time range of GLE subdiffusion. In this setup, subdiffusion occurring for t ≫ τ h is caused by the medium's disorder. The accuracy of the approximation between two cutoffs is controlled by the scaling parameter b > 1. The smaller b, the better the accuracy. However, a larger N is then required. In this work, we use α = 0.6 in all simulations, having in mind some lipid systems. With b = 5 and C α (b) = 1.0807, approximation (A.1) provides about 1% accuracy for t between 10 −2 and 10 and ν 0 = 100, see in Fig. A1. It still provides a sufficient accuracy of 5% until t = 10 3 , i.e., over 5 time decades. Hence, with time integration step ∆t = 0.005 in our stochastic simulation it provides a good approximation of the corresponding FLE dynamics until t c = 10 3 . Then, the approximation becomes worse and for t > t c a gradual transition to normal diffusion occurs. However, Fig. 2, a and b reveal that subdiffusion with α(t) ≈ 0.6 holds until t = 10 4 in the potential-free case. Increasing N to N = 18 provides a better than 1% accuracy from t = 10 −2 until t = 10 6 . This choice has accuracy of 2% until t = 10 7 -the maximal time in our simulations. Moreover, choosing b = 2 and C 0.6 (2) = 0.4654518 would provide numerical accuracy of power law approximation better than 0.001% over more than eight time decades for a quite moderate N = 60. This is the reason why approximation in Eq. (A.1) with ν i ∝ 1/b i is practically much better than a concurrent expansion based on a scaling pertinent to eigen-mode expansions of polymer dynamics with k i = const and ν i ∝ i p , with some constant p [128]. For example, in the case of Rouse model of polymeric chain, p = 2. The latter one needs, e.g., about N = 100 of polymeric eigen-modes to achieve an acceptable power law approximation just over two time decades. Numerically, it is simply not feasible [87] to cover much larger time scales, like in our research, including this paper.
For doing Markovian embedding, one introduces a set of N viscoelastic forces u i such that the corresponding embedding dynamics in the hyperspace of dimension D = N + 2 reads [64,65] x(t) = v(t) shows that with lowering b to b = 2 and increasing N to N = 60 one can increase relative accuracy of approximation to better than 10 −5 or 0.001%. It means that our embedding method can be made numerically exact in quite moderate embedding dimensions. Numerical accuracy of several per cents is, however, sufficient in most stochastic simulations. Stochastic numerical results correspond to the same matching approximate kernels. They agree very well with quasi-analytical results for diffusion at all times, and also for VACF at sufficiently short times until statistical noise starts to dominate for small values of VACF. In the inset (a), the exact analytical result (2) is compared with two quasi-analytical approximations. Here, the absolute value of VACF is plotted.
δ ij δ(t − t ′ ). The initial u i (0) are sampled as mutually independent Gaussian variables with zero mean and variances u 2 i (0) = k B T k i [64,65]. Eq. (A.2) corresponds to a Maxwell-Langevin model of stochastic viscoelastic dynamics developed in our previous work [64,65] and used in numerous papers. The first two equations in Eq. (A.2) are just equations of motion expressing Newton's second law. The last N equations describe the relaxation of viscoelastic force modes of environment influenced by the particle motion. Such an equation corresponds to Maxwell's idea that viscoelasticity can be described within a framework of an elastic force u characterized by some elastic constant k, which can weaken in time and relax to zero with rate ν [129]. It is how Maxwell derived phenomenon viscosity from the phenomenon elasticity in his classical work [129] by course-graining over the times largely exceeding 1/ν. The last thermal noise term in Eq. (A.2) ensures that this viscoelastic force is consistent with thermodynamics and FDT [65], i.e., it models a thermal environment at local temperature T and relaxes to equilibrium with rate ν for a localized particle, v = 0. However, when a particle moves, it disturbs this local equilibrium, what creates a long-lasting memory in the environment for slowly relaxing modes u i . These modes remember where the particle was before. Thereby, a slowly relaxing local deformation is created, which attracts and temporary traps the particle (viscoelastic caging). An analogy with polaron picture in solids is most useful here to grasp the physical mechanism of viscoelastic subdiffusion [64,65]. Notice that this approach is straightforward to generalize [116][117][118] to include the effects of hydrodynamic memory [130,131], and a normal friction component [65,87]. The latter corresponds, e.g., the water content of cytosol. We do not consider them here.
Initially, particles were always prepared with their velocities Maxwell-distributed at temperature T and localized sharply at some random, uniformly sampled in the spacial interval [−L/2, L/2] positions x i (0). It is a non-equilibrium initial preparation. Particles are typically subjected to some local bias f (x i (0)) and explore first some local environment. They can, e.g., be either trapped in some local potential minima or start at the tops of potential local hills, see in Fig. 1.
In numerical simulations, we use the second-order stochastic Heun algorithm [132] implemented in CUDA and run simulations on professional graphical processor units (GPUs) with double precision. In ensemble simulations, we run M = 10 5 particles in parallel. With ν 0 = 100, ∆t = 0.005 and N = 11 viscoelastic modes, simulations require about 6.5 days to arrive at t max = 10 7 on Titan V processors, and five times longer on older Kepler K20 GPUs. Embedding with N = 18 on Titan V GPU requires about two weeks for the same integration time step. It sets the practical restrictions in simulations. We first test our approach on calculating VACF K v (t) and δx 2 (t) in the potential-free case and comparing the results with exact result in equations (2), (3) for FLE dynamics. Results for normalized K v (t) are shown in Fig. A2 (a). For α = 0.6, K v (t) changes sign only once at t 1 ≈ 1.76 and remains negative for t > t 1 . It is different, e.g., from the case α = 0.5, where the sign oscillations occur exactly three times [87]. At t 2 ≈ 3.04 it reaches first a minimum K v (t 1 ) ≈ −0.21286 v 2 T , and then increases and reaches a local maximum K v (t 1 ) ≈ −0.00275 v 2 T at t 2 ≈ 7.88, not crossing zero. Then, it again decreases to a local minimum K v (t 1 ) ≈ −0.00984 v 2 T at t 3 ≈ 11.88, and after this it gradually approaches zero in accordance with power-law t −1.4 . It has to be mentioned that such fine features are not detectable in stochastic numerical simulations using M = 10 5 because of pure statistical relative error of the order 1/ √ M ≈ 0.00316 or 0.316%. Because of this reason, the numerical results in Fig.  A2 (a) sink in the statistical errors already for t > 6. Hence, a trustful comparison of stochastic numerical and (semi)-analytical results in part (a) can be made only for t < 6. In part (b), however, the agreement between the exact and (semi)-analytical results, from one side, and numerical results, from another side, is nearly perfect for all times. To resolve the discussed fine features in part (a), we also plot (semi-)analytical results obtained by numerical inversion of the Laplace-transformed with the Laplace-transformed memory kernel approximated by Laplace-transformed (A.1). In this case,K v (s) is a rational function of s, and it can be quasi-analytically inverted, while finding the roots of the corresponding polynomial of s in denominator of this rational function numerically. However, for practical purposes it is best to use a Gaver-Stehfest series [133,134] for the inversion of Laplace-transform with an adaptable (arbitrary in principle) numerical precision [135], as we did earlier in many papers, e.g., in [64]. This inversion of Laplace-transform allows us to arrive at numerically accurate results in many cases, including exact and approximate memory kernel in this paper. In the insert of Fig. A2 (a), where the exact (without cutoff) |K v (t)/v 2 T | and its two approximations are plotted on a doubly logarithmic scale, one can see that the approximation with ν 0 = 5000 and N = 18 excellently reproduces K v (t) on the whole time span depicted. The approximation with ν 0 = 100 and N = 18 also well reproduces all the features mentioned above. However, it makes some discrepancies from the exact FLE result near to the extrema of K v (t) in part (a) visible. Stochastic numerics in the main plot in Fig. A2 for ν 0 = 100, ν 0 = 5000 (done in this case with a much finer time-step ∆t = 10 −5 ), and N = 11 also excellently agree with (semi)-analytical results obtained by numerical inversion of (A.3) [part (a)] and 2K v (s)/s 2 [part (b)]. The discussed small discrepancies in a small intermediate range of t in part (a) does not influence the results already for t > 12, where the overdamped FBM regime is established in the absence of external potential. For this reason, we used embedding with ν 0 = 100, which allows for a much larger ∆t and makes feasible much large t max in simulations. It is a typical trade-off between numerical accuracy and feasibility in numerical simulations. Moreover, we do not consider rigorously an FLE model in this paper, but rather a subdiffusive GLE model with memory cutoffs.