Fr\'echet differentiable drift dependence of Perron--Frobenius and Koopman operators for non-deterministic dynamics

We consider Perron-Frobenius and Koopman operators associated to time-inhomogeneous ordinary stochastic differential equations, and establish their Fr\'{e}chet differentiability with respect to the drift. This result relies on a similar differentiability result for pathwise expectations of path functionals of the solution of the stochastic differential equation, which we establish using Girsanov's formula. We demonstrate the significance of our result in the context of dynamical systems and operator theory, by proving continuously differentiable drift dependence of the simple eigen- and singular values and the corresponding eigen- and singular functions of the stochastic Perron-Frobenius and Koopman operators.


Introduction
In the study of dynamical systems and differential equations in particular, one important aspect is the sensitivity of the solution with respect to the data. In the literature, there exist classical results for the deterministic case that deal with dependence on the inital data and the driving velocity field. The same dependencies also arise for non-deterministic systems, e.g. [FGP10,AJKW17] for Fréchet-type dependencies on the initial data, and [FLL + 99, GM05, Mon13,DG14] for the dependence of path functionals or their expectations with respect to changes in the drift. This paper uses the approach of [FLL + 99], which establishes a Gâteaux-type dependence on the data by establishing the existence of directional derivatives with respect to the drift, in order to establish the Fréchet-type dependence of the solution operator with respect to an additive change of drift in a sufficiently smooth setting: for a suitable observable g, we provide in Theorem 3.1 the Fréchet derivative at γ = 0 of the non-linear functional u x g (γ) := E g(X γ ) X γ 0 = x with respect to additive perturbations in γ; above, X γ denotes the solution of the perturbed stochastic differential equation (2.7) below.
Our main motivation for considering the above result is the study of global long-term properties of dynamical systems, such as their stationary distribution, or the rate of mixing. Such properties are strongly related to spectral objects of the so-called transfer operators associated to the dynamics [Pie94,SKMA08,SSA09,FGTW16,FS17]. These are infinite-dimensional linear operators describing the evolution of distributions and observables under the non-linear, stochastic dynamics. For instance, for a dynamical system given by the mapping Φ : R d → R d , the Koopman operator, acting on observables g : R d → R, is given by U g(x) = g (Φ (x)). If the dynamics is non-deterministic, e.g., the value of a stochastic process X at some time t, then U g(x) = E[g(Φ(x))] = E x [g(X t )]. If we now allow for non-deterministic initial conditions X 0 with some distribution f , i.e., X 0 ∼ f , the Perron-Frobenius operator P is defined by Φ(X 0 ) ∼ Pf . These two operators are adjoints on the appropriate spaces [LM94], and allow for far-reaching analysis of the underlying dynamics; such analysis continues to be intensively exploited in applications such as (geophysical) fluid dynamics [FPET07, DFH + 09, FLS10, FHRvS15, AM17], molecular dynamics [DDJS96, SFHD99, PWS + 11, SS13, BPN14, BKK + 17], or stability and control [MMM13,MM16,PBK18]. In particular, smoothness or differentiability of spectral objects of the transfer operator might allow for more efficient optimization of fluid mixing processes [FGTW16].
One of our main results in this paper is the differentiable dependence of these transfer operators on the time-dependent drift function, i.e., on the driving velocity field. As we consider the operators in their induced norm topology on L 2 -spaces, we will consider genuinely nondeterministic dynamics. This is because for deterministic systems, these spaces are too "large" for the transfer operators associated with deterministic systems to depend even continuously on the drift with respect to the induced operator norm, as the following example shows.
Example (Discontinuous drift-dependence for deterministic dynamics). Let us consider the time-t map Φ of some deterministic differential equation, and the time-t mapΦ of a slight perturbation of the previous differential equation. Let Φ andΦ be such that the associated Koopman operators U : g → g • Φ andŨ : g → g •Φ are well-defined on L 2 (Λ), where Λ is the Lebesgue measure; cf. [LM94,Wal00] for details. For x with Φ(x) =Φ(x), let us consider the function sequence f n = Λ(A n ) −1/2 1 An , where {A n } n∈N is a sequence of balls with A n ↓ {Φ(x)}, and 1 A denotes the characteristic function of a set A. Now, it is easy to see that there will be some N such that U f n −Ũ f n L 2 (Λ) = 2 for all n ≥ N , however small the perturbation in the drift is chosen to be. This shows that U cannot depend continuously on the drift.
The prior example implies that continuous differentiability cannot hold in the deterministic case. Combining this observation with duality arguments from Lemma A.1 and Remark A.2 shows that the same assertion also holds true for the Perron-Frobenius operator on L 2 (Λ). Note that the main reason for non-continuous drift-dependence in the example is that functions with highly localized supports are mapped by U and P to functions with highly localized supports. However, for non-deterministic systems driven by non-degenerate noise, the noise has the effect of spreading the support of the initial conditions, thus enabling a smooth drift-dependence of the associated transfer operators. We also note that, while the prior example rules out smooth driftdependence of transfer operators associated with deterministic dynamics on L p -spaces, there could be other spaces where this property is retained [Bal18]. Such spaces, however, would require norms that "punish" increasingly localized densities [GL06,Thi12]. In addition, single trajectories or realizations of both deterministic ordinary-and non-deterministic differential equations can be shown to depend smoothly on the drift, by using the integral form of the differential equation and the implicit function theorem. As this will not be of importance further on, we leave the details to reader.
We would like to remark that smooth dependence of invariant measures-often called linear response-has been considered in specific cases. For instance, Butterley and Liverani [BL07] show differentiability of SRB measures corresponding to Anosov flows with respect to onedimensional parameters, and give an exhaustive report on the work that has been done previously in this field. For a survey on linear response results for deterministic systems (maps), we refer to [Bal14]. Results for non-deterministic systems arise in the context of random compositions of maps [BRS17], or for stochastic ordinary and partial differential equations in the weak topology [HM10]. The latter reference shows Gâteaux-type pointwise differentiability of the transfer operators acting on smooth functions with respect to a real parameter, where also the (constant-in-space) diffusion matrix can vary with the parameter. It belongs to the class of approaches where functional-analytic tools are used to derive quantitative perturbation results for transfer operators and their dominant eigenmodes [KL99,Gal15,Sed18,GG19]. Our improvement over these results is that we consider (i) Fréchet differentiability in the infinite-dimensional space of velocity fields driving the stochastic differential equations (ii) of the associated transfer operator in its strong norm topology, and thus obtain differentiability of all isolated eigenvalues.
One question that we do not address in this paper is differentiability with respect to the diffusion matrix. In this paper, we rely on Girsanov's formula in order to obtain an expression for the Radon-Nikodym derivative between two path measures; this expression is crucial to our estimates for computing the Fréchet derivative. Girsanov's formula applies in the case where the drift term is changed, and is a fundamental result in the theory of stochastic differential equations. To the best of our knowledge, an analogous formula for changes of the diffusion term does not exist. This is because the set of possible changes that one can apply to the diffusion matrix while still obtaining a mutually absolutely continuous path measure is severely constrained. Consider the simple case of a constant zero drift term and identity diffusion matrix. In this case, the path measure is Wiener measure on path space, which is a Gaussian measure. If one scales the identity diffusion matrix by a constant whose absolute value is not 1, then the resulting path measure is not mutually absolutely continuous with respect to Wiener measure [Bog98, Example 2.7.4]. Thus, no Radon-Nikodym derivative exists, and the approach we follow for changes of the drift cannot be applied for such changes of the diffusion matrix, even in this simple case.
In this paper, we first introduce the setting which we are working in, state our assumptions, and introduce some key auxiliary results in Section 2. Then, we will extend ideas from [FLL + 99] to establish Fréchet differentiability of pathwise expectations with respect to the drift in Theorem 3.1 of Section 3. Section 4 uses the quantitative remainder estimates from the preceding section to show our main result: differentiable drift-dependence of the transfer operators associated with non-deterministic dynamics governed by stochastic differential equations, Theorem 4.5. This is then applied in Section 5 to carry over the smooth dependence on the drift to isolated eigenvalues and corresponding eigenfunctions of transfer operators in Theorem 5.1. We conclude by pointing out two immediate consequences for dynamical systems. For convenience, we collect some relevant results on stochastic processes and transition kernels in the appendices.

Prerequisites from stochastic analysis
Stochastic differential equations. For the rest of this work we are concerned with the following objects. We fix T > 0 and Ω := C([0, T ]; R d ) := {f : [0, T ] → R d | f continuous}. Further, let F be a σ-algebra on Ω, and P be a probability measure on (Ω, F). For a filtration (F t ) t∈[0,T ] with F t ⊆ F for every t ∈ [0, T ], let W = (W t , F t ) t∈[0,T ] be the d-dimensional Wiener process with respect to P. We use the notation W = (W t , F t ) t∈[0,T ] to emphasize that W t is F tmeasurable for every t ∈ [0, T ]. We define a measurable functional h acting on [0, T ] × Ω to be adapted to the filtration be measurable functions such that the following stochastic differential equation (SDE) admits a unique strong solution (cf. Theorem A.3): (2.1) Fix a deterministic initial condition x ∈ R d . Let P x denote the conditioning of P to the subset {f : [0, T ] → R d | f (0) = x} of Ω. For simplicity, we shall assume that b and σ satisfy the conditions (A1) and (A2) (the Lipschitz continuity (2.2) and linear growth conditions (2.3)), which we recall below for convenience.
Assumption 2.1. There exist constants 0 < L, λ σ < ∞ that do not depend on y, y ′ , or t, such that (A1) The functions b and σ satisfy the Lipschitz conditions (A2) The functions b and σ satisfy the growth conditions where ξ ⊤ denotes the transpose of ξ.
(A4) It holds that (2.5) (Here and in the following, the subscript 's' denotes the time input.) Remark 2.2. The last condition (A4) can be derived from the other assumptions, even under the more general condition of locally Lipschitz drift and diffusion matrix. However, for the sake of simplicity, in Lemma A.4 a proof is given for globally Lipschitz drift, change of drift, and diffusion.
Norms for the drift. Let us consider the vector space Above, we allow n ∈ N to vary, so that by using the 'stacked' vector representations of matrices, we can consider C to contain both vector-and matrix-valued functions whose entries satisfy the Lipschitz continuity and growth conditions. In particular, the drift b and diffusion σ introduced earlier belong to C. The essential idea is that C contains functions from [0, T ] × R d to some Euclidean space such that each component of the image is Lipschitz continuous and grows linearly. In particular, if γ ∈ C, then admits a unique strong solution X γ , by Theorem A.3. Using this notation, we have that X 0 denotes the unique strong solution of the unperturbed SDE (2.1). Stochastics notation. Let E x denote the expectation operator with respect to P x , and let µ γ,x := P x • (X γ ) −1 denote the law of the solution of (2.7). Using this notation, it follows that µ 0,x is the law of the solution of the unperturbed SDE (2.1). We shall write E γ,x (E 0,x ) to denote the expectation with respect to µ γ, We shall say that f : holds. Utilizing this notion we set We shall omit the dependence on the parameter µ 0,x and simply write f L ∞ below. Note that (L ∞ (µ 0,x ), · L ∞ ) is a normed vector space. Define the function class V := C ∩ L ∞ (µ 0,x ), and equip V with the following norm: In Lemma A.5, we show that (V, · V ) is a Banach space.
Key auxiliary results. Now, let x ∈ R d be arbitrary, and let us consider the map u x g : V → R, for a suitable observable g specified below in Theorem 3.1 (C2), defined by (2.10) We wish to establish that u x g is a Fréchet differentiable map from (V, · V ) to (R, | · |). From now on, we will consider γ ∈ V .
First, we recall a result from [LS01] which will be crucial for the following steps. Note that in the result below, we allow for the Wiener process W to be of different dimension than the diffusion processes ξ and η. The result below is adapted from [LS01, Section 7.6.4].
Remark 2.3. The main purpose of Theorem 2.4 is that it allows us to express probabilistic objects-like expectation values-involving the process X γ with respect to the law of the unperturbed process X 0 . This will be key to establishing the differentiability of (2.10) with respect to γ.
T ] × X to R n , and σ is a measurable function from [0, T ] × X to R n×k , such that the following assumptions are fulfilled: (B1) The system of algebraic equations and satisfy (A1) and (A2), so that both (2.11) and (2.12) have unique strong solutions ξ and η respectively.
(B3) It holds that Then, the law µ ξ of ξ is absolutely continuous with respect to the law µ η of η, and (2.13) Remark 2.5. Following the steps of Lemma A.4 one can prove that (B3) can be derived from (B2) and (A3). Further, since in our , and thus Assumption 2.1 implies (B2); the same applies for b t := b t + γ t and σ. Finally, as we assume σ(t, x) to be invertible for all (t, x), (B1) is satisfied as well.
From Girsanov's formula (2.13) and its specification (2.16) to our case below, it is known that the term inside the exponential is described by a continuous semimartingale 14) and the associated quadratic variation process For an easier readability of the proof of our main result below, we first state two technical lemmas. We defer the proofs to the Appendix A.2.
Lemma 2.6. The exponential martingale (Z γ t (X 0 )) t∈[0,T ] described by is square integrable with respect to P x .
The following result will allow us to bound the series expansion of Z γ t from above and pass to the limit γ → 0 for the differentiability result.

Fréchet differentiability of expected path functionals
In order to prove the Fréchet differentiability of the non-linear function u x g (γ) = E x [g (X γ )] with respect to γ at γ = 0 we propose a derivative (3.1) and show the required convergence. We do so by first shifting all the γ-dependence to a new stochastic process Z γ by Theorem 2.4 (see also Remark 2.3) and then employing Lemma 2.6 and Lemma 2.8 to pass to the limit.
Theorem 3.1. Let x ∈ R d . Suppose that Assumption 2.1 above and following hold true: Then the Fréchet derivative of u x g : V → R at γ = 0 exists, and is given by Remark 3.2 (Admissible observables). Before proceeding to the proof, let us examine which observables g satisfy condition (C2) above.
(i) If γ ∈ V , then γ is almost surely bounded on [0, T ]× R d and continuous, hence bounded everywhere. Therefore, by reversing the roles ofb and b in Theorem 2.4, we obtain that µ 0,x and µ γ,x are in fact mutually locally equivalent. Thus, since both µ 0,x and µ γ,x are probability (in particular, finite) measures, g ∈ L ∞ (µ 0,x ) satisfies (C2).
(ii) In Sections 4 and 5, we shall focus on observables of the kind g =g • π t , whereg : R d → R and π t : Ω → R d is given by π t (X) = X t , the coordinate projection for some 0 ≤ t ≤ T . For such observables, we have where π # t denotes the push-forward by π t , i.e., π # t µ 0,x = µ 0,x • π −1 t . The push-forward of the path-measure µ 0,x is the distribution of the process X 0 at time t, i.e., π # t µ 0,x (dy) = k t (x, y, 0)dy, where the transition kernel k t is introduced in (4.1) below. Similarly as in case (i), g satisfies the condition (C2) for anyg ∈ L ∞ (R d , Λ), with Λ being the Lebesgue measure.
Proof of Theorem 3.1. By Remark 2.5, applying Theorem 2.4 yields the formula (2.16) for the Radon-Nikodym derivative of µ γ,x with respect to µ 0,x . Given the definition (2.10) of the map u x g , it follows that where in the last equality we used the notation E 0,x to denote the expectation of functionals of X 0 , or equivalently the expectation with respect to µ 0,x . The square integrability of Z γ t with respect to µ 0,x follows from Lemma 2.6. We T 2 converges to zero as γ L ∞ converges to zero. Using the definition of Z γ T , the series expansion of the exponential, the triangle inequality, and the inequality (a + b) 2 ≤ 2(a 2 + b 2 ), we have uniformly in x. By Lemma 2.8, we have the following estimate Together with (A.6), an estimate for M γ 2 T , we obtain where the right-hand side is O( γ 2 L ∞ ), and hence decreases to zero as γ L ∞ decreases to zero. Note that the parameter λ σ √ T determines the convergence rate. To complete the proof, we observe that together with the Cauchy-Schwarz inequality implies which shows that the Fréchet derivative of u x g at zero exists and is given by Remark 3.3 (Continuity of the differential). Instead of calling the above "differentiation in β = 0" we could also call it "differentiation in β = b" and denote X 0 by X b instead. We chose 0 here for simplicity and the interpretation of being unperturbed. The different notation would lead to the following representation of the derivative It also poses the question whether we have continuity in b and thus continuous differentiability everywhere. This is indeed the case as the following proposition shows.
Proposition 3.4. Let us assume that gM γ belongs to L 2 (µ b,x ) and that the assumptions of Theorem 3.1 hold for X 0 replaced by X b . Then the map Sketch of proof. Under the stated hypotheses, one way to show continuous differentiability everywhere is to use the same strategy as for the proof of Theorem 3.1. By (3.8), Therefore, then continuity of the derivative for fixed γ will follow. Note that assumptions (C1) and (C2) in Theorem 3.1 readily imply (i).
To show that (ii) is satisfied, we modify the steps in the proof of Theorem 3.1, by using (3.3) with (A.8) and (A.6) to bound |1 − Z b ′ T | 2 . Uniformity of the estimates in γ V imply the claim.
Remark 3.5 (Time-t-observables). For a fixed but otherwise arbitrary 0 ≤ t ≤ T , let us examine the structure of the derivative (3.1) for the observablesg • π t discussed in Remark 3.2 (ii).
As π t is F t -measurable, g =g • π t is also F t -measurable, and as in the proof of Theorem 3.1 we have where we used in the third and fifth equalities that g dµ = g dµ| Ft for F t -measurable g. This leads to Note the difference in the equation above with respect to (3.1): the observableg now does not depend on the entire path X 0 , only on its value at time t, and the Itô integral goes up to t instead of up to T .
The preceding results extend to the case where we augment the SDE (2.1) with normal reflecting boundary conditions for some smooth, bounded domain X. In this case, the reflecting boundary conditions are encoded by the local time process associated to the boundary; see, e.g., [RY99, Chapter IX, §2, Exercise 2.14] for a one-dimensional example, or [Pil14, Theorem 2.4.1] for the general case. Recall that Girsanov's formula describes the Radon-Nikodym derivative of two mutually equivalent probability measuresP and P. Since the boundary conditions are invariant under changes between mutually equivalent probability measures, the preceding analysis carries over to the case of diffusions with reflection.

Fréchet differentiability of transfer operators
We will utilize in this section our results from above to show the differentiability of the Perron-Frobenius operator and the Koopman operator with respect to the drift.
We shall consider a compact subset X ⊂ R d with C 1+δ boundary for some δ > 0. In addition to Assumption 2.1, let σ ∈ C 2+α ([0, T ] × X; R d×d ) and b, γ ∈ C 1+α ([0, T ] × X; R d ) for some α > 0. Let us consider the processes X 0 and X γ , governed by (2.1) and (2.7) on X with reflecting boundary conditions [Pil14], respectively. For the rest of this section, let us fix some t ∈ (0, T ]. Recall from Remark 3.2 that X γ t , the process X γ at time t, has the distribution µ γ,x t := π # t µ γ,x , where π # t is the push-forward operator associated with the coordinate projection ("time-t-evaluation functional") π t . Under the given assumptions, the distribution is absolutely continuous with respect to the Lebesgue measure Λ on X with density 1 k t , i.e., π # t µ γ,x (dy) =: µ γ,x t (dy) =: k t (x, y, γ) dy, k : (t, x, y, γ) → k t (x, y, γ) , The function k is called the transition kernel. Consider the following lemma, which is proven in Appendix B.
The Perron-Frobenius operator P t (γ) and Koopman operator U t (γ) can be expressed with the transition kernel according to for an initial density f ∈ L 2 (Λ) and an observable g ∈ L 2 (Λ) at initial time t = 0. By (2.10), the Koopman operator has the useful representation where we abuse the notation u x g (γ) to denote u x g•πt (γ) if g ∈ L 2 (Λ). Using Theorem 3.1, this implies "point-wise" differentiablility of γ → U t (γ)g(x) for fixed x and g. In the following we will lift this property to the operator level. Using the transition kernel, duality arguments, and uniformity of certain bounds in x ∈ X, we will be able to extend our result to the Perron-Frobenius operator and Koopman operator, where the differentiable dependence is with respect to the operator norm. To this end, we will first dispose of g in Lemma 4.2, then get rid of x in Lemma 4.4, and bring all this together in Theorem 4.5. Below, we write X * to denote the topological dual of a given vector space X.
Proof. In order to lift our point-wise result Theorem 3.1 to the operator level, we use some quantitative bounds derived in the proof of that theorem. By Lemma 4.1, if g ∈ L 2 (Λ) then g ∈ L 2 µ 0,x t holds, and we have for the derivative of u x g with respect to γ at γ = 0. The equations (3.6) and (3.7) imply the existence of the residual where ♦ = V or ♦ = L ∞ µ 0,x . Now, by substituting equation (3.6) and (3.2) in (4.5), and by applying (4.2) and (3.5), it follows that there exists a constant C independent of x, g and γ, and a positive function q(γ) independent of x and g such that where q(γ) γ ♦ vanishes for γ ♦ → 0. Thus, if we consider u x g (γ) and its derivative (4.4) as linear functionals on g ∈ L 2 (Λ), i.e., then dividing (4.5) by g L 2 (Λ) , taking the supremum over all g = 0, and using (4.6), we obtain (4.7) Using that |q(γ)| γ ♦ → 0 as γ ♦ → 0 proves the claim.
Further, we get for the derivative By using that the bounds (4.6) and (4.7) are uniform with respect to x, we remove the x-dependence of the previous lemma in the following result.
Proof. Considering Lemma 4.2, we still need to get rid of the evaluation in x. Fortunately, all relevant bounds are uniform in x. First, we use the Riesz isomorphism R to identify L 2 (Λ) * with L 2 (Λ), which leads to and further guarantees the existence of the following operator in γ, Note that the operator is a bounded linear operator. Since R is a linear isomorphism, Lemma 4.2 and in particular (4.7) provide Finally, differentiability of γ → k t (·, ·, γ) , V → L 2 (Λ ⊗ Λ) is proven by observing that and noting that |q(γ)| γ ♦ → 0 as γ ♦ → 0.
We recall that the Perron-Frobenius operator and the kernel are related by P t (γ)f (y) = The differentiability of P t (γ) is now a simple consequence of Lemma 4.4. Given two vector spaces U 1 and U 2 , we write L(U 1 , U 2 ) to denote the space of bounded linear operators from U 1 to U 2 .
Remark 4.6. We could prove the differentiability of the Koopman operator U t (γ) analogously to Theorem 4.5. Alternatively, we can use the reasoning from Lemma A.1 and Remark A.2 to deduce the differentiability of the Koopman operator with respect to γ.

Differentiability of isolated eigenvalues
With a classical result stated in e.g., [Klo17,p.2] and originally derived in [Ros55] using the implicit function theorem, we can state the following result on smooth drift dependence of the eigenvalues and eigenvectors of the transfer operators. Note that all non-zero eigenvalues are automatically isolated, as the transfer operator is compact; see [FK17,Lemma 30] and the remark following the proof (note that this result assumes stronger regularity conditions on the data). Recall that the space V from Section 2 equipped with the norm · V defined in (2.9) is a Banach space, cf. Lemma A.5 in the appendix.
Theorem 5.1. Let us assume that λ 0 is a simple and isolated eigenvalue with eigenvector f 0 of the unperturbed linear and bounded operator P t (0) that belongs to L L 2 (Λ), L 2 (Λ) . Then, there exists a neighborhood U ⊂ V of the constant function 0 such that for all γ ∈ U the operators P t (γ) have an isolated eigenvalue λ γ close to λ 0 . Further, the mappings γ → λ γ and γ → f γ that send the function γ to its corresponding eigenvalue and eigenvector respectively are continously Fréchet differentiable. In the case of the eigenvector, which is only unique up to scaling, the scaling can be chosen such that the given map is differentiable.
As with Theorem 4.5, one can show the analogous result for the Koopman operator.
Proof. By the Fréchet differentiability of γ → P t (γ) from Theorem 4.5 we have Now, we can consider P t (γ) as an additive perturbation of P t (0) in operator space L. Using the idea of [Ros55], nicely stated in [Klo17, p.1], we can deduce the existence of U ⊂ L, a neighborhood of P t (0), and mappings m 1 : P t (γ) → λ γ and m 2 : P t (γ) → f γ such that m 1 and m 2 are analytical on U . The proof in [Klo17] utilizes the implicit function theorem to conclude this. By Theorem 4.5, ℓ : γ → P t (γ) is a continuously differentiable mapping. Thus, there exists a neighborhood U ⊂ V of 0 such that the mappings n 1 = m 1 • ℓ : U → R , γ → λ γ and n 2 = m 2 • ℓ : U → L 2 (Λ) , γ → f γ are continuously Fréchet differentiable.
Remark 5.2 (Changing reference measures). For non-stationary dynamics it is natural to consider P t : L p (µ 0 ) → L p (µ t ), cf. [Fro13,Den17], where µ 0 is some given initial distribution and µ t is the push-forward of µ 0 by the dynamics. 2 The γ-dependence of µ t = µ γ t , however, poses the problem that it is not trivial to compare the different P t (γ)f ∈ L p (µ γ t ) with one another. Thus, the natural approach here is to work on the "common space" L p (Λ). Results that consider the case where the perturbed operators map to different spaces [Kol06,ZP07,MNP13] are more involved, and beyond the scope of the present paper.
Finally, let us discuss two applications that illustrate the significance of the preceding results in the context of dynamical systems. Singular values and functions. As both P t (γ) and U t (γ) are differentiable with respect to γ by Theorem 4.5, it follows that their concatenations P t (γ)U t (γ) and U t (γ)P t (γ) are also differentiable with respect to γ. However, since P t (γ) and U t (γ) are adjoints (cf. (4.3) or [LM94]), we have the differentiability of P t (γ) * P t (γ) and P t (γ)P t (γ) * , thus of the singular values, and thus of the right and left singular vectors of P t (γ) and U t (γ). These are of particular interest for non-autonomous systems, as studies by Froyland et al. [FSM10,Fro13] have shown that they can be connected to finite-time persistent dynamical structures called coherent sets. It should be noted that these studies consider the operator P t with respect to changing reference measures, as discussed above in Remark 5.2. Nevertheless, our results apply to the large class of systems with divergence-free velocity fields (modeling incompressible flows) and with diffusion coefficients independent of the spatial variable x. For such systems the Lebesgue measure is an invariant reference measure.
Periodically forced systems. For periodically forced non-autonomous systems, i.e., where b(t+ T, ·) = b(t, ·), γ(t + T, ·) = γ(t, ·), and σ(t + T, ·) = σ(t, ·) for some T > 0, Theorem 5.1 shows smooth drift-dependence of the long-term behavior, e.g., of ergodic averages of the typē To see this, let P t 0 ,t 1 (γ) denote the Perron-Frobenius operator for the SDE (2.7) running from time t 0 to t 1 , and let {f γ t } t∈[0,T ) be the stationary family of densities, i.e., P s,t (γ)f γ s = f γ t mod T . In particular, P s,s+T (γ)f γ s = f γ s , and the eigenfunctions are differentiable with respect to γ. As the process X γ is clearly irreducible due to non-degenerate noise, the dominant eigenvalue 1 of P s,s+T (γ) is simple. By Birkhoff's individual ergodic theorem x, which is continuously differentiable in γ.
Remark A.2. More important than the simple and short proof given below are some special cases we are interested in. For W = L(V * , U * ) and the linear isometry j : A → A * the above lemma implies the differentiability of the adjoint family (A * (z)) z∈Z . If in this special case U, V are also reflexive spaces, then the reverse direction of the lemma holds as well.
Proof. The claims simply follow from the chain rule for Fréchet derivatives and the rule for differentiating linear operators.
The next result considers existence and uniqueness of strong solutions. It is adapted from [LS01, Section 4.4.2, Corollary].
Proof. We verify that · V indeed defines a norm on V := C ∩ L ∞ (µ 0,x ). Note that the linear growth condition (2.3) is a weaker condition than the boundedness condition (2.8), and that Lipschitz continuity combines with (2.8) to ensure that any f ∈ C ∩ L ∞ (µ 0,x ) is everywhere bounded. Thus, by the Lipschitz continuity and boundedness conditions, we have f V = 0 if and only if f is constant and equal to zero. Using the definition of V and its norm, we observe that if f, g ∈ V then f + g V ≤ f V + g V , and also that αf V = |α| f V for all α ∈ R. Observe also that V consists of all functions whose · V -norm is finite. To show that V is closed under Both quantities on the right-hand side are finite for any n ∈ N, so f ∈ V and hence f is Lipschitz continuous. Since L ∞ (µ 0,x ) contains discontinuous functions, it follows that V is a normed vector space and a proper subset of L ∞ (µ 0,x ). The completeness can be proven using adaptations of the arguments above.
A.2. Proofs of lemmas 2.6 and 2.8 Recall that we use M to denote the quadratic variation process of a continuous local martingale M , as in (2.15). The following result is given in [Kaz94, Section 1.2, Theorem 1.5].

B. Properties of the Transition Kernel
Via the connection between the SDE (2.1) and the Fokker-Planck partial differential equation (B.1) below, one can investigate properties of the transition kernel both by stochastic calculus and the theory of partial differential equations (PDEs  [Iva82a,Iva82b] apply in our case. Thus, we shall derive properties of the transition kernel associated with the SDE (2.7) on a bounded domain X with sufficiently smooth boundary and reflecting boundary conditions from estimates stemming from the theory of parabolic PDEs. In particular, we will be following the steps of Friedman [Fri08] and sharpen some of his estimates.
In what follows, we will establish a bound of this type for Γ N as well. To this end, we will need a kind of an analogue of [Fri08, Lemma 1.4.3], with respect to integration on the boundary ∂X.
Now, let us construct the fundamental solution Γ N of the problem with Neumann boundary conditions. The general form of the problem is stated in [Fri08, (5.3.1)-(5.3.3)], but note that his functions β(x, t), f (x, t), g(x, t) are identically zero in our case, such that we arrive at We will now show that ϕ can be expressed in the kernel form ϕ(x, t) = and that Ψ satisfies a bound of type (B.3). Following the derivation of (5.2.12) we can show that for x ∈ ∂X the bound ∂Γ(x, t; ξ, τ ) ∂ν(x, t) ≤ const.
With this, (B.10), and Lemma B.1 again we obtain for the integral in (B.12) with respect to y, σ that t 0 ∂X Θ(y, σ; x, z, t) dS y dσ ≤ const.
for some µ * > 1/2 arbitrary. With this bound and (B.3), writing (B.12) and (B.13) in the form u(x, t) = X Γ N (x, t; z, 0)f 0 (z) dz, we arrive at the desired bound which implies a uniform bound in x for any fixed t > 0.