Linear and nonlinear transport equations with coordinate-wise increasing velocity fields

We consider linear and nonlinear transport equations with irregular velocity fields, motivated by models coming from mean field games. The velocity fields are assumed to increase in each coordinate, and the divergence therefore fails to be absolutely continuous with respect to the Lebesgue measure in general. For such velocity fields, the well-posedness of first- and second-order linear transport equations in Lebesgue spaces is established, as well as the existence and uniqueness of regular ODE and SDE Lagrangian flows. These results are then applied to the study of certain nonconservative, nonlinear systems of transport type, which are used to model mean field games in a finite state space. A notion of weak solution is identified for which a unique minimal and maximal solution exist, which do not coincide in general. A selection-by-noise result is established for a relevant example to demonstrate that different types of noise can select any of the admissible solutions in the vanishing noise limit.


Introduction
This paper has two main purposes.First, we develop a well-posedness theory for first-and second-order linear transport equations for a new class of irregular velocity fields, as well as the corresponding ODE and SDE regular Lagrangian flows.We then apply the results to the study of certain nonlinear transport systems, motivated in particular by applications to mean field games (MFG) with a finite state space.
The divergence of such a vector field is bounded from below, which means that, formally, the flow will not concentrate at null sets.This indicates that the two problems (1.1) and (1.2) are amenable to a solution theory in Lebesgue spaces.
On the other hand, the measure div b is not in general absolutely continuous with respect to Lebesgue measure, and this leads to the formation of vacuum for t > s.It is therefore the case that the existing theory of renormalized solutions, initiated by DiPerna and the first author [28] for Sobolev velocity fields and extended to the case b ∈ BV loc and div b ∈ L ∞ by Ambrosio [2], does not cover our present situation.In particular, the two problems (1.1) and (1.2) cannot be covered with a unified theory, due to the fact that (1.1) cannot be understood in the distributional sense if div b is not absolutely continuous with respect to Lebesgue measure and u ∈ L 1 loc .Nevertheless, we exploit the dual relationship between the two problems, and provide a link to the forward, regular Lagrangian flow (1.3).Analogous results are also proved for the degenerate, second order equations (1.5) where b satisfies (1.3), a(t, x) =1 2 σ(t, x)σ(t, x) T is a nonnegative, symmetric matrix, and σ ∈ R d×m ; and the SDE flow (1.7) d t Φ t,s (x) = b(t, Φ t,s (x))dt + σ(t, Φ t,s (x))dW t , where W is an m-dimensional Brownian motion.
We then turn to the study of nonlinear, nonconservative systems of transport type that take the form (1.8) Here, u, f , and g are vector-valued, with f ∈ R d and g, u ∈ R m for some integers m, d ≥ 1, f and g are local functions of (t, x, u) ∈ [0, T ] × R d × R m , and the equation reads, for each coordinate i = 1, 2, . . ., m, The primary motivation for the consideration of (1.8) comes from the study of mean field games (MFG).These are models for large populations of interacting rational agents, which strategize in order to optimize an outcome, based on the collective behavior of the remaining population, while subject to environmental influences.The master equation for mean field games with a (finite) discrete state space takes the general form of the system (1.8) with d = m, as described in [44]; see also [10,33].Alternatively, systems of the form (1.8) arise upon exploiting dimension reduction techniques for continuum-state MFG models in which the various data depend on the probability distribution of players through a finite number of observables, i.e.
µ → u t, Φdµ for probability measures µ and some given continuous R d -valued function Φ.
This connection is explored by the authors and Lasry in [41].We note also that the special case where d = m, f (t, x, u) = −u, and g(t, x, u) = 0 leads to the system (1.9) − ∂ t u + u • ∇u = 0, which arises in certain models describing the flow of compressible gasses at low density with negligible pressure [36,46].The nonlinear equation (1.8) can formally be connected to a system of characteristic ODEs on R d × R m .In order to draw the analogy to MFG PDE systems and the master equation in a continuum state space (see for instance [15]), it is convenient to represent the characteristics as the forward backward system (1.10) −∂ s U s,t (x) = g(s, X s,t (x), U s,t (x)) U T,t (x) = u T (X T,t (x)), If f , g, and u T are smooth and the interval [t, T ] is sufficiently small, then (1.10) can be uniquely solved, and the unique, smooth solution u of (1.8) is given by u(t, x) := U t,t (x).The argument fails for arbitrarily long time intervals, in view of the coupling between X and the terminal condition for U . 1  The monotone regime is explored in [10,44], i.e., d = m and (−g, f ) : R d ×R d → R d ×R d and u T : R d → R d are smooth and monotone (one of which is strictly monotone).In that case, (1.10), and therefore (1.8), can be uniquely solved on any time interval.This regime is exactly analogous to the monotonicity condition of Lasry and the first author for MFG systems with continuum state space [38][39][40], and, as in that setting, strong regularity and stability results can be established for (1.8).The monotone regime also allows for a well-posed notion of global weak solutions of (1.8), even when f , g, and u T fail to be smooth [9].
If there exist functions H : R d × R d → R and v T : R d → R such that (g, f ) = (∇ x , ∇ p )H and u T = ∇v T (the so-called "potential regime"), then, formally, one expects u(t, •) = ∇v(t, •), where v solves the Hamilton-Jacobi equation (1.11) for which global, weak solutions can be understood with the theory of viscosity solutions [22].Weak solutions of (1.8) can then be indirectly understood as the distributional derivative of v, an approach which is taken in [19].In the special case where d = m = 1, (1.8) can be studied with the theory of entropy solutions of scalar conservation laws [37].
In this paper, using the theory developed here for linear equations, we identify new regimes of assumptions on f , g, and u T for which a notion of weak solution can be identified for any dimensions d, m ≥ 1.Under a certain ordering structure, the existence of a unique maximal and minimal solution is established, which do not coincide in general.This nonuniqueness is further explored from the viewpoint of stochastic selection, and we prove, for a specific but informative example, that any of the family of solutions can be distinguished by a certain vanishing noise limit, indicating that the choice of a mean field game equilibrium is very sensitive to the manner in which low-level, systemic noise is introduced to the model.

Summary of main results
We list the main results of the paper here, in an informal setting.More precise statements and discussions can be found within the body of the paper.
The assumption that b satisfy the (semi)-increasing condition (1.3) implies that b ∈ BV loc .We emphasize again, however, that the measure div b will in general have a singular part with respect to Lebesgue measure, and we therefore cannot appeal to the existing results on renormalized solutions to transport equations with irregular velocity fields.We do not give a full account of the vast literature for such problems, but refer the reader to the thorough surveys [3][4][5][6] and the references therein.
Our approach to the first-order transport problem is to study the well-posedness of the regular Lagrangian flow for (1.4) directly, rather than using PDE methods.The assumption (1.3) on b allows for a comparison principle with respect to the partial order (1.12) x, y ∈ R d , x ≤ y ⇔ x i ≤ y i for all i = 1, 2, . . ., d.
A careful regularization procedure then leads to the existence of a minimal and maximal flow, which coincide a.e., and we have the following result (see Section 3 below for more precise statements): Theorem 1.1.Assume b satisfies (1.3).Then, for a.e.x ∈ R d , there exists a unique absolutely continuous solution of (1.4), and there exists a constant C > 0 such that, for all 0 ≤ s ≤ t ≤ T , φ −1 t,s (A) ≤ C|A| for all measurable A ⊂ R d .
We also obtain analogous results for the SDE (1.7).Degenerate linear parabolic equations and SDEs with irregular data have been studied in a number of works that generalize the DiPerna-Lions theory and the Ambrosio superposition principle; see [20,29,42,43,55].A common source of difficulty involves the dependence of σ on the spatial variable, even if it is smooth.This is the case, for instance, when b ∈ BV loc and div b ∈ L ∞ treated by Figalli [29], or when b satisfied a one-sided Lipschitz condition from below, as considered by the authors in [45]; in both settings, the results are constrained to σ(t, x) = σ(t) constant in R d .In our present setting, we can relax the spatial dependence, and we assume that σ is Lipschitz and satisfies (1.13) σ ik (t, x) = σ ik (t, x i ), for all (t, x) ∈ [0, T ] × R d , i = 1, 2, . . ., d, k = 1, . . ., m.
Then, in Section 4, we turn to the study of the linear transport equations (1.1)-(1.2),as well as the second-order equations (1.5)- (1.6), which can be related to the ODE and SDE flows in Section 3.
The solution operator for the nonconservative equation (1.1) cannot be made sense of in the sense of distributions, because the measure div b can have a singular part.We nevertheless provide a PDE-based characterization for solutions that are increasing or decreasing with respect to the partial order (1.12).In order to give meaning to the ill-defined product b • ∇u in this context, we introduce mollifications of a onesided nature that lead to commutator errors which are shown to possess a sign; this is to be compared with the renormalization theory initiated in [28], in which the commutator errors are shown to converge to zero with the convolution parameter.This leads to a notion of sub and supersolution for (1.1), which are proved to satisfy a comparison principle.The solution operator for (1.1) can thus be alternatively characterized in terms of these regularizations, and, moreover, the regular Lagrangian flow (1.4) can be recovered as the (vector)-valued solution of the terminal value problem (1.1) with u T (x) = x.These two viewpoints on the transport equation and ODE flow are instrumental in our understanding of the nonlinear equations to follow.
The continuity equation (1.2) (or the Fokker-Planck equation (1.6) in the second-order case) can then be related to the nonconservative equation through duality.Importantly, arbitrary distributional solutions of (1.2) are not unique in general.We prove that, if f 0 ≥ 0, then there exist unique distributional solutions of (1.2) and (1.6) which coincide with the duality solution.Moreover, this result is proved independently of the superposition principle; instead, we use the duality with the nonconservative equation, and the characterization of its solutions in terms of one-sided regularizations.
A consequence of the uniqueness of nonnegative distributional solutions of (1.2) is that, if f and |f | satisfy (1.2) in the sense of distributions, then f is the "good" (duality) solution (see Corollary 4.1 below).We do not know whether this property characterizes the duality solution, or, in other words, whether the duality solution satisfies the renormalization property in general.This should be compared with [45], where the authors resolve the same questions for half-Lipschitz velocity fields.
The paper concludes in Section 5 with the study of the nonlinear equation (1.8) and the associated system (1.10).We operate under the assumption that the discontinuous nonlinearities f and g satisfy, for some C ∈ L 1 + ([0, T ]), (1.14) for all i, j = 1, 2, . . ., d and k, ℓ = 1, 2, . . ., m, Observe that (1.14) is satisfied with C ≡ 0 by the particular example of the Burgers-like equation (1.9).We develop a theory for solutions of (1.8) that are decreasing with respect to the partial order (1.12).The first observation is that the decreasing property is propagated, formally, by the solution operator.On the other hand, shocks form in finite time, and so, even if u T , f , and g are smooth, u(t, •) will develop discontinuities for some t < T in general, requiring a notion of weak solution.We next note that, under the above assumptions, if u is decreasing, then the velocity field b(t, x) := f (t, x, u(t, x)) satisfies (1.3).Solutions of the nonlinear equation (1.8) can then be understood as fixed points for the linear problem (1.1), and at the same time through the system of forward-backward characteristics (1.10), using the theory for the regular Lagrangian flows in Section 3.
Theorem 1.3.Assume f and g satisfy (1.14) and u T is decreasing.Then there exist a maximal and minimal decreasing solution u + and u − of (1.8) in the fixed point sense.If u is any other such solution, then u − ≤ u ≤ u + .
Continuous decreasing solutions of (1.8) satisfy a comparison principle, which in particular implies that u + is below every continuous supersolution and u − is above every continuous subsolution.In general, u − and u + do not coincide, which must then be a consequence of the formation of discontinuities.This nonuniqueness of solutions is closely related to the question of multiplying distributions of limited regularity.Indeed, in the equation (1.8), the product f (t, x, u) • ∇u cannot be defined in a stable (with respect to regularizations) way in view of the fact that, in general, u ∈ BV loc and ∇u is a locally finite measure.
The well-posedness of both strong and weak solutions of MFG master equations in the continuum state space setting has been explored under various sets of monotonicity assumptions [1,8,11,15,16,18,31,32,34,35,47].The approach in our setting, which involves appealing to Tarski's fixed point theorem for increasing functions on lattices [54], has also been taken in the continuum state space setting, where maximal and minimal solutions were found under related assumptions; see for instance [25][26][27]48].The partial order used in [48] comes from the notion of stochastic dominance for probability measures.We note that, for the equation (1.8) posed on an infinite dimensional Hilbert space of L2 random variables, the partial order (1.12) is related to the analogous notion of stochastic dominance for random variables, and we aim to study the infinite dimensional version of (1.8) in future work.
We explore the nonuniqueness issue in more detail for the Burgers equation (1.9) in one dimension, where the decreasing terminal value has a single discontinuity at 0: It turns out that (1.15) admits infinitely many fixed point solutions, consisting of a shock traveling with variable speed between 0 and 1.Of course, (1.15) can be reframed as a scalar conservation law, whose unique entropy solution is the shock-wave solution with speed 1/2.We note that the notion of entropy solution does not extend to the nonconservative equations (1.8) or (1.9).We characterize the family of fixed point solutions (1.15) as limits under distinct types of regularizations of the equation (1.15).
, and u ε is the unique classical solution of . We interpret Theorem 1.4 as a selection-by-noise result for the nonunique problem (1.15).Indeed, the result can be reformulated on the level of the system (1.10), which, for (1.16), becomes the forward-backward system of SDEs (1.17) Various selection methods have been proposed to study mean field games models that do not admit unique solutions, and we refer in particular to [19,24] for problems involving stochastic selection.Our result is distinguished by the consideration of several different descriptions of small noise, each of which selects a different solution for the deterministic problem in the vanishing noise limit.

A note on velocity fields with a one-sided Lipschitz condition
Let us remark that the regime in which b satisfies (1.3) shares many similarities with the setting in which b is half-Lipschitz from below 2 , that is, A key commonality in both settings is that div b is bounded from below, but not necessarily absolutely continuous with respect to Lebesgue measure.Transport equations and flows for velocity fields satisfying (1.18) have been studied from a variety of different viewpoints [12,13,17,21,[50][51][52], and in [45], the authors obtain very similar results to those described above regarding the existence, uniqueness, and stability of the regular Lagrangian flow forward in time, as well as proving well-posedness and studying properties and characterizations of solutions to the problems (1.1) and (1.2) in Lebesgue spaces.
A key difference between the two regimes is the behavior of the flow for (1.4) in the compressive direction, that is, backward in time.For velocity fields satisfying the half-Lipschitz condition (1.18), the backward ODE is uniquely solvable for all x ∈ R d .Moreover, the resulting backward flow is Lipschitz continuous, and it can be identified as the left-inverse to the forward, regular Lagrangian flow; see [45] for more precise statements, as well as new characterizations of the time-reversed versions of (1.1) and (1.2).
On the other hand, when b satisfies (1.3), the backward problem (1.4) is not in general unique for every x ∈ R d , nor is it true that a globally Lipschitz flow can always be found.We note that, even for examples where the backward flow has a unique solution for Lebesgue-a.e. x ∈ R d , neither the stability nor the solvability of the time-reversed versions of (1.1)-(1.2) in Lebesgue spaces can be expected to hold, because, in general, any backward flow solution to (1.4) will concentrate on sets of Lebesgue-measure zero.For a detailed discussion and examples, see subsections 3.4 and 4.3 below.

Notation
Given a bounded function φ : R d → R, φ * and φ * denote the lower and uppersemicontinuous envelopes, and, if φ is R m , the same notation is used coordinate-by-coordinate.
We often denote arbitrary functions spaces on R d as X(R d ) = X when there is no ambiguity over the domain.For p ∈ [1, ∞], L p w and L p w-⋆ denote the space of p-integrable functions with respectively the weak and weak-⋆ topologies.L p loc denotes the space of of locally p-integrable functions with the topology of local L p -convergence, and L p loc,w and L p loc,w-⋆ are understood accordingly.The notation 1 denotes the vector (1, 1, . . ., 1) in Euclidean space, the dimension being clear from context.Given two sets A and B, A△B := (A\B) ∪ (B\A).

Preliminary results
This section contains a collection of results regarding vector-valued notions of increasing/decreasing, as well as a vector-valued maximum principle.

Properties of increasing functions
We first introduce a partial order on R d that is used throughout the paper.Definition 2.1.For x, y ∈ R d , we will write (2.1) x ≤ y if x i ≤ y i for all i = 1, 2, . . ., d.
For any sequence y n → x, if y n x, then y n may be replaced with a value y ′ n such that y ′ n ≤ x, y ′ n ≤ y n , and φ i (y ′ n ) ≤ φ i (x).A similar argument holds if y n x for all n, and the claim follows.
Lemma 2.1 implies that each component of an increasing function φ : R d → R m has limits from both the "left" and "right."We will call an increasing function φ càdlàg if, for i = 1, 2, . . ., m, Remark 2.1.Given a nonnegative measure µ, the repartition function dµ is an example of a càdlàg increasing function, but such functions do not cover the full range of increasing functions if d ≥ 2; indeed, they are distinguished by the fact that mixed derivatives k j=1 ∂ x ℓ j φ for any distinct set (ℓ j ) k j=1 ⊂ {1, 2, . . ., d} are still measures.Consider a smooth surface Let n be the normal vector to Γ that, at all points of Γ, points inward to U + .If n ≥ 0 everywhere on Γ, in the sense of (2.1), then φ = 1 U+∪Γ is a càdlàg increasing function that is not a repartition function.

ABV functions
In one dimension, functions of bounded variation can be written as a difference of non-decreasing functions.With respect to the partial order (2.1), the generalization of this notion is a strict subspace of BV .
where the supremum is taken over all curves γ : For example, C 0,1 ⊂ ABV .It is straightforward to see that ABV = BV when d = 1.
Remark 2.2.Several generalizations of the notion of finite variation to multiple dimensions, besides the space BV , exist in the literature, and the one in Definition 2.2 is due to Arzelà [7].It is a strictly smaller subspace than BV (as we show below).This notion of variation, along with several others, seems not to have had the same ubiquity in the theory of PDEs as the usual notion of BV , but is particularly relevant in this paper.More details about ABV functions, and many other notions of variation in multiple dimension, can be found in [14].The function [0, ∞) ∋ r → φ(x + r1) is increasing, and, thus, is differentiable almost everywhere in [0, ∞).
The finiteness of the above expression for almost every x ∈ R d is then a consequence of Fubini's theorem.
The argument for the lim inf is identical.
Remark 2.3.In view of Lemmas 2.1 and 2.3, an increasing function is equal almost everywhere to a càdlàg function, and we assume for the rest of the paper that increasing functions are càdlàg.

One-sided regularizations
It will convenient at several times in the paper to specify regularizations of discontinuous functions that enjoy certain ordering properties.We specify two such regularization procedures here, each of which has merit in different situations.
We first discuss the inf-and sup-convolutions given, for some measurable function φ, by Then the following are either well-known properties or easy to check.
As a consequence, because φ is increasing, and similarly φ ε (x) ≤ φ * (x), so that part (a) is established.Now assume that 0 < ε < δ.Using again that φ is increasing, we find and similarly φ δ ≤ φ ε .The convergence statements in part (b) then easily follow.Finally, part (c) is seen upon computing Remark 2.5.An analogue of part (c) in Lemma 2.5 can also be seen for the sup-and inf-convolutions φ ε and φ ε in (2.2), namely, for all R > 0, there exists C R > 0 depending on the linear growth of φ such that, for all x ∈ B R , The advantage of the one-sided mollifications φ * ρ ε and φ * ρ ε is that the constant C R can be replaced by a uniform constant that does not depend on the growth of φ, which will be convenient when we consider φ depending on an additional time parameter in an L 1 way.On the other hand, the sup-and inf-convolutions are very flexible one-sided regularizations even when φ is not itself increasing.

A maximum principle
The following multi-dimensional maximum principle is used at various points in the paper.
Proof.We prove the result first under the additional assumption that inf and, for all t ∈ [0, T ] and i ∈ {1, 2, . . ., M }, x → V i (t, x) attains a global maximum on R d .In that case, define We thus obtain which is a contradiction in view of the assumption on d i .In this case, we indeed conclude that We now turn to the general case.For x ∈ R d , define ν(x) := 1 + |x| 2 , and note that Dν and D 2 ν are globally bounded on R d .Then standard arguments yield that, for all i ∈ {1, 2, . . ., M }, β > 0, and then lim β→0 β sup x∈S β ν(x) = 0.It follows that there exists a smooth bounded function and max Fix δ > 0 and a constant C > 0 to be determined.For (t, x) Then, for all i ∈ {1, 2, . . ., M }, Ṽ i (0, •) ≤ −δ, and, for all t ∈ [0, T ], Ṽ i (t, •) attains a global maximum over From the boundedness of the coefficients and the V i 's, and from the nonnegativity of d i , it follows that there exists C > 0 depending only on the bounds of the coefficients and the V i such that, as Taking C > C and letting β be sufficiently small, in relation to δ, we then see that di > 0. From the first step, we conclude that, if β is sufficiently small, then, for all (t, x) ∈ [0, T ] × R d and i ∈ {1, 2, . . ., M }, Sending first β → 0 and then δ → 0 yields the result.

The regular Lagrangian flow
The object of this section is to study the forward-in-time flow The second condition reads equivalently as Moving forward, for convenience, we define the positive, increasing, absolutely continuous functions 2) is a choice of basis on R d .Indeed, the results of the paper continue to hold if b is replaced by Ab(t, A T x) for a d × d orthogonal matrix A.
The precise interpretation of the problem (3.1), wherein b is discontinuous, is made sense of as a differential inclusion.Namely, at every discontinuity x of b(t, •), we have b * (t, x) b * (t, x), and so the natural formulation is Remark 3.2.In order to make notation less cumbersome, for a function φ : [0, T ] × R d → R, we will always denote by φ * and φ * the lower and upper semicontinuous envelopes of φ in the space variable only; that is, for [α i , β i ] containing all limit points of b(t, y) as y → x, is a slightly weaker formulation than the standard Filippov regularization [30], where boxes are replaced with general convex sets.
Before developing the general theory, it is useful to record the following a priori ODE bounds, which are an easy consequence of Grönwall's lemma.
Then there exists a constant C > 0, depending only on T > 0, such that C 0 (r)dr.

A comparison principle
The following comparison result for ODEs is at the heart of much of the analysis of this paper.It leads to the existence and uniqueness of the regular Lagrangian flow, as well as stable notions of solutions to the transport and continuity equations with velocity fields satisfying (3.2).
) for all R > 0, and ) be such that, with respect to the partial order (2.1), X(0) ≤ Y (0) and Proof.The continuity of X and Y implies that there exists R > 0 such that We may thus assume without loss of generality that and define t 0 := inf t ∈ (0, T ] : there exists i ∈ {1, 2, . . ., d} such that e −ψ1(t) ∆ i (t) − δe ψ2(t) > 0 .
We have t 0 > 0. Assume by contradiction that t 0 < T , and let i be such that the maximum is attained.

Maximal and minimal flows
The comparison principle from the previous subsection is now used to establish the existence of maximal and minimal (with respect to the order (2.1)) semicontinuous solutions of the differential inclusion (3.4) for vector fields satisfying ∂ xi b j ≥ 0 for i = j.
Then there exist solutions φ + and φ − of (3.4) that are absolutely continuous in time such that (a) For all 0 ≤ s ≤ t ≤ T , φ + t,s and φ − t,s are increasing in the sense of (2.1), φ + t,s is right-continuous, and , the condition on the derivatives of b becomes vacuous, and we recover the existence of a unique minimal and maximal solution under the sole assumption that b is locally bounded.

Uniqueness and stability of the regular Lagrangian flow
We now use the full assumption (3.2), and in particular, the lower bounds on ∂ xi b i for i = 1, 2, . . ., d that were not needed to prove the existence of the maximal and minimal solution of (3.4).In particular, we prove that φ + and φ − are equal almost everywhere, giving rise to a unique regular Lagrangian flow.
where ω 1 is as in (3.3).Then b satisfies (3.2) with C 1 = 0 and with a possibly different C 0 ∈ L 1 + , and φ± are the corresponding maximal and minimal flows.We may therefore assume without loss of generality that b(t, •) is increasing for t ∈ [0, T ].
We first prove the statement involving the lower-semicontinuous envelope of φ + t,s , and in view of Proposition 3.1(d), we need only prove the opposite inequality (φ where the one-sided mollifiers ρ ε and ρ ε are defined in (2.3), and let φ +,ε and φ −,ε be the corresponding flows.Arguing exactly as in the proof of Proposition 3.1, appealing to Lemma 2.5, as We may thus appeal to Lemma 3.2, and find that, for all t ∈ [s, T ], Sending first ε → 0 and then δ → 0 gives (φ + t,s ) * ≤ φ − t,s , as desired.We now prove the final statement.For 0 < ε < δ, we have the inequalities, for t ∈ [s, T ] and and so, sending δ → 0, it follows that For arbitrary ε > 0, 0 ≤ s ≤ t ≤ T , and Sending ε → 0 and appealing to dominated convergence gives and a similar argument gives Recalling that increasing functions are continuous almost-everywhere, it now follows from Proposition 3.2 that φ + t,s and φ − t,s are equal almost everywhere, given 0 ≤ s ≤ t ≤ T .We thus finally arrive at the almosteverywhere unique solvability of the ODE (3.1), and the identification of the unique regular Lagrangian flow.
Theorem 3.1.For every s ∈ [0, T ] and for almost every x ∈ R d , there exists a unique absolutely continuous solution [s, T ] ∋ t → φ t,s (x) of the differential inclusion (3.4).Moreover, φ t,s satisfies the regular Lagrangian property: the map and there exists C > 0 depending only on T such that, for all R > 0, In particular, is the unique absolutely continuous solution of the integral equation Finally, if 0 ≤ r ≤ s ≤ t ≤ T , then the composition φ s,t • φ r,s is well-defined a.e. and is equal to φ r,t .Remark 3.5.Taking f = 1 A in the estimate above, for some A ⊂ R d of finite measure, we find the regular Lagrange property φ −1 t,s (A) ≤ e d(ω1(t)−ω1(s)) |A|.
Proof of Theorem 3.1.Let s ∈ [0, T ] be fixed and, for N ∈ N, define t N n = s + n(T −s)
, let ε > 0 and let b ε and φ +,ε be as in the proof of Proposition 3.1.Then The change of variables formula then gives The set (φ +,ε t,s (x)) x∈supp f, ε>0 is uniformly bounded in view of Lemma 3.1, and so the bounded convergence theorem implies, upon taking ε → 0, that This implies that the map f → f • φ t,s extends continuously to L 1 (R d ).The local statement follows from the finite speed of propagation implied by the a priori estimates in Lemma 3.1.
It now follows easily that b(•, φ •,s ) belongs to L 1 loc ([0, T ]×R d ), and the uniqueness statement for absolutely continuous solutions of the integral equation follows from Proposition 3.2 and the fact that φ = φ + = φ − a.e.
Finally, the composition φ t,s • φ r,s is justified because φ t,s ∈ L 1 loc , and its equality to φ r,t a.e. is a consequence of the a.e.uniqueness of the ODE and the flow properties in Proposition 3.1(c).
We now demonstrate that any regularizations of b lead to the flow φ in the limit, not only the one-sided regularizations used above.Theorem 3.2.Assume (b ε ) ε>0 is a family satisfying (3.2) uniformly in ε, and, for every ε, Moreover, for all t ∈ [s, T ], φ ε t,s is increasing, and thus BV .This along with the a priori estimates in Lemma 3.
, and therefore, for some subsequence ε n n→∞ to some ψ •,s , which then also satisfies the regular Lagrangian property (3.8).
We now use (3.8) to deduce that, for any R > 0, there exists R ′ > 0 independent of ε such that, for a.e.r ∈ [s, T ] and for any δ > ε, Taking first ε n → 0 (using the Lipschitz continuity of b δ ) and then δ → 0, we find that the right-hand side above converges to zero.By Lemma 3.1, we may use the dominated convergence theorem to deduce that Sending n → ∞ in the integral equation We also have, by Minkowski's inequality, and thus (3.9) is satisfied in the integral sense for a.e.x ∈ R d .By Theorem 3.1, ψ = φ, and therefore the convergence statements hold for the full family ε → 0.
Another means of regularization is through the addition of stochastic noise.In particular, if s ∈ [0, T ], x ∈ R d , ε > 0, and W : Ω × [0, T ] → R d is a Wiener process defined over a given probability space (Ω, F , P), then there exists a unique strong solution of the SDE (3.10) This is true even if b is merely locally bounded; see [23,56].
The following is then proved exactly as for Theorem 3.2.
Theorem 3.3.Assume b satisfies (3.2) and let φ ε be the solution of (3.10).Then, for all s ∈ [0, T ], with probability one, as Proof.The transformation φt,s (x) = φ t,s (x) − √ 2ε(W t − W s ) leads to the equation The exact same arguments as those above show that φε t,s is increasing for all 0 ≤ s ≤ t ≤ T , satisfies the regular Lagrange property (3.8), and, for any fixed Brownian path W , the a priori estimates of Lemma 3.1 may be applied.Arguing as in the proof of Theorem 3.
)), and, for almost every x ∈ R d , [0, t] ∋ s → φ t,s (x) is the unique absolutely continuous solution of

Some remarks on the backward flow
We discuss next the question of solvability of the backward flow: for fixed s ∈ [0, T ], this is the terminal value problem Formally, the lower bound on div b suggests that the backward flow should concentrate positive measure sets to null sets.As an example when d = 1, if b(t, x) = sgn x, then the backward flow is given by and so all trajectories eventually concentrate at x = 0.This situation can be generalized to multiple dimensions when b satisfies the half-Lipschitz property The condition (3.12) then implies the existence of a unique, Lipschitz, concentrating solution of the backward flow (3.11).This condition was used by Filippov [30] to build unique solutions of differential inclusions; see also [21,50,51] and the recent work of the authors [45].
The situation is very different when b satisfies (3.2).The differential inclusion corresponding to (3.11) then takes the form As it turns out, there is no unique flow, and, in some cases, no Lipschitz flow.
Since b is independent of t, we formally write the flow φ t,s as φ t−s .
Recall that (3.13) is a weaker notion of solution than the classical Filippov formulation (see Remark 3.3).In Example 3.1 above, this means b(0, x 2 ) is taken to be the smallest convex set containing all limit points of b(y) as y → (0, x 2 ), that is, the line segment connecting (−1, −1) and (1, 1).The unique such solution is then the flow φ (0) .However, we can construct an example where even the Filippov flow is not unique.
In fact, φ (k) −τ is not continuous for any τ > 0 and k = 1, 2, 3.This can be seen by considering the flows starting from (x 1 , x 2 ) < (0, 0) with In particular, given τ > 0 and ε > 0, for k = 1, 2, 3, In [52], Poupaud and Rascle explore the connection between the uniqueness (for every x ∈ R d ) of Filippov solutions of (3.11) and a stable notion of measure-valued solutions to the continuity equation3 As Example 3.2 shows, we cannot take this approach in analyzing (3.14), because there are three distinct measure-valued solutions when f 0 = δ 0 .

Stochastic differential equations
For b satisfying (3.2), we now consider the stochastic flow for the SDE (3.15) Here, for some m ∈ N, W : [0, T ] × Ω → R m is an m-dimensional Wiener process on a given probability space (Ω, F , P), and, for (t, x) Because σ may be degenerate, the addition of noise does not necessarily imply the existence and uniqueness of a strong solution.In reproducing the theory for the ODE (3.1) , we are therefore led to the condition and, for a.e.t ∈ [0, T ] and all i = 1, 2, . . ., d, j = 1, 2, . . ., m, We assume in addition to (3.2) a bounded oscillation condition for b: , and for a.e.(t, x, y) While not strictly necessary, this helps simplify some of the arguments by allowing certain regularizations of b to be globally Lipschitz.
The reformulation of (3.15) as a differential inclusion then takes the form (e) For a.e.x ∈ R d , (3.15) admits a unique integral solution, which is equal a.e. to Φ + •,s and Φ − •,s .
Proof of Theorem 3.5.Let b ε and b ε be the sup-and inf-convolutions of b as in the proof of Proposition 3.1, and define the stochastic flows Φ +,ε and Φ −,ε by Part (d), and then, as a consequence, (e), is proved similarly as in the proof of Proposition 3.2 and Theorem 3.1, using now instead the one-sided mollifiers ρ ε and ρ ε to regularize b.Note that, in view of the assumption (3.17), for some and similarly for b * ρ ε .This allows for the use of the comparison Lemma 3.3 in the argument, which requires global Lipschitz regularity in view of the lack of finite speed of propagation.Finally, (f) follows exactly as in the proof of Theorem 3.1.

Linear transport equations with increasing drift
For velocity fields b satisfying (3.2), we discuss the terminal value problem for the nonconservative equation (4.1) as well the initial value problem for the conservative equation The lower bound on div b implied by (3.2), which gave rise to the regular Lagrangian property for the unique flow φ in the previous section, here allows for a theory of weak solutions of (4.1) and (4.2) in L p -spaces, due to the expansive property of the flow.

The nonconservative equation
It is important to note that solutions u of (4.1) taking values in L p loc cannot be understood in the sense of distributions.This is because, under the assumption (3.2), div b is a measure that need not necessarily be absolutely continuous with respect to Lebesgue measure.Instead, we identify unique solution operators for (4.1) that are continuous on Lebesgue spaces and stable under regularizations.This is done through the relationship with the flow in the previous section, and also by characterizing the solutions using "one-sided" regularizations for increasing or decreasing solutions.

Representation formula
When b and u T are smooth, the unique solution of the terminal value problem for (4.1) with terminal value u is where φ is the flow corresponding to the ODE (3.1).In view of Theorem 3.1, this formula extends to u ∈ L p loc , 1 ≤ p ≤ ∞ if the assumption on b is relaxed to (3.2).We thus identify a family of solution operators for (4.1) that are continuous on L p and evolve continuously in time.(a) For all 0 ≤ s ≤ t ≤ T , S(t, s) is continuously on L p loc (R d ) for 1 ≤ p ≤ ∞, and there exists C > 0 depending only on T such that, for any R > 0, S(s, t)u L p (BR) ≤ e d(ω1(t)−ω1(s)) u L p (B R+C(1+R)(ω 0 (t)−ω 0 (s)) .
loc for some p < ∞, and u ε is the corresponding solution of (4.1), then, as ε → 0, u ε converges to S(t, T )u T strongly in C([0, T ], L p loc (R d )).The same statement is true if Proof.Properties (a) and (b) follow immediately from Theorem 3.1, while properties (c) and (d) follow from Theorem 3.2 and the dominated convergence theorem (and see also Theorem 3.4).For the statement involving the viscous equation (4.4), we remark that, in that case, u ε is given by u ε (t, x) = u ε T (φ ε T,t (x)), where φ ε is the stochastic flow corresponding to (3.10).The proof is then finished because of Theorem 3.3.More precisely, for ε > 0, we have the easy a priori bound It follows from a diagonal argument that there exists ε n → 0 and a family of continuous linear operators on L p such that S(r, s)S(s, t) = S(r, t) for r ≤ s ≤ t, and, for all t ∈ [0, T ], In particular, u := S(•, t)u ∈ C([0, T ], L p w (R d )), and so, for any s ∈ [0, t], On the other hand, We find then that u(s + h, •) L p → u(s, •) L p , which, coupled with the weak convergence, means that u(s + h, •) → u(s, •) strongly in L p if p > 1 (and so for all p locally).A similar argument holds with h replaced by −h.
The following renormalization property for the solution operator S(s, t), 0 ≤ s ≤ t ≤ T , is then immediate from the formula.

Characterizing increasing/decreasing solutions
The solution of the transport equation in the previous subsection was characterized as the unique limit under arbitrary regularizations, as well as through the formula involving the regular Lagrangian flow.The remainder of the section is dedicated to understanding further ways to characterize the solution, and in particular on the level of the equation (4.1) itself.This will become useful in the study of nonlinear equations in the final section.
We first observe that, if u T is increasing/decreasing with respect to the partial order (2.1), then so is u(t, •) for all t ∈ [0, T ].While this is immediately clear from the formula u(t, •) = u T • φ T,t and the fact that φ T,t is increasing, it can also be seen directly from the equation.Indeed, if u is a smooth solution of (4.1) and Therefore, if v i ≥ 0 (or v i ≤ 0) when t = T for all i = 1, 2, . . ., d, then the same is true for t < T by the maximum principle Lemma 2.6.The result for general b satisfying (3.2) follows from approximating b and using the limiting result in Theorem 4.1.By linearity, we have thus established the following: Proposition 4.2.For all 0 ≤ s ≤ t ≤ T , S(s, t) : ABV → ABV .
In particular, since ABV is densely contained in L p loc (R d ) and S(s, t) is continuous on L p loc , Proposition 4.2 implies that belonging to ABV is a suitable criterion for the propagation of compactness.Notice for instance that this provides another proof of the fact that the convergence statements in Theorem 4.1 are with respect to strong convergence in C([0, T ], L p loc ) for p < ∞.We now demonstrate how the propagation of the increasing or decreasing property leads to a method for characterizing solutions of (4.1), independently of the solution formula.The idea is to regularize u in a one-sided manner, as in subsection 2.3.
Remark 4.2.A well-posed notion of sub-and supersolutions can be defined where u is approximated using, instead of the one-sided mollifiers, the method of inf-and sup-convolution: In particular, for any decreasing u T : R d → R d , there exists a unique solution u of (4.1) in the sense of Definition 4.1, which is given by u(t, •) = S(t, T )u T , and which is continuous a.e. in [0, T ] × R d .
Proof.Set u ε = u * ρ ε and v ε = v * ρ ε ; then, combining the inequalities for u ε and v ε given by Definition 4.1, we obtain Let β : R → R be smooth and increasing.Multiplying the above inequality by the positive term β We may then take β(r) = r p + (arguing with an extra layer of regularizations if p = 1) and find that, in the sense of distributions, be smooth and decreasing, such that ψ = 1 on [0, 1] and ψ = 0 on [2, ∞), and set ψ(t, x) := ψ(|x|/R(t)) for some R : [0, T ] → [0, ∞) to be determined.Using ψ as a test function, we discover where in the last line we used the fact that div b ≥ 0 and ψ ≥ 0. Using the fact that ψ′ ≤ 0, with ψ′ = 0 only if r ∈ [1, 2], we find that For t 0 ∈ [0, T ], this is made nonnegative on [t 0 , T ] by choosing, for any fixed R > 0, and so We may then choose functions ψ that approximate 1 [0,1] from above, and then, by the monotone convergence theorem, The proof of (4.6) is finished upon sending ε → 0 and setting The comparison inequality (4.6) implies uniqueness for a solution with terminal condition u T , and so it remains to show that S(t, T )u T is a solution in the sense of Definition 4.1.
Let (b δ ) δ>0 be a family of smooth approximations of b satisfying (3.2) with C 1 = 0 and uniform C 0 , and let u δ T = u T * γ δ for a family of standard mollifiers (γ δ ) δ>0 .Note then that, for δ > 0, u δ T is decreasing, and, as δ → 0, u δ T → u T in L p loc for all p < ∞.Let u δ be the corresponding solution of the terminal value problem (4.1).It follows that u δ (t, •) is decreasing for all t ∈ [0, T ].
For any δ > 0 and ε > 0, where the last inequality follows from the fact that ρ ε (x − y) = 0 only if x ≤ y, b δ (t, •) is increasing, and ∇u δ ≤ 0. Sending δ → 0, it follows from Theorem 4.1 that, in the sense of distributions, if u = S(•, T )u T and It follows that S(•, T )u T is a subsolution.A similar argument shows that it is a supersolution, and therefore the unique solution in the sense of Definition 4.1.Now, for some M > 0, define ψ(t) = exp(M ω 1 (t)) and ũδ (t, x) := u δ (t, x + ψ(t)1).
For any fixed R > 0, there exists M > 0 such that, in view of the linear growth of b δ given by (3.2) uniformly in δ, and the fact that u δ is decreasing in the spatial variable, for any Since M is independent of δ, we may send δ → 0 and conclude that ũ(t, , and therefore continuous a.e. in that set by Lemma 2.3.The transformation leading from u to ũ preserves null sets, and we conclude that u is continuous almost everywhere.
Remark 4.3.The idea behind Definition 4.1 is to establish a sign for the commutator between convolution and differentiation along irregular vector fields, as compared to the work of DiPerna and the first author [28], where the commutator is shown to be small for Sobolev vector fields.We must take convolution kernels with a specific one-sided structure in order to analyze the commutators; a less crude example of this idea is seen in the work of Ambrosio [2] for general BV velocity fields.
A different notion of sub and supersolutions, which also selects S(•, T )u T as the unique solution of (4.1), can be obtained by instead regularizing b.Recall that we have assumed without loss of generality that b is increasing.Then a decreasing function u can be said to be a sub (super)solution if, for all ε > 0, in the sense of distributions, where b ε ≤ b ≤ b ε are one-sided regularizations of b, for example the one-sided mollifiers or the inf-and sup-convolutions.The notion of sub and supersolution in Definition 4.1 turns out to be more amenable to the study of the nonlinear systems in Section 5.
Remark 4.4.Throughout this section, we have studied the setting where u T , and therefore u(t, •) for t < T , are decreasing.The same analysis can be achieved for increasing solutions, in which case the inequalities in (4.12) are reversed.Note, in particular, that the solution flows φ t,s (x) are vector-valued solutions in this sense; i.e.
We now observe that the full family of solution operators S(s, t) : L p loc → L p loc , 0 ≤ s ≤ t ≤ T , can be constructed independently of the ODE flows φ s,t , which can then be recovered with the theory of renormalization.Indeed, Definition 4.1, and its counterpart for increasing solutions, can be used to define S(s, t)ū for any ū ∈ ABV .The density of ABV in L p loc (R d ), and the L p -continuity of S(s, t), then allow the solution operators to be continuously extended to L p loc .It is, however, not clear whether solutions of (4.1) can be characterized for arbitrary u T ∈ L p loc , other than by the formula (4.3) or as limits of solutions to regularized equations.

Lower order terms
We briefly explain how to extend the above results to equations with additional lower order terms, as in (4.7) Then there exists a unique function u ∈ C([0, T ], L p loc ) with the following properties: (a) There exists C 2 ∈ L 1 + ([0, T ]), and, for R > 0, a modulus ω R : [0, T ] → [0, T ], depending only on the assumptions in (3.2) and (4.8) such that, for R > 0, and (d ε ) ε>0 be families satisfying (3.2) and (4.8) uniformly in ε, such that, as ε → 0, (b ε , c ε , d ε ) → (b, c, d) a.e.Let (u ε T ) ε>0 be a family of smooth functions approximating u T in L p loc , and let u ε be the corresponding solution of (4.7).Then, as ε → 0, u ε converges strongly to u in C([0, T ], L p loc ).The same statement is true if u ε solves (4.9)

Analogous statements hold when
T , and u ε as in the statement of part (b), if (φ ε t,s (x)) s,t∈[0,T ] denotes the corresponding smooth flow, we have the formula The convergence statements, and thus the formula in part (c), are then proved just as in Theorem 3.2.In particular, arguing just as in that proof, we have , in view of Lemma 3.1, and so we conclude parts (b) and (c) by the dominated convergence theorem.The L p -estimates in part (a) are proved just as before, either from the regularized equation itself or from (4.10), using the lower bound on the divergence of b.
Sub and supersolutions can be characterized when u T is increasing/decreasing under the additional assumption that A solution is both a sub-and supersolution.
Finally, the following is proved almost identically to Theorem 4.2.
In particular, for any decreasing u T ∈ L p loc (R d ), there exists a unique solution u of (4.7) in the sense of Definition 4.2, which is given by (4.10), and which is continuous a.e. in [0, T ] × R d .

The conservative equation
In contrast to the theory for the nonconservative equation, solutions to (4.2) belonging to Lebesgue spaces can be made sense of in the sense of distributions.However, under the general assumption (3.2), these are not in general unique, as the simple example b(x) = sgn x on R shows.Drawing once more an analogy with the setting studied in [12,13,45] of half-Lipschitz velocity fields, the "good" (stable) solution of (4.2) is identified using a particular solution formula, and, in [45], this is shown to coincide with the pushforward by the regular Lagrangian flow of the initial density.As discussed in subsection 3.4 above, the former strategy is unavailable; however, in view of the theory of the nonconservative equation and the forward regular Lagrangian flow that was built in the previous section, we may define solutions by duality.

Duality solutions
For 0 ≤ s ≤ t ≤ T ¡ denote by S * (t, s) the adjoint of the solution operator S(s, t).Theorem 4.5.Let 1 ≤ p ≤ ∞ and f 0 ∈ L p loc , and define and f is a distributional solution of (4.2) and where φ is the flow obtained in Section 3.
The same is true if f ε is taken to be the unique smooth solution of and analogous convergence statements hold in the weak-⋆ sense if p = ∞.
Proof.The identity (4.14) follows immediately from (4.3); observe that it is well-defined for f 0 ∈ L p loc in view of the regular Lagrange property (3.8).
We now prove the convergence statements, and it suffices to prove the results for p < ∞, since L ∞ loc ⊂ L p loc for any p < ∞.Let (b ε ) ε>0 , (f ε 0 ) ε>0 , (f ε ) ε>0 be as in the statement.In view of the lower bound on the divergence, it is straightforward to prove a priori L p bounds.Namely, for all R > 0, there exists a modulus of continuity ω R : [0, ∞) → [0, ∞) depending only on T and C 0 from (3.2), such that, for all ε > 0 and t ∈ [0, T ], (4.15) It follows that there exists a subsequence ε n n→∞ using the fact that b ε converges strongly to b in L p loc for any p < ∞, we find that f is a distributional solution of (4.2), and thus, moreover, f ∈ C([0, T ], L p loc,w (R d )).The weak-⋆ convergence statements when p = ∞ are proved analogously.
Fix u ∈ C 1 c (R d ), and t 0 ∈ [0, T ], and let u ε denote the solution of (4.1) with terminal value u ε (t 0 , •) = u; in view of the results of the previous subsection, Exploiting the duality of the equations and integrating by parts gives Sending n → ∞ for ε = ε n gives the identity It follows that f (t 0 , •) = S(t 0 , 0)f 0 , and we therefore have the full convergence for all ε → 0. An identical argument can be used to prove the convergence statement for the vanishing viscosity limit.Finally, the the fact that f is continuous from [0, T ] into L p loc (R d ) with the strong topology is a consequence of the continuity of the upper bound in the L p -estimate (4.15) (see Remark 4.1).Remark 4.5.It is clear from the proof above that, when 1 < p < ∞, the initial functions f ε 0 need only converge weakly in L p loc to f 0 as ε → 0.
Remark 4.6.It is not clear whether the convergence of regularizations f ε to the duality solution f of (4.2) can be upgraded to strong convergence, except when d = 1, in which case (3.2) coincides with a half-Lipschitz condition on b (see [45]).

Nonnegative solutions and renormalization
It is by now standard in the theory of the continuity equation (4.2) that there is uniqueness of nonnegative solutions when the measure f 0 is concentrated on sets where the ODE (3.1) has a unique solution; see [2,3].In view of Theorem 3.1, we then have the following: We present here an alternative proof using the characterization of f as the duality solution, in order to emphasize again that the theory in this section can be developed independently of the analysis of the ODE (3.1) in the previous section.
Proof of Theorem 4.6.By Theorem 4.5, f = S * (•, 0)f 0 is a distributional solution, and its nonnegativity can be seen through an approximation argument, since weak convergence preserves nonnegativity.
Assume now that f ∈ C([0, T ], L p loc (R d )) is a nonnegative distributional solution of (4.2).Fix t 0 ∈ [0, T ] and a decreasing function u : R d → R, and let u = S(•, t 0 )u.Then u(t, •) is decreasing for t ∈ [0, t 0 ], and is a solution of (4.1) in the sense of Definition 4.1.In particular, u ε = u * ρ ε and u ε = u * ρ ε satisfy respectively where we have used the nonnegativity of ψ and f .Similarly, Sending ε → 0, we conclude that By linearity, the same is true for all increasing u as well, and therefore, by a density argument, for all u ∈ L p ′ loc (R d ).In particular, we take u ∈ C c (R d ), in which case S(•, t 0 )u is supported in [0, T ] × B R for some R > 0, by the finite speed of propagation property.To conclude, we may then take ψ ∈ C 1 c (R d ) such that ψ ≡ 1 in B R , and therefore also ∇ψ = 0 in B R .
Corollary 4.1.Assume that f and |f | are both distributional solutions of (4.2).Then f is the unique duality solution of (4.2); that is, f (t, •) = S(t, 0)f (0, •).Corollary 4.1 gives a sufficient criterion for a distributional solution to be the correct duality solution.However, we do not know whether this condition is necessary.In other words, it is an open question whether |S * (t, s)f | = S * (t, s)|f | for any f ∈ L p loc .This renormalization property is equivalent to a kind of injectivity for the forward flow φ t,s , which we describe with the next result.
Then the following statements are equivalent: Let B ⊂ R d be measurable.Then by Theorem 4.5, It follows that (S * (t, x)ρ)1 B = 0 a.e. if and only if (S * (t, x)1 A )1 B = 0 a.e., whence (4.16).It now follows that (b) is equivalent to (4.17) and so this is equivalent to S * (t, x)f ± = S * (t, s)f ± , and thus (a).
Either of the two renormalization properties would follow from the strong convergence of regularizations in Theorem 4.5.However, we do not know at this time whether the strong convergence actually holds; see Remark 4.6.
The characterization in part (b) of Proposition 4.4 is a reformulation of the renormalization property in terms of the injectivity of the flow.For instance, we have the following.Lemma 4.1.Assume that b satisfies (3.2), let 0 ≤ s ≤ t ≤ T and f ∈ L p loc , and suppose that Then the renormalization property in Proposition 4.4 is satisfied.
Proof.It suffices to establish part (b) in Proposition 4.4.Let A ± = {±f > 0}.Then, for any measurable Let B be any finite-measure set contained in We conclude that φ ♯ t,s 1 A+ (y) and φ ♯ t,s 1 A− (y) are both positive only if φ −1 t,s ({y}) intersects both A + and A − .In view of the assumption of the lemma, the set of such y has measure 0. This establishes property (b) of Proposition 4.4.
Remark 4.7.Observe that, when d = 1, φ t,s is actually injective for any 0 ≤ s ≤ t ≤ T .Notice that, for the drifts introduced in Examples 3.1 or 3.2, the corresponding flow has the property that φ −1 t,s ({y}) is at most a singleton for any 0 ≤ s ≤ t ≤ T and a.e.y ∈ R d .
In view of the regular Lagrangian property, a kind of injectivity of the flow can be seen for particular ordered sets.Suppose that x, y ∈ R d , x ≤ y, and φ t,s (x) = φ t,s (y).Then it cannot be true that x i < y i for all i = 1, 2, . . ., d.If this were the case, then φ t,s would be constant on the cube [x, y], which violates the regular Lagrange property.The following then follows from Lemma 4.1.
Proposition 4.5.Assume that f ∈ L p loc (R d ) and, for a.e.x, y ∈ R d such that f (x) > 0 and f (y) < 0, either x < y or y < x.Then renormalization is satisfied for S * (t, s)f for all 0 ≤ s ≤ t ≤ T .
The condition on f in Proposition 4.5 is satisfied if there exist cubes

Some remarks on "time-reversed" equations
As discussed in subsection 3.4, for velocity fields b satisfying (3.2), there is not a satisfactory notion of reverse flow for the ODE (3.1).Nevertheless, we can indirectly make sense of the backward Jacobian det(D x φ 0,t (x)), which, formally, should be the solution of (4.2) with f 0 = 1, that is, (4.18) J(t, •) := S * (t, 0)1.
2) uniformly in ε, and converge a.e. to b as ε → 0, and if φ ε 0,t is the solution of Proof.Items (a) and (b) follow immediately from Theorem 4.2.To prove (c), let b ε and φ ε 0,t be as in the statement, let f ε be the solution of (4.2) with drift b ε and initial condition f ε 0 = f * ρ ε , where ρ ε is a standard mollifier.Then, for (t, x) The statement follows upon sending ε → 0 and appealing to the weak convergence of f ε and det(D x φ ε 0,t ).Continuing the formal discussion from above, note that, if f 0 and b are smooth, then v(t, x) = f 0 (φ 0,t (x)) solves the initial value problem for the transport equation The time direction for (4.19) is forward, in contrast to (4.1), where it is backward.We therefore cannot appeal to the theory for that equation.Nevertheless, if f 0 ∈ L ∞ and b satisfies (3.2), then a candidate for the solution of (4.19) is which, by Proposition 4.6(c), is a well-defined bounded function.Note, however, that studying the stability properties of the formula (4.20) is complicated by the fact that J and S * (t, 0)f 0 are stable only under weak convergence in C([0, T ], L p loc ).To complement (4.19), we also consider the terminal value problem for the continuity equation: The formula in this case should be (4.22)g(t, x) = g T (φ T,t (x)) det(D x φ T,t (x)).
In fact, both terms in the product have meaning: u(t, x) := g T (φ T,t (x)) is the solution of (4.1) with terminal value g T , and det(D x φ T,t (x)) is well-defined almost-everywhere by Lemma 2.3 and the fact that φ T,t is increasing.Furthermore, by regularizing b and taking weak distributional limits, it turns out that det(D x φ T,t (x)) is a measure bounded from below.However, u is not continuous in general, and so it is not possible to make sense of the product in (4.22).This is exactly what leads to multiple measure-valued solutions of the equation in general; see the discussion of Example 3.2 above.

Second-order equations
We finish this section by briefly demonstrating how the first-order results can be extended to the second-order equations where b satisfies (3.2) and (3.17), and a : [0, T ] × R d → S d is given by a(t, x) = 1 2 σ(t, x)σ(t, x) τ and σ is a matrix-valued function satisfying (3.16).Observe that this means that (4.25) a ij (t, x) depends only on (t, x i , x j ) ∈ [0, T ] × R 2 for all i, j = 1, 2, . . ., d. (c) For any where Φ denotes the stochastic flow from Theorem 3.5.
Proof.The argument follows almost exactly as in the first order case (Theorem 4.1), using the stability and uniqueness results in Theorem 3.5 for the SDE (3.15) (recall that we are assuming the bounded-oscillation condition (3.17) in addition to (3.2)).Upon regularizing the velocity field b, the formal a priori L p estimate in part (a) can be made rigorous, which, in particular, gives boundedness of the solution operator on L p for all p ∈ [1, ∞), uniformly in ε > 0, so that the initial datum u 0 can always be assumed to belong to C c (R d ) without loss of generality.The existence and uniqueness of the strong limit and its identification with the formula in part (c) are then a consequence of Theorem 3.5.
Remark 4.8.If u T is increasing (decreasing), then the same is true for u(t, •) for all t ∈ [0, T ], which, again, can be checked with the representation formula (4.26), or by the differentiating the equation and using (4.25).For now, we do not discuss the question of characterizing solutions in a PDE sense.
Remark 4.9.Similar results can be obtained for the equation with lower order terms (4.27) Theorem 4.8.For every f 0 ∈ L p , 1 ≤ p < ∞, there exists a distributional solution of (4.24) with the following properties: (a) f is obtained uniquely from weak limits in C([0, T ], L p ) upon replacing b with a regularization, or from vanishing viscosity limits, and the resulting solution operator is bounded on L p , with bound depending only on the assumptions for b, for all p ∈ [1, ∞).(c) For all t ∈ [0, T ], (4.28) where Φ t,0 is the stochastic flow from Theorem 3.5.Thus, if f 0 is a probability density and X 0 is a random variable independent of the Wiener process with density function f 0 , it follows that f (t, •) is the probability density function of Φ t,0 (X 0 ) (which is absolutely continuous in view of the regular Lagrange property).
Proof.The proofs of parts (a)-(c) proceed similarly to the proof of Theorem 4.5, by first regularizing b, proving uniform L p -estimates, and passing to the limit, exploiting the uniqueness results for (3.15).The uniqueness of nonnegative solutions in part (d) now follows from the Ambrosio-Figalli-Trevisan superposition principle; see for instance [29,55].

Nonlinear transport systems
We now turn to the study of the nonlinear transport systems discussed in the introduction, that is (5.1) where, for some integer m ≥ 1, u : We also consider the associated forward-backward system of characteristics posed for fixed (t, x)

Weak solutions
We will introduce assumptions on f , g, and u T , that, freezing u, make the equation (5.1) exactly of the form of those nonconservative linear equations studied in the previous section.This then leads to a natural notion of weak solution via a fixed point operator.Assume (5.3) and, for some C 0 ∈ L 1 + ([0, T ]), a.e.t ∈ [0, T ], and all i, j ∈ {1, 2, . . ., d} and k, ℓ ∈ {1, 2, . . ., m}, Under these assumptions, any solution operator for (5.1) should preserve the decreasing property of solutions, which we show formally assuming the data are smooth.Proposition 5.1.Assume f , g, and u T are smooth with uniformly bounded derivatives, f and g satisfy (5.3), and u T : R d → R m is decreasing with respect to (2.1).If u is a smooth solution of the terminal value problem (5.1), then, for all t ∈ [0, T ], u(t, •) is decreasing.
We now make the connection with the linear transport equation theory of the previous sections.In particular, note that if u(t, •) is decreasing for all t ∈ [0, T ], then b(t, x) := f (t, x, u(t, x)) satisfies the assumptions of (3.2), while c(t) = C 0 (t) and d(t, x) = g(t, x, u(t, x)) + C 0 (t)u(t, x) satisfy (4.11). is decreasing in x, satisfies (5.4), and, formally, solves the equation (5.1) with f and g replaced by Proof of Theorem 5.1.As described above, we assume without loss of generality that C 0 ≡ 0. For some C > 0 to be determined, set We define a map S on L as follows: for u ∈ L, let v := S(u) be the solution, as in Theorem 4.3, of the linear transport equation Then, in view of the solution formula (4.10) and the bounds (5.3) on f and g, there exists a sufficient large C, depending on u T , such that S maps L into L.
We now note that L forms a complete lattice under the partial order that is, every subset of L has a greatest lower bound and least upper bound, which is a consequence of the uniformly bounded linear growth of solutions in L. Suppose now that u, ũ ∈ L satisfy u ≤ ũ under the order (5.6), and set b(t, x) := f (t, x, u(t, x)) and d(t, x) := g(t, x, u(t, x)).
Then (5.3) with C 0 ≡ 0 implies that b(t, x) ≥ f (t, x, ũ(t, x)) and d(t, x) ≤ g(t, x, ũ(t, x)), and so, in particular, v := S(u) and ṽ := S(ũ) are respectively a sub and supersolution of the linear equation (4.7) with c(t) ≡ 0. It follows from Theorem 4.4 that v ≤ ṽ, and therefore S is increasing on the complete lattice L with respect to the partial order (5.6).The existence of a unique maximal and minimal solution are now a consequence of the Tarski lattice-theoretical fixed point theorem [54].The continuity a.e. in [0, T ] × R d of u + and u − now follows from Theorem 4.4.Observe now that any version of the maximal solution u + is a solution in the sense of Definition 5.1.Because u + (t, •) is decreasing, it is continuous a.e., and its maximal version is upper semicontinuous, which is therefore also the unique maximal solution in the everywhere-pointwise sense.A similar argument shows that the minimal everywhere-pointwise solution u − is lower-semicontinuous in the spatial variable, and we conclude.
The fixed point theorem of [54] used above further characterizes u + and u − as u + = sup {u ∈ L : S(u) ≥ u} and u − = inf {u ∈ L : S(u) ≤ u} , where the sup and inf are understood with respect to the partial order (5.6).We alternatively characterize the maximal and minimal solutions in terms of appropriately defined sub and supersolutions of the equation (5.1).
Recall that ρ ε and ρ ε are the one-sided mollifying functions from subsection 2.3.We then define a notion of sub and supersolution.Definition 5.2.Assume f and g satisfy (5.3) with C 0 ≡ 0 and u T is decreasing.A function u : In other words, sub and supersolutions in the sense of (5.2) are sub and supersolutions of the corresponding linear equations identified in Definition 5.1.It follows that a function which is both a sub and supersolution is in fact a solution.
Proof.We prove only the subsolution statement, as the other one is analogously proved.It is clear that max{u and so a simple application of the chain rule gives In particular, for any smooth positive test function where the last term is understand as the pairing between locally finite measures and continuous functions, because f (t, x, (u ∨ v)(t, x)) is BV in x.We may then approximate max(•, •) with such functions Ψ and determine that, in the distributional sense, For δ > 0, we convolve both sides of the above inequality with the one-sided mollifier ρ δ and obtain, in view of (5.3), For fixed δ, as ε → 0, (u ε ∨ v ε ) * ρ δ → (u ∨ v) * ρ δ locally uniformly, and we may therefore send ε → 0 in the above inequality, again using f ∈ BV , to obtain the desired subsolution inequality for (u ∨ v) * ρ δ .
Proposition 5.3.Let f and g satisfy (5.3) with C 0 ≡ 0, and assume u T has linear growth and is decreasing.Let u + and u − be the maximal and minimal solution from Theorem 5.1.Then, in the sense of Definition 5.2, u + is the pointwise maximum of all subsolutions, and u − is the pointwise minimum of all supersolutions.
Remark 5.1.In view of Lemma 5.2, the maximum/minimum in Proposition 5.3 may be restricted to sub/supersolutions that are continuous a.e. in [0, T ] × R d .
Proof of Proposition 5.3.We prove only the statement for u + , since the proof is analogous for u − .Let ũ+ (t, x) := sup {u(t, x) : u is a subsolution} .
Because u + is itself a subsolution, we clearly have u + ≤ ũ+ .Suppose now that u is a subsolution of (5.1), and let v be the solution of It then holds that u and v are respectively a subsolution and solution of the linear equation (4.7) with b(t, x) = f (t, x, u(t, x)), c(t, x) = 0, and d(t, x, ) = g(t, x, u(t, x)).Then Theorem 4.4 implies u ≤ v.Note also that v belongs to the lattice L from the proof of Theorem 5.1, because v(T, •) = u T .If S is the fixedpoint map from the proof of that theorem, we then set w = S(v), which, by a similar argument, satisfies w ∈ L and S(v) = w ≥ v.By the characterization of u + by the Tarski fixed point theorem, we must have u ≤ u + , and therefore ũ+ ≤ u + .

Continuous solutions
We now investigate when the maximal and minimal solution u + and u − identified in the previous subsection coincide.In this subsection, we prove a comparison principle for continuous sub and supersolutions, so that, in particular, u + = u − if both are continuous.As we will see by example in the forthcoming subsections, this can fail in general if the assumption of continuity is dropped.We will first introduce a different but equivalent notion of solution for continuous solutions, using the theory of viscosity solutions.Let us assume throughout this subsection, in addition to (5.3), that (5.7) f and g are uniformly Lipschitz continuous and bounded, and C 0 ≥ 0 is constant.
Theorem 5.2.Assume that f and g satisfy (5.3) and (5.7), and let u and v be respectively a bounded sub and supersolution of (5.1) in the sense of Definition 5.3 such that either u is continuous and v is lowersemicontinuous, or u is upper-semicontinuous and v is continuous.If u(T, •) ≤ v(T, •), then, for all t ∈ [0, T ], u(t, •) ≤ v(t, •).
Proof.The proofs of both statements are almost identical, so we prove only the statement when u is continuous and v is lower-semicontinuous.We first prove the result under the additional assumption that, for some δ > 0, u is a sub-solution of and so, if λ is sufficiently large and β is sufficiently small, depending on δ, then φ(T ) < 0, and therefore t < T .We next claim that t < 0 for all sufficiently large λ and small β, which implies φ(t) ≤ 0 for all t ∈ [0, T ], and, hence, gives the result.If this were not true and 0 ≤ t < T , then choose i, x, ŷ such that We then have (5.10)u j ( t, x) − v j ( t, ŷ) ≤ u i ( t, x) − v i ( t, ŷ) for all j = 1, 2, . . ., m, and, from the fact that u and v are decreasing, (5.11) λ(x j − ŷj ) + β xj ≤ 0 and λ(x j − ŷj ) − β ŷj ≤ 0 for all j = 1, 2, . . ., d.
Recall that the system, and, in particular, the equation for X, is viewed as a differential inclusion, where, for all s ∈ [t, T ], x − s + t, x ≤ c(t), x − c(t) s − t T − t , x = c(t), x, x ≥ c(t).
We note that, for any solution (U, X), we must have X s,t (x) = x − s + t for x < 0 and X s,t (x) = x for x ≥ T − t.However, for x ∈ [0, T − t], there is ambiguity in the speed at which the X s,t (x) travels: it can move with speed −1, 0, or anything in between, where in the latter case the characteristic is constrained to end at X(T, t, x) = 0.The precise value x ∈ [0, 1] for which X T,t (x) = 0 thus encodes the choice of the shock-wave speed c(t) in the definition of u c .

Stochastic selection
An important feature in the theory of entropy solutions of scalar conservation laws is the stability under regularizations, and in particular under vanishing viscosity limits.In the above context, the entropy solution (5.21) of (5.19) arises as the strong limit in C([0, T ], L 1 loc (R d )), as ε → 0, of the unique smooth solution u ε of (5.25) where (5.26) u ε T : R d → R d is smooth and lim By contrast, we show here that any solution u c can arise as a limit from suitably regularized equations.
Remark 5.4.The restrictions that c ′′ ∈ L 1 and c ′ lie strictly within (−1, 0) are put in place to ensure that θ c ∈ W 1,1 and −∞ < θ c < ∞, so that the equation for v ε is well-posed.Achieving speeds c that are only Lipschitz, and where c ′ is allowed to be either 0 or 1, is possible by letting θ c = θ ε c depend suitably on ε.

Theorem 3 . 4 .
2, we may then extract a subsequence ε n n→∞ − −−− → 0 such that φεn , and therefore φ εn , converges in C([s, T ], L p loc (R d )) and L p loc (R d , C([s, T ]).The same arguments as in Theorem 3.2 may then be used to conclude that any such limit is the unique regular Lagrangian flow φ from Theorem 3.1.If b is smooth, then ∂ s φ t,s (x) = −b(s, φ t,s (x)).It is then straightforward to show that all the same theory above can be developed for the terminal value problem for the flow corresponding to −b.For every t ∈ [0, T ] and p

Theorem 3 . 5 .
Assume b satisfies (3.2) and (3.17), and σ satisfies (3.16).Then the following hold with probability one, for all s ∈ [0, T ].(a)There exist solutions of Φ + •,s and Φ − •,s of (3.18) such that, for all t ∈ [s, T ], Φ + t,s is upper-semicontinuous and increasing and Φ − t,s is lower-semicontinuous and increasing.(b) Any other solution Φ of (3.18) satisfies Φ ε s,s = Id .Parts (a), (b), and (c) are then established exactly as in the proof of Proposition 3.1, making use of the comparison result of Lemma 3.3 above.

Theorem 4 . 1 .
Assume that b satisfies (3.2), and define the solution operators in (4.3) using the flow constructed in Section 3. Then the following hold:

Remark 4 . 1 .
The uniqueness of the semigroup is a consequence of the uniqueness of the flow established in the previous section.Note, however, that, solely under the assumption div b ≥ −C 1 for some C 1 ∈ L 1 + ([0, T ]), any weakly-limiting family of solution operators must lead to solutions in C([0, T ], L p loc (R d )), in the strong topology.

Definition 4 .
1 is then generalized as follows:
and if u is the solution identified in Theorem 4.7 of(4.23) in [0, t] × R d with terminal value u(t, •) = u, then f (t, x)u(x)dx = f 0 (x)u(0, x)dx.