A MICROSCOPIC MODEL FOR THE BURGERS EQUATION AND LONGEST INCREASING SUBSEQUENCES

We introduce an interacting random process related to Ulam's problem, or finding the limit of the normalized longest increasing subsequence of a random permutation. The process describes the evolution of a configuration of sticks on the sites of the one-dimensional integer lattice. Our main result is a hydrodynamic scaling limit: The empirical stick profile converges to a weak solution of the inviscid Burgers equation under a scaling of lattice space and time. The stick process is also an alternative view of Hammersley's particle system that Aldous and Diaconis used to give a new solution to Ulam's problem. Along the way to the scaling limit we produce another independent solution to this question. The heart of the proof is that individual paths of the stochastic process evolve under a semigroup action which under the scaling turns into the corresponding action for the Burgers equation, known as the Lax formula. In a separate appendix we use the Lax formula to give an existence and uniqueness proof for scalar conservation laws with initial data given by a Radon measure.

1. The results. This paper constructs a Markov process related both to longest increasing subsequences of random permutations and to the Burgers equation, a first-order nonlinear partial differential equation. The process lives on the sites i of the one-dimensional integer lattice Z. Its state is a configuration η = (η i ) i∈Z of nonnegative variables η i ∈ [0, ∞) that we picture as vertical sticks on the lattice sites. The state evolves in continuous time according to the following rule: (1.1) At each site i, at rate equal to η i , a random stick piece u uniformly distributed on [0, η i ] is broken off η i and added on to η i+1 .
Our main result is a hydrodynamic scaling limit: The process is scaled by speeding up time and shrinking lattice distance by a factor N. In the limit N → ∞ the empirical profile of the stick configuration converges to a weak solution of the Burgers equation. The longest increasing subsequences appear in a rigorous construction of the process and also in the proof of the scaling limit. Motivation for the paper is discussed in Section 2. First we present our theorems, beginning with the existence of the process. According to the conventional interpretation of particle system generators (see Liggett's monograph [Lg]), an obvious formalization of rule (1.1) is the generator where the configuration η u,i,i+1 is defined, for i, j ∈ Z and 0 ≤ u ≤ η i , by Lf is certainly well-defined for bounded continuous cylinder functions f (functions depending on finitely many sticks only) on [0, ∞) Z . However, we shall not define the process through the generator, but by explicitly constructing the probability distributions of the process on the path space. Rule (1.1) shows that the process is totally asymmetric in the sense that stick mass moves only to the right. It is fairly clear that, for the process to be welldefined from an initial configuration (η i ), we cannot allow arbitrary fast growth of η i as i → −∞. We take the state space to be This is the largest possible state space for which our construction of the process works. It matches the class of admissible initial profiles of the partial differential equation, compare with (1.7) below.
We topologize Y with a new metric r defined below. A topology stronger than the usual product topology is necessary for uniform control over the sticks to the left of the origin and because Y is not even a closed subset of [0, ∞) Z . For real numbers a and b, let r 0 (a, b) = |a − b| ∧ 1, and for η, ζ ∈ Y set r(η, ζ) = sup n≤−1 It is not hard to see that (Y, r) is a complete separable metric space. A partial order on Y is defined stickwise: η ≥ ζ if η i ≥ ζ i for all i ∈ Z. It may be intuitively plausible from (1.1) that the process is attractive, and this will be proved by a coupling. By Feller continuity we mean that E η f(η(t)) is a continuous function of η for t > 0 and f ∈ C b (Y ), the space of bounded continuous functions on Y . As usual D([0, ∞), Y ) denotes the space of right-continuous Y -valued trajectories η(·) with left limits.
Theorem 1. There is a Feller continuous attractive Markov process on Y with paths in D([0, ∞), Y ) such that holds for all bounded continuous cylinder functions f on Y and all initial states η ∈ Y . The i.i.d. exponential distributions on (η i ) i∈Z are invariant for the process.
Next we turn to the Burgers equation. In one space dimension this is the nonlinear conservation law (1.5) ∂ t u + ∂ x (u 2 ) = 0, where u(x, t) is a real-valued function defined for (x, t) ∈ R × (0, ∞). The solutions of this equation may develop discontinuities even for smooth initial data. Hence in general we can hope to solve it only in a weak sense instead of finding a function u(x, t) that is everywhere differentiable and satisfies (1.5) as it stands. Moreover, it turns out that our scaling limit does not require the initial data to even be a function, so we wish to allow as general initial conditions as possible. The following definition of weak solution turns out to be the right one for our purposes. Recall that the Radon measures on R are those nonnegative Borel measures under which bounded sets have finite measure, and that their vague topology is defined by declaring that ν n → ν if ν n (φ) → ν(φ) for all functions φ ∈ C 0 (R) (compactly supported, continuous).
Definition 1. Let m 0 be a Radon measure on R. A measurable function u(x, t) on R×(0, ∞) is a weak solution of (1.5) with initial data m 0 if the following conditions are satisfied: (i) For a fixed t > 0, u(x, t) is right-continuous as a function of x.
(iv) For all φ ∈ C 1 0 (R) (compactly supported, continuously differentiable) and t > 0, Theorem 2. Let m 0 be a Radon measure on R such that Then there exists a function u(x, t) that satisfies Definition 1 and has the following additional properties: (i) For a fixed t > 0, u(x, t) is continuous as a function of x except for countably many jumps, and u(x−, t) ≥ u(x+, t) = u(x, t) ≥ 0 for all x.
(ii) Among the solutions satisfying Definition 1, u(x, t) is uniquely characterized as the one with minimal flux over time: If v(x, t) is another weak solution satisfying (i)-(iv) of Definition 1, then for t > 0 and all x ∈ R such that m 0 {x} = 0, If equality holds in (1.8) for a.a. x ∈ R (in particular, for all x such that m 0 {x} = 0), then u(x, t) = v(x, t) for a.a. x. (iii) In case m 0 (dx) = u 0 (x)dx for some u 0 ∈ L ∞ (R), then u(x, t) is the unique entropy solution characterized by the existence of a constant E > 0 such that for all t > 0, x ∈ R, and a > 0, The uniqueness statement (last sentence in (ii)) is an immediate consequence of the form of the right-hand side of (1.6). A fairly involved proof shows that the entropy condition (1.9) guarantees uniqueness with L ∞ (R) initial data, see Theorem 16.11 in Smoller's monograph [Sm]. We prove Theorem 2 in the Appendix by proving the analogous result for a more general scalar conservation law in one space dimension.
Given m 0 satisfying (1.7), we make the following assumption on the sequence {µ N 0 } ∞ N =1 of initial distributions on the stick configurations: (1.10) For all −∞ < a < b < ∞ and ε > 0, Hidden in the assumption that µ N 0 be a measure on Y is of course the condition that (1.11) lim n→−∞ n −2 −1 i=n η i = 0 holds µ N 0 -a.s. Additionally, our proof of the scaling limit requires a certain uniformity: (1.12) There is a number b ∈ R such that for all ε > 0 we can find q and N 0 that satisfy Let P N be the distribution of the process on D([0, ∞), Y ) when the initial distribution is µ N 0 , and write η N (t) for the process. The empirical measure α N t is the random Radon measure defined by In other words, the integral of a test function φ ∈ C 0 (R) against α N t is given by The definition of α N t incorporates the space scaling: The sticks now reside on the sites of N −1 Z. The time scaling is introduced by explicitly multiplying t by N. Assumption (1.10) says that α N 0 → m 0 in probability as N → ∞, and our main theorem asserts that this law of large numbers is propagated by the stick evolution: Theorem 3. Assume (1.7) and (1.10)-(1.12), and let u(x, t) be the solution given in Theorem 2. Then for each t > 0, α N N t → u(x, t)dx in probability as N → ∞, in the vague topology of Radon measures on R. Precisely, for each φ ∈ C 0 (R) and Remark 1.13. Here are two examples of initial distributions satisfying (1.10)-(1.12) (η N i denotes the stick η i as a random variable under µ N 0 ): (i) For any m 0 satisfying (1.7), the deterministic initial sticks η N i = Nm 0 [i/N, (i + 1)/N ) satisfy (1.10)-(1.12).
(ii) Suppose m 0 (dx) = u 0 (x)dx for a function u 0 ∈ L ∞ loc (R), and that u * (x) = ess sup x<y<0 u 0 (y) grows at most sublinearly: lim x→−∞ |x| −1 u * (x) = 0. Then we may take (η N i ) i∈Z independent exponentially distributed random variables with expectations E[η N i ] = Nm 0 [i/N, (i + 1)/N ). These claims are verified in Section 10. Furthermore, we show by example that the independent exponential sticks described in (ii) can fail to lie in Y if u * (x) grows too fast as x → −∞.
These are the main results. The rest of the paper is organized as follows: In Section 2 we briefly discuss the general themes touched on by the paper. Section 3 describes Hammersley's particle process and Section 4 contains a string of technical results about it. Section 5 constructs the stick process in terms of Hammersley's process. Section 6 proves that the stick process is attractive through an alternative construction, utilized also in Section 7 to verify that i.i.d. exponential distributions are invariant. Section 8 proves the hydrodynamic scaling limit to an equation that contains an unknown parameter, namely the value c = lim n −1/2 L n where L n is the longest increasing subsequence of a random permutation on n symbols. In Section 9 we calculate c = 2 by combining the scaling limit of Sect. 8 with a judiciously chosen explicit solution of the Burgers equation. Section 10 proves the claims made in Remark 1.13 and presents examples to illustrate the need for the assumptions we have made. The Appendix develops the existence and uniqueness theorem for the p.d.e.
There is some independence among the sections: The Appendix can be read independently of everything else, and everything else can be read without the attractiveness and invariance proofs of Sections 6 and 7. The reader who wishes to understand how Theorem 3 and c = 2 are proved without wading in technicalities can read Sections 2 and 3, the first paragraph of Section 5, Section 8 without the proofs, and then Section 9. The complete proof of Theorem 1 runs from Section 3 to Section 7.
2. The context of the paper. Motivation for this paper comes from two sources: (1) Hydrodynamic scaling limits for asymmetric particle systems and (2) Ulam's problem, or the study of the longest increasing subsequence of a random permutation.
The asymmetric simple exclusion and zero range processes are the two interacting particle systems that have been studied as microscopic models for nonlinear conservation laws. A law of large numbers of the type of our Theorem 3 was first obtained by Rost [Ro] in 1981 for the totally asymmetric simple exclusion process. His result was valid only for the initial profile u 0 (x) = I (−∞,0) (x). Techniques developed over a decade, and in 1991 Rezakhanlou [Re] published results that admitted a general bounded initial profile u 0 ∈ L ∞ (R d ) and covered both the exclusion and the zero range process in several space dimensions. In comparison, our Theorem 3 admits an even more general initial profile, but our approach confines us to one space dimension.
Of particular interest is the study of the particle system at locations where the solution u(x, t) of the macroscopic equation has a discontinuity, or a shock. A recent review of such results is provided in [Fe]. We propose the stick model as an addition to the arsenal of models on which such questions can be studied. If these properties of the stick model are accessible, they can be compared with the wealth of results obtained for the exclusion process to see to what extent the results are model-specific. The stick model was originally developed to study microscopic mechanisms for the porous medium equation. See [ES,FIS,SU] for accounts of this work.
Ulam's problem is the evaluation of c = lim n −1/2 L n where L n is the length of the longest increasing subsequence of a random permutation on n symbols. By now there are several proofs of c = 2. Our paper is partly inspired by the proof of Aldous and Diaconis [AD]. They turn Hammersley's [Ha] representation of L n in terms of a planar Poisson point process into an interacting particle system which they analyze in the spirit of Rost to deduce c = 2. This particle system, named after Hammersley in [AD], is turned in our paper into the stick model by mapping the interparticle distances to stick lengths. This connection is the same as the one between the simple exclusion and zero range processes that has been used by Kipnis [Ki] among others.
To show the connection of the stick model with Ulam's problem, let us outline the proof of Theorem 3. At the heart of the proof is a link between the microscopic description (the stick model) and the macroscopic description (the p.d.e.) in terms of "distribution functions" of the two measures α N N t and u(x, t)dx. Microscopically this is given by a configuration of particles z(t) = (z k (t)) k∈Z on R that satisfy The macroscopic equivalent is a function U(x, t) that satisfies for a < b. The particle dynamics z(t) corresponding to the stick dynamics of Theorem 1 can be defined by Here (z i ) is the initial particle configuration and Γ((z i , 0), t, k − i) is a random variable defined through Hammersley's point process representation of L n (precise definition follows in Sect. 3). Formula (2.3) identifies z(t) as Hammersley's particle process. On the other hand, if u(x, t) is the solution described in Theorem 2 then U(x, t) satisfies By assumption (1.10), N −1 z [N q] → U 0 (q) as N → ∞. Thus the proof of Theorem 3 boils down to showing that Nq] converges to (x − q) 2 /4t. The proof of c = 2 is hidden in the identification of this limit, to which Section 9 is devoted. In the actual proof the process z(t) and the function U(x, t) become the primary objects, and η(t) and u(x, t) are defined by (2.1)-(2.2). The formal similarity of (2.3) and (2.4) acquires depth through various parallel properties of the evolutions. For example, there is a semigroup property in the obvious sense: for any 0 ≤ s ≤ t. What is most intriguing is that the semigroup (2.3) operates at the level of paths, or individual realizations, of the stochastic process, yet it matches with the action on the macroscopic level where all randomness has been scaled away. We conclude with some observations about the macroscopic equations of the three processes, the stick, the exclusion, and the zero range. The equation of the totally asymmetric exclusion process, is also often called the Burgers equation because the formula ρ = 1/2−u transforms between weak solutions of (1.5) and (2.5). But since this connection does not preserve nonnegativity (u ≥ 0, ρ ≥ 0) it does not link the stick and exclusion processes.
As the mass of a particle model comes in discrete units there is a minimal rate at which mass leaves an occupied site, while for the stick model there is no such positive lower bound. This simple observation manifests itself in the speed of propagation of the macroscopic equation. The equation of the zero range process is Here c(k) is the rate at which a single particle leaves a site when there are k particles present, and ν ρ is the equilibrium measure with expectation ρ. The basic hypothesis for the hydrodynamic limit is that c : N → [0, ∞) be a bounded nondecreasing function with 0 = c(0) < c(1) (see [Re]). A computation shows that f (0) = c(1), while for the stick model the corresponding derivative is (d/du)(u 2 )| u=0 = 0. By Remark A17 of the Appendix, the source solution of (2.6) travels with speed c = c(1) > 0, while the left endpoint of the source solution for (1.5) never leaves the origin. Similarly for (2.5), there is a nonzero speed (d/dρ)[ρ(1 − ρ)]| ρ=0 = 1.
3. The particle picture. To prove Theorem 1 we construct the stick process in terms of a related particle process. The state of the particle process is a sequence z = (z k ) k∈Z of particle locations on R, labeled so that z k+1 ≥ z k for all k. The connection with the stick configuration is that η i = z i+1 − z i for all i. Thus to match (Y, r) we define the state space of the particle process to be with the metric It is clear that as a metric space (Y, r) is equivalent to (Z β , s) for any β ∈ R, where Z β = {z ∈ Z : z 0 = β}. The evolution of the particle configuration is defined in terms of a rate 1 Poisson point process on R × (0, ∞). Fix a realization of such a process, in other words, a simple point measure on R × (0, ∞). For 0 ≤ s < t and −∞ < a < b < ∞, consider all the up-right paths of points in the rectangle (a, b]×(s, t]: These are finite sequences (x 1 , t 1 ), . . . , (x , t ) of points of the point process contained in (a, b]×(s, t] such that x 1 < x 2 < · · · < x and t 1 < t 2 < · · · < t . Define L ((a, s), (b, t)) to be the maximal number of points on such a path. (This notation is from [AD].) An inverse to this quantity is defined by In other words, Γ((a, s), t, k) is the horizontal distance needed for building an upright path of k points starting at (a, s), with vertical distance t − s at our disposal. Suppose Γ((a, s), t, k) = h and (x 1 , t 1 ), . . . , (x k , t k ) is an up-right path of k points in (a, a + h] × (s, t]. Let γ be the piecewise linear curve got by connecting (a, s) to (x 1 , t 1 ), (x 1 , t 1 ) to (x 2 , t 2 ), and so on up to (x k , t k ), and then (x k , t k ) to (x k , t).
Thus γ connects the horizontal time-s and time-t lines. We call γ an up-right curve that realizes Γ((a, s), t, k), and say that γ contains an up-right path of k points. Given an initial configuration (z i ) ∈ Z, the positions of the particles at time t > 0 are defined by In the next section we prove rigorously that this evolution is well-defined for a.e. realization of the Poisson point process, and that it defines a Feller process on the path space D([0, ∞), Z). But first we wish to point out that this is the process that Aldous and Diaconis [AD] call Hammersley's particle process: Set Then N(y, t) − N(x, t) is the number of particles in (x, y] at time t, and N( · , t) evolves by the rule which is precisely formula (10) in [AD]. In [AD] the reader can also find illuminating pictures of typical paths of the particles. We regard z as a particle configuration rather than as a Radon measure on R to retain the flexibility of having infinitely many particles in a bounded interval, or even at a single location.
4. The particle process as a Feller process on D([0, ∞), Z). As the particle evolution is defined in terms of Γ, and Γ contains the same information as L , we could from now on express everything in terms of Γ and entirely forget L . But we choose not to do this, for L is convenient to work with, and through L we emphasize the connection of our paper with past work on increasing subsequences and related problems. We write P for all probabilities involving the Poisson point process, and P z when the probability space of the point process is augmented by the choice of an initial configuration z ∈ Z. These easily obtained bounds are fundamental to all that follows: Lemma 4.1. For any s ≥ 0, a ∈ R, τ, h > 0, and k ∈ Z + : Proof. Given that there are j (≥ k) points in the rectangle (a, a + h] × (s, s + τ ], the probability of having an up-right path of length ≥ k among them is at most j k (k!) −1 . (Because then one of the j k possible k-sets must be an up-right path, and given the x-coordinates of a k-set, only one of the k! equally likely orderings of the t-coordinates turns it into an up-right path.) Since j has Poisson distribution with expectation τ h, proving (4.2). For (4.3), use (k!) −1 ≤ (e/k) k and then take k = β 0 √ τ h , the smallest integer above β 0 √ τ h.
As the first application we check that (3.2) defines an evolution in Z.
Proposition 4.4. Let z ∈ Z and define z(t) = (z k (t)) k∈Z by (3.2) for 0 < t < ∞. Then the following holds for almost every realization of the point process: For all t > 0, z(t) ∈ Z and for each k holds for i = i ± (k, t) but fails for all i < i − (k, t) and i > i + (k, t).
Proof. Fix T > 0 and let ε > 0 be arbitrary but small enough so that Then by (4.3) By Borel-Cantelli and the monotonicity of L , there exists a P z -almost surely finite random variable J such that Fix a realization of the point process for which J > −∞. By hypothesis and then by (4.7) Thus only a finite range of i's come into question as possible minimizers in (3.2), and the existence of i ± (k, t) follows.
To quantify this finite range of i's, pick k 0 < 0 so that 2εk 2 > z k for k ≤ k 0 .
and consequently by (4.7) By the monotonicity of L , (4.6) remains valid if ε is decreased. In other words, this argument can be repeated for arbitrarily small ε > 0, with new values of j 0 and k 0 , but without changing the realization of the point process or the value of J . Then (4.9) shows that |z To get a single P z -null set outside of which these properties hold simultaneously for all t, let T ∞ along a countable set.
To establish P z -a.s. properties of the evolution, fix an initial configuration z ∈ Z and a realization of the point process such that the statement of Proposition 4.4 is valid. It is obvious from (3.2), but important, that With a little more work we see that (P z -a.s.) each realization of the particle evolution satisfies a semigroup property: Proof. Let i < j = i − (k, t). Then (4.15) cannot happen, hence by Lemma 4.13 neither can (4.14), and so i cannot equal i − (k, s).
By arguments with up-right curves similar to those employed above, we leave it to the reader to prove another property we shall need later: We now turn to the regularity of the paths z(t). First we show that the trajectory of a single particle is in the space D([0, ∞), R) (P z -a.s.), and then utilize the monotonicity built into (z k (t)) k∈Z to extend this to the whole configuration.
Lemma 4.18. For a fixed k, z k (t) is right-continuous and has left limits as a function of t.
Proof. By (4.10) limits exist from both left and right, and we shall be done after showing that for some ε Fix t 1 > t 0 and let If j 1 = k we can stop here, so suppose j 1 < k. If j satisfies the requirement in braces, then for some i ≤ j there is an up-right curve from (z i , 0) to (z k (t 1 ), t 1 ) through (z j (t 0 ), t 0 ), and hence i − (k, t 1 ) ≤ j. In particular, j 1 above is finite. Let contains no point process points}.
Since a realization of the point process is a Radon measure, ε > 0. Let t ∈ (t 0 , t 0 +ε). By Corollary 4.16 (with time zero replaced by time t 0 ) there is no j < j 1 such that Proposition 4.21. The path z(·) is an element of D([0, ∞), Z).
Proof. Fix t 0 . We need to show that s(z(t), z(t 0 )) → 0 as t t 0 . Let 0 < ε < 1. Since z, z(t 0 + 1) ∈ Z, we may pick k 0 < 0 so that By a similar argument we show that A few words about technical details that the reader is welcome to skip. Z is a measurable subset of [−∞, +∞) Z in the natural product σ-field, and the Borel field of Z coincides with its relative σ-field as a subspace of [−∞, +∞) Z . P is a Borel probability measure on the space M s of simple point measures ρ on R × (0, ∞), endowed with the vague topology. We leave it to the reader to convince herself that , on which z(t) ∈ Z and the other properties proved in this section hold for all 0 ≤ t < ∞. A can be defined by requiring that, for each fixed T , (4.6) holds for all small enough ε with some finite J . For (z, ρ) / ∈ A we redefine the evolution by z(t) ≡ z for some fixed element z ∈ Z, so that z(t) becomes a measurable Z-valued function of (z, ρ) ∈ Z × M s . Since the redefined z(·) is constant on the exceptional set where Proposition 4.21 fails, we have in fact produced a map (z, ρ) → z(·) from Z × M s into D([0, ∞), Z). This is again measurable since the Borel field of D([0, ∞), Z) is generated by the projections z(·) → z(t), 0 ≤ t < ∞.
Thus we have constructed the distribution of z(·) under P z as a Borel probability measure on D([0, ∞), Z). From Proposition 4.11 it follows that this particle process is Markovian. As the last result of this section we prove that the transition probabilities are Feller continuous.
Write w(t) for the evolution defined by (3.2) for an initial configuration w. The point process gives a natural coupling of the particle evolutions z(t) and w(t).
Pick δ 1 ∈ (0, 1) so that the event (4.27) the set contains no point process points has P (z,w) -probability at least 1 − ε 1 /2. (Note that the events appearing above do not depend on w so for them P (z,w) -probability is the same as P z -probability.) Pick δ > 0 so that whenever s(z, w) ≤ δ, For the remainder of the proof, pick and fix w ∈ Z so that s(z, w) ≤ δ, and a realization of the point process for which J ≥ k 0 and which has no points in Since realizations not satisfying these requirements have P (z,w) -probability less than ε 1 , the proof is completed by showing that s(z(t), w(t)) ≤ ε 0 .
It follows from (4.25), (4.26), (4.28), and from δ 1 < 1 that for i ≤ k 1 hence by (4.24) and (4.26) for |k| ≤ |k 0 |, This implies that for |k| ≤ |k 0 |, The same statement holds for z too, as it is certainly true that s(z, z) ≤ δ. Now observe from (4.28) that for k 1 ≤ i ≤ |k 0 |, z i and w i lie in the vertical strip that contains no point process points, and consequently when computing z k (t) and w k (t) by (4.31), the same up-right paths are available. Hence we have For k < k 0 similar reasoning gives Combining (4.25), (4.32), (4.33), and the fact that From this lemma it follows easily that E z [f(z(t))] is a continuous function of z for f ∈ C b (Z), or that the transition probability of the particle process is Feller continuous. 5. From the particle process to the stick process. Given an initial stick configuration η and a number β, define a particle configurationz β =z β (η) by A stick configurationη =η(z) is defined in terms of a particle configuration z bỹ These formulas extend to continuous mappings between D([0, ∞), Y ) and where the choice of β is immaterial asη(z β (η)) = η for all β and the distribution of the Poisson point process is invariant under translations. It follows from the development of the previous section that this defines the stick process η(t) as a Y -valued Markov process with Feller continuous transition probabilities. The remainder of this section is devoted to the proof of (1.4). We begin with some sharper estimates on the probabilities of the particle process.
and C is a constant independent of everything else.
Proof. The case λ k (z) = 0 is trivial: All particles for i ≤ k − 1 stay piled at the origin for all time. Assuming λ = λ k (z) > 0, it follows from (5.5) that For the event Inside the last sum, the first factor of the nth term is bounded by e √ t λ 2n ≤ 2 −n , as (n!) −1 ≤ (e/n) n and t ≤ t k (z), and the second factor is bounded by a constant uniformly over n.
These events depend both on the initial condition z and on the point process in We abbreviatedη(τ ) =η(z(τ )). The integration variable r in the second term of the middle formula is the x-coordinate of the point process point in [z −m−1 , z m+1 ]× (0, τ ], given that there is exactly one. Thus for a constant C depending only on f ∞ , Let A be a constant large enough so that τ = A −3 satisfies τ < [2 e 2 A] −1 . Increase A further so that n = t/τ is an integer. Let s j = jτ , j = 0, . . . , n. Fix an initial configuration η and set z =z β (η) for some β we need not specify. By the Markov property (5.16) By the monotonicity of the particle locations, and by Corollary 5.10 E z [ |ξ| p ] < ∞ for all p. Clearly |Lf(η(s))| ≤ 2 f ∞ δ(z(s)), so by the right continuity of paths and dominated convergence, the first term of the last formula in (5.16) converges, as τ → 0, to The second term vanishes as τ → 0 by the same bound in terms of ξ. The proof will be complete once we have shown that the third term vanishes as A → ∞. By (5.14) and (5.15) Adding up these terms and using τ = A −3 and n = t A 3 gives (the value of the constant C has obviously been changing from line to line) and used the inequality ψ ≥ λ(z(s)) for 0 ≤ s ≤ t. By Corollary 5.10 E z [ |ψ| p ] < ∞ for all p < ∞, thus the last line of (5.18) vanishes as A → ∞.
6. Attractiveness of the stick process. To prove attractiveness we present an alternative construction of the stick process, again in terms of Poisson point processes but this time with no reference to Hammersley's particle system. This second construction has the advantage of being more directly connected with the stick dynamics and it reduces the proof of attractiveness to a mere observation. However, the moment estimates needed for this construction seem hard to come by without the help of the first construction in terms of the particle system. The technical problem is to control the amount of stick mass moving in from the left, so we perform the construction in two steps: Step 1. We use the new construction to define the dynamics of (ζ k (t)) −M ≤k<∞ for a fixed M < ∞, or equivalently, we set ζ k (t) = 0 for all t ≥ 0 and k < −M.
The difficulty is simply defined away: There is no mass to the left of site −M that needs controlling.
Step 2. We let M ∞ and use the original construction to show that the processes ζ M (·) constructed in Step 1 converge weakly to the stick process defined by (5.2).
Fix initial configurations η and z =z 0 (η) for the duration of the section. As before, the particle dynamics z(·) is defined by (3.2), with the corresponding stick dynamicsη Write z M (·) for the particle dynamics defined by (3.2) with initial configuration z M . Note that z M k (t) ≥ z −M and z M k (t) ≥ z k (t) for all k, and z M k (t) = z −M for k ≤ −M. The corresponding stick dynamics is defined byη M k (t) = z M k+1 (t) − z M k (t), and we haveη M k (t) = 0 for all k ≤ −M − 1 and all t ≥ 0. The stick processesη(·) andη M (·) have distributions P η and P η M on D([0, ∞), Y ), respectively, defined by (5.2).
This was the old construction performed in the previous section. Next we explain the new construction that results in a sequence of processes ζ M (·) with distributions Q η M on D([0, ∞), Y ). Start by giving each site k ∈ Z a realization A k of a rate 1 Poisson point process on (0, ∞) × [0, ∞). Points of these processes are denoted by (t, b) with the interpretation that t ∈ (0, ∞) stands for time and b ∈ [0, ∞) for the vertical height of a stick. To avoid conflicts, assume that all the t-coordinates of A k are distinct from those of A for k = .
Consider a fixed M ∈ Z + for a while. Set ζ M k (t) = 0 for all t ≥ 0, if k < −M. The evolution of ζ M k (t) for k ≥ −M is determined by the following rule: Starting at any time t, ζ M k (·) remains equal to ζ M k (t) until the first time s > t when either (1) site k receives a stick piece from site k − 1 or (2) there is a point (s, b) ∈ A k such that b < ζ M k (t), which sets ζ M k (s) = b and gives the remaining piece ζ M k (t) − b to site k + 1.
Here is the precise inductive definition. Starting with the initial configuration η M , define The stick at site −M only decreases with time as it receives no stick mass from the left. This generates a sequence of jump times 0 . Now the induction step for an arbitrary k > −M: Given from the step for k − 1 are times 0 = τ k−1,0 < τ k−1,1 < τ k−1,2 < . . .
∞ and numbers u k−1,n > 0, n = 1, 2, 3, .... These specify that site k receives from site k − 1 a piece of length u k−1,n at time for t ∈ (τ k−1,n , τ k−1,n+1 ), and then This specifies the evolution of ζ M k (·). As the last part of the induction step, construct the input for the next step by setting τ k,0 = 0, We shall prove that, as M → ∞, the processes ζ M (·) converge to the stick processη(·): Since z M k (t) z k (t) a.s. as M ∞ for all t and k, it is clear that the finitedimensional distributions of P η M converge to the finite-dimensional distributions of the stick process P η . Thus to conclude weak convergence of Q η M to P η we need to prove (i) that Proof. All sticks remain zero for all time to the left of −M under both Q η M and P η M , hence it suffices to show that (ζ M k (·)) −M ≤k≤K = (η M k (·)) −M ≤k≤K in distribution, for any fixed K < ∞. Fix a number κ > 0, and consider the compact space Take κ large enough for X to contain the initial configuration (η M k ) −M ≤k≤K chosen above. Since the dynamics moves stick mass only to the right and no stick mass is entering from the left of site −M, X is closed under the dynamics. With finite total stick mass there are a.s. only finitely many stick-cutting events in any finite time interval, so we can view both (ζ M k (·)) −M ≤k≤K and (η M k (·)) −M ≤k≤K as Markov jump processes on X, with a common initial state. To see that these processes coincide it is then enough to observe that they have a common strong generator whose domain is all of C(X). Forη M (·) this follows from noting that estimates (5.14)-(5.15) hold uniformly over X. We leave to the reader the analogous calculation for ζ M (·).
Next a compactness criterion whose proof we leave to the reader. Proof. By standard compactness criteria (see for example Thm 7.2 in Ch. 3 of [EK]), we need to check two things: (i) For each t and ε > 0 there exists a compact set K ⊂ Y such that (ii) For every ε > 0 and T there exists a δ > 0 and M 0 < ∞ such that s, t ∈ [t i−1 , t i ) for some i = 1, . . . , n } and {t i } ranges over all partitions such that 0 = t 0 < t 1 < · · · < t n−1 < T ≤ t n , min 0≤i≤n−1 (t i+1 − t i ) > δ, and n ≥ 1.
Proof of (i): By the tightness of the distribution of z(t) on Z, pick a compact K 1 ⊂ Z such that P z {z(t) ∈ K 1 } > 1 − ε/4. An obvious modification of the compactness criterion of Lemma 6.5 applies to Z, hence n −2 β n → 0 as n → −∞, where β n = sup w∈K 1 |w n |. Pick a k > 0 so that a compact subset of Y . In the next calculation, use the following facts: as we set z 0 = 0 at the outset. Thus Proof of (ii): We need only consider δ ≤ 1 so set τ = T +1. For any s, t appearing inside the braces in (6.7) and any k 0 > 0, Thus (6.8) where w (η M (·)) −k 0 ≤k≤k 0 , δ, T is the modulus of continuity defined as in (6.7) but with the metric Choose k 0 large enough so that the first and second term after ≤ in (6.8) are both ≤ ε/3. This bound is uniform in M, hence it remains to find M 0 and δ so that By Proposition 4.4, J is a P z -a.s. finite random variable. On the event { J ≥ −M 0 }, by Corollary 4.16 and Lemma 4.17. Pick M 0 large enough so that P z { J < −M 0 } < ε/6. Then for M ≥ M 0 , The last probability no longer varies with M, hence can be made ≤ ε/6 by choosing δ small enough.
We have proved Proposition 6.1. It remains to make explicit the coupling that proves attractiveness. Proposition 6.9. Given initial configurations η, ζ ∈ Y such that η ≤ ζ, there is a joint distribution P (η,ζ) whose marginals are the stick processes started with η and ζ and for which P (η,ζ) { η(t) ≤ ζ(t) } = 1 for all t.
Proof. Use the new construction described in this section to produce processes η M (·) and ζ M (·) with initial conditions η M and ζ M , respectively, and couple them through common point processes {A k } k∈Z . Let P M denote the joint distribution of (η M (·), ζ M (·)). It is clear from the construction that η M k (t) ≤ ζ M k (t) holds for all M, k, and t, for a.e. realization of {A k } k∈Z . Same reasoning shows in fact that P M +1 stochastically dominates P M . Thus P M converges to a distribution P that by the earlier part of this section has the right marginals, and we take it to be P (η,ζ) .
7. The invariance of exponentially distributed sticks. Let ν be the probability distribution on Y under which the (η i ) i∈Z are i.i.d. exponential random variables with common mean β −1 . Write S(t) for the semigroup of the stick process on Y , and pick and fix a cylinder function f ∈ C b (Y ). We wish to prove (7.1) ν S(t)f = ν(f).
Pick K so that f(η) = f(η −K , . . . , η K ) and let M > K. Write S M (t) for the semigroup of the process (ζ M k (·)) −M ≤k≤M considered in the proof of Lemma 6.2.
Since S M (t) can be restricted to the compact space X κ −M,M where it has the strong generator L M = L −M,M of (6.4), we have By Proposition 6.1 S M (t)f → S(t)f as M → ∞, pointwise and boundedly. Hence to prove (7.1) it suffices to show that Consider the ith term of the first sum. Writeν for the marginal distribution of (η j ) j =i,i+1 . Then Let M → ∞ and apply Proposition 6.1. This proves (7.2), and completes the proof of Theorem 1.
(See [Du], section 6.7, or [Ha].) It is also clear that this convergence, being really a statement about the distribution of L , is valid even if the base point (a, s) varies with (τ, h). It follows readily that, for any sequence r N ∈ R, any t > 0 and −∞ < a < b < ∞, The symbol c is reserved for the constant defined by (8.1), except in the Appendix, where it denotes another important constant. In this section we prove the following proposition: Proposition 8.3. Assume (1.7) and (1.10)-(1.12) as in the statement of Theorem 3. Let u(x, t) be the solution given by Theorem A1 to the equation with initial condition m 0 . Then for each t > 0, α N N t → u(x, t)dx in probability as N → ∞, in the vague topology of Radon measures on R.
The value c = 2 is calculated and the proof of Theorem 3 thereby completed in the next section.
Fix a function U 0 (x) such that m 0 [a, b) = U 0 (b)−U 0 (a) for all −∞ < a < b < ∞. From the initial stick configuration η N distributed according to µ N 0 , define initial particle configurations by z N =z N U 0 (0) (η N ), as in formula (5.1). Then assumption (1.10) guarantees that for all x ∈ R. As before, the particle process z N (t) is a.s. defined by (3.2) as a function of the initial configuration z N (now random) and the Poisson point process on R × (0, ∞). The stick process is then defined by For the remainder of the section we assume that everything is defined on some common probability space and write P for all probabilities. By the Appendix, the formula solves (8.4) in the following sense: For each t > 0, the solution u(x, t) is the Radon-Nikodym derivative dm t /dx of the measure m t defined by m t [a, b) = U(b, t) − U(a, t). To prove Proposition 8.3 it suffices to show that α N N t [a, b) → U(b, t)−U(a, t) in probability for any fixed −∞ < a < b < ∞. By (8.6) this follows from having By the continuity of U(x, t) in x, it suffices to take k(N) = [Nx] in (8.8). It is this statement that we shall now prove for a fixed (x, t).
Recall from (3.2) that Lemma 8.11. Given ε > 0, there exist r < x and N 0 such that, for N ≥ N 0 , Proof. Choose ε 0 > 0 such that (2 t ε 0 ) −1/2 ≥ 2 e 2 . Let b be the number given by assumption (1.12), and set Now choose r and N 1 so that P (A N,q ) > 1 − ε/3 whenever N ≥ N 1 and q ≤ r (by (1.12)), Pick N 2 so that P (B N ) > 1 − ε/3 and e −2N (x−r)+2 < ε/3 whenever N ≥ N 2 . Set N 0 = N 1 ∨ N 2 and let N ≥ N 0 . Then (This follows directly from the definition of A N,r if x ≤ b, and by writing z N Summing over j ≤ [Nr] gives The next lemma gives the lower bound and completes the proof of Proposition 8.3.
Before proceeding we describe a coupling useful for a restricted class of initial distributions.
There is a coupling of the two processes such that, almost surely, η(t) ≤ ζ(t) for all t ≥ 0, and η i (t) = ζ i (t) for all t ≥ 0 and i < 0.
Proof. We could use the coupling of Proposition 6.9, but in this case we can do with the simpler particle construction of Section 5. Let w(t) and z(t) be the particle processes through which ζ(t) and η(t) are defined by We couple w(·) and z(·) through a common point process on R×(0, ∞) and through their initial configurations, defined by w =z 0 (ζ) (recall (5.1)) and z i = w i for i ≤ 0, z i = z 0 = w 0 = 0 for i > 0. One can argue directly from (3.2) that z i (t) = w i (t) whenever z i (t) < 0, otherwise z i (t) = 0 ≤ w i (t). This gives the conclusion via (9.3).
Apply this to the following setting: Let (η i ) i<0 be i.i.d. exponential random variables with expectations E[η i ] = 1, and η i = 0 for i ≥ 0. This defines initial stick distributions µ = µ N 0 (the same for all N, hence we drop the superscript N from η) that satisfy assumptions (1.10)-(1.12) for the initial profile u 0 . Let ζ(t) be a stationary stick process such that (ζ i (t)) i∈Z are i.i.d. exponential variables with expectation one for each t ≥ 0. Then Lemma 9.2 implies that the sticks (η i (t)) i<0 stay in equilibrium for all time, and gives uniform moment bounds on η i (t) for i ≥ 0. Take The first term after the equality sign, t, comes from (1/2) E η 2 −1 (s) = 1, utilizing the fact that η −1 (s) = ζ −1 (s) in distribution. Fix b ≥ c 2 t/4 + 2, take m = [Nb], replace t by Nt, and divide by N in (9.4) to get Next let N → ∞ in (9.5). By Proposition 8.3 and (9.1), in probability. By the coupling of Lemma 9.2, uniformly over N. Thus the expectations on the left-hand side of (9.5) converge, and we have (9.6) It remains to argue that Use the coupling of Lemma 9.2 to show that E[η k+1 (t)] ≤ E[η k (t)] for all k.
10. Remarks on the hypotheses. First we discuss the assumptions (1.10)-(1.12) imposed on the initial stick distributions.
Proof of Remark 1.13. Case (i) is trivial. Let us show that (1.10) is satisfied in case (ii). Let β = ess sup a−1≤x≤b |u 0 (x)|. Then Next we prove (1.12); the similar argument for (1.11) is left to the reader. Since for n ≤ i < 0, the variables (η N −j ) 0<j≤|n| are stochastically dominated by i.i.d. exponential variables (ξ j ) 0<j≤|n| with expectations Eξ j = u * (n/N). Given ε > 0, pick q < 0 so that u * (x) ≤ ε|x|/2 for x ≤ q. By exponentiation and Markov's inequality, where we set I(x) = x − 1 − log x and used the fact that I is strictly increasing and positive for x > 1. (I is the Cramér rate function of the exp(1)-distribution from basic large deviation theory.) Thus we have which is < ε for large enough N. This completes the proof for Remark 1.13.
Next some examples where desirable properties fail.
Example 10.1. As soon as there is singularity in m 0 , we cannot in general hope that the initial law of large numbers (1.10) holds for the independent exponential sticks of Remark 1.13(ii): η N i has the exp(1)-distribution for all N, contradicting (1.10). Example 10.2. In this example the independent exponentially distributed initial sticks with expectations E[η N i ] = Nm 0 [i/N, (i + 1)/N ) fail to lie in Y , though m 0 satisfies (1.7) and has a locally bounded density, as smooth as desired.
Example 10.3. As the last example in this series, a deterministic initial configuration that satisfies (1.10)-(1.11) but not (1.12), and the proof we gave in Section 8 fails.
The macroscopic profile is constant: u 0 (x) = u(x, t) = 1 for all x and t, and U 0 (x) = U(x, t) = x. Let β N and N be sequences such that both β N /N and N /N increase to ∞ as N ∞, and N < (Nβ N ) 1/2 < β N for all N. By (8.1) and the fact that c = 2, in probability as N → ∞. Pick and fix t ≥ 1. Then, since N (Nβ N t/2) −1/2 ≤ √ 2, as N → ∞. Choose a subsequence N j ∞ along which the above probabilities sum. By Borel-Cantelli there exists an a.s. finite random variable J such that, for Define the initial particles (that implicitly define the initial sticks) by This contradicts the convergence N −1 z N 0 (Nt) → U(0, t) = 0, since P {J ≤ j 0 } can be made arbitrarily close to 1 by choosing j 0 large enough. In this example (1.12) fails because, for large N and with probability 1, Next we look at the restriction m[x, 0) = o(x 2 ) (x → −∞) placed on the initial macroscopic profile. This condition prevents the solution from becoming infinite in finite time. If it fails, mass moving in from the left can accumulate too fast. Suppose ε > 0 and u 0 (x) = ε|x| for x < 0, u 0 (x) = 0 for x ≥ 0. Then for t < (2ε) −1 the solution is and it ceases to exist by time t = (2ε) −1 . Finally we show that the space Z is the largest possible state space for the particle dynamics obeying (3.2), in the sense that if the process starts outside Z, there is a finite time after which all the particles are at −∞. Consequently, our construction for the stick process does not work for initial configurations outside Y .
Proposition 10.4. Suppose z i ≤ −εi 2 for infinitely many i, for some ε > 0. Then there exists a t > 0 such that P z { z k (t) = −∞ for all k } = 1.
Proof. By (8.1) and the translation invariance of the Poisson point process, there are constants c 0 , c 1 , ε 0 ∈ (0, ∞) such that independently of x ∈ R. Pick and fix t so that c 0 tε/2 ≥ 1. Let k ∈ Z be arbitrary. By the hypothesis, we can choose a sequence j n −∞ such that In particular, now z j n −z j n+1 ≥ (ε/2)(k−j n+1 ) 2 . Pick n 0 so that t(ε/2)(k−j n+1 ) 2 ≥ c 1 for n ≥ n 0 . Then for n ≥ n 0 , These events are independent. Hence a.s.
Appendix. In this final section of the paper we prove Theorem 2 and deduce formula (8.7) that was used in Section 8 to prove the scaling limit. There is nothing to be gained by restricting ourselves to the Burgers equation (1.5), hence we work in the setting and with the notation of section 3 of Lax's lectures [La2]. We give an existence proof and a uniqueness criterion for a nonlinear scalar conservation law in one space variable, with initial data given by a Radon measure. Such a result does not seem to be available in the literature, so it may have some independent interest. The equation we study is where f ∈ C 2 (R) satisfies f(0) = 0 (a convenient normalization with no effect on the equation). Let a(u) = f (u). We assume that a (u) > 0 everywhere so that, in particular, f is strictly convex. The convex dual of f is given by Alternatively, g is characterized by where, by definition, b is the inverse function of a (b(a(u)) = u) defined on the image of a and c = a(0). The function b is strictly increasing, hence g is strictly convex. Furthermore, g is strictly decreasing on (−∞, c], strictly increasing on [c, ∞), and lim z→∞ g(z) ∧ b(z) = ∞. We make two further assumptions, namely Assumption (A3) guarantees that both g and b are finite on (c 0 , ∞) for some c 0 < c. Assumption (A4) is needed only for the uniqueness proof where it enables us to deal with a solution v(x, t) that can take negative values. If we restrict our attention to nonnegative solutions (A4) is not needed, for (A3) together with f(0) = 0 and convexity already implies that − inf u≥0 f(u) < ∞.
The initial profile is a Radon measure m 0 on R that satisfies = 0 for all α > 0.

Pick and fix a left-continuous function
The infimum is not affected by letting q vary over all of R, because no q > x − ct can give a smaller value than q = x − ct. The first step is to establish that U(x, t) is finite and that the infimum in (A6) is always achieved (proofs follow after the main results).
For x ∈ R and t > 0, set q(x, t) = q + (x, t) (this is merely a convention, we could just as well work with q − (x, t)) and then We shall show that q(x, t) is jointly measurable as a function of (x, t) and hence so is u(x, t). It is fairly immediate that U(x, t) ≤ U(y, t) whenever x < y, and then we have a Lebesgue-Stieltjes measure m t defined by m t [x, y) = U(y, t) − U(x, t). The connection with u(x, t) is that, for t > 0, This is the solution we are looking for. Here is the existence theorem.
(i) For a fixed t > 0, u(x, t) is continuous as a function of x except for countably many jumps, and (ii) For 0 < s < t and −∞ < a < b < ∞, (iv) For all φ ∈ C 1 0 (R) (compactly supported, continuously differentiable) and t > 0, Items (ii) and (iii) above guarantee that the integrals in (A10) are well-defined, and (A10) itself says that u(x, t) is a weak solution of (A1) with initial data m 0 . Now we turn to uniqueness. As is well-known, the initial profile does not always uniquely specify a solution, but some additional conditions are needed to rule out all but one solution. It is by now classical that, with L ∞ initial data, a unique solution is characterized by the following entropy condition: There exists a constant E > 0 such that for all t > 0, x ∈ R, and a > 0.
In the general case we can characterize u(x, t) as the solution with minimal flux accumulated over time. For each x, this quantity is the amount of mass that has left the interval (−∞, x) by time t, and it is given by (This equality and the existence of the integral will be proved later.) Theorem A2. Assume (A3)-(A5). Suppose v(x, t) is a measurable function on R × (0, ∞) that satisfies items (ii)-(iv) of Theorem A1 and is right-continuous as a function of x, for each fixed t > 0. Then f(v(x, τ )) is locally integrable as a function of τ ∈ [0, ∞) for all x ∈ R, and if m 0 {x} = 0 we have for all t > 0. If equality holds in (A14) for a.e. x ∈ R (in particular, for all x such that m 0 {x} = 0), then u(x, t) = v(x, t) for a.e. x.
Some remarks and an example before we turn to the proofs.
Remark A15. The use of formula (A6) to solve conservation laws has been known since the 1950's, as evidenced by Lax's paper [La1]. In the context of Hamilton-Jacobi equations formula (A6) is known as the Lax formula (see Sect. 11.1 in [Li]). The novelty of our treatment is in the particular class of initial conditions we cover.
Remark A16. It is clear that Theorem 2 follows from Theorems A1 and A2 and Proposition A12. Formula (8.7) on which the proof of the scaling limit is based is a special case of (A6), as the convex dual of f(u) = c 2 u 2 /4 is g(z) = z 2 /c 2 . The solution (9.1) can be derived from (A6) and (A8) by calculus.
Remark A17. Formulas (A6) and (A8) are really expressing the solution for general initial data in terms of the source solution, that is, the solution whose initial measure m 0 equals δ 0 , a unit mass at the origin. For α ∈ [0, ∞] and t > 0 let γ t be the unique number in [0, ∞] satisfying g(c + γ t /t) = α/t. Set Then for α < ∞ w α (x, t) solves (A1) with m 0 = αδ 0 , and is the solution described in Theorem A1. In general, if the total mass m 0 (R) = α ∈ [0, ∞], then (A6) and (A8) are equivalent to In the terminology of convex analysis, U( · , t) is the infimal convolution of U 0 and W α ( · , t) (see [Rf]).
Example A18. The source solution for the equation ∂ t u + ∂ x (u 2 ) = 0 for a unit mass m 0 = δ 0 is given by Furthermore, for any β > 0, the function also satisfies all the requirements of Theorem A1 except nonnegativity. The flux test (A14) distinguishes v β (x, t) from u(x, t). Yet each v β (x, t) is an entropy solution in the sense of (A11), so the entropy condition alone is not a sufficient uniqueness criterion for singular initial data.
The remainder of the section works through the proofs.
Proof of Lemma A7. First we show that which implies that the infimum in (A6) is a finite number. Pick ε ∈ (0, t). By assumption (A5), which increases to ∞ as q −∞. By (A19) it suffices to consider only q ranging over some bounded interval. Since U 0 (q) is nondecreasing and left-continuous, the expression U 0 (q) + tg((x − q)/t) is lower semicontinuous as a function of q. Consequently the infimum in (A6) is achieved at some q, and then also at the finite values Recall that we defined q(x, t) = q + (x, t) for t > 0. Set q(x, 0) = x.
(v) For a fixed t > 0, q(x, t) is increasing and right-continuous as a function of x. As a function on R × [0, ∞), q(x, t) is upper semicontinuous and locally bounded, hence in particular jointly measurable.
But the right-hand side can be made arbitrarily large by choice of z. Thus no such ε > 0 can exist, and we must have that lim inf q j ≥ x.
(v) That q( · , t) is increasing follows from item (i); right-continuity follows from letting y x in the inequality U 0 (q(y, t)) + tg y − q(y, t) t ≤ U 0 (q) + tg y − q t (q ≤ x − ct arbitrary) and from noting that lim inf y x q(y, t) ≥ q(x, t) by item (i).
For upper semicontinuity we show that q(x, t) ≥ β whenever (x n , t n ) → (x, t) in R × [0, ∞) and q(x n , t n ) ≥ β for all n. In case x n x, apply (i), (ii), and (iv): In case x n x, we have by (i) for all m > n; hence by (ii) or (iv) by the right-continuity of q( · , t).
For local boundedness we show that q(x, τ ) is bounded on any rectangle [a, b] × [0, t]. An upper bound comes from q(x, τ ) ≤ x − cτ . Since q( · , τ ) is increasing, the lower bound comes from showing that inf 0≤τ ≤t q(a, τ ) > −∞. Suppose q(a, τ ) −∞ as τ → σ in [0, t]. If σ = 0, then q(a, τ ) → a by (iv). For σ > 0 the argument used in the proof of Lemma A7 shows that Proof of (A9). Fix t > 0, let −∞ < x < y < ∞, and set q x = q(x, t) and q y = q(y, t). Then for some ξ ∈ (x−q x )/t, (y−q x )/t by the mean value theorem. By the local boundedness of b this shows that m t is absolutely continuous with respect to Lebesgue measure. Letting y x shows that by Lebesgue's differentiation theorem and the continuity of b.
Conversely, take x close enough to y so that (x − q y )/t > c 0 . Then from for a.e. y.
Proof of items (i) and (ii) of Theorem A1. Direct consequences of item (v) of Lemma A20 and the continuity of b.
As a first step towards solving the equation with u(x, t), we show that the equation is valid off the {t = 0} boundary.
Lemma A25. For 0 < t 0 < t 1 and φ ∈ C 1 0 (R), Proof. Let t 0 = s 0 < s 1 < · · · < s n = t 1 be a partition. Integrate by parts and utilize the shorthand q i = q(x, s i ) to write Taking into account that g(z) = zb(z) − f(b(z)) and adding and subtracting terms turns the above sum into Since q i−1 and q i are minimizers in (A6) for U(x, s i−1 ) and U(x, s i ), respectively, we have (A27) α i (x) ≤ 0 and β i (x) ≥ 0.
Note that (f • b) (z) = z b (z) and use the mean value theorem to rewrite A i (x) as x − q i σ i for some numbers σ i , θ i ∈ (s i−1 , s i ). As x varies in the compact support of φ and 0 < t 0 ≤ s i−1 < s i ≤ t 1 , the q i stay bounded and the arguments of b are contained in a fixed compact set. Thus by the continuity of b , uniformly over x. Similarly one sees that Combining what we have done so far, For n ∈ Z + take s i = t 0 + i(t 1 − t 0 )/n and set and utilizing (A27) we can write It follows from item (ii) of Lemma A20 that whenever q − (x, τ ) = q + (x, τ ). In particular, φ (x)Ψ n (x, τ ) → φ (x)f(u(x, τ )) a.e. by Lemma A20(i) and (v). Since Ψ n (x, τ ) ≤ I {φ =0}×[t 0 ,t 1 ] (x, τ ) · sup z∈K |f(b(z))| for a certain compact set K, we may apply dominated convergence to (A28) and conclude that φ (x)f(u(x, τ )) dx dτ ≥ 0.
In particular, this says that m t → m 0 vaguely as t → 0, so the second term in (A26) converges to φ dm 0 as t 0 → 0. To get the required convergence on the right-hand side of (A26) and thereby prove item (iv) of Theorem A1, we need local integrability of f(u(x, t)) up to the boundary {t = 0}, as stated in Theorem A1(iii).
Proof. Proofs of (i) and (ii) follow straightforwardly from (A6) and (A8) and the monotonicity of b. For (iii) we need to show that q k (x, t) q(x, t). Letq = lim k→∞ q k (x, t). Let q ≤ x − ct be arbitrary, and note thatq ≤ q k (x, t) ≤ x − ct. Use the minimizing property of q k (x, t) and the hypothesis to write Letting k ∞ and comparing the first and last lines gives showing thatq is a minimizer for U(x, t) and thereby must satisfyq ≤ q(x, t). By item (ii)q ≥ q(x, t).
Completion of the proof of Theorem A1. Item (iii) is a part of (A33) proved above, and this justifies taking the t 0 0 limit in (A26) to obtain (A10).
Next we develop the uniqueness results, beginning with the fact that we have the entropy solution for L ∞ initial data.