Front location determines convergence rate to traveling waves

We propose a novel method for establishing the convergence rates of solutions to reaction-diffusion equations to traveling waves. The analysis is based on the study of the traveling wave shape defect function introduced in [2]. It turns out that the convergence rate is controlled by the distance between the ``phantom front location'' for the shape defect function and the true front location of the solution. Curiously, the convergence to a traveling wave itself has a pulled nature, regardless of whether the traveling wave is of pushed, pulled, or pushmi-pullyu type. In addition to providing new results, this approach simplifies dramatically the proof in the Fisher-KPP case and gives a unified, succinct explanation for the known algebraic rates of convergence in the Fisher-KPP case and the exponential rates in the pushed case.


Introduction
We consider the long-time behavior of solutions to reaction-diffusion equations of the form with a nonlinearity f ∈ C 2 ([0, 1]) that satisfies f (0) = f (1) = 0, f ′ (0) > 0, f (u) > 0 for u ∈ (0, 1). (1.2) In addition, we normalize the nonlinearity so that f ′ (0) = 1.(1.3)This condition can be achieved by a simple space-time rescaling and is not an extra assumption on f (u).Reaction-diffusion equations of the form (1.1) are used in a wide variety of settings to understand how the interplay of diffusive spreading and growth gives rise to front propagation and invasions.Our interest is in precisely quantifying this behavior.

Convergence in shape to a traveling wave
Traveling waves are solutions to (1.1) of the form u(t, x) = U c (x − ct), with a profile U c (x) such that and 0 < U c (x) < 1 for all x ∈ R, U c (−∞) = 1, U c (+∞) = 0. (1.5) Solutions to (1.4)- (1.5) are only unique up to translation, so we often fix the choice of the wave by the normalization U c (0) = 1 2 . (1.6) Another natural normalization is mentioned in Section 2, see (2.7) below.For nonlinearities satisfying (1.2), there exists a minimal speed c * > 0 such that traveling waves exist if and only if c ≥ c * [21].
The normalization (1.3) implies that c * ≥ 2. We denote the profile of the wave corresponding to the minimal front speed c * as U * (x).The study of the long time behavior of the solutions to (1.1) with initial conditions that decay rapidly as x → +∞ goes back to the original papers [17,24].To be concrete and avoid some additional technicalities, we momentarily consider the case where the initial condition for (1.1) is a step-function: u 0 (x) = u(0, x) = ½(x ≤ 0). (1.7) It is well known that this assumption may be greatly relaxed, as long as u 0 (x) is sufficiently rapidly decaying as x → +∞, see [8,13] for a recent detailed analysis of this issue.It was shown in the original KPP paper [24] that the solution u(t, x) to (1.1) converges to U * (x) in shape.That is, there exists a reference frame m(t) such that u(t, x + m(t)) − U * (x) = o(1), as t → +∞. (1.8) We will refer to m(t) as the front location.Note that, strictly speaking, it is only defined up to an o(1) term as t → +∞.Moreover, the KPP paper showed that the front location m(t) has the asymptotics m(t) = c * t + o(t), as t → +∞. (1.9) The extraordinarily innovative proof in [24] relies on, in modern terminology, an intersection number argument and can be extended not only to all Lipschitz f (u) that satisfy (1.2), but to a much larger classes of nonlinearities.In that sense, both (1.8) and (1.9) are fairly universal results.
Front location and convergence rates in the pushed and pulled cases On the other hand, both the precise character of the o(t) correction to the front location in (1.9) and the rate of the "convergence in shape" in (1.8) depend heavily on the profile of the nonlinearity f (u), as neither can be easily obtained from the intersection number arguments.The results quantifying these convergence rates and making the asymptotics of the front location m(t) more precise than (1.9) are more modern and are very different in what are known as the "pushed" and "pulled" regimes.Recall that, informally, front propagation is pushed if it is "bulk dominated" and is pulled if it is "tail dominated".For positive nonlinearities that satisfy (1.2)-(1.3) the spreading speed for the linearized problem is c lin = 2.We will give a more refined definition below but for the moment the reader can think that propagation is pushed if c * > c lin = 2 and pulled if c * = c lin = 2. Contemporary arguments to establish convergence rates in the pushed case are spectral in nature, while, for pulled fronts, are motivated in great part by the connection to branching Brownian motion and other log-correlated random fields, and typically use entirely different techniques.
When the front is pushed, so that c * > 2, its location has the asymptotics m(t) = c * t + x 0 + o(1), as t → +∞, (1.11) with some x 0 ∈ R.Moreover, the convergence rate in (1.8) is exponential [15,34]: with some ω > 0. The proofs of (1.11)-(1.12) in [15,34] as well as the later extensions to other "pushed fronts" problems are based on spectral gap arguments and provide implicit estimates on the exponential rate ω > 0 of convergence in (1.12).On the other hand, when f (u) is of the Fisher-KPP type, so that, in addition to (1.2), it satisfies f (u) ≤ f ′ (0)u, for all 0 < u < 1, (1.13) the propagation is pulled and spreading is dominated by the region far ahead of the front.Under this assumption, when the normalization (1.3) is adopted, the minimal speed c * = c lin = 2 and the front location has the asymptotics with some x 0 ∈ R, first established in the pioneering works by Bramson [11,12] via the connection with branching Brownian motion.The Bramson asymptotics was re-visited in [1,2,4,7,19,22,25,29,31,36], including in some more general pulled settings, and also refined in [8,9,19,20,30].However, unlike in the pushed case, where the front location asymptotics (1.11) was sufficient for the convergence rate estimate (1.12), obtaining a convergence rate in (1.8) for the Fisher-KPP nonlinearities required a much finer asymptotics than given by the Bramson result (1.14).To this end, Graham has improved in [20] the Bramson asymptotics for the Fisher-KPP nonlinearities to show that with some x 0 , x 1 ∈ R.This confirmed a series of formal predictions in [9,14], partly proved in [23,30].
The "very fine" asymptotics in (1.15) leads to a convergence bound of the form after using an asymptotic expansion based on (1.15) that approximately solves (1.1).It was also shown in in [20] that this rate can not be improved for the Fisher-KPP nonlinearities.We note that, with different assumptions on the initial data that rule out (1.7) and its compact perturbations, faster convergence rates were proven by Avery and Scheel [6].While the Bramson asymptotics (1.14) holds for all Fisher-KPP reactions, it does not hold for all nonlinearities that satisfy (1.2)-(1.3)for which c * = 2.As was shown in [2,19], there is a class of nonlinearities f (u) such that the front location asymptotics is not (1.14) but Informally, this happens when f (u) is exactly at the pushed-pulled transition.We refer to these as "pushmi-pullyu" fronts.Thus, the distinction between various regimes of propagation can not be made based solely on whenever the propagation speed is predicted by the linearization (1.10) or not.It turns out that it should be made based both on the propagation speed and the asymptotics behavior of the traveling wave as x → +∞.Let us, therefore, define terminology for the three classes roughly discussed above.We remind the reader that f (u) satisfies (1.2)-(1.3).
• A traveling wave is pushed if c * > 2.
• A traveling wave is pulled if c * = 2 and there is some A 0 > 0 such that • A traveling wave is pushmi-pullyu if c * = 2 and there is A 1 > 0 such that (1.17) We refer the reader to [2,5,8,9,14,18,19,35] for more in depth discussion.We often abuse terminology and refer to the nonlinearity itself as being "pushed," "pulled," or "pushmi-pullyu."A simple linearization argument shows that the two asymptotics in (1.16)-(1.17)are the only possibilities when c * = 2, so the cases above are exhaustive.Intuitively, once the normalization (1.3) is fixed, "large" nonlinearities f correspond to pushed fronts, "small" ones correspond to pulled fronts, and the boundary case corresponds to pushmi-pullyu fronts.
There are two important points to make before discussing our results.First, while convergence rates have been established in the Fisher-KPP and pushed cases, nothing quantitative is known for the intermediate cases; that is, pushmi-pullyu nonlinearities and pulled nonlinearities not satisfying the Fisher-KPP condition (1.13).Second, the arguments used to establish convergence rates in the Fisher-KPP and pushed regimes are quite different.This indicates the difficulty in closing the gap: establishing sharp rates in the transitional cases and developing a cohesive understanding of convergence rates in all cases.

An informal statement of the results
Our interest here is to complete and unify the separate pictures for the pulled, pushed, and pushmipullyu cases described above.Despite very different approaches to the proof of convergence to the traveling wave in the pushed and pulled cases, one can see one common feature in the original KPP results (1.8)-(1.9)and in the pushed case (1.11)-(1.12).Namely, the obtained rate of convergence of u(t, x) to U * (x) is much finer than the corresponding obtained rate of convergence for the front location.To see this, one needs to only compare (1.8) to (1.9) in the pulled case and (1.11) to (1.12) in the pushed case.
Here, we recover and explain this philosophy that "rough front location asymptotics gives a finer rate of convergence to a traveling wave."We introduce a novel approach to quantifying the convergence rate in (1.8) that provides one simple explanation both for the exponential and algebraic rates in the pushed and pulled cases, respectively.Roughly, we prove the following (cf.Theorem 2.1), under some technical assumptions: As we have mentioned, in the case c * = 2, the convergence rate in (1.18) has been established in [20] for the Fisher-KPP nonlinearities based on the very fine asymptotics (1.15).The proof here is completely different and avoids (1.15) altogether.For the other pulled and pushmi-pullyu cases the rate in (1.18) is, to the best of our knowledge, new, as is the explicit rate in the pushed case.
To explain the approach to the proof of the convergence rates in (1.18), we need to recall the notion of the shape defect function introduced in [2].It is well known that the traveling wave solutions to (1.1) are monotonically decreasing.Thus, there is a C 1 (0, 1) function η(u) so that It is easy to see that η(u) > 0 for all u ∈ (0, 1) and η(0) = η(1) = 0. (1.20) We call η(u) the "traveling wave profile function."We define the shape defect function to be This, in a sense, represents how close the solution u(t, x) is to solving (1.19) and is a measure of the "distance in shape" between u(t, x) and the profile U * (x).A major advantage here is that we do not a priori need to know which shift of U * is the closest one in order to use w to obtain bounds on u(t, x) − U * (x).Imprecisely, one finds that where the second inequality holds up to the appropriate shift.We note that related quantities were used in [16,27,32,36]; see [2] for a more detailed discussion.
The main idea of this work is to estimate w(t, x) directly through its evolution equation where, by [2, equation (4.1)], for all u ∈ (0, 1), (1.24) and use that information to read off the rate of convergence of u(t, x) to the traveling wave profile U * (x).As we see below, the nonlinearity Q(u) satisfies and, for a large class of nonlinearities, we also have see Lemma 5.1.
A key informal observation is that if u(t, x) is a solution to (1.1), there is a "phantom front" location m w (t) that is far behind the true front m(t) and is where the shape defect function w(t, x) "wants" to have its front.The phantom front location of w can be read off its equation (1.23).Surprisingly, the evolution of w(t, x) in (1.23) turns out to be "Fisher-KPP-like," regardless of whether the solution u(t, x) to (1.1) itself is of the pushed, pulled or pushmi-pullyu nature.This is the main and, to us, unexpected unifying element of all three cases.The simple reason behind this pulled nature of w(t, x) is that, because of (1.25)-(1.26),ahead of the front it satisfies which is exactly the same linearized problem as for the Fisher-KPP equation.
The second new key point is that the distance between the true and the phantom fronts controls the rate of convergence in (1.18), once again, regardless of whether the front is pushed or pulled.More precisely, at an informal level, the main result of this paper is that the convergence rate in (1.18) comes from the estimate where the first approximation follows from (1.22) and the second comes from the "Fisher-KPP like" nature of (1.27); see also (2.18), below.In particular, this explains why one needs only "rough" asymptotics for m(t) and m w (t) to get an "exponentially finer" convergence rate in (1.18).In order to pass from (1.29) to (1.18), we show that, as long as f (u) satisfies (1.2)-(1.3)and some additional technical assumptions, the front location and the phantom front location have the following behavior as t → +∞: , in the pulled case. (1.30) Using (1.29) and (1.30) leads directly to (1.18).The asymptotics for m(t) in (1.30) in all three cases is already known and to a better precision than stated in (1.30), with the pushmi-pullyu case analyzed recently in [2] and formally predicted in [8,14,26].Our main goal here is to explain what the phantom front location m w (t) is, how (1.29) comes about, and how the asymptotics of m w (t) in (1.30) can be computed.We emphasize that, unlike [20,30] that analyzed the Fisher-KPP case, we only use the O(1)-precise asymptotics for m(t) and not anything finer to get the convergence rates in (1.18).
In all of the three cases in (1.30), the analysis of the phantom front location m w (t) for the shape defect function is based on typical techniques for the Fisher-KPP equations (pulled fronts).This leads to the surprising conclusion that, for a large class of nonlinearities, the convergence of the shifted solution u(t, x + m(t)) to U * (x) is a pulled phenomenon, regardless of the pushed, pulled, or pushmi-pullyu character of the spreading of u(t, x) itself.The reader may notice that the phantom front asymptotics m w (t) in (1.30) has the Bramson form (1.14), which is a signature of the pulled fronts, precisely when m(t) is not pulled.On the other hand, in the pulled case it is the front asymptotics m(t) itself that has the Bramson asymptotics (1.14), while the phantom front position m w (t) has an extra log t delay relative to this location.This will be explained below.Of course, without such a delay between m(t) and m w (t), we would have D(t) = O(1) and (1.29) would be useless!
We hope to convince the reader that the scheme outlined above is exceedingly simple to put into practice, beyond the situations we consider in the present paper.Once one starts to work directly with the shape defect function w(t, x) and has the intuition (1.29), the convergence proof is straightforward.In particular, the sometimes heavy computations, such as in the proof of Lemma 4.3 below, should not obfuscate this basic fact.We do not consider more general problems here because our interest is in the simplest possible presentation to illustrate the meaning behind the convergence rates.

Organization of the paper
To better illustrate the method, we first focus on the the "Hadeler-Rothe" family of nonlinearities f given by (2.1) below.In Section 2, we give a statement of our main result, Theorem 2.1, which establishes (1.18) in this context.This section also contains an expanded discussion both of the proof and of the sharpness of our bounds.The proof of Theorem 2.1, given in Section 3, relies on estimates of the shape defect function in Theorem 3.3, which are proved in Section 4.
In order to analyze the evolution equation (1.23) for w, we require some properties of the traveling wave profile function η(u) and the nonlinearity Q(u) that appears in (1.23).They are established in Section 5 in some generality, not just for the Hadeler-Rothe nonlinearities.Following this, Section 6 contains an extension of the convergence rates (1.18) to the general case.The key observation is that the proof of Theorem 2.1 uses the particular form of the Hadeler-Rothe nonlinearities essentially only through these properties of Q and η.General versions of Theorem 2.1 are formulated there, in Theorems 6.1 and 6.

Convergence rates for the Hadeler-Rothe nonlinearities
To fix the ideas in a simple setting, we look in detail at the special class of the so-called Hadeler-Rothe nonlinearities.They have the form with some n ≥ 2 and χ ≥ 0. The traveling waves for such nonlinearities were discussed in detail in [21,28] for n = 2 and in [14] for n > 2. The classical Fisher-KPP nonlinearity f (u) = u − u 2 is a special case of (2.1) with χ = 0 and n = 2.It was shown in [14,21,28] for nonlinearities of the form (2.1) that there is a pushed-to-pulled transition at χ = 1: Moreover, the traveling wave profile function is explicit for χ ≥ 1 and is given by see [2,Proposition A.2]. Hence, when χ ≥ 1, the traveling waves have the purely exponential asymptotics (cf.(1.17)): there exists ε, A 1 > 0 so that When 0 ≤ χ < 1, no such explicit expression is possible for η(u) because U * has the pulled asymptotics: there exists some ε > 0 and A 0 > 0 so that The decay rate λ 0 > 0 in (2.4) and (2.5) is the largest root of Let us mention that, after a spatial shift, we may assume that B 0 = 0, so that (2.5) becomes This is another natural normalization that we will sometimes use below as an alternative to (1.6).
The corresponding front location asymptotics for the solutions to (1.1) with a rapidly decaying initial condition was established in [2]: there exists x 0 that depends on the initial condition u 0 , so that, as t → ∞ (2.8) It is convenient to recall the asymptotic behavior of U * as x → −∞ as well: there are Here, λ 1 is the nonnegative root of Notice that, due to (2.1), we have

The main result for the Hadeler-Rothe nonlinearities
In this section, we state the convergence rates in (1.18) for the Hadeler-Rothe nonlinearities of the form (2.1).For simplicity, we take an initial condition u(0, x) = u 0 (x) such that 0 ≤ u 0 (x) ≤ 1 for all x ∈ R, and there exsts some L 0 ∈ R, so that u 0 (x) = 0 if x ≥ L 0 , and w 0 (x) = w(0, x) ≥ 0, for all x ∈ R. (2.11) The non-negativity assumption on w(0, x) simply encodes that the initial condition u 0 (x) is "steeper" than U * (x).In particular, it follows from (2.11) that u 0 (x) is monotonically decreasing and u 0 (x) → 1 as x → −∞.The comparison principle and (1.23) yield that then u(t, x) remains steeper than U * (x) for all t > 0, in the sense that w(t, x) > 0, for all t > 0, x ∈ R. (2.12) A typical example of such initial condition is u 0 (x) = ½(x ≤ 0).We believe that the non-negativity assumption on w(0, x) can be relaxed by using results such as by Angenent in [3] or Roquejoffre in [33] to show that w(t, x) "eventually" becomes nonnegative, at least on every compact set.We adopt this assumption to avoid the related technicalities.
Our main result for the Hadeler-Rothe nonlinearities is as follows.
Assume that f (u) is given by (2.1) with some χ ≥ 0 and n ≥ 2. Let c * be given by (2.2).Then there is (ii) if χ > 1, then for any Λ > 0, t . (2.14) As will be seen from the proof, convergence occurs in a (stronger) weighted L ∞ -norm, but we opt for the simpler statement here.
The main ingredients in Theorem 2.1 are knowledge of the true front location m(t) as well as the behavior of Q and η in (1.26).In this sense, we use the form (2.1) in a rather weak way.We provide a full discussion of the general case in Section 6 and formulate broader versions of Theorem 2.1 there; see Theorems 6.1 and 6.2.
Interestingly, unlike the classical results in [15,16,34] for pushed waves, the estimate (2.14) does not depend on f ′ (1).Actually, a similar argument using our methods yields a messier global estimate: However, the f ′ (1) term in the exponential merely reflects the "slowness" with which U * converges to 1 on the left.We choose to present the "at and beyond the front" estimate (2.14) above because it is a better representation of the mechanism that pulls u(t, x) towards U * (x).In particular, it reflects the aforementioned pulled nature of the convergence of the solution to the wave in shape, regardless of whether the wave itself is pushed or pulled.

Discussion of the proof
A very useful observation is that, for the Hadeler-Rothe nonlinearities, (1.26) holds and the traveling wave profile function η(u) is concave.Proposition 2.2.Assume that f (u) has the form (2.1), then, for any χ ≥ 0 and n ≥ 2, A more precise version is stated in Lemma 4.1.Proposition 2.2 follows immediately from the explicit expression (2.3) for η(u) when χ ≥ 1.Otherwise, it is proved in Lemma 4.1.Its generality, beyond the Hadeler-Rothe class, is discussed in Section 6.
Proposition 2.2 is nearly enough to understand the phantom front m w (t) as we have, at highest order, ahead of the front.Remarkably, this is exactly the same as the linearization for the classical Fisher-KPP equation This would suggest that m w (t) should be given by the standard Bramson asymptotics (1.14) for the Fisher-KPP case.However, it has been observed that the Bramson shift may be sensitive to lower order terms ahead of the front for nonlinearities that are not better than Lipschitz near u = 0 [10].
In that case, (2.15) may be not a faithful approximation to (1.23).It is, thus, crucial to understand the regularity of η near u = 0.As a consequence, we consider two cases depending on this regularity.
The pushed and pushmi-pullyu cases: χ ≥ 1 Consider first the pushed and pushmi-pullyu cases, where η is given explicitly by (2.3) and is smooth at u = 0.In this case, Recall that n ≥ 2. Hence, we expect that, ahead of the front of u(t, x), the shape defect function w(t, x) does behave approximately as a solution to when χ ≥ 1.An informal consequence of [22] is that w(t, x), being bounded and approximately satisfying (2.17) where it is small, "wants to have a front" at the location and should have the approximate form On the other hand, w(t, x) is governed by u(t, x), which has its front at the position m(t) = c * t in the pushed case χ > 1, and at m(t) = 2t − 1 /2 log t in the pushmi-pullyu case χ = 1 [2].Hence, we have, up to lower order terms According to (2.18), this produces which, along with (1.22), yields Theorem 2.1.
Let us note that the explicit form of η, beyond Proposition 2.2, is not needed here, because the key estimate used above, that is, the right hand side of (2.16), follows directly from the traveling wave asymptotics (2.4) and (2.20) below.Indeed, we can see that, whenever (2.4) holds, we have, for some α > 0, η(u) ∼ u + O(u 1+α ).
The pulled case: 0 ≤ χ < 1 For 0 ≤ χ < 1, we do not have an explicit expression for η(u) or Q(u).To understand the behavior of Q(u) for u ≪ 1 in this range of χ, we can, at least informally, deduce the behavior of η and its derivatives from (2.5).
Using (1.19), we can write two useful identities involving η: From these, we immediately observe that These are made precise in Lemma 3.4 below.Therefore, when 0 ≤ χ < 1, the function Q(u) defined in (1.24) has the asymptotics , as u → 0.
Thus, a good approximation to w(t, x) is by a solution to a modification of (2.17): Using, once again very informally, the main result of [10], we see that the shape defect function w(t, x) "wants to have its front" at the location while the front of u(t, x) is at the Bramson position as follows from [2].Thus, for 0 ≤ χ < 1, we have D(t) = log t and (2.19) again yields the O( 1 /t) convergence rate in (1.18).The above informal arguments indicate that, as we have already mentioned, the behavior of the shape defect function w(t, x) is always a pulled phenomenon regardless of the pushed, pulled, or pushmi-pullyu spreading of u(t, x) itself.

Sharpness of Theorem 2.1
It appears that this approach leads to matching lower bounds.This is easiest to see in the pushed case.Indeed, fixing ε, δ ≪ 1, R ≫ 1, and T ≫ 1, it is straightforward to check that is a subsolution to (1.23) for t ≥ T .The additional εt shift in the moving frame allows us to use the approximation Q ≈ 1 because it puts us in the regime where u ≪ 1.Up to further adjusting δ, it is easy to check that w(1, From this, a simple ODE argument shows that The arguments in the pulled and pushmi-pullyu cases will be more involved.We, nonetheless, expect them to proceed in a fairly straightforward manner using the shape defect function.

Estimates on the shape defect function
One of the main technical points of this paper is that the proof of Theorem 2.1 requires understanding the front location asymptotics for u(t, x) only up to O(1) as t → +∞.For the Hadeler-Rothe nonlinearities we have the following.This claim holds, of course, for a much wider class of nonlinearities -see [2,19] for a discussion.The next lemma gives preliminary control on how quickly u(t, x) tends to its limits as x → ±∞.Lemma 3.2.With m(t) as in Proposition 3.1 and w(t, x) satisfying (2.12), there is C > 0 so that u(t, x + m(t)) ≥ U * (x + C) for all x < 0, and u(t, x + m(t)) ≤ U * (x − C) for all x > 0.
By a simple ODE comparison argument using (1.19), (1.21), and (2.12), we see that, for any Then Lemma 3.2 follows directly from Proposition 3.1.The proof is omitted.
The main step allowing us to deduce the bounds in Theorem 2.1 is the following estimate on the shape defect function at the front location m(t).
We note that the ε in cases (i) and (ii) can almost certainly be removed with a more careful proof.Our focus in this paper, however, is not on the sharpest possible behavior on the left, as x → −∞.
While the statements in Theorem 3.3(i)-(ii) for the pulled and pushmi-pullyu cases are slightly different, the proofs, postponed until Section 4, are nearly identical.They are based on the intuition discussed in Section 2.2: the equation for w(t, x) wants to spread slower than the equation for u(t, x).The statement of Theorem 3.3(iii) in the pushed case and its proof, presented in Section 4.1, are different because we can use an elementary estimate "out-of-the-box".

Preliminary bounds on η
We now make the behavior of η(u) near u = 0, stated informally in (2.22), precise.Lemma 3.4 (Asymptotics of η(u) in the pulled case).Assume that f ∈ C 2 ([0, 1]) and satisfies (1.2)-(1.3).Suppose that the profile U * (x) has the asymptotics (2.5) as x → +∞.Then there exists C > 0 so that, for u ∈ (0, 1 /100), We note that this lemma does not require the specific form (2.1) of f .The parts (i)-(ii) will be used to deduce Theorem 2.1 from Theorem 3.3.The property (iii) is not required for that proof but will be needed in the proof of Theorem 3.3 itself.
Proof.We use the normalization of U * (x) in which B 0 = 0 in (2.5).Consider first the claim (i).Fix u ∈ (0, 1 /100) and x u such that U * (x u ) = u.We deduce from (2.5) with B 0 = 0 that Using this in the definition of η(u), we find The claim (i) follows then from inserting (3.3) into (3.4) and using a straightforward expansion.We omit the proofs of (ii) and (iii) as they proceed by similar arguments.
Lemma 3.5 (Asymptotics of η in the pushed and pushmi-pullyu cases).Assume that f ∈ C 2 ([0, 1]) and satisfies (1.2)-(1.3).Suppose that the profile U * has the asymptotics (2.4) as x → +∞.Then, there exist α > 0 and C > 0 such that, for all u ≥ 0, The proof is omitted as it is a simpler version of the proof of Lemma 3.4.

The proof of Theorem 2.1
The first steps of the proof for both cases (i) and (ii) can be handled simultaneously.As u(t, x) is monotonic in x, we may define σ(t) by We shift to the corresponding moving frame: let ũ(t, x) = u(t, x + σ(t)) and w(t, x) = u(t, x + σ(t)).
It follows from Proposition 3.1 that We may then apply Theorem 3.3 with σ(t) in place of m(t), at the expense of changing the constants.
To use Theorem 3.3, we need to bound the smallness of the difference in terms of the smallness of the shape defect function w(t, x).Note that, by the choice of σ(t) in (3.5), s(t, 0) = 0, for all t > 0. (3.6) We also point out that by the steepness comparison (3.2), we have s(t, x) ≤ 0 when x > 0, and s(t, x) ≥ 0 when x < 0. (3.7) In order to relate s(t, x) to w(t, x), note that, for each fixed t, s(t, x) satisfies the following ODE in x: Here, ξ(t, x) is an intermediate point between ũ(t, x) and U * (x) given by the mean value theorem.(3.9) From here, the main points of the proof are exactly the same in each case (i)-(ii); however, due to the difference in the precise asymptotics in Theorem 3.3 in these two cases, we have no choice but to write up each case separately.
We now consider the case x ≤ 0. Due to (3.7), we need only obtain an upper bound on s(t, x).The argument is essentially the same as for x ≥ 0. The main differences are the asymptotics of η ′ (u) near u ≈ 1 and U * (x) and w(t, x) as x → −∞.Unlike before, we need not separate into the two cases, as the behavior at the back is the same both for 0 ≤ χ < 1 and χ = 1.
First, notice that 1 − Ce λ 1 x ≤ U * (x) ≤ ξ(t, x) for x ≤ 0, and, for all The combination of these two inequalities leads to where ε is as in (2.9).We use (3.9) and then (3.13) and Theorem 3.

Proof of Theorem 2.1(ii)
We proceed as above.By the Harnack inequality, it suffices to consider the case L = 0, so that x ≥ 0. Again, due to (3.7), we need only establish a lower bound on s(t, x).Next, note that, due to Lemma 3.5, we have, for some p > 1, We find, from (3.9) and Theorem 3.3(iii), once again, with m(t) = c * t + x 0 , The second to last equality uses that exp{− y 2 /4t} ≤ 1 and the last inequality uses that λ 0 > c * /2, which follows from (2.6).This concludes the proof.
4 The proof of Theorem 3.3 Before we begin, we state one final lemma about the behavior of η and Q, defined in (1.19) and (1.24), respectively.This is the key and essentially only place in this manuscript where we use the form (2.1) of the Hadeler-Rothe nonlinearities f (u).Further, we have the refined bounds: letting for any δ 0 , δ 1 ∈ (0, 1 /100) with δ 1 sufficiently small, there are r 0 > 0 and r 1 > 0 such that 3) The constant C depends only on χ and n.The constants r 0 and r 1 depend on χ, n, δ 0 , and δ 1 .
Let us make two comments.First, the term 2/ log 2 u in (4.3) is crucial for the coefficient 5 /2 in the phantom front location that appears in (1.30) in the pulled case.Second, the form (2.1) of f is mainly used to prove the bound (4.1).Indeed, the estimate (4.3) follows directly from Lemma 3.4 and the definition (1.24) of Q.The proof of Lemma 4.1 is found in Section 5.

The pushed case: the proof of Theorem 3.3(iii)
We begin with the pushed case as it is simplest.From (1.23), Lemma 4.1, and (2.12), we find Hence, e −t w is a subsolution of the heat equation and we find, by (2.11), As u 0 (x) = 0 for x ≥ L 0 , we also have w 0 (x) = 0 for x ≥ L 0 , and we can assume without loss of generality that L 0 = 0. We obtain, for x ≥ 0 The result follows by changing variables x → x + m(t) = x + c * t + x 0 .

The pushmi-pullyu case: the proof of Theorem 3.3(ii)
We begin with the pushmi-pullyu case χ = 1.In that case, the front location is We recall the following estimate to the right of m(t) when χ = 1.
Lemma 4.2.For any t sufficiently large and any L, we have We omit this proof as it is essentially the same as [2, Lemma 6.6].In view of Lemma 4.2, we need only consider the behavior of w(t, x) behind the position m(t) − L. We do this via the construction of a super-solution.Changing to the moving frame and applying Lemma 4.1 to (1.23), we find, for any ε > 0, Above we have potentially increased L so that, by Proposition 3.1, u > 1−δ 1 with δ 1 as in Lemma 4.1 for x < 0. We next remove an integrating factor.Let λ 1,ε be the positive root of (2.10)), and let we obtain the differential inequality Before constructing a supersolution for (4.6), we note the following boundary conditions.First, due to Lemma 4.2, we have Second, due to Lemma 3.2 and parabolic regularity theory, we have, for any x < 0, As a result, if we can produce a supersolution z(t, x) for (4.6) defined for t ≥ T and x ∈ [−δt, 0] that satisfies the boundary conditions and the initial condition at t = T inf then we would conclude, via the comparison principle, that z(t, x) ≤ z(t, x) for t ≥ T and x ∈ [−δt, 0].Let us note that λ 1,ε < λ 1 due to (4.5).We define the function z(t, x) by z(t, x) = A t for x < 0 and t > T.
It is clearly possible to choose A, depending on L, δ and T > 0, so that the conditions in (4.8)-(4.9)are satisfied.It remains to check that z is a super-solution of (4.6).A direct computation yields, for any x ∈ (−δt, 0), as long as we increase T if necessary.Hence, z is a super-solution for (4.6).We deduce that w(t, x) ≤ A t e λ 1,ε x , for t > T and −δt ≤ x ≤ 0.

The pulled case: the proof of Theorem 3.3(i)
When 0 ≤ χ < 1 the front is located at the position Exactly the same argument as in the proof of Theorem 3.3(ii) to control the behavior of w(t, x) for x < m(t) can be applied.Thus, we only need to control w(t, x + m(t)) for x > 0. This is done by the following.
Lemma 4.3.Under the assumptions of Theorem 3.3(i), we have Ct for all x > 0.
Before starting the proof, let us make the following comment.As discussed in the introduction, the convergence rate of w(t, x) is controlled by the lag D(t) of the phantom front m w (t) behind the true front m(t), as in (1.28)-(1.29).When 0 ≤ χ < 1, the phantom front m w (t) is given by (4.4) and m(t) in (4.10).On the other hand, the use of the naive linearization such as (2.17) would produce an incorrect estimate m w (t) ∼ 2t − (3/2) log t which would lead to D(t) ∼ O(1), and a bound in the spirit of (2.19) on the convergence rate would be useless.Thus, the lag comes solely from the non-zero term R(u) in (4.3).We have to use this estimate in an essential way to obtain any convergence rate in (1.18) in the pulled case, let alone a sharp one.
Proof.First, for L and T > 0 to be determined, we let and define ũ similarly.Then, recalling Lemma 4.1, since η ′′ (u) ≤ 0, we find We remove an exponential, z(t, x) = e x w(t, x) ( to obtain We now define a supersolution to (4.12) for t ≥ 1 and x ∈ R as follows.For B ≥ 1 and T ≥ 1 to be chosen, let where we have defined Let us set The proof of Lemma 4.3 will be finished if we show that w(t, x) ≤ Aw(t, x), with some A > 0.
Before we proceed, let us explain where (4.15) comes from.First, from (2.23), we expect w to "look like" the solution of The traveling wave solution of this equation has the asymptotics x 2 e −x as x → +∞ [10], which motivates a multiplicative factor x 2 in (4.15), as we have already removed an exponential factor in (4.13).On the other hand, "far to the right," we should have a Gaussian behavior, which motivates the exp{− x 2 /4t} type term in (4.15).In addition, as we have mentioned above, we expect the phantom front location m w (t) to be near the front location for (4.16), which is known to be at the position given by (4.4).Thus, the lag between the true and the phantom fronts is D(t) ∼ log t.
Because of that, we expect w ∼ O( 1 /t).This explains the multiplicative factor θ(t) in (4.15).The other terms in (4.15) are simply technical; in particular, the B and T factors allow to verify the supersolution condition and to "fit" w above w initially.By the comparison principle applied to the differential linear inequality (4.14) for z(t, x), we will have shown that w(t, x) ≤ Aw(t, x), for t ≥ 1 and x ∈ R, with some A > 0, if we show the following: (i) the initial comparison holds: (ii) the function w(t, x) has the form ) are important because they allow us to make the matching between θ(t) and e −x ζ(t, x) somewhere in the interval (1,10) as the minimum of two super-solutions.This is crucial because, as ζ(t, x) vanishes at x = −B, it can not be a super-solution for x < 0, and, as we will see, θ(t) is not a super-solution for x > 10.This is depicted in Figure 4.1.
We now check conditions (i)-(v).The initial comparison (4.17) is easy to check using wellknown bounds on parabolic equations.In particular, w(t, x) is bounded, up to a large multiplicative constant, by a Gaussian in x, for each t > 0 fixed.Hence, after increasing T , independent of all parameters, and increasing A, depending on L and B, the bound (4.17) must hold.Recall that L appears in the change of variables (4.11).
Next, we notice that (ii) is clear by observation if B is sufficiently large.Similarly, after increasing T (depending only on B), (iii) is also clear by observation.
To see that (iv) is satisfied requires us to increase L (independent of all parameters) and apply Proposition 3.1 with any δ 1 sufficiently small to find that ũ(t, x) ≥ 1 − δ 1 for all t ≥ 1, x ≤ 10.
Then, from Lemma 4.1, we have Thus, up to increasing T , depending only on δ 1 > 0, we have Therefore, (iv) holds.We now check (v), which is a computationally tedious condition to verify, even though the computations are completely elementary.First, we compute: Noticing that θ/θ = −1/(t + T ) and θ/ √ θ = − √ θ/(t + T ), cancelling the obvious terms, and then grouping terms by the growth in x yields Since θ ≤ 1, we have, up to increasing T (independent of all parameters), Using Young's inequality and then increasing T (independent of all parameters), we arrive at At this point, we can see why the right hand side of (4.20) should be positive.Recall that, according to Lemma 4.1 (equation (4.2)), the term R(ũ) ≥ r 0 > 0 when ũ is not too small.Hence, it should dominate the next to last term in the right side of (4.20) in that region if B is large.On the other hand, for ũ small, the term R(ũ) looks like 2/ log 2 (ũ), according to (4.3).Moreover, as ũ(t, x) ≈ U * (x) and U * (x) has the asymptotics (2.5), we have log 2 (ũ) ≈ x 2 .Thus, once again, R(ũ) dominates the next to last term in the right side of (4.20).
We make the discussion above more precise.Let us fix δ 1 > 0 as in Lemma 4.1.We claim that, up to increasing L (depending on δ 1 ), we have for all t ≥ 1, with a constant C L that depends on L. The first alternative above is due to Proposition 3.1.The second alternative follows from [22, Proposition 3.1] and its proof, as well as an application of the comparison principle.We first consider the "large" ũ regime (and, thus, x "not too far on the right").If ũ ≥ δ 0 , then R(ũ) ≥ r 0 due to (4.2) and we find up to increasing B further if necessary so that 2/B 2 < r 0 .In particular, then we have, from (4.20), as desired.
Next we consider the "small" ũ regime (and, thus, "large" x regime).Note that, by (4.21), if ũ ≤ δ 0 , then In particular, this case is restricted to x that is very large, after possibly decreasing δ 0 .We begin by estimating R(ũ) using (4.3).For the quadratic term, we apply (4.21) to find Then, using that (1 + z) −2 ≥ 1 − 2z for all z ≥ −1, we obtain A similar argument, using the inequality yields a bound for the second term in R(ũ): Using these in (4.20), we find After decreasing δ 0 (which, by (4.22), increases the lower bound for x), we find There is only one negative term above.Applying Young's inequality with p = 3/2 and q = 3 yields Hence, we have which is positive after further decreasing δ 0 (which, by (4.22), increases x).This concludes the proof of (v) and, thus, the proof of the lemma.
5 Proofs of the bounds on η and Q

Concavity of η: Proposition 2.2
We make two observations.First, arguing as in Lemma 3.4, it is easy to check that, for any f , its traveling wave profile function η satisfies Second, Proposition 2.2 follows from the following more general result.
This yields the first alternative in (4.2) in the pulled case.We now investigate the second alternative in (4.2).Notice that (5.10) The second equality above follows from (1.19) and (2.9), while the third is due to (2.10).The inequality uses the particular form of f .This concludes the proof.

The general case
In this section, we discuss the convergence rate when f satisfies (1.2) and the normalization (1.3) but does not necessarily have the Hadeler-Rothe form (2.1).
Let us begin by recalling the proof of the convergence rates in the Hadeler-Rothe case (Theorem 2.1).The main lemma is the estimate on w (Theorem 3.3).The argument to deduce Theorem 2.1 from Theorem 3.3 relies only on the behavior of η near u = 0, which is established in full generality in Lemmas 3.4 and 3.5.
In the proof of Theorem 3.3, there are exactly two places where we use the assumption (2.1) on the form of f rather than just the assumptions (1.2)-(1.3): the O(1) asymptotics for the front location of u(t, x) (Proposition 3.1) and the bounds on Q (Lemma 4.1).The final conclusion of Lemma 4.1, that is, the expansion (4.3), holds for any pulled front as it merely reflects the linear factor in (1.16).Hence, the supersolutions for w constructed in each case in the proof of Theorem 3.3 hold in generality if we take the front asymptotics of u and behavior of η as assumptions.Hence, the exact arguments above yield the following: Theorem 6.1.Suppose that u solves (1.1) with f satisfying (1.2)-(1.3)and initial data u 0 satisfying (2.11).Assume further that the traveling wave profile function η and the associated quantity Q, respectively defined in (1.19) and (1.24), satisfy (4.1)-(4.2). (Here, we are only assuming the positivity of r 1 , not necessarily the limiting behavior as δ 1 → 0 stated below (4.2).) Finally, suppose the front asymptotics of u are given by if U * is pushed, in the sense of (3.1), with the definition of pushed, pulled, and pushmi-pullyu given in (1.16)-(1.17).
Then there is σ : [0, ∞) → R such that, whenever c * = 2, and, for any Λ > 0, whenever c * > 2, 6.1 The assumptions in Theorem 6.1 In this section, we discuss the three main assumptions in Theorem 6.1: (6.1), (4.1), and (4.2).Briefly, the front asymptotics (6.1) of u is nearly known in complete generality so it is a quite weak assumption and the refined bounds (4.2) on Q may be side-stepped by alternate arguments at the expense of a slightly less precise convergence rate.Thus, the main assumption to be checked in practice is (4.1), that is, that η ′′ ≤ 0 and Q ≤ 1.We formulate a version of Theorem 6.1 that assumes only (4.1) in Theorem 6.2 below.We also discuss here the feasibility of (4.1).
The assumption (6.1).In fact, (6.1) is nearly established in full generality.The pulled and pushed asymptotics in (6.1) are completely proved: see [36,Lemma 5.2] for the pushed case and [19] for the pulled case.The statement in [19] additionally requires f ′ (1) < 0, although this can likely be removed via a comparison argument with solutions to (1.1) with appropriately chosen f and f in place of f .We do not pursue this further here.The pushmi-pullyu case is more delicate.If f has the particular form (5.2), this is established in [2], but it is otherwise still open.The most general result is [19], in which the asymptotics It is not hard to track the effect of the o(log t) term in our computations to see that the informally derived convergence rate (1.29) holds: for every ε > 0, Another argument leading to (6.3) is sketched in greater detail below, see (6.5) and its discussion.
We expect that, in many applications, either the assumptions of Lemma 5.1 would hold, or the inequalities in (4.1) are checkable or can be sidestepped using ad hoc adjustments to our approach here.It is easy to derive several differential equations relating η and Q to f that are useful for understanding η and Q, although we do not discuss this further here.
The assumption (4.2).This assumption is not used in the argument of the pushed setting (see Section 4.1).Hence, we need only address the pulled and pushmi-pullyu cases.
In the pushmi-pullyu case, we outline an argument below that yields nearly the same conclusion albeit without either inequality in assumption (4.2).Hence, we focus our discussion mainly on the pulled case, where (4.2) plays a greater role.
In the pulled setting, the first inequality in (4.2) holds automatically due to the concavity of η (4.1) (see (5.9) and the arguments surrounding it).The second inequality in (4.2) is equivalent to Q(1) < 0, which holds if and only if f ′ (1) < 0 (see (5.10)).

Figure 4 . 1 :
Figure 4.1: A depiction of the conditions (ii) and (iii) and their relationship to w.