Local Well Posedness of Quasi-Linear Systems Generalizing KdV

In this article we prove local well-posedness of quasilinear dispersive systems of PDE generalizing KdV. These results adapt the ideas of Kenig- Ponce-Vega from the Quasi-Linear Schr\"odinger equations to the third order dispersive problems. The main ingredient of the proof is a local smoothing estimate for a general linear problem that allows us to proceed via the artificial viscosity method.

This way (1.1) is a generalized KdV with the dispersive relation a that depends on the unknown and its derivatives ( u, ∂ x u, ∂ 2 x u).
Moreover, well-posedness of (1.1) allows one to study fully non-linear dispersive systems of the form: if the non-linear function f satisfies appropriate assumptions similar to the one for (1.1). Indeed, differentiating this equation and letting u = ( v, ∂ x v) allows one to reduce the problem to a quasi-linear one.
Well posedness of the semi-linear analogues of (1.1), where the top order a ≡ 1, particularly when b ≡ 0 and c is a polynomial, is quite well-understood with Local Smoothing and Strichartz estimates playing a significant role, cf.. [13] and [15] and references within. While quasi-linear dispersive equations are of interest in physical applications, in particular to water waves with variable dispersion, they are far less understood.
Well posedness has been established for quasilinear dispersive equations with special algebraic structure for which conservation laws have been found, for example the following shallow water wave equation u t − u txx + 3uu x = 2u x u xx + uu xxx see [3] and references therein. However, finding such conservation laws is not possible in general for (1.1).
A major advance for the well-posedness of a scalar (1.1) was a work of Craig-Kappeler-Strauss in [5]. However, in addition to being restricted to scalar equation, they had to make a technical assumption for the favorable sign of the coefficient b, which was not natural in light of the semi-linear results in [15]. However, the pioneering work of Kenig-Ponce-Vega [14] for the Quasilinear Schrödinger equation suggested a method to prove the well-posedness of (1.1) under more general assumptions on the coefficients.
The method of Kenig et al. was a modification of the energy method, which was a successful approach to treat well-posedness of quasi-linear wave equation in high regularity Sobolev spaces in the 70's cf... [10]. Namely, the energy method relies on estimates of the form u H s = O( u 0 H s ), which are proved via integration by parts, symmetry of the top order terms, Sobolev embedding and Grownwall inequality. However, the energy method cannot in general work for the (1.1) without modification due to the infinite speed of propagation. Overcoming this difficulty of controlling the size of a solution by the data occupies most of this paper. The heart of the matter can already be seen, when trying to prove L 2 well-posedness for a linear case of (1.1), regularized by a parabolic term for 0 ≤ ε ≤ 1.
The standard energy method is available to prove an L 2 estimate for t ≥ 0, only when b ≥ 0. However, modifications of the energy method are needed, in general. Below we review the several works that motivated the approach used in this paper.
In the case of a = Id, c symmetric and an integrable b, Kenig-Staffilani in [15], motivated by [9], were able to cancel b∂ 2 x term with a change of variables v(x, t) = Φ(x, t)u(x, t) for an appropriate Φ. After this change of variables a standard energy argument (as explained above) works for v and hence gives the L 2 estimate for I = [0, T ] small enough where f (x, t) L 1 I L 2 x = I f (t) L 2 x dt and likewise for all other space-time norms from now on. Similar argument was at the heart of the energy estimate that Lim-Ponce WELLPOSEDNESS OF QUASI-LINEAR KDV 3 used to prove well-posedness of a quasilinear Schrödinger equation in 1 dimension in [17]. However, this argument does not seem to work for a general system (1.2) with merely a symmetric top order a∂ 3 x , or for the Schrödinger equation in more than one space dimension. As we show in section 2 the symmetry of the coefficient a is sharp for the estimate (1.3), in the sense that without this assumption, this estimate can be false. This suggested to proceed via a Local Smoothing estimate argument, with the side benefit of capturing the regularization effect for equations (1.2) and (1.1).
Kenig-Ponce-Vega in [14] proved well-posedness of quasi-linear Schrödinger equation in higher space dimensions by proving a Local Smoothing estimate generalizing from the case of a time independent variable coefficient linear Schrödinger in [7] and [6]. Local Smoothing for (1.2) means that for δ > 1 where we use the notation x = (1 + |x| 2 ) 1 2 and interpret ∂ x u(ξ) = ξ û(ξ) as a Fourier multiplier and hence (1.4) means that the solution of (1.2) is 1 derivative more regular than u 0 and 2 than f at a cost of weights. Note, that this effect is local, as the (1.2) is time reversible and by (1.3) cannot gain smoothness globally.
Local Smoothing was first proven for KdV in [11], [16] and for the linear Schrödinger ∂ t u + iLu = f with L = △ in [4], [20], [21], [12] where the gain is 1 2 derivative relative to u 0 and 1 derivative relative to f respectively. This was generalized to L = a ij (x)∂ i ∂ j + b(x)∇ in [6] and [7]. In [8] Doi showed that, roughly speaking, under appropriate asymptotic flatness of the coefficients for L above local smoothing is equivalent to the non-trapping of the bicharacteristic flow generated by the principal symbol of the operator L. Note that in the one spacial dimension, which is the relevant setting for this paper, the non-trapping condition is automatic by the coefficient assumption (NL1) and we will omit it.
Our proof of the wellposedness also involves using Local Smoothing to prove the energy estimate, however as (1.1) is of higher order than the Schrödinger equation the argument of [7] has to be modified. Using this method to prove wellposedness of higher dispersive systems will be done in the subsequent work.
To motivate the function space we use to prove well-posedness, we note that in order to prove the main linear estimate (1.3) a decay of the coefficient b is necessary. This phenomenon is similar to the Mizohata condition, which shows the necessity of sup x,t|ω|=1 t 0 ℑb(x + sω) · ωds < ∞ for the L 2 well-posedness of Schrödinger equation ∂ t u + i△u + b(x)∇u = 0, cf... [19], and we prove a corresponding result for (1.2) explicitly in the section 2. On the non-linear level of (1.1) this suggests an L 1 condition on u and weighted Sobolev spaces provide a natural way to ensure an L 1 condition in an L 2 -based Sobolev space. This motivates working with the weighted Sobolev spaces H s,2 for s ∈ Z + , which we define as follows: which is a typical choice to ensure extra decay of the lower order term b∂ 2 x as explained above.

Coefficient assumptions.
We state the precise assumptions on the coefficients a d of the equation (1.1): Moreover, it is uniformly positive definite in D M at time 0, that is for every M > 0 there exists a constant λ M > 0 Regularity. Let J ∈ Z + be a given positive integer. We assume that all the coefficients a(x, t, z)-d(x, t, z) ∈ C 1 t B J x C J z , with the space B J of C J functions with all J derivatives bounded. Specifically, there exists a function (J, M ) → C J,M increasing in J and M > 0 separately, such that for each J and M the coefficients with norms bounded by C J,M , e.g. for a that means sup 0≤α≤1; 0≤β+|γ|≤J (NL3) Asymptotic flatness and decay of linear parts. There exists δ ′ > 1 2 and a constant C 0 , such that and c † are the antisymmetric parts of b and c respectively, defined as c † ij = 1 2 (c ij − c ji ) While any δ ′ > 1 2 works in the (NL) assumptions, provided the definition of H s,2 is modified with a weight x replaced by x δ ′ , for simplicity we set δ ′ = 1.
Note, that unlike the constants λ M in (NL1), which depend only on data u 0 , C J,M in (N L2) is a family of constants that depend on the smoothness of the coefficients J and the size of the solution M . This is a natural assumption as we consider nonlinear equations and the coefficients may grow with the size of the solution. The size of the solutions is a priori unknown and is one of the quantities to be estimated. Thus when using (NL2) we will be explicit in the choice of J and M to avoid a possible circularity.
The proof is based on the artificial viscosity regularization of the initial value problem (1.1) for a parameter 0 < ε ≤ 1: Where N u is the operator from (1.1) We then construct solutions u ε for the regularized problem and show that the solution u = lim ε→0 u ε in the desired topology.
Main Theorem. Specifically, by well-posedness we mean the following: (1) Existence. Let R > 0 be given. Then there exists a T > 0, such that if the  is not sharp, but holds under very general assumptions on the coefficients and no smallness on data. While preparing this work for the publication, I have learned of a parametrix-based approach to the well-posedness of Quasi-linear Schrödinger for the small data in [18] in the lower regularity Sobolev spaces than in the [14]. Adapting this approach to (1.1) may allow to lower regularity for small data.
The argument of the proof of the Theorem 1.1 also shows that (1.1) has a Local Smoothing effect, that is in addition to X I , the solution is in u ∈ L 2 I H 9,2 ( x −4 dx).
A simple transformation x → −x in the equation (1.1) shows that the dispersive property (N L1) can taken with an opposite sign. That is, if in addition to (NL2)-(NL3), for every M > 0 there exists a constant λ M > 0 3n , then Theorem 1.1 holds.
We note that the continuous dependence in the Theorem 1.1 is the best we can hope for as (1.1) is quasilinear.
Finally, the arguments in the proof of Theorem 1.1 can be used to extend the persistence of regularity to H s,k rather than H s,2 .
The rest of the paper is organized as follows. In the section 2 we prove the main linear estimate (1.3). In the section 3 we prove the well-posedness of (1.6) for a time independent of the regularization ε. Finally, in the section 4 we construct the solution of (1.1) and prove continuous dependence.
Notation. When estimating with multiplicative constants, we often write A x,y B, to mean A ≤ C(x, y)B, where the constant C(x, y) may change from line to line. As dependence on the integers (8, 2) and the constant in the Sobolev embedding H 1 (R) ֒→ L ∞ (R) occurs frequently we do not explicitly state dependence on them.
x with the space B k of bounded C k functions have the following bounds: (L3) Asymptotic flatness and decay.
By a solution of (1.2) we mean a classical solution and hence by the Sobolev Regularity of a-d, particularly with weights in (L3) determines the Sobolev exponent of H 8,2 in Theorem 1.1, where for simplicity, we set δ = δ ′ = 1. Proof of analogues of Theorem 2.1 with coefficients rougher than above leads to lowering regularity in Theorem 1.1.
Note, that for the applications of Theorem 2.1, the constant C 1 will depend on the solution of the non-linear problem, whileC 0 will only depend on data. As we will use the constant A = A(λ,C 0 ) from Theorem 2.1 to control the size of the solution and in turn C 1 , it is crucial for A not to depend on C 1 .
It is not difficult to prove an H s version of (1.3) and (1.4), provided coefficients are more regular than (L2), by differentiating (1.2), using the Theorem 2.1 and choosing T small to control lower order terms. However, for some estimates in the proof of Theorem 1.1 we will not have such control of the coefficients, and we omit the H s estimate.
Remark follows from Theorem 2.1 by a simple scaling of the equation. Indeed, sending (x, t) → (−x, −t) preserves assumptions (L1)-(L3), while changing the sign of the time.
Finally before proceeding with the proof of the Theorem 2.1 in the subsection 2.3, we motivate the coefficient assumptions (L1)-(L3) by showing the necessity of symmetry of a and decay of b for the Theorem 2.1.
2.1. Symmetry of the top order. Similar to the hyperbolic systems, symmetry of (1.1) is necessary for the well-posedness. Namely, we consider the following constant coefficient linear system that violates the symmetry in (NL1), but satisfies Taking a Fourier transform in space this equation reduces to an ODE, for which the explicit solution iŝ We then take the dataˆ u 0 (ξ) T = 0 ξ −s−1 and a computation shows Therefore, (2.1) is ill-posed in H s for any s and hence in any H s,2 .

2.2.
Necessity of decay of the coefficient b. Here we show that a slightly weaker form of asymptotic flatness (L3) is necessary for (1.3) in the following special case. More precisely, we show that for a solution u of is necessary for (1.3).
We show necessity by a WKB method, similar to an argument of [19] for a Schrödinger equation. Let u = e iφ(x,t,ξ) v(x, t, ξ), where φ = xξ 2 + tξ 6 . Then for an For which we get the explicit solution by the method of characteristics We further define Now assume that (2.3) does not hold. Then there exists a x 0 , t 0 and r 0 such that t0 0 b(x 0 + r 0 · s)r 0 ds ≥ 6 log(3A) Moreover we can assume by rescaling t 0 with t0 r0 that r 0 = 1. Hence for x near x 0 (1)), which is a contradiction for ξ large enough.
2.3. Proof of Theorem 2.1. To proceed we first change the dependent variable (or gauge it) and then proceed with the energy method for the gauged problem.
We use x −2δ to define a multiplicative change of variables as follows: Because the integral above is convergent and |∂ α x x β | ≤ C(α, β) x β−α we get φ ∈ B ∞ , i.e. it is bounded and all of its derivatives are bounded. Moreover, Definition of φ implies that e φ and e −φ are also in B ∞ , and hence Likewise, by the product rule and duality, Inverting (2.7) we write (1.2) as: Then v satisfies the following "gauged" system x , e φ ] is a third order differential with B ∞ coefficients with norms controlled by N and δ.
We now proceed with the energy estimates. Taking a dot product of (2.9) by v and integrating in x we get Note that L * , the adjoint of the operator L, is where a T ij = a ji and l.o.t. are terms having less than 3 derivatives of v. Hence integration by parts of (2.10) gives x a ji + ∂ 2 xb ji − ∂ xcji +d ij +d ji We now claim that the Theorem 2.1 reduces to the following proposition. Proposition 1. There exist A = A(C 0 , λ, δ) and T = T (C 1 ,C 0 , λ, δ), such that for 0 ≤ t ≤ T ′ ≤ T ≤ 1, the solution of (2.9) satisfies: Indeed, discarding the second term on the right hand side of (2.12) and the Grownwall inequality imply v(x, t) 2 Thus by the Cauchy-Schwartz inequality we get for I = [0, x ) which is (1.2) after we use the comparability of the norms of u and v. Integrating (2.12) and using in (2.13) gives the (1.4) after the comparability of the norms.
Proof of Proposition 1. We estimate (2.11) term by term in reverse order: Trivially, Hence interpolation of H 3 2 between L 2 and H 2 , Cauchy inequality and (1.5) give While for I 4 , integrating by parts gives We use (2.16) for I 3 By Cauchy-Schwartz and (2.15) we estimate where C T ij = C ji . Finally, by (2.14) Summing together the estimates of I 1 -I 6 we complete the proof of the Proposition 1 and hence the Theorem 2.1 3. Wellposedness of the regularized problem. By Duhamel principle, solving (1.6) in a sufficiently regular Sobolev space is equivalent to a fixed point of the operator where the parabolic operator e −εt∂ 4 x is defined as a Fourier multiplier: Well-posedness of (1.6) for a short time dependent on the parameter ε is very similar to [14], where the same regularization is used for a quasi-linear Schrödinger equation. However, as (1.1) is of higher order than Schrödinger , we provide the proof in the Proposition 2.

Preliminary estimates.
Lemma 3.1. For any positive integer s, 0 < T ≤ 1 and t ∈ I = [0, T ] the following estimates hold By changing the Y norm in (1.8) by a multiplicative constant, for the rest of the paper we treat the constant C(8) as 1 from the Lemma 3.1 for simplicity.
Proof. By Plancherel and boundedness of e −α for α ≥ 0: To proceed with weights, note that multiplying the PDE for v = e −εt∂ 4 x u 0 by x 2 and commuting derivatives with weights, we get x −1 is a differential operator of order 3 with B ∞ coefficients. Hence by the Duhamel formula Therefore, by (3.3) and Minkowski inequality By the boundedness of (Pseudo)Differential operators, E 3 x v H s s x v H s+3 . By Pseudo-Differential calculus (or commuting x with derivatives by hand) and Cauchy-Schwarz Using elementary Calculus estimate α 3 2 e −α ≤ 1 e for α ≥ 0 and the explicit formula for the semigroup e −εt∂ 4 x , t ≥ 0 we get Using this estimate instead of (3.3) with weights finishes the proof.
We also need the following estimate for N ( u) ≡ N u ( u) from (1.7): 14 TIMUR-AKHUNOV Lemma 3.2 (Moser estimate). Let u, v ∈ C 0 I H s,2 for s = 8 or 9 be given with max{ Proof. The proposition follows by the elementary calculus and the Sobolev embedding. The constants in the s = 9 case depend on M , because it is impossible to get terms like ∂ s+5 x u · ∂ s+6 x u by differentiating N ( u) s + 3 times.

3.2.
Short time well-posedness. We now set up the following notation: with R from (1.8) and A from the Theorem 3.5.
We do the contraction argument in the following closed subset of C 0 I H 8,2 for I = [0, T ].  6) has a unique solution u ε in X M Iε . We note, that (3.1) is locally well-posed in C 0 t H 4 using the same proof. However, for the arguments in section 3.3 and beyond we need u ε to be in X M Iε .
Proof. Let u ∈ X M Iε . Then by Minkowski inequality and (3.2), followed by (1.8), (3.6) and (3.7) we get: Likewise using Lemmas 3.1 and 3.2 for the difference gives the contraction property.
Corollary 1. As we proved Proposition 2 by the contraction mapping argument, we automatically get continuous (and Lipschitz) dependence on data. That is, the flow map ( u 0 , f ) → u is continuous in Y → X M Iε . Moreover, (3.1) has a persistence of regularity property.
For the persistence of regularity, let the data ( u 0 , f ) in addition to satisfying (1.8) satisfy for some s > 8 To see this, redo the boundedness argument in the proof of the Proposition 2 for H 9,2 partitioning [0, T ] into identical intervals of lengthT 9 =T (ε, M, C 12,M ) for using (3.6a). For higher norms proceed inductively redoing (3.6a) to estimate u ε  (1.7). We will show in Lemma 3.3 that the coefficient bounds (L1)-(L3) for this linear equation do not depend on the particular v, but only on the (NL1)-(NL3) bounds, data for t = 0 and the bounds M , C(M ) for t = 0.
In particular an application of Theorem 2.1 for u = v = u ε would imply for We then pursue the same strategy for ∂ s x u ε and x 2 ∂ s−6 x u ε for s = 6, . . . 14. Namely, we differentiate (1.1) s times and account for quasi-linear interactions.
with a the same as in (1.1), Similarly, c s explicitly depend on a number s, (∂ α x u ε ) α≤4 and up to 2 derivatives of a, b; and d s depend on s, (∂ α x u ε ) α≤5 and up to 3 derivatives of a-c respectively. The reason for these formulas, is that differentiation is "linearizing", so we only can get a∂ s+3 x u ε by applying all derivatives on u ε and there are only few nonlinear ways to get terms higher than order s.
Once we establish (L1)-(L3) coefficient estimates for (3.10) that are uniform when evaluated at v ∈ X M T ′ in Lemma 3.3, then for the solution u ε satisfying u ε ∈ C 0 I H 8+5,2 ∩ C 1 I H 4+5,2 (3.11) We would then apply Theorem 2.1 to (3.10) to get Provided T ′ ≤ T . Note that, (3.11) is needed for to ensure that (3.10) is valid classically for 6 ≤ s ≤ 14.
Finally, we multiply (3.10) by x 2 to rewrite it as where a, b s , c s , d s , F s are identical to (3.10) by construction and x −1 is an order 3 differential operator with L ∞ coefficients. Hence as long as the coefficient estimates are established and (3.11) is valid, Theorem 2.1 implies Therefore, establishing a uniform estimate for u ε X M I reduces to the following: • Find uniform bounds on the coefficients to (1.1) and (3.10) to use the Theorem 2.1 (note the remark after Theorem 2.1). • Control the terms not involving data ( F s and  Proof. To simplify notation, let v = ( u, ∂ x u, ∂ 2 x u).

Using the Fundamental Theorem of Calculus
we get using (NL2), (NL3) and Sobolev embedding Similarly, The first two estimates follow immediately from the construction of F s and Sobolev embedding.
For the second, by construction E 3 is a differential operator with coefficients bounded by C(s)C s,M (1 + M 4 ) and hence Where the last line follows by interpolation in (3.5). Proof. Note, that the Proposition 2 holds on [T ′ , T ′ + T ε ] with data ( u ε (T ′ ), f ) at t = T ′ , as long as u ε (T ′ ), f Y < M 2 , which will hold for any T ′ ∈ I as long as (3.15) is valid. When this is the case, u ε can be extended to solve (1.6) on [0, T ′ + T ε ] in X M [0,T ′ +Tε] and arguing inductively to extend u ε to I = [0, T ]. Thus it suffices to prove (3.15).
We first complete the proof under the assumption that (3.11) is valid.
By the Lemma 3.3 estimates (3.9), (3.12) and (3.14) are valid. Adding these inequalities and using equivalence of norms (1.5) we get Using Lemma 3.4 we get Which after choosing T small enough proves (3.15).
Applying Theorem 3.5 to u ε m , for which (3.11) applies, allows us to recover (3.15) for u in the limit as m → ∞, as the constants A and T don't depend on m.
4. Removing regularization. We first construct solution u of (1.1) in a topology weaker than C 0 I H 8,2 ∩ C 1 I H 4,2 . Proposition 3. There exists a T > 0 and I = [0, T ] and a sequence of solutions of (1.6) u εn in X M I for ε n → 0, such that u εn → u in C 0 H 7,2 ∩ C 1 I H 4,2 . Proof. Take a difference of solutions u ε and u ε ′ ∈ X M I of (1.6) for 0 < ε ′ < ε ≤ 1: and similarly forc,d.
By an argument identical to the Lemma 3.3, for u, v ∈ X M I , L u, v satisfies (L1)-(L3) with boundsC 0 , C 1 dependent on the same parameters as in Lemma 3.3. Hence for an appropriate T Theorem 2.1 applied to (4.1) gives We then use that u ε − u ε ′ C 0 I H 8,2 ≤ 2M and for 0 < 1 ≤ 1 interpolate H 13 between L 2 and H 8+6 ⊂ H 8,2 to get Hence by completeness of C 0 I H 13 x we can take a sequence ε n → 0 to construct u = lim εn→0 u ε .
Note, that L ∞ I H 14 = L 1 I H −14 * , and hence the closed unit ball in it is weak- * compact by the Banach-Alaoglu theorem. As a consequence, the sequence u εn ∈ X M I ⊂ L ∞ I H 8,2 x has a weak limit u εn ⇀ v ∈ L ∞ I H 14 x up to a subsequence. Hence, for a.e. x, there is a further subsequence, such that u εn (x) → v(x). But by uniqueness of limits v = u. A similar argument for L ∞ Using Fatou lemma gives u L ∞ For strong convergence in C 0 I H 8−1,2 we multiply (4.1) by x 2 and rewrite it as is an order 3 differential operator with coefficients bounded by 4 + C 1,M (1 + M ). Applying Theorem 2.1 all terms are treated as in the proof of (4.3), except we use (3.5) to interpolate x → 0 as ε, ε → 0. Interpolating H 8−1 between L 2 and H 8 implies Finally, using (4.1) and Lemma 3.
Moreover, for uniqueness of the limit we use (4.3) for u ε = u and u ε ′ = u ′ for ε = ε ′ = 0. The regularity in the Proposition 3 is enough for the Lemma 3.3 to be valid. Moreover, as Theorem 2.1 is valid for [−T, T ], when ε = 0 we get uniqueness for [−T, T ].
We now aim to recover the loss of the derivative in Proposition 3 following the regularization method of Bona-Smith [2]. That is we regularize the data with a parameter κ and then apply the Theorem 2.1 for the difference at the top level of regularity keeping a careful track of κ.
Lemma 4.1. Let K ⋐ H 14 be a compact set. Then ∀κ > 0, and any u 0 ∈ K, u 0,κ ∈ S satisfies with the convergence rate dependent on K.

Remark 2. Lemmata 4.1 and 4.2 hold for
for all j ≥ 0 and similarly for K ⋐ C 0 I H 4,2 . For L 1 H 8,2 redo the Lemmas using for o(1) convergence rate, which follows from Lebesgue Dominated convergence in t and ξ.
Finally, for f ∈ C 0 I H 4,2 we use the continuity of f to see that f (I) is a compact set in H 4,2 .
Then considering (1.6) with data u ε κ (kT ), f (t + kT ) Lemma 3.3 is valid on I k and Lemma 3.4 can be refined to hold for 0 ≤ s ≤ 8 + j. Namely, there's no way to get terms involving ∂ 14 x u ε · ∂ 14+j x u ε by differentiating N u ε ( u ε ) 17 times. This gives with C = A( ( u ε κ , f κ ) Y ) from Theorem 2.1. ChoosingT small enough and inducting on k finishes the proof.
Theorem 4.3. u ε κ → u ε as κ → 0 in L ∞ I H 14 x uniformly in 0 < ε ≤ 1. Moreover, this convergence is uniform for u ε coming from data in a compact set K ⋐ B R (0), where B R (0) is a ball of radius R centered at 0 in Y .
We then subtract (3.10) for u ε κ from (3.10) for u ε with s = 14 and rewrite it as where N 14 is the operator from the Lemma 3.3 for s = 14 with coefficients evaluated at u ε and F 14 = F 14 (x, t, u ε , . . . , ∂ 13 x u ε ), and likewise N 14,κ and F 14 κ are evaluated at u ε κ respectively. Therefore, assuming (3.11), we apply the Theorem 2.1: x ≡ I 1 + I 2 + I 3 We estimate the right hand side term by term. By the Proposition 4, I 1 = o(1) as κ → 0. For I 2 we write by the Fundamental Theorem of Calculus, Hence by Sobolev For I 3 , we proceed as for I 2 using Fundamental Theorem of Calculus Hence by the Cauchy-Schwarz and Sobolev: x Thus by (4.9) and (4.7) Combining the estimates for terms I 1 -I 3 , (4.9) and using equivalence of norms (1.5) gives u ε κ → u ε in C 0 I H 14 provided (3.11) is valid.
To finish the proof without (3.11), we proceed as in the proof (3.5), except when we regularize the data ( u 0,m , f m ) → ( u 0 , f ) in Y , we use compactness of K = {( u 0,m , f m )} ∪ {( u 0 , f )} in Y to ensure that the convergence rate u ε m,κ → u ε