GRADIENT BOUNDS FOR STRONGLY SINGULAR OR DEGENERATE PARABOLIC SYSTEMS

. We consider weak solutions u : Ω T → R N to parabolic systems of the type u t − div A


Introduction
In this paper, we are interested in the regularity of weak solutions u : Ω T → R N of strongly degenerate or singular parabolic systems of the type ∂ t u i − div A i (x, t, Du) = f i for i ∈ {1, . . ., N} in Ω T = Ω × (0, T ). (1.1) Here and in the following Ω is a bounded and open subset of R n (n ≥ 2), T > 0 and N ≥ 1.
The exact assumptions on the vector field A = (A 1 , . . ., A N ) : Ω T × R N n → R N n and the vectorvalued inhomogeneity f = (f 1 , . . ., f N ), N ≥ 1, will be discussed in detail later.As our main result we will show that weak solutions are locally Lipschitz-continuous in the spatial directions, i.e. that the spatial gradient Du is locally bounded.The chosen terminology can be illustrated by the model system of equations.In fact, in the prototypical system we have in mind, the vector field A is given by for some p > 1.First, we note that any time-independent 1-Lipschitz continuous function u is a solution of the homogeneous model system.Accordingly, well established methods from regularity theory, using the second weak spatial derivatives of u, cannot be utilized.We will circumvent this difficulty by approximation.Before we specify in detail the structure of the considered system of equations, we briefly discuss some results already available in the literature.So far, most progress has been made for the associated elliptic, i.e. time-independent, problem.L. Brasco [4] proved the Lipschitz continuity of weak solutions.A. Clop, R. Giova, F. Hatami and A. Passarelli di Napoli [9] generalized this result for systems (N ≥ 2).The two previous works arise from the study of strongly degenerate functionals.These can be regarded as asymptotically convex functionals, i.e.
1 functionals having a p-Laplacian type structure only at infinity.Such class of functionals has been widely investigated, since the local Lipschitz regularity result by Chipot and Evans [8].In particular, we mention generalizations allowing super-and sub-quadratic growth [22,25,29], as well as lower order terms [28].Extensions to various other contexts can be found in the non-exhaustive list [12]- [14], [16]- [20], [32].The first result concerning gradient continuity was achieved by F. Santambrogio and V. Vespri [31].The authors restricted themselves to equations and the dimension n = 2 and proved that a gradient dependent function of the form g(Du) is continuous whenever g is continuous and vanishes on the set of degeneracy, i.e. on the unit ball in case of the prototypical equation.Subsequently, M. Colombo and A. Figalli [10,11] extended this result to arbitrary dimensions n ≥ 2. A similar result in the vectorial setting was recently shown by V. Bögelein, F. Duzaar, R. Giova and A. Passarelli di Napoli [6].L. Mons [27] extended this result to systems satisfying more general structural conditions.On the contrary, for the time-dependent problem, little is known so far.The first author [2] succeeded in showing a fractional higher differentiability result.Furthermore, A. Gentile and A. Passarelli di Napoli as well as the first author and A. Passarelli di Napoli obtained a higher differentiability result in [21] and [3] respectively.
In this paper we prove the boundedness of the spatial gradient under quite general assumptions on the vector field A. In particular, this can be seen as a parabolic analogue of Brasco's Lipschitz-continuity result.Our main result reads as follows.For notation and definitions we refer to Section 2.
In the subcritical case 1 < p ≤ 2n n+2 , the weak solutions to (1.1) may not be bounded (see [15,] and note that the p-Laplacian satisfies our growth assumptions).Therefore, the extra assumption u ∈ L ∞ (Q R (z 0 ), R N ) in Theorem 1.1 (b) is natural.
Note that some widely degenerate systems can be interpreted, similarly to functionals, as asymptotically regular systems.Therefore, in the special case that A(x, t, Du) ≡ A(Du) ≈ |Du| p−2 Du for |Du| ≫ 1 and p > 2n n+2 we recover the results obtained in [24,5].
Finally, we explain the choice of the parameter n, which is defined by for some β > 0. If n = 2 and 1 < p < 2 we choose β in such a way that 0 < β < 4(p − 1) 2 − p .(1.4)This choice ensures that in the case n = 2 we have p > 2n n+2 for any p > 1.Indeed, this condition will be decisive to carry out the proof in Subsection 6.3.
1.1.Structural conditions on the vector field A. To specify the structure of the vector field A : Ω T × R N n → R N n , we consider a function which is convex with respect to the s-variable and vanishes for all s ∈ [0, 1].Furthermore, we shall assume that the partial map s → F (x, t, s) is in C 1 (R + ) ∩ C 2 (R + \ {1}) for almost every (x, t) ∈ Ω T , while for every (t, s) ∈ (0, T ) × [0, ∞) the map x → F (x, t, s) is differentiable almost everywhere.We additionally suppose that there exist an exponent p > 1 and some positive constants L, C 1 > 1 and K such that, for all s > 1 and for almost every x, y ∈ Ω and t ∈ (0, T ), the function F satisfies the following growth assumptions: Then, the vector field A : Ω T × R N n → R N n is defined by A(x, t, ξ) := ∂sF (x,t,|ξ|) |ξ| Note that for the prototypical case where a : Ω T → R + and x → a(x, t) is a Lipschitz continuous function that is bounded from below by a positive constant, one can easily deduce that the growth conditions (H 1 )−(H 4 ) are fulfilled.If furthermore a(x, t) ≡ 1, then we recover the vector field in (1.2).
1.2.Strategy of the proof.Our approach is inspired by [9,12,16].The proof will be achieved by a parabolic Moser iteration technique.However, the implementation is quite subtle due to the degeneracy of the differential operator.Since weak solutions may not be twice weakly differentiable with respect to the x-variable, we approximate the original system (1.1).The approximation is chosen in such a way that the regularized vector field A ε , ε > 0, satisfies standard p-growth assumptions.However, proceeding at this point with a standard Moser iteration, the constants would blow up as ε ↓ 0. This problem will be overcome by the choice of an appropriate test function in the derivation of the Caccioppoli-type inequality.At this point it is helpful to realize that for δ > 0 large enough and |Du ε | > 1 + δ the p-growth conditions of the approximating systems are satisfied independently of ε.Therefore, we choose a test function vanishing on the set {|Du ε | ≤ 1 + δ}.We note that the choice of such a test function requires the existence of second weak spatial derivatives, which is the reason why we had to introduce the approximating solutions.This test function allows to obtain a Caccioppolitype inequality and in turn we set up a kind of Moser iteration scheme.Note that, similarly to the treatment of the parabolic p-Laplacian, in the Moser iteration we have to distinguish between three different regimes of the parameter p.The first one is the superquadratic case p ≥ 2, the second one is the subquadratic supercritical case 2n n+2 < p < 2 and finally we have the subcritical case 1 < p ≤ 2n n+2 .In the last case, we need to ensure that also the approximating solutions u ε are uniformly bounded with respect to ε.This is achieved by a maximum principle.In all three previous cases, we prove the boundedness of the spatial gradient Du ε of the approximating solutions together with quantitative estimates.Here it is worthwhile to note that our quantitative estimates are uniform in ε.Since the approximating solutions converge strongly in L p to the original solution, the statement of Theorem 1.1 follows after passing to the limit as ε ↓ 0.
1.3.Plan of the paper.In Subsection 2.1, we introduce the notation adopted throughout the paper.In Subsections 2.2 and 2.3, we recall some basic facts on the used function spaces and the regularization in time.In the following Subsections 2.4 and 2.5, the approximating vector field is defined and its growth properties are obtained.Thereby, we distinguish the subquadratic and superquadratic cases.Subsection 2.6 is devoted to some algebraic inequalities needed for our purposes.
In Section 3, we define the approximating problems and prove a strong convergence result in L p .As a by-product, we obtain the uniqueness of the weak solutions to a Cauchy-Dirichlet problem associated with system (1.1).
In Section 4, we establish a maximum principle for solutions to the approximating problem.This allows us to show that the approximating solutions are essentially bounded, provided that the original solution is itself essentially bounded.
In Section 5 we lay the groundwork for Section 6, where we prove local L ∞ -bounds for the spatial gradient of the approximating solutions.The proof is divided into several subsections.In Subsection 6.1 we establish a suitable Caccioppoli-type inequality, from which we derive a reverse Hölder-type inequality in Subsection 6.2.In these subsections, we need to address separately the three regimes of the parameter p that we referred to above.For this reason, the Moser iteration procedure is split into two steps: in Subsection 6.3 we deal with the supercritical case, while in Subsection 6.4 we consider the subcritical one.
Finally, in Section 7 we give the proof of Theorem 1.1.
Acknowledgements.Part of this paper was written while P. Ambrosio was visiting the University of Salzburg in November and December 2022.He is thankful to Verena Bögelein and the host institution for their kind invitation and constant support.The authors gratefully acknowledge fruitful discussions with Verena Bögelein and Antonia Passarelli di Napoli, who provided valuable comments and suggestions during the preparation of the manuscript.This work has been partially supported by the FWF-Project P36295-N.P. Ambrosio is a member of the GNAMPA group of INdAM that partially supported his research through the INdAM−GNAMPA 2024 Project "Fenomeno di Lavrentiev, Bounded Slope Condition e regolarità per minimi di funzionali integrali con crescite non standard e lagrangiane non uniformemente convesse" (CUP: E53C23001670001).

Preliminaries
2.1.Notation and essential tools.In this paper we shall denote by C or c a general positive constant that may vary on different occasions, even within the same line of estimates.Relevant dependencies on parameters and special constants will be suitably emphasized using parentheses or subscripts.The norm we use on the Euclidean spaces R k will be the standard Euclidean one and it will be denoted by |•|.In particular, for the vectors ξ, η ∈ R N n , we write ξ, η for the usual inner product and |ξ| := ξ, ξ 1 2 for the corresponding Euclidean norm.For points in space-time, we will frequently use abbreviations like z = (x, t) or z 0 = (x 0 , t 0 ), for spatial variables x, x 0 ∈ R n and times t, t 0 ∈ R. We also denote by B ̺ (x 0 ) = {x ∈ R n : |x − x 0 | < ̺} the n-dimensional open ball with radius ̺ > 0 and center x 0 ∈ R n ; when not important, or clear from the context, we shall omit to denote the center as follows: B ̺ ≡ B ̺ (x 0 ).Unless otherwise stated, different balls in the same context will have the same center.Moreover, we use the notation for the backward parabolic cylinder with vertex (x 0 , t 0 ) and width ̺.We shall sometimes omit the dependence on the vertex when all cylinders occurring in a proof share the same vertex.
For a general cylinder Q = B × (t 0 , t 1 ), where B ⊂ R n and t 0 < t 1 , we denote by where 1 denoting the standard, non-negative, radially symmetric mollifier in R n+1 .Obviously, here the function υ is meant to be extended by the k-dimensional null vector outside Q.
In this work, we define a weak solution to (1.1) as follows: We conclude this first part of the preliminaries by recalling the following iteration lemma, which is a standard tool for "reabsorbing" certain terms and can be found, for example, in [23,Lemma 6 for all ρ 0 ≤ ρ < r ≤ ρ 1 , for some σ > 0, ϑ ∈ [0, 1) and a non-negative constant C.Then, there exists a constant κ ≡ κ(σ, ϑ) > 0 such that for all ρ 0 ≤ ρ < r ≤ ρ 1 we have

Orlicz spaces.
Here we recall some basic properties of Orlicz spaces that will be needed later on (for more details, we refer to [1]).Let Ψ : [0, ∞) → [0, ∞) be a Young function, i.e.Ψ(0) = 0, Ψ is increasing and convex.If Σ is an open subset of R k , we define the Orlicz space L Ψ (Σ) generated by the Young function Ψ as the set of the measurable functions v : for some λ > 0. When equipped with the Luxemburg norm The Zygmund space L q log α L(Σ), for 1 ≤ q < ∞, α ∈ R (α ≥ 0 for q = 1), is defined as the Orlicz space L Ψ (Σ) generated by the Young function Ψ(s) ≃ s q log α (e+ s) for every s ≥ s 0 ≥ 0. Therefore, a measurable function v on Σ belongs to L q log α L(Σ) if Moreover, we record that for the function holds for every v ∈ L q log α L(Σ) and some θ = θ(q) > 0 (see [9]).

Steklov averages.
In this section, we recall the definition and some elementary properties of Steklov averages.Let us denote a domain in space time by Q ′ := Ω ′ × I, where Ω ′ ⊆ Ω is a bounded domain and I := (t 1 , t 2 ) ⊆ (0, T ).For every h ∈ (0, where k ∈ N, the Steklov average [v] h (•, t) is defined by for x ∈ Ω ′ .This definition implies, for almost every The proof of the following result is straightforward from the theory of Lebesgue spaces (see [15,Lemma 3.2]).
For further needs, we now recall the well-known Steklov average formulation of (1.1) in 2.4.Approximation of the function F for 1 < p ≤ 2. As already mentioned, we assume that for almost every (x, t) Moreover, we notice that in the case 1 < p < 2 both bounds for ∂ ss F (x, t, •) in (H 3 ) blow up as s → 1 + .This very singular behavior of F must be avoided, since we need to use the second derivative ∂ ss F to establish a local bound for the spatial gradient of the weak solutions to (1.1).Therefore, for 1 < p < 2 and for almost every (x, t) ∈ Ω T we approximate the partial map s → F (x, t, s) by smoothing it around the singularity of ∂ ss F (x, t, •), in such a way that the resulting approximation F ε (x, t, •) coincides with F (x, t, •) outside a small neighborhood of the singularity s = 1.Thus, in this section we will assume that 1 < p ≤ 2, unless otherwise stated.For ε ∈ 0, 1   2   we define the function where η 1 ∈ C ∞ 0 ((−1, 1)) denotes the standard, non-negative, radially symmetric mollifier in R. Keeping this definition in mind, in what follows we will show that an approximation of F (x, t, •) is given by Fε (x, t, s) := F (x, t, v ε (s) + 1), ε ∈ 0, 1 2 .Firstly, we need to prove that Fε satisfies non-degenerate growth conditions for 0 < ε < 1  2 .Therefore, we begin our analysis by studying the growth of this function.
By assumption, we know that for almost every (x, t) ∈ Ω T the map s → Fε (x, t, s) belongs to C 2 (R + ).For later purposes, we now note that one can easily check that for all s ≥ 0 and for some universal constant C > 0. Furthermore, from the growth assumption (H 1 ) and from definition (2.3) we can deduce that for all s ≥ 1 + 2ε and for almost every (x, t) ∈ Ω T .As for the derivatives of Fε with respect to the s-variable, for almost every (x, t) ∈ Ω T we have and and from assumptions (H 2 ) and (H 3 ) it follows that ∂ s Fε , ∂ ss Fε ≥ 0.Moreover, combining (2.5), (2.6), (H 2 ) and the fact that v ε (s) ≤ max{2ε, s − 1} ≤ s for every s > 1 and every ε ∈ 0, 1 2 , for almost every (x, t) ∈ Ω T we obtain for any s > 0 and any ε ∈ 0, 1  2 .As for the second derivative ∂ ss Fε , combining (2.5), (2.7), (H 2 ) and (H 3 ), for every s > 0 and for almost every (x, t) ∈ Ω T we find Now, using the fact v ε (s) ≤ 2ε for s < 1+2ε, we can estimate the second term on the right-hand side of (2.8) as follows: (2.9) In order to deal with the first term on the right-hand side of (2.8), we distinguish between the cases 1 < s < 1 + 2ε and s ≥ 1 + 2ε.In the first case, we have v ε (s) ≥ ε and therefore we get (2.10) If, on the other hand, s ≥ 1 + 2ε, then we have v ε (s) = s − 1 due to (2.4).In this case, using that (s − 1) 2 ≥ ε 2 4 (1 + s 2 ) we obtain (2.11) Joining estimates (2.8)−(2.11),for every s > 0 and for almost every (x, t) ∈ Ω T we then have where c ≡ c(p, C 1 ) > 0. This concludes the necessary growth estimates of Fε .Now, in order to prove that the function Fε is indeed a good approximation of F , it remains to analyze the behavior of Fε as ε ց 0. To this end, we notice that (2.4) immediately implies for s / ∈ (1, 1 + 2ε).
(2.12) Furthermore, for s ∈ [1, 1+2ε] we can estimate the difference of these two derivatives as follows: Let us explicitly observe that the last term tends to zero as ε ց 0. To ensure the convergence result of Lemma 3.4 below, we need to accelerate the rate of convergence.For this reason, for any 1 < p < 2 we now define the new approximation Collecting the above conclusions, we note that F ε has the following properties: Lemma 2.4.For every ε ∈ (0, 2 1−p ), almost every (x, t) ∈ Ω T and every s ∈ R + 0 , we have However, notice that 2 1−p ≥ 1 2 whenever 1 < p ≤ 2, which implies that the previous lemma holds for any ε ∈ (0, 1/2).
2.5.Approximation of the vector field A. With the approximation (2.14) of F and Lemma 2.4 in mind, we can define for every ε ∈ (0, 1/2) the vector field (2.15) We thus approximate the structure function A by means of the vector fields A ε , in order to be allowed to apply some results from the theory of singular or degenerate parabolic systems to the weak solutions of problem (P ε ), introduced in Section 3. Therefore, we now need to check whether A ε fulfills non-degenerate growth conditions.This is what we will do hereafter.
From the growth conditions of ∂ s F ε and from the structure of the approximation, for any p > 1 and for any ε ∈ (0, 1  2 ) we immediately obtain for every ξ ∈ R N n and for almost every (x, t) ∈ Ω T .As for the spatial gradient of A ε , by the assumption (H 4 ) we have (2.17) for every ξ ∈ R N n , for every ε ∈ (0, 1  2 ) and for almost every (x, t) ∈ Ω T .Now we want to examine the structure of D ξ A ε (x, t, ξ).To this end, for any ξ ∈ R N n \ {0} and for any ε ∈ (0, 1  2 ) we define the bilinear form and observe that The next lemma provides the relevant ellipticity and boundedness properties of the bilinear form A ε (x, t, ξ).Lemma 2.5.Let 1 < p < ∞ and ε ∈ (0, 1  2 ).Then, there exists a positive constant (2.19) ) Proof.From (2.15) it follows that In order to prove the assertion, we distinguish between two cases.When ∂ s h ε (x, t, |ξ|) < 0, from Lemma 2.4 and the growth assumption (H 2 ) we obtain where we have applied the inequality for s > 0 and p > 2.
This proves the upper bound in (2.19), and the one in (2.20) is an immediate consequence.Moreover, using the Cauchy-Schwarz inequality, we get thus proving the lower bound in (2.19).To obtain the lower bound in (2.20), we observe that for some positive constant c ≡ c(p, C 1 , δ).
In the case ∂ s h ε (x, t, |ξ|) ≥ 0, applying the Cauchy-Schwarz inequality again, from Lemma 2.4 and the growth condition (H 3 ) we get where we have used the inequality (s − 1) p−2 , which holds for every s ≥ 0 and every p > 2. To obtain the lower bound in (2.19), we neglect the term ∂ s h ε (x, t, |ξ|) and use the fact that for every p > 1.
Finally, to establish the bounds in (2.20), one can argue as above.This time, for |ξ| ≥ 1 + δ 2 we are allowed to use the growth assumptions (H 2 ) and (H 3 ) also in the case 2 |ξ| 2 by zero from below and make use of After doing this, we get the desired estimates by means of the following inequalities, which hold for and (2.22) From the previous lemma, it follows that the bilinear form (λ, ζ) → A ε (x, t, ξ)(λ, ζ) defines a scalar product on the Euclidean space R N n 2 .As for the modulus of A ε , we get the following result: Lemma 2.6.Let 1 < p < ∞ and ε ∈ (0, 1  2 ).Then, there exists a positive constant C ≡ C(p, C 1 , ε) such that, for any ξ ∈ R N n \ {0} and any λ, ζ ∈ R N n 2 , we have Now, by (H 2 ) and (H 3 ), in the case p > 2 we get In the case 1 < p ≤ 2, we use the estimates from Lemma 2.4 to find that We thus obtain the first conclusion of this lemma.Finally, due to equality (2.12), if |ξ| ≥ 1 + δ 2 we only need to use the assumptions (H 2 ) and (H 3 ) together with the inequalities (2.21) and (2.22) to obtain from (2.23) These inequalities conclude the proof.
2.6.Algebraic inequalities.In this section, we gather some relevant algebraic inequalities that will be needed later on.We start with an elementary assertion, which will be used in the Moser iteration.
Lemma 2.7. (2.25) Proof.The proof of the second inequality is similar to the one of Lemma 2.3. in [7].Thus we only prove the first inequality: For δ ∈ (0, 1] we define the auxiliary function The following two lemmas are concerned with auxiliary estimates for the functions A ε and G δ defined above. Lemma 2.8.Let 1 < p < ∞, δ > 0 and ε ∈ (0, 1  2 ).Then, there exists a positive constant Proof.Here we argue as in [6,Lemma 2.8].The inequality on the left-hand side of (2.20) implies that It remains to estimate the integral in the right-hand side of (2.26).To this end, we distinguish whether or not this implies a bound from below in the form Thus, we obtain that In the case | ξ| > |ξ|, we estimate |ξ s | from below as follows Therefore, for s ∈ 3|ξ|+1+δ/2 4|ξ| , 1 we get which yields Using the previous lemma, we can easily achieve the following result: Then, for every ν > 0 and almost every x ∈ Ω we have Therefore, the claimed inequality immediately follows from Lemma 2.8, by the positivity of the right-hand side.
Thus, we only need to consider the case where either |ξ| > 1 + δ or | ξ| > 1 + δ.In order to do this, we first assume that |ξ| ≥ | ξ|.Note that this implies while in the case 1 < p < 2 we use Young's inequality with exponents where we have used the fact that 0 < ε < Absorbing the last term on the right-hand side into the left-hand side, we obtain the desired result.

A family of regularized parabolic problems
In this section, we let u be a weak solution of (1.1) and introduce the approximating Cauchy-Dirichlet problem (P ε ), where ε ∈ (0, 1  2 ) is the approximation parameter.Denoting by u ε the unique weak solution of this problem, we will prove the strong convergence G δ (Du ε ) → G δ (Du) in L p as ε → 0. As an easy consequence of this convergence, we then establish the uniqueness of weak solutions to an initial-boundary value problem associated with (1.1).
To set up the approximating problem, for ε ∈ (0, 1 2 ) we define the truncated vector-valued function f ε by (3.1)Moreover, we consider a space-time cylinder Q ′ := Ω ′ × I, where Ω ′ ⊆ Ω is a bounded domain and I := (t 1 , t 2 ) ⊆ (0, T ).In what follows, it will suffice to assume that In this framework, we identify a function as a weak solution of the Cauchy-Dirichlet problem if and only if u ε is a weak solution of (P ε ) 1 and, moreover, Remark 3.2.We know that the regularized parabolic system (P ε ) 1 fulfills standard p-growth conditions by virtue of (2.18) and Lemma 2.5.The existence of a unique weak solution to (P ε ) can be inferred from the classic existence theory, cf.[26, Chapter 2, Theorem 1.2 and Remark 1.2].By a difference quotient method, one can show that Du ε is locally bounded and that u ε admits second weak spatial derivatives in L 2 loc , see [15,Chapter 8].For this to hold in the subcritical case 1 < p ≤ 2n n+2 , we additionally have to require that u ε belongs to L r loc (Q ′ , R N ), where r ≥ 2 satisfies n(p − 2) + rp > 0 (see again [15,Chapter 8]).Since, in the subcritical case, we always assume that u ε ∈ L ∞ loc (Q ′ , R N ), the latter requirement is trivially fulfilled.Theorem 3.3.Let p > 1 and u ε be the weak solution of problem Moreover, assume that u ε satisfies the requirements of Remark 3.2 in the case We now prove the following strong convergence result: be a weak solution of (1.1) and assume that is the unique energy solution of problem (P ε ).Then, there exists a constant C ≡ C(p, n, δ) ∈ (0, min{ 1  2 , ( δ 4 ) p−1 }) such that for every ε ∈ (0, C), the estimate holds for some positive constant C ≡ C(p, C 1 , δ).In particular, this estimate implies that Proof.We observe that (u ε − u) ∈ L p (I; W 1,p 0 (Ω ′ , R N )) by the lateral boundary condition.Unfortunately, we cannot test systems (1.1) and (P ε ) 1 with the function u ε − u, since weak time derivatives might not exist.Therefore, we resort to the equivalent Steklov averages formulations of (1.1) and (P ε ) 1 , thus obtaining for every t ∈ I = (t 1 , t 2 ), for every h ∈ (0, t 2 − t 1 ) and every φ ∈ W 1,p 0 (Ω ′ , R N ) ∩ L 2 (Ω ′ , R N ).Then, for a fixed time slice Ω ′ × {t} we can choose the function φ(•, t) defined by as a test function in (3.4).For any fixed τ ∈ I, the term involving the time derivatives yields for every h ∈ (0, τ − t 1 ).Now we pass to the limit as h → 0. Using Lemma 2.3 and taking into account the growth conditions of A ε and the fact that u ε = u on Ω ′ × {t 1 } in the L 2 -sense, from (3.4) and (3.6) we obtain the following inequality for every τ ∈ I. Taking the supremum over τ ∈ I, we thus obtain We now apply Young's inequality with exponents (2, 2) to control the last integral as follows Joining the two previous estimates, we get In order to estimate the first integral on the right-hand side of (3.7), we now distinguish whether or not p ≥ 2. If p ≥ 2, then we have which implies In what follows, we will denote by C a general positive constant that only depends on p, C 1 and δ.Using the previous estimate, Lemma 2.9 with ν := n 2(n+2) < 1 2 and Young's inequality with exponents (p, p p−1 ), we have where we have also exploited the facts that ε < ε 1−ν and ε 1−ν < ε ν , since 0 < ε < 1 2 .In the case 1 < p < 2, we need to use (2.13) and (2.14), which imply for every ξ ∈ R N n \ {0} and for almost every (x, t) ∈ Ω ′ × I. Using the above estimate with ξ = Du and arguing as in the case p > 2, we now obtain from (3.7) that for any 0 < ε < min{ 1 2 , ( δ 4 ) p−1 }.Now, from the proof of [6, Lemma 2.3] we have that Using this information to estimate the right-hand side of both (3.8) and (3.9), for every p > 1 we get At this point, notice that since ν := n 2(n+2) > 0. Also, recall that ε is small enough to have 0 < ε < min{ 1 2 , ( δ 4 ) p−1 }.From this and from (3.10) it follows that there exists a constant C ≡ C(p, n, δ) ∈ (0, min{ 1  2 , ( δ 4 ) p−1 }) such that, for every ε ∈ (0, C), we have This allows to absorb the second term on the right-hand side of the last estimate into the left-hand side, so that we finally obtain the desired conclusion.
We are now in a position to prove the uniqueness of weak solutions to the Cauchy-Dirichlet problem where Theorem 3.5.Let Ω ⊂ R n be a bounded domain and let f ∈ L 2 (Ω T , R N ).Then, the Cauchy-Dirichlet problem (3.11) admits at most one weak solution.
)) be a weak solution of (3.11) and let be the unique weak solution to (P ε ) with Q ′ := Ω ′ × I = Ω T .Then, by Lemma 3.4, for every δ > 0 there exists a constant C ∈ (0, min{ 1 2 , ( δ 4 ) p−1 }) such that for any ε ∈ (0, C) we have for some positive constant C that is independent of ε.Notice that the right-hand side of (3.12) converges to zero as ε → 0 + .Moreover, we have Combining estimates (3.12) and (3.13), we infer that This implies the uniqueness of the weak solutions to problem (3.11), by virtue of the uniqueness of limits in Lebesgue spaces and the uniqueness of the energy solutions to problem (P ε ).

Maximum principle for the homogeneous system
In this section, we want to establish a maximum principle for the homogeneous system in the case 1 < p ≤ 2n n+2 .For the proof, we need to assume that f = 0 and note that the assumed boundedness of u ε in Remark 3.2 is implied by the maximum principle.Therefore, this assumption is not restrictive whenever f ≡ 0. By (3.1), we then have f ε = 0 for every ε ∈ (0, 1  2 ).Keeping in mind the notation introduced in Section 3, we now set and denote by w the N-dimensional vector whose components are all equal to k.Notice that w is a trivial solution of system (1.1) with f = 0.Moreover, one can easily check that, for every i ∈ {1, . . ., N} and for almost every t ∈ I := (t 1 , t 2 ), we have Using the Steklov averages formulations of (P ε ) 1 and (1.1) with w instead of u, and arguing as in the proof of Lemma 3.4 with (4.1) in place of (3.5), one can easily obtain for every τ ∈ I. Since A ε (x, t, •) is a monotone vector field and Dw = A ε (x, t, Dw) = 0, we can omit the latter integral, thus obtaining Therefore, for every i ∈ {1, . . ., N} and every ε ∈ (0, 1 2 ), we have Similarly, but using the function 1), we get u i ε (x, τ ) ≥ −k for any i ∈ {1, . . ., N}, for any ε ∈ (0, 1  2 ) and for almost every (x, τ ) ∈ Q ′ .Thus, we can conclude that for all ε ∈ 0, 1 2 .

Weak differentiability
Here we derive some higher differentiability results that will be useful in the following.These results involve the spatial gradient of the weak solution to problem (P ε ) with To begin with, for each γ ≥ 0 and a > 0 we consider the function Φ γ,a (w) := w 2 (a + w) γ−2 , w ≥ 0. (5.1) For this function, one can easily check that from which we can immediately deduce for every w ∈ (0, a).Using these results, we obtain the following lemma.
Lemma 5.1.Let p > 1, ε ∈ (0, 1  2 ), γ ≥ 0 and a > 0.Moreover, assume that Notice that for m ∈ R + the function Φ γ,a,m (w) := min{Φ γ,a (w), m}, w > 0, is Lipschitz continuous.By Theorem 3.3 (see also Remark 3.2 for the case 1 < p ≤ 2n n+2 ), we have that Then, by virtue of the chain rule in Sobolev spaces, we obtain This is sufficient to prove the assertion.Moreover, we have Now we focus on the map Using the notation (2.1), we obtain the following result: Proof.Integrating by parts, for 0 < ̺ ≪ 1, for every φ ∈ C ∞ 0 (Q R (z 0 )) and every i ∈ {1, . . ., n} we obtain In order to pass to the limit as ̺ ց 0 under the integral signs, we need to estimate the above integrands.From (2.16) it follows that for some positive constant c ≡ c(p, C 1 , ε).As for the first integrand on the right-hand side of (5.4), thanks to (2.17) we get Moreover, by Lemma 2.6 we obtain where C ≡ C(p, C 1 , ε) > 0. Now we use a well-known result on the convergence of mollified functions to deduce that ), as ̺ → 0 + .Combining this with estimates (5.5)−(5.7)and applying the Generalized Lebesgue's Dominated Convergence Theorem (see [30,page 92,Theorem 17]) to both sides of (5.4), we find that ) and every i ∈ {1, . . ., n}.This implies the assertion.In addition, from (5.8) we obtain that for every i ∈ {1, . . ., n}.
For further needs, we now introduce the auxiliary function H λ : R N n → R + defined by where λ > 0 is a parameter.For this function we record the following result, whose proof is omitted, since it is similar to that of Lemma 5.1.

Local boundedness of Du ε
As before, by u we denote a weak solution of (1.1) and we let u ε be the unique weak solution of the regularized problem Our aim in this section is to establish local L ∞ estimates for Du ε with constants independent of ε.We will achieve this result by using the Moser iteration technique, which is based on Caccioppolitype inequalities.We will obtain this kind of inequalities by first differentiating the system of differential equations, and then testing the resulting equation with a suitable power of the weak solution itself.The groundwork for doing this has been laid in Section 5.
To move forward, we now fix δ > 0 and γ ≥ 0.Moreover, to shorten our notation we set and we drop the subscript ε for the weak solution u ε and the subscripts γ, a for the function Φ γ,a defined in (5.1).Therefore, from now on we will simply write u and Φ in place of u ε and Φ γ,a respectively, unless otherwise stated.To simplify our notation even more, we additionally introduce the function P : Q R (z 0 ) → R + 0 defined by P := (|Du| − a) + , and its "mollified" version with an intentional abuse of the notation (2.1) on the left-hand side of (6.1).By the elementary properties of Sobolev functions, the weak spatial gradient of P is given by Step 1: Caccioppoli-type inequalities.In order to prove the local boundedness of Du, we now test the weak formulation of system (P ε ) 1 with the function D i φ̺ , where ) is a non-negative cut-off function that will be specified later.We thus obtain At this stage, we fix ).With such a choice of ψ and integrating by parts, for the term involving the time derivative we get Now, for γ > 0 we estimate the inner integral from above and below as follows: and Recalling that ∂ t χ ≤ 0, ∂ t ω ≥ 0 and χ, ω ≥ 0, and plugging the two previous inequalities into (6.4),we obtain Inserting this into (6.3) and letting ̺ ց 0 yields the following inequality where the terms J 0 -J 9 are defined by For convenience of notation, we now abbreviate . In what follows, we will estimate each of the last nine terms above.We first prove that J 2 is non-negative, thus we can drop it in the following.We limit ourselves to dealing with the case p > 2, since for 1 < p ≤ 2 the same result follows in a similar way.For p > 2 we have where δ mk and δ ℓj denote the Kronecker delta.Using the above equality and the expression for D|Du|, we then obtain almost everywhere in the set Q r ∩ {|Du| > a}.Now, the Cauchy-Schwarz inequality implies that Combining this with the fact that ∂ s F (x, t, |Du|), ∂ ss F (x, t, |Du|) ≥ 0, from (6.6) we infer that almost everywhere in Q r ∩ {|Du| > a}.Furthermore, we obviously have Thus, taking into account that ψ ≥ 0 and Φ ′ (w) = (a + w) γ−3 w(γw + 2a) ≥ 0 for every w ≥ 0, we arrive at the desired conclusion, that is J 2 ≥ 0.
We now deal with the terms J 1 and J 3 .We let 0 < ε < min{ 1 2 , ( δ 4 ) p−1 }.By Lemma 2.5 we have Using the Cauchy-Schwarz inequality together with Young's inequality and again Lemma 2.5, we get where where, in the last line, we have used that Now we estimate the second integral on the right-hand side of (6.10).Using Young's inequality with exponents (2, 2) as well as inequalities (5.2) and ( 5.3) with w = P , for every ζ > 0 we obtain where the last inequality is due to the fact that for every p > 1 on the set Q r ∩ {a < |Du| < 2a}.We now recall that, by virtue of Theorem 3.3, we have At this point, joining estimates (6.10)−(6.13),we obtain We now turn our attention to J 7 and J 9 .Using (2.17) and Young's inequality, we find Similarly, we have Finally we estimate J 8 .Using estimate (2.17) and equality (6.2), we obtain Now we deal with the second integral on the right-hand side of (6.17).Using the fact that |Du| ≤ 2P in Q r ∩ {|Du| ≥ 2a}, the inequality (5.3) with w = P as well as Young's inequality, we get We now estimate the first term on the right-hand side of (6.17) by resorting to the same procedure leading to (6.13).Thus, for every σ > 0 we have Letting σ ց 0 in the last inequality, we then obtain At this point, combining estimates (6.17)−(6.19),we find Now, observe that from inequality (6.5) it follows that since J 2 ≥ 0. Plugging estimates (6.7)−(6.9),(6.14)−(6.16) and (6.20) into (6.21) and choosing κ = 1 4c 2 (n+4) , we arrive at where c ≡ c(n, p, δ, C 1 , K) > 1.
At this stage, we perform a particular choice of the function χ.For a fixed time τ ∈ (t 1 −r 2 , t 1 ) and ϑ ∈ (0, t 1 − τ ), we define the Lipschitz continuous function χ by Therefore, we have Hence, letting ϑ ց 0 and taking the supremum over τ ∈ (t 1 − r 2 , t 1 ), estimate (6.22) turns into In summary, we have so far obtained the following result.
holds true for some constants In what follows, we will write again u and Φ in place of u ε and Φ γ,a respectively, unless otherwise specified.First, we estimate the second integral on the left-hand side of (6.23) from below.To this end, we note that From this identity, we infer where, in the second to last line, we have used the fact that Using estimate (6.24) in combination with Young's inequality, we then obtain from which we deduce Now we consider the case 1 < p ≤ 2n n+2 .This implies Our goal now is to reduce the exponent of H δ 2 (Du) in the second integral on the right-hand side of (6.23).To this end, we observe that In addition, for |Du| > a we have Using the above identity to estimate the last integral of (6.27) and taking into account that η ∈ C ∞ 0 (B r (x 1 )) and ω is independent of the x-variable, we obtain ˆQr(z1) (6.30) Combining (6.23) and (6.29), we obtain in the case 1 < p ≤ 2n n+2 the following inequality For convenience of notation, we now use the indicator function to indicate that the terms multiplied by it need to be taken into account only in the subcritical case 1 < p ≤ 2n n+2 .Integrating (6.25) over Q r (z 1 ) and combining the resulting inequality with (6.31) in the case 1 < p ≤ 2n n+2 , after some algebraic manipulation, for every p > 1 we get the following estimate: Proposition 6.2 (Caccioppoli-type inequality).Let p > 1, γ > 0, δ > 0, b = 1 + δ and 0 < ε < min{ 1  2 , ( δ 4 ) p−1 }.Moreover, let p be defined as in (6.30) and assume that is the unique energy solution of problem (P ε ) with Q ′ = Q R (z 0 ) ⋐ Ω T and u a weak solution of (1.1), satisfying the additional assumption of Remark 3.2 if 1 < p ≤ 2n n+2 .Then, for any parabolic cylinder Q r (z 1 ) ⋐ Q R (z 0 ) with r ∈ (0, 1) and any cut-off functions η ∈ C ∞ 0 (B r (x 1 ), [0, 1]) holds true for some constant k > 1 depending on n, p, δ, C 1 and K. and the Sobolev embedding theorem on the time slices B t := B r (x 1 ) × {t} for almost every t ∈ (t 1 − r 2 , t 1 ), we obtain where C S ≡ C S (N, n, p) ≥ 1.Now, using the definitions of H δ and b, the properties of ω and η, and applying Proposition 6.2, we can estimate the second mean value on the right-hand side of (6.32) as follows: where c ≡ c(n, p, δ, C 1 , K) > 1 and Inserting the preceding inequality into (6.32),integrating with respect to time over (t 1 − r 2 , t 1 ) and using Proposition 6.2 again, we obtain where For a fixed µ > 0 that will be chosen later, we now split the cylinder Q r (z 1 ) as follows where θ ≡ θ(n) > 0. Joining (6.33) − (6.35) and (6.38), after some algebraic manipulation we arrive at where At this point, we consider estimate (6.33) in the case f L n+2 log α L(Qr(z 1 )) = 0 and the above inequality if f L n+2 log α L(Qr(z 1 )) > 0. In the latter case, choosing µ such that from (6.39) we obtain where the constants on the right-hand side are of the type for some constant C ≡ C(N, n, n, α, p, δ, C 1 , K) > 1.Moreover, from (6.33) it immediately follows that inequality (6.40) holds true also when f L n+2 log α L(Qr(z 1 )) = 0.
Let us now consider the case 1 < p ≤ 2n n+2 .Arguing exactly as above, this time we arrive at where the constants C 4 and C 5 are of the same type as in (6.41).

6.3.
Step 3: the iteration for p > 2n n+2 .Thanks to Proposition 6.3, we can now start the Moser iteration procedure.Once again, we shall keep both the assumptions and the notations used in Propositions 6.2 and 6.3.At this step, we assume that p > 2n n+2 .By virtue of (1.3) and (1.4), this implies that p > 2n n+2 .We define by induction a sequence {γ k } k∈N 0 by letting γ 0 = 0 and Notice that γ k > 0 for every k ∈ N, since p > 2n n+2 .Furthermore, this sequence diverges to +∞ as k → ∞ and by induction we have In addition, one can easily check that

.43)
Now, for k ∈ N 0 and s ∈ (0, 1) we set Note that this is possible since Using equality (6.43) and replacing γ, η and ω with γ k , η k and ω k respectively, we obtain from (6.42) the following recursive reverse Hölder-type inequality which holds for any k ∈ N 0 .Exploiting the properties of η k and ω k as well as the definitions of r k , Q k , b, H δ and p, we obtain for every k where, in the last line, we have used the fact that (6.45) In addition, one can easily deduce that for any k ∈ N 0 .Joining the last four inequalities and using the fact that r k = 2r k+1 −sr ≤ 2r k+1 and r k ≤ r for every k ∈ N 0 , we find for any k ∈ N 0 , where C ≡ C(N, n, n, α, p, δ, C 1 , K) > 1 and Θ ≡ Θ(n, α) > 0. To shorten our notation, we now set so that the last inequality turns into for k ∈ N 0 .Iterating the above estimate, we obtain for any k ∈ N.This inequality can be rewritten as follows: (6.50) Next, we observe that where One can easily check that p − ϕ ≥ 0 and ϕ ≤ p whenever p > 2n n+2 .Using this information and the fact that c s > 1, we can apply inequality (2.24) to obtain the following estimate: Now we use that 1 < γ j + p ≤ p( n+2 n ) j for any j ∈ N 0 to estimate the following product by means of (2.24) and (2.25):  for any k ∈ N, with c ≡ c(N, n, n, α, p, δ, C 1 , K) > 0. Since the constant c is independent of k ∈ N, recalling (6.48), (6.51), (6.52) and (6.53) and passing to the limit as k → ∞ in both sides of (  , where, in the second to last line, we have applied Young's inequality with exponents ( ϕ 2−p , ϕ ϕ−2+p ).Note that the preceding inequality holds for any s ∈ (0, 1).Hence, we can absorb the essential supremum on the right-hand side using Lemma 2.2 with ρ 0 = sr and ρ 1 = r.This yields ess sup .
We have thus proved the following result, which ensures the desired local L ∞ -bound of Du ε for all p > 2n n+2 .
Theorem 6.4.Let p > 2n n+2 , δ > 0 and 0 < ε < min{ 1 2 , ( δ 4 ) p−1 }, where n is defined according to (1.3) − (1.4).Moreover, assume that is the unique energy solution of problem (P ε ) with Q ′ = Q R (z 0 ) ⋐ Ω T and u a weak solution of (1.1).Then, for any parabolic cylinder Q r (z 1 ) ⋐ Q R (z 0 ) with r ∈ (0, 1) and any s ∈ (0, 1), we have ess sup Step 4: the iteration for 1 < p ≤ 2n n+2 .We now start the iteration procedure for 1 < p ≤ 2n n+2 , keeping both the assumptions and the notations used in Propositions 6.2 and 6.3.Note that, by the choice of β in (1.4), the condition 1 < p ≤ 2n n+2 implies that n ≥ 3, and therefore n = n.We will however restrict ourselves to the case f = 0. We again define by induction a sequence {γ k } k∈N 0 by letting γ 0 = 0 and This sequence diverges to +∞ as k → ∞ and, by induction, we have Moreover, one can easily check that which holds for any k ∈ N 0 .Exploiting the properties of η k and ω k as well as the definition of Q k , for every k ∈ N 0 we get r 2 Qr(z 1 ) where, in the last line, we have used (6.45).Combining the two previous inequalities with (6.44) and (6.46), and using the fact that r k ≤ 2r k+1 and r k ≤ r for every k ∈ N 0 , we find for any k ∈ N 0 , where C ≡ C(N, n, α, p, δ, C 1 , K) > 1 and Θ ≡ Θ(n, α) > 0. To shorten our notation, we now set (6.63) Thus, recalling the definition (6.48), the preceding inequality turns into for k ≥ 0. Note that the constant c ε depends on ε and r through the quotient Iterating the above estimate, we obtain for any k ∈ N.This inequality can be rewritten as follows: Now we use that 1 < γ j + p = p( n+2 n ) j for any j ∈ N 0 to estimate the following product by means of (2.24) and (2.25): Finally, let us consider the case f = 0 and 1 < p ≤ 2n n+2 .Arguing as above, but this time using Theorem 6.5 instead of Theorem 6.4, we get ess sup |Du| ≤ ess sup This concludes the proof.

6
.57) Combining estimates (6.54)−(6.57)and recalling the definitions of c s and c r in (6.49), we obtain from (