On the H\"older regularity of signed solutions to a doubly nonlinear equation

We establish the interior and boundary H\"older continuity of possibly sign-changing solutions to a class of doubly nonlinear parabolic equations whose prototype is \[ \partial_t\big(|u|^{p-2}u\big)-\Delta_p u=0,\quad p>1. \] The proof relies on the property of expansion of positivity and the method of intrinsic scaling, all of which are realized by De Giorgi's iteration. Our approach, while emphasizing the distinct roles of sub(super)-solutions, is flexible enough to obtain the H\"older regularity of solutions to initial-boundary value problems of Dirichlet type or Neumann type in a cylindrical domain, up to the parabolic boundary. In addition, based on the expansion of positivity, we are able to give an alternative proof of Harnack's inequality for non-negative solutions. Moreover, as a consequence of the interior estimates, we also obtain a Liouville-type result.


INTRODUCTION AND MAIN RESULTS
Let E be an open set in R N . For T > 0 let E T denote the cylindrical domain E × (0, T ]. We shall consider quasi-linear, parabolic partial differential equations of the form where the function A(x, t, u, ζ) : E T × R N +1 → R N is only assumed to be measurable with respect to (x, t) ∈ E T for all (u, ζ) ∈ R × R N , continuous with respect to (u, ζ) for a.e. (x, t) ∈ E T , and subject to the structure conditions where C o and C 1 are given positive constants, and p > 1. The prototype equation is Here ∆ p := div(|Du| p−2 Du) is the p-Laplace operator. When p = 2 it becomes the heat equation.
The motivations to study such an equation will be explored in Section 1.3. We however proceed to present our main results on the interior regularity in Section 1.1 and the boundary regularity in Section 1.2.
When we speak of the structural data, we refer to the set of parameters {p, N, C o , C 1 }. We also write γ as a generic positive constant that can be quantitatively determined a priori only in terms of the data and that can change from line to line. If θ = 1, we simply write Q ̺ .
We postpone the formal definition of local weak solution to Section 1.4. It is however noteworthy to mention here that local boundedness of local weak solutions is inherent in our notion of local solution (cf. Section 1.4.2). Thus we may always work with locally bounded solutions. Now we state our main result concerning the interior Hölder continuity of weak solutions to (1.1), subject to the structure conditions (1.2). Theorem 1.1. Let u be a bounded, local, weak solution to (1.1) -(1.2) in E T . Then u is locally Hölder continuous in E T . More precisely, there exist constants γ > 1 and β ∈ (0, 1) that can be determined a priori only in terms of the data, such that for every compact set K ⊂ E T , 1.2. Boundary Regularity. We will establish regularity of weak solutions to (1.1) -(1.2) up to the lateral boundary S T := ∂E × (0, T ], provided the solution satisfies proper Dirichlet or Neumann boundary data and ∂E possesses certain geometry or smoothness. Likewise, regularity of weak solutions up to the initial level t = 0 can also be obtained, provided the given initial value is regular enough.
The arguments employed will be local in nature. As a result, it suffices to require the boundary data to be taken just on a portion of the parabolic boundary. Nevertheless, we choose to present the results globally for simplicity, in terms of initial-boundary value problems.
To this end, let us first consider formally the following initial-boundary value problem of Dirichlet type: there exists α * ∈ (0, 1) and ̺ o > 0, such that for all x o ∈ ∂E, for every cube K ̺ (x o ) and 0 < ̺ ≤ ̺ o , there holds Intuitively, this means one can place an exterior cone whose vertex is attached to x o (uniformly with respect to x o ).
Next, we consider the Neumann problem. In order to deal with possible variational data on S T , we assume ∂E is of class C 1 , such that the outward unit normal, which we denote by n, is defined on ∂E. Let us consider the initial-boundary value problem of Neumann type: A(x, t, u, Du) · n = ψ(x, t, u) on S T , where the structure conditions (1.2) and the initial condition (I) are retained. On the Neumann datum ψ we assume for simplicity that, for some absolute constant C 2 , there holds (N) |ψ(x, t, u)| ≤ C 2 for a.e. (x, t, u) ∈ S T × R.
More general conditions should also work (cf. Sec. 2, Chap. II, [5]). The formal definitions of weak solutions to (1.4) and (1.6) will be given in Section 1.4. Now we are ready to present the results concerning regularity of solutions to (1.4) or (1.6) up to the parabolic boundary Γ.

Near S T -Neumann Type Data.
Theorem 1.4. Let u be a bounded weak solution to the Neumann problem (1.6). Assume ∂E is of class C 1 and (N) holds. Then u is Hölder continuous in any compact set K ⊂ E T . More precisely, there exist constants γ > 1 and β ∈ (0, 1) determined by the data, C 2 , dist(K; {t = 0}) and the structure of ∂E, such that for every pair of points (x 1 , t 1 ), (x 2 , t 2 ) ∈ K.
1.3. Novelty and Significance. The equation (1.1) -(1.2) has been referred to as a doubly nonlinear parabolic equation in the literature, due to the nonlinearity of both the solution and its spatial gradient. It is a particular form of a more general equation whose prototype is The interest in such an equation stems from its mathematical structure, in understanding doubly nonlinear phenomena that generate mixed types of degeneracy and/or singularity in partial differential equations, and its connection to physical models, including dynamics of glaciers ( [26]), shallow water flows ( [2,9,14]) and friction dominated flow in a gas network ( [23]). In particular, the prototype equation (1.3) is naturally connected to the nonlinear eigenvalue problem −∆ p u = λ|u| p−2 u (cf. [25]), which plays an important role in the nonlinear potential theory.
The equation (1.1) -(1.2) has been observed by Trudinger ([27]), via Moser's iteration, to possess a Harnack inequality for non-negative solutions, analogous to the one for the heat equation. See also [12,18]. Such a Harnack inequality has been used to establish the interior Hölder regularity for non-negative solutions in [20,21].
Our main contribution is to remove the sign restriction on solutions for the Hölder regularity to hold. The Harnack inequality seems not applicable in this setting due to changing signs of solutions and the power-like nonlinearity with respect to the solution itself. Instead, we employ a more basic tool -expansion of positivity -to handle the current situation. Our approach emphasizes the different roles played by sub-solutions and super-solutions. As a by-product, the expansion of positivity also leads to an alternative proof of the Harnack inequality. See Appendix B. The interior estimates also give us a Liouville type result for global solutions, which seems new in the literature. Moreover, our approach is flexible enough to obtain the Hölder regularity of solutions to the initial-boundary value problems of both Dirichlet type and Neumann type, up to the parabolic boundary. As far as we know, the boundary regularity has not ever been dealt with in the literature even in the case of non-negative solutions.
Our proofs of Hölder regularity -interior or boundary -all unfold along two main cases, i.e., when the solution is close to zero or when it is away from zero, through comparisons between the oscillation and the supremum/infimum of the solution. In the first case, we will take advantage of the scaling invariant property of the equation and obtain the expansion of positivity -Proposition 4.1 -without intrinsic scaling techniques. This treatment parallels the classical parabolic theory (p = 2) in [22], the new input being that we need to trace the competition between the oscillation and the extrema of the solution (see Remark 4.4). Whereas in the second case, the solution behaves like the one to the parabolic p-Laplacian equation, i.e. u t = ∆ p u. Thus this latter case hinges upon the possibility to treat such a degenerate (p > 2) or singular (1 < p < 2) equation, for which we exploit the existing theory in [5,8].
The Hölder regularity for doubly nonlinear equations has also been considered in [15,16,17,28,29], under various conditions on the structure of the equation. The local regularity theory for the doubly nonlinear equation (1.7) seems fragmented and it deserves future investigations.
. This guarantees that all the integrals in (1.9) are convergent.
A function u that is both a local weak sub-solution and a local weak super-solution to (1.1) -(1.2) is a local weak solution.

Notion of Parabolicity and Local Boundedness of Solutions. For any
Accordingly, we notice that Using ( the function k ± (u − k) ± is a local weak sub(super)-solution, for all k ∈ R.
We will give a proof of this claim in Appendix A. In particular, when u is a local weak solution, u + and u − are non-negative, local weak sub-solutions to (1.1) -(1.2). Since it has been shown that non-negative, local sub-solutions are locally bounded, we may always work with locally bounded solutions. See [12,18] in this regard.

Notion of Solution to the Dirichlet Problem. A function
for all non-negative test functions . Moreover, settingp := min{2, p}, the initial datum is taken in the sense that for any compact The Dirichlet datum g is attained under u ≤ (≥)g on ∂E in the sense that the traces of (u − g) ± vanish as functions in W 1,p (E) for a.e. t ∈ (0, T ], i.e. (u − g) ± ∈ L p (0, T ; W 1,p o (E)). Notice that no a priori information is assumed on the smoothness of ∂E.
A function u that is both a weak sub-solution and a weak super-solution to (1.4) is a weak solution.

Notion of Solution to the Neumann Problem. A function
is a weak sub(super)-solution to (1.6), if for every compact set K ⊂ R N and every sub-interval Here dσ denotes the surface measure on ∂E. The Neumann datum ψ is reflected in the boundary integral on the right-hand side. Moreover, the initial datum is taken as in the Dirichlet problem.
A function u that is both a weak sub-solution and a weak super-solution to (1.6) is a weak solution.

SOME TECHNICAL TOOLS
For k, w ∈ R we define two quantities Note that g ± (w, k) ≥ 0. For b ∈ R and α > 0, we will embolden b α to denote the signed α-power of b as The following lemma can be found in the literature; cf. [1, Lemma 2.2] for α ∈ (0, 1) and [13, inequality (2.4)] for α > 1.
Lemma 2.1. For any α > 0, there exists a constant γ = γ(α) such that, for all a, b ∈ R, the following inequality holds true: Based on Lemma 2.1, we prove the following.
Lemma 2.2. There exists a constant γ = γ(p) such that, for all w, k ∈ R, the following inequality holds true: Proof. We only consider g − , since the estimate for g + is analogous. If k ≤ w, then g − (w, k) = 0 = (w − k) − . Therefore it is enough to consider w, k ∈ R with w < k. Here, we have Note that p − 2 > −1 and therefore the integral on the right-hand side exists. With Lemma 2.1 we thus obtain In the last line we have used 1 2 (|w| + |k|) ≤ |w| + 1 2 |k + w| ≤ 2(|w| + |k|). This proves the lower bound for g − (w, k). For the upper bound we again apply Lemma 2.1 and obtain This finishes the proof of the lemma.
The time derivative of a weak solution exists in the sense of distribution only. However we often need to use u in the test function and thus the term u t appears in the integral weak formulation of solution, which is not granted by the preset notion of solution. In order to overcome the lack of regularity in the time variable, we define the following mollification in time: Properties of this mollification can for instance be found in [19].

ENERGY ESTIMATES
In this section we exploit the property of weak sub(super)-solutions in order to deduce certain energy estimates. We emphasize the different roles played by sub-solutions and super-solutions. When we state "u is a sub(super)-solution..." and use " ± " or " ∓ " in what follows, we mean the sub-solution corresponds to the upper sign and the super-solution corresponds to the lower sign in the statement.
First of all, we present energy estimates for local weak sub(super)-solutions defined in Section 1.4.1.
Proof. We only consider the case of a local weak sub-solution. Recall the definition (2.1) for the α-power of a possibly negative number and define w h via w p−1 h := u p−1 h , where · h denotes the time mollification from (2.2). From the weak form of the differential inequality for sub-solutions we deduce the mollified version (cf. [19]) Now, we choose in (3.1) the testing function In the following we omit in the notation the reference to the center z o = (x o , t o ). Moreover, we observe that which can be seen by substituting σ := s p − 1 . Note that the mapping R ∋ s → φ(s) = s p − 1 is increasing with derivative φ ′ (s) = (p − 1)|s| p−2 (s = 0 in the case p < 2). For the integral in (3.1) containing the time derivative we computë Here we used in the second line the identity and the fact that the map τ → (τ 1 p−1 − k) + is a monotone increasing function, implying that the term in the second last line of the above inequality is non-negative. Since w p−1 h = u p−1 h → u p−1 in L p p−1 (Ω T ) we can pass to the limit h ↓ 0 in the integral on the right-hand side. We therefore get with the obvious meaning of I ε and II ε . We now pass to the limit ε ↓ 0. For the term I ε we obtain for any t o − S < t 1 < t 2 < t o that Next, we observe that the boundary term in (3.1) disappears as h ↓ 0, since by construction ϕ(·, 0) ≡ 0 on E, i.e. we have It remains to consider the diffusion term. After passing to the limit h ↓ 0, we use the ellipticity and growth assumption (1.2) for the vector-field A, and subsequently Young's inequality to the integral containing (u − k) + and D(u − k) + . In this way we obtain one term that we can absorb in the term arising from the ellipticity condition, the other one is shifted later on to the right hand side.
Combining the preceding estimates and letting ε ↓ 0 we arrive at The constant γ in the first integral on the right hand side depends only on p, C o and C 1 . At this point, a standard argument finishes the proof. We first pass in the last inequality to the limit t 1 ↓ t o − S. This poses no problem since u ∈ C 0, T ; L p loc (E) . We obtain Here, we discard the second integral on the left-hand side and take then the essential supremum with respect to t 2 ∈ (t o − S, t o ). This leads to an estimate of the essential supremum of the first integral. On the other hand, discarding the first integral and passing to the limit t 2 ↑ t o we deduce a similar estimate for the second integral of the left-hand side. Together, this gives ess sup to−S<t<toˆK R×{t} This finishes the proof of the energy estimate.
Next, we consider the situation near the initial level t = 0 when a continuous datum u o is prescribed. We work in a cylinder K R (x o ) × (0, S) ⊂ E T , which lies on the bottom of E T . Let ζ be a non-negative, piecewise smooth cutoff function that is independent of t and vanishes on ∂K R (x o ). The conclusion of Proposition 3.1 holds in any cylinder satisfying Then in view of the initial datum u o being taken in the topology of Lp loc (E) and letting t 1 ↓ 0, it is not hard to verify that the space integral on the right-hand side at the time level t 1 will tend to zero. Consequently, we arrive at (3.4) and every non-negative, piecewise smooth cutoff function ζ independent of t and vanishing on ∂K R (x o ), there holds Now we turn our attention to the energy estimates near S T . We first deal with Dirichlet data. When we run the calculation in the proof of Proposition 3.1 within we need to assume certain restrictions on the level k, i.e., In such a way, the test functions in (3.2) . This fact does not require any smoothness of ∂E (cf. [11, Lemma 2.1]). As a result, we have Finally, we deal with the energy estimates for the Neumann problem (1.6). Like before, we consider the problem in Q R, . For a cutoff function ζ as in Proposition 3.3, a similar procedure as in the proof of Proposition 3.1 will give us that ess sup Now we make use of (N), apply the trace inequality (cf. [4, Proposition 18.1]) for each time slice and then integrate in time, and use Young's inequality to estimate the boundary integral: Hence, collecting the above estimates we arrive at

EXPANSION OF POSITIVITY
We first introduce the notation that is used throughout this section. For a compact set K ⊂ R N and a cylinder Q We also assume (x o , t o ) ∈ Q, such that the forward cylinder Next we state our main proposition of this section, which will be the main ingredient in the proof of Theorem 1.1.

Proposition 4.1. Let u be a locally bounded, local, weak sub(super)-solution to
There exist constants ξ, δ and η in (0, 1) depending only on the data and α, such that either The proof of Proposition 4.1 is a straightforward consequence of Lemmas 4.1 -4.3 in the following sections. Before presenting proofs, some remarks are in order.
Remark 4.1. By repeated applications of Proposition 4.1, we could conclude that for an arbitrary A > 1, there exists someη ∈ (0, 1) depending on α, the data and also on A, such that provided this cylinder is included in Q and |µ ± | < ξM , where ξ is the parameter from Proposition 4.1.
Remark 4.2. Proposition 4.1 exhibits the spread of pointwise positivity both in time and in space. Nonetheless, in the proof of Hölder regularity we only need the positivity in a smaller cube than the one of the initial measure information assigned. Incidentally, Proposition 4.1 can be exploited to give an alternative proof of Harnack's inequality for non-negative solutions (cf. [12,18,27]). We will present it in Appendix B. We mention that a proposition of similar type has been used in [24] to deal with the Hölder regularity for the porous medium type equation.
Remark 4.4. Up to a proper adjustment of coefficients, the prototype equation of (1.1) -(1.2) can be written in a formal way as |u| p−2 u t = ∆ p u. Seemingly the equation is homogeneous in u, and cylinders of the type Q ̺ = K ̺ × (−̺ p , 0] appear to be the correct ones to examine the equation. However, a careful inspection of proofs of Lemma 4.1 -4.3 will reveal a subtle yet crucial difference between the role of u in the absolute value on the left-hand side and that of others. Loosely speaking, |u| represents the extrema, while other u's stand for the oscillation. Notice also that when we apply Proposition 4.1 to prove Theorem 1.1, the typical M will be aω for some a in (0, 1). Under this point of view, the above equation can be interpreted in a probably improper but heuristic manner that assuming the symbols are self-suggestive. This hints the correct cylinder to examine the equation is actually The time scaling by θ reflects the competition between [ω] and [µ] via either-or alternatives. A closer inspection of the proofs of Lemma 4.1 and Lemma 4.3 shows it is required that θ ≃ 1, while the proof of Lemma 4.2 uses θ 1 only. This explains why the quantities µ ± enter into Proposition 4.1 via an either-or form, such that they are comparable with ξω for ξ = 2η when p > 2, whereas ξ = 8 suffices when 1 < p < 2.

Propagation of Positivity in Measure.
Lemma 4.1. Let M > 0 and α ∈ (0, 1). Then, there exist δ and ε in (0, 1), depending only on the data and α, such that whenever u is a locally bounded, local, weak sub(super)-solution to Proof. We only show the case of super-solutions, the other case of sub-solutions being similar.
Assume (x o , t o ) = (0, 0) and |µ − | ≤ 8M . Otherwise there is nothing to prove. Use the energy estimate in Proposition 3.1 in the cylinder Q = K ̺ × (0, δ̺ p ], with k = µ − + M and choose a standard non-negative cutoff function ζ(x, t) ≡ ζ(x) independent of time that equals 1 on K (1−σ)̺ with σ ∈ (0, 1) to be chosen later and vanishes on ∂K ̺ satisfying |Dζ| ≤ (σ̺) −1 ; in such a case, we have for all 0 < t < δ̺ p , that In order to estimate the first integral on the right-hand side, we take into consideration the measure theoretical information at the initial time t = 0 and the fact u ≥ µ − . This leads tô The second term on the right-hand side of (4.3) is estimated bÿ The left-hand side of the energy estimate (4.3) can be bounded from below bŷ K̺×{t}ˆk u with ε ∈ (0, 1 2 ) to be chosen later. Due to Lemma 2.2 and the fact that Collecting all the above estimates yields that for a constant γ = γ(p, C o , C 1 ). The fractional number in the preceding inequality can be rewritten in the form At this stage, we need to bound I ε from above. Keeping in mind |µ − | ≤ 8M and |k ε | ≤ 9M and applying Lemma 2.1, we havê This allows us to choose the various parameters quantitatively. Indeed, we may choose ε ∈ (0, 1) small enough such that This fixes ε as a constant depending only on p and α. Next, we define σ := α 8N . Finally, we choose δ ∈ (0, 1) small enough so that Note that this specifies δ as a constant depending on the data and α. With these choices we have This proves the asserted propagation of positivity (4.2), as long as There exists γ > 0 depending only on the data and α, such that for any positive integer j * , if while in the case p > 2, the same conclusion holds provided |µ ± | < εM 2 −j * .
Proof. We only show the case of super-solutions, the case of sub-solutions being similar. Moreover, we assume (x o , t o ) = (0, 0). We employ the energy estimate in Proposition 3.1 in K 8̺ × (0, δ̺ p ] with levels and introduce a cutoff function ζ in K 8̺ (independent of t) that is equal to 1 in K 4̺ and vanishes on ∂K 8̺ , such that |Dζ| ≤ ̺ −1 . Then, we obtain Now we treat the individual terms of the right side separately. We begin with the first one. Due to Lemma 2.2 we have for a constant γ depending only on p. This implies in particular that holds true in any case. In the second integral appearing on the right-hand side of the energy estimate, we utilize the bound (u − k j ) − ≤ εM 2 −j . Therefore, in all cases the above estimate yields¨ Next, we apply [5, Chapter I, Lemma 2.2] slice wise to u(·, t) for t ∈ (0, δ̺ p ] over the cube K ̺ , for levels k j+1 < k j . Taking into account the measure theoretical information this gives Here we used in the last line the short hand notation A j (t) := u(·, t) < k j ∩ K 4̺ . We now integrate the last inequality with respect to t over (0, δ̺ p ] and apply Hölder's inequality in time.
With the abbreviation A j = {u < k j } ∩ Q this procedure leads to Recall that δ depends on the data and α. Therefore γ depends on only the data and α. Now take the power p p−1 on both sides of the above inequality to obtain Add these inequalities from 0 to j * − 1 to obtain From this we conclude This completes the proof.

4.3.
A DeGiorgi-type Lemma. Here we prove a DeGiorgi-type Lemma on cylinders of the form Q ̺ (θ). In the application θ will be a universal constant depending only on the data, in particular θ will be independent of the solution.
There exists a constant ν ∈ (0, 1) depending only on the data and θ, such that if then either We prove the case of super-solutions only, the case of sub-solutions being similar. Assume (x o , t o ) = (0, 0) and |µ − | ≤ 8M . Otherwise there is nothing to prove. In order to employ the energy estimate in Proposition 3.1, we notice first that due to Lemma 2.2 we have for any non-negative piecewise smooth cutoff function ζ vanishing on the parabolic boundary of Q ̺ (θ). In order to use this energy estimate, we set (4.5) Recall that Q ̺n (θ) = K n × (−θ̺ p n , 0] and Q̺ n (θ) = K n × (−θ̺ p n , 0]. Introduce the cutoff function 0 ≤ ζ ≤ 1 vanishing on the parabolic boundary of Q n and equal to identity in Q n , such that |Dζ| ≤ γ 2 n ̺ and |ζ t | ≤ γ 2 pn θ̺ p .
In this setting, the energy estimate may be written as where γ depends on the data and θ. Here, we used On the other hand, we recall |µ − | ≤ 8M , so that u ≤k n implies |u| + |k n | ≤ 18M and |u| + |k n | ≥ k n − u ≥ k n −k n = 2 −(n+3) M . Inserting this above, we find that Now setting 0 ≤ φ ≤ 1 to be a cutoff function which vanishes on the parabolic boundary of Q n and equals the identity in Q n+1 , an application of the Hölder inequality and the Sobolev imbedding [5, Chapter I, Proposition 3.1] gives that In the second last line we used the above energy estimate. In terms of Y n = |A n |/|Q n |, this can be rewritten as for a constant γ depending only on the data and with b ≡ 2 . Hence, by [5, Chapter I, Lemma 4.1], there exists a positive constant ν depending only on the data, such that Y n → 0 if we require that Y o ≤ ν, which is the same as assuming Since Y n → 0 in the limit n → ∞ we have  (0, 0). By δ, ε ∈ (0, 1) and γ > 0 we denote the corresponding constants from Lemma 4.1 and Lemma 4.2 depending on the data and α and by ν ∈ (0, 1) we denote the constant from Lemma 4.3 applied with θ = δ. Then, ν depends on the data and α. Next, we choose an integer j * in such a way that γ Then, j * depends only on the data and α. We let ξ = 8 if 1 < p ≤ 2 and ξ = ε2 −j * if p > 2.
In the following we may assume that |µ − | ≤ ξM , since otherwise there is nothing to prove. Applying in turn Lemma 4.1 and Lemma 4.2 we infer that This proves the assertion of Proposition 4.1 for η = ε 2 j * +1 depending only on the data and α. Let us point out in fact we have chosen ξ = 2η when p > 2.

PROOF
We may assume that (x o , t o ) coincides with the origin. Set Our proof unfolds along two main cases, namely (5.1) when u is near zero: µ − ≤ ω and µ + ≥ −ω; when u is away from zero: µ − > ω or µ + < −ω.
Note that (5.1) 1 is equivalent to the condition that −2ω ≤ µ − ≤ µ + ≤ 2ω and therefore |µ ± | ≤ 2ω. When this case holds, we deal with it in Section 5.2 via a simple application of Proposition 4.1, thanks to the possibility to choose ξ = 8. No intrinsic scaling whatsoever is used in Section 5.2, though it seems unavoidable when we deal with the second case of (5.1) 2 in Section 5.3

Reduction of Oscillation Near Zero.
In this section assume that the first case in (5.1) holds. Observe that one of the following must be true: either Since both cases can be treated similarly, we restrict ourselves to the case (5.2). As mentioned above, |µ ± | ≤ 2ω always holds. An application of Proposition 4.1 (note also Remark 4.1) gives η ∈ (0, 1) depending only on the data, such that This yields a reduction of oscillation, i.e. we have Now we may proceed by induction. Suppose up to i = 1, 2, · · · j − 1, we have built   For all the indices i = 1, 2, · · · j − 1, we alway assume the first case in (5.1) holds, i.e., µ − i ≤ ω i and µ + i ≥ −ω i . In this way the argument at the beginning can be repeated and we have for all i = 1, 2, · · · j, ess osc Consequently, iterating the above recursive inequality we obtain for all i = 1, 2, · · · j,

Reduction of Oscillation Away From Zero.
In this section, let us suppose j is the first index satisfying the second case in (5.1), i.e., either µ − j > ω j or µ + j < −ω j . Let us treat for instance µ − j > ω j , for the other case is analogous. We observe that since j is the first index for this to happen, one should have µ − j−1 ≤ ω j−1 . Moreover, one estimates As a result, we have The condition (5.4) indicates that starting from j the equation (1.1) resembles the parabolic p-Laplacian type equation in Q j . Therefore, the reduction of oscillation hinges upon the possibility to treat the parabolic p-Laplacian type equation. To render this technically, we drop the suffix j from our notation temporarily for simplicity, and introduce v : It is straightforward to verify that v satisfies where, for (x, t) ∈ Q, v ∈ R and ζ ∈ R N , we have defined which is subject to the structure conditions Moreover, In order to use the known regularity theory for the parabolic p-Laplacian (see [5,8] for an account of the theory), it turns out to be more convenient to consider the equation satisfied by w := v p−1 , i.e.
where for (x, t) ∈ Q, y ∈ R and ζ ∈ R N , we have defined In the last line we used the abbreviation y def = min max y, 1 2 , 2 p . It is easy to see that w belongs to the same kind of functional space (1.8) as u and v due to (5.5) which yields 1 ≤ w ≤ 2 p−1 in Q. Employing (5.5) again one can verify that there exist absolute positive constants C o = γ o (p)C o and C 1 = γ 1 (p)C 1 , such that (5.7) A(x, t, y, ζ) · ζ ≥ C o |ζ| p and | A(x, t, y, ζ)| ≤ C 1 |ζ| p−1 , for a.e. (x, t) ∈ Q, any y ∈ R, and any ζ ∈ R N . In other words, w is a local weak solution to the parabolic p-Laplacian type equation. First proved in [6] for p > 2 and then in [3] for 1 < p < 2, the power-like oscillation decay for solutions to this kind of degenerate or singular parabolic equation is well known by now. The proofs in [3,6] exploit the idea of intrinsic scaling. We state the conclusion in the following proposition in a form that favors our application, and refer to the monograph [5] for a comprehensive treatment of this issue. If for some constant σ in (0, 1), there holds then, there exist constants β 1 in (0, 1) and γ > 1 depending only on the data N, p, C o , C 1 and σ, such that for all 0 < r < ̺, we have ess osc Remark 5.1. This proposition has been stated for all p > 1. However the proofs in [6] for p > 2 and in [3] for 1 < p < 2 are remarkably different. We mention a recent attempt in [24] to find a unified approach.
To use this proposition properly when 1 < p < 2, we first check the condition (5.8) is satisfied. Indeed, recalling v = u/µ − , w = v p−1 and ω = ess osc Q u, we first use (5.5) and the mean value theorem to obtain Since ess osc Q v = ω/µ − , this amounts to Then by (5.4), we have Thus the condition (5.8) in Proposition 5.1 is fulfilled for σ = 1. As a result, the conclusion of Proposition 5.1 is obtained. Moreover, the above lower bound of ω actually allows us to obtain the set inclusion Using this set inclusion and rephrasing the oscillation decay of Proposition 5.1 in terms of u, we have for all 0 < r < ̺, ess osc Now we revert to using the suffix j. The above oscillation estimate reads: for all 0 < r < ̺ j , we have (5.9) ess osc Combining (5.3) and (5.9), we arrive at the desired conclusion, i.e., for all 0 < r < ̺, there holds ess osc A proper rescaling gives the oscillation decay in Remark 1.1 and finishes the proof of Theorem 1.1 in the case 1 < p < 2.
6. PROOF OF THEOREM 1.1 WHEN p > 2 Let A ≥ 1 to be determined later in terms of the data and ̺ > 0 be so small that We may assume that (x o , t o ) coincides with the origin. Set Like when 1 < p < 2, our proof unfolds along two main cases, namely (6.1) when u is near zero: µ − ≤ ξω and µ + ≥ −ξω; when u is away from zero: µ − > ξω or µ + < −ξω.
Strictly speaking, the above ξ should be ξ/8 for ξ chosen as in Proposition 4.1 (note also Remark 4.1) depending on the data, A and α = 1 2 ν, whereas ν is the absolute constant determined in Lemma 4.3 with θ = 1 there. It will be clear shortly from the proof where the various dependences come from. Meanwhile, we will keep using ξ to denote ξ/8 for ease of notation bearing in mind the actual meaning of ξ, and this substitution will not spoil our reasoning in the following.
When p > 2, the number ξ from Proposition 4.1 is in general a very small number. This brings additional technical complication to reducing the oscillation in the first case of (6.1). As we will see in Section 6.2 and Section 6.3, the method of intrinsic scaling is employed. Whereas in Section 6.5 where we deal the second case of (6.1), the treatment more or less parallels Section 5.3 for 1 < p < 2.

Reduction of Oscillation Near Zero-Part I.
In this section we assume the first case of (6.1) holds. We work with u as a super-solution near its infimum.
Suppose that for somet ∈ − (A − 1)̺ p , 0 , where ν is the absolute constant appearing in Lemma 4.3. Taking M = 1 4 ω, then according to Lemma 4.3, we have u ≥ µ − + 1 8 ω a.e. in (0,t) + Q 1 2 ̺ , since the other alternative, i.e., |µ − | ≥ 2ω, does not hold due to (6.1) 1 . An application of Proposition 4.1 (note also Remark 4.1) applied with 2 p A instead of A gives ξ, η ∈ (0, 1) depending only on the data and A, such that either |µ − | > ξω or If the above line holds, we immediately obtain a reduction of oscillation, i.e. we have The case µ − > ξω does not hold due to (6.1) 1 . Therefore it remains to deal with the case µ − < −ξω. Due to the restriction on µ + in (6.1) 1 , we must also have µ − > −2ω. Thus, we proceed further with the assumptions In the next lemma we establish that the pointwise information in (6.4) 2 propagates to the top of the cylinder Q o . Lemma 6.1. Suppose the hypothesis (6.4) holds. Then there exists a constant η 1 ∈ (0, 1) depending on ξ, A and the data, such that As a result, we have a reduction of oscillation Proof. For ease of notation we sett− ( 1 2 ̺) p = 0. Define k n ,k n , ̺ n ,̺ n , K n and K n , according to (4.5) (cf. the proof of Lemma 4.3) with M and ̺ replaced by 2η 1 ω and 1 2 ̺ respectively, for some 0 < η 1 < 1 8 ξ and θ > 0 to be determined later. The only difference is that now the cylinders Q n and Q n are of forward type whose vertices are attached to the origin, i.e., Q n = K n × (0, θ̺ p n ] and Q n = K n × (0, θ̺ p n ]. Since we know the "initial datum" at t = 0 as in (6.4) 2 , we may choose a cutoff function ζ in K n independent of t, such that it equals 1 on K n and vanishes on ∂K n , satisfying |Dζ| ≤ γ2 n ̺ −1 . Note that the boundary term at t = 0 on the right-hand side of the energy inequality vanishes on K n , since a.e. on K 1 2 ̺ . This requires 2η 1 < 1 8 . In this way, the terms on the right-hand side of the energy estimate in Proposition 3.1 involving the initial time and ζ t vanish. Thus, using the condition −2ω < µ − < −ξω, which leads to a lower bound for the sup-term in the energy estimate, we obtain Now setting ζ to be a cutoff function which vanishes on the parabolic boundary of Q n and equals identity in Q n+1 , an application of the Sobolev imbedding [5, Chapter I, Proposition 3.1] with q = p N +2 N and m = 2 gives that Setting Y n = |A n |/|Q n |, we arrive at The meaning of b is clear in this context. Note that b and γ depend only on the data. Hence by [5, Chapter I, Lemma 4.1], there exists a constant ν o ∈ (0, 1) depending only on the data, such that Y n → 0 in the limit n → ∞ if we require that To finish the proof, we fix θ = 2 p A and choose η 1 so small that ν o Together with the former bound for η 1 determined in the course of the proof, we have to require that This proves the asserted claim. 6.3. Reduction of Oscillation Near Zero-Part II. In this section we still assume the first case of (6.1) holds. However, now we work with u as a sub-solution near its supreme.
Due to the restriction on µ − in (6.1) 1 , we must have µ + ≤ 2ω. Thus our assumptions for the following Sections 6.3.1 -6.3.3 are
A similar consideration as in Lemma 4.1 then gives The fractional number of integral on the right can be rewritten as We estimate by using ξω ≤ µ + ≤ 2ω and k ≥ 1 2 ξω to obtain the bound I ε ≤ γε, where γ depends only on p. Inserting this above leads to the inequality This fixesε in dependence on p and ν. Then we fix σ := ν 16N and choose δ small enough to have γδ σ p ≤ 1 16 ν. Finally, the paramter ε is chosen such that δε 2−p ≥ 1. The proof can now be finished by redefiningεε as ε.
Sincet is arbitrary, we actually obtain the measure theoretical information

6.3.2.
Shrinking the Measure Near the Supremum. By ε ∈ (0, 1) we denote the constant from Lemma 6.2 depending only on the data. The number A is still to be determined. We choose A in the form A = 2 j * (p−2) + 1 with some j * to be fixed later and define Q ̺ (θ) = K ̺ × (−θ̺ p , 0] with θ = 2 j * (p−2) . Lemma 6.3. Suppose (6.7) and (6.9) hold. There exists γ > 0 depending only on the data, such that for any positive integer j * , we have Proof. For j = 0, . . . , j * − 1 we employ the energy estimate in Proposition 3.1 in the cylinder K 2̺ × (−θ̺ p , 0] with levels k j = µ + − 2 −j εω and a time independent cutoff function ζ(x, t) ≡ ζ(x), such that ζ equals 1 in K ̺ , vanishes on ∂K 2̺ , and such that |Dζ| ≤ 2̺ −1 . Then, we obtain The first term on the right is estimated by Lemma 2.2 and using ξω ≤ µ + ≤ 2ω. We obtain In the second last line we used the fact that the paramter ε is already fixed in dependence on the data. Hence, the above energy estimate yields Next, we apply [5, Chapter I, Lemma 2.2] slicewise to u(·, t) for t ∈ (−θ̺ p , 0] over the cube K ̺ , for levels k j+1 > k j and take into account the measure theoretical information from (6.9), i.e. that u(·, t) ≤ µ + − εω ∩ K ̺ ≥ 1 4 ν|K ̺ | for all t ∈ (−θ̺ p , 0]. This leads to In the last line we used the abbreviation A j (t) := u(·, t) < k j ∩ K ̺ . We now integrate the preceding inequality with respect to t over (−θ̺ p , 0] and apply Hölder's inequality slice-wise.
Now take the power p p−1 on both sides. This gives To finish the proof, we proceed exactly as in the proof of Lemma 4.2. We add the inequalities with respect to j from 0 to j * − 1 and obtain from which we deduce the claim, i.e. that This completes the proof.
Introduce the cutoff functions ζ vanishing on the parabolic boundary of Q n and equal to identity in Q n , such that |Dζ| ≤ γ 2 n ̺ and |ζ t | ≤ γ 2 pn θ̺ p . Thus, using again the condition ξω ≤ µ + ≤ 2ω to estimate the terms in the energy inequality (see the proof of Lemma 4.3) we obtain where we abbreviated A n = u > k n ∩ Q n .
The constant γ depends on the data and ξ. The latter dependence enters due to the estimate from below of the sup-term in the energy inequality. Note that ξ is already determined in dependence on the data. Now setting ζ to be a cutoff function which vanishes on the parabolic boundary of Q n and equals identity in Q n+1 , an application of the Sobolev imbedding [5, Chapter I, Proposition 3.1] and the preceding estimate imply that Setting Y n = |A n |/|Q n |, we arrive at where b = 4 p and γ only depends on the data. Hence by [5, Chapter I, Lemma 4.1], there exists a constant ν 1 ∈ (0, 1) depending only on the data, such that Y n → 0 if we require the smallness condition Y o ≤ ν 1 .
In this way the argument in the previous sections can be repeated, and we have for all i = 1, 2, · · · j, ess osc Consequently, iterating this recursive inequality we obtain for all i = 1, 2, · · · j, (6.10) ess osc 6.5. Reduction of Oscillation Away From Zero. In this section, let us suppose j is the first index satisfying the second case in (6.1), i.e. either µ − j > ξω j or µ + j < −ξω j . Let us treat for instance µ − j > ξω j , for the other case is analogous. We observe that since j is the first index for this to happen, one should have µ − j−1 ≤ ξω j−1 . Moreover, one estimates As a result, we have The condition (6.11) indicates that starting from j the equation (1.1) resembles the parabolic p-Laplacian type equation in Q j . Like when 1 < p < 2 (cf. Section 5.3), we drop the suffix j from our notation for simplicity, and introduce v As in Section 5.3, v satisfies (1.1) -(1.2) with A(x, t, u, Du) replaced by some properly definedĀ(x, t, v, Dv), which is subject to the structural conditions (1.2). Moreover, As in Section 5.3, it turns out to be more convenient to consider the equation satisfied by w : where similarly as in (5.3) we define the vector-field A by A(x, t, y, ζ) =Ā x, t, y for a.e. (x, t) ∈ Q, any y ∈ R and any ζ ∈ R N . This time y is defined by It is easy to see that w is in the same kind of functional space (1.8) as u and v due to (6.12). Employing (6.12) again, we verify exactly as in Section 5.3 that there exist absolute positive for a.e. (x, t) ∈ Q, any y ∈ R, and any ζ ∈ R N . Note that ξ is already fixed in dependence of the data. This shows that w is a local weak solution to the parabolic p-Laplacian type equation in Q. We tend to use Proposition 5.1. To order for that, we first check the condition (5.8) is satisfied. Indeed, recalling v = u/µ − , w = v p−1 and ω = ess osc Q u, we first use (6.12) and the mean value theorem to obtain Since ess osc Q v = ω/µ − , this amounts to Then by (6.11), we have Thus we only need to take σ ≤ c p−2 , such that Q σ̺ (θ) ⊂ Q ̺ and the condition (5.8) in Proposition 5.1 is fulfilled. As a result, the conclusion of Proposition 5.1 is obtained. Moreover, the above upper bound of ω actually allows us to obtain the set inclusion Using this set inclusion and rephrasing the oscillation decay in Proposition 5.1 in terms of u, we have for all 0 < r < ̺, that the oscillation decay estimate ess osc holds true. Now we revert to using the suffix j. The above oscillation decay then reads (6.13) ess osc whenever 0 < r < ̺ j . Combining (6.10) and (6.13), we arrive at the desired conclusion, i.e., for all 0 < r < ̺ we have ess osc A proper rescaling gives the oscillation decay in Remark 1.1 and completes the proof of Theorem 1.1.

PROOF OF BOUNDARY REGULARITY
The proofs of Theorems 1.2 -1.4 present many similarities with the interior case. Hoewver, contrary to the interior case we do not need to distinguish between the cases p < 2 and p > 2. All the technical tools needed near the parabolic boundary have been presented previously. Therefore, we will give sketchy proofs only, while keeping reference to the tools and strategies used in the interior and highlighting the main modifications. 0) is attached to the bottom of E T . We may assume x o = 0 and set

Proof of Theorem 1.2. Consider the cylinder of forward type
Like in the proof of interior regularity, there are two main cases to consider, namely (7.1) when u is near zero: µ − ≤ ω and µ + ≥ −ω; when u is away from zero: µ − > ω or µ + < −ω.
Let us suppose the first case holds, which implies |µ ± | ≤ 2ω. The proof continues with a comparison to the initial datum u o , i.e., we may assume For otherwise, we would arrive at Let us assume for instance the second inequality with µ − holds and work with u as a supersolution. Therefore, we let θ ∈ (0, 1) to be chosen later and definek n , ̺ n ,̺ n , K n and K n , as in the proof of Lemma 4.3 according to (4.5), with M replaced by 1 4 ω. The only difference is that now the cylinders Q n and Q n are of forward type whose vertices are attached to the origin, i.e., Q n = K n × (0, θ̺ p n ] and Q n = K n × (0, θ̺ p n ]. With these choices, we may apply the energy estimates in Proposition 3.2 within Q n , since the levels k n are admissible according to (3.4), i.e., Using |µ − | ≤ 2ω, a similar analysis as in the proof of Lemma 4.3 leads us to the analogue of (4.6), i.e. to the energy estimate where we have abbreviated A n = u > k n ∩ Q n .
Note that in (4.6) we only have to replace M by 1 4 ω. Now we are in a situation similar to Lemma 6.1. More precisely, (6.5) holds true with η 1 = 1 and -on the right-hand side -2 pn replaced by 4 pn . Now, applying the Sobolev imbedding as in the proof of Lemma 6.1 and rewriting the resulting estimate in terms of Y n = |A n |/|Q n |, we arrive at where γ and b are positive constants depending only on the data. Hence by [5, Chapter I, Lemma 4.1], there exists a constant ν o ∈ (0, 1) depending only on the data, such that Y n → 0 as n → ∞ if we require that Upon choosing θ = ν o , the above line is automatically satisfied and as a result we obtain This in turn yields a reduction of oscillation of the form ess osc Q1 u ≤ 7 8 ω.
Consequently, taking the initial datum into consideration, we obtain (see [5,Chapter III,Lemma 11.1] for the corresponding estimate for weak solutions to parabolic p-Laplacian equations) Now we may proceed by induction. Define λ = 1 2 θ 1 p , and suppose up to i = 1, 2, · · · j − 1, we have built sequences For all the indices i = 1, 2, · · · j − 1, we always assume the first alternative (7.1), i.e. that µ − i ≤ ess osc Qi u and µ + i ≥ − ess osc Qi u holds true. In this way the above argument can be repeated and we have for all i = 1, 2, · · · j, the reduction of oscillation ess osc Qi u ≤ max 7 8 ess osc Consequently, iterating the above recursive inequality, we obtain for all i = 1, 2, · · · j, ess osc In what follows, let us suppose j is the first index satisfying either µ − j > ω j or µ + j < −ω j . Let us treat for instance µ − j > ω j , for the other case is analogous. As in Section 5.3 we use the fact that j is the first index for this to happen to obtain ω j < µ − j ≤ 9 7 ω j ; cf. the proof of (5.4). Then, for simplicity we drop the suffix j from our notation temporarily, and introduce v := u/µ − and w := v p−1 in Q = K ̺ × (0, ̺ p ]. In this way, the function w will satisfy the parabolic p-Laplacian type equation (5.6) -(5.7). Moreover, it attains the initial datum w o := (u o /µ − ) p−1 in the sense of L 2 (K ̺ ). Next, we state in the following proposition concerning the regularity of solutions to the parabolic p-Laplacian type equation up to the initial time (cf. [5]). Proposition 7.1. Let p > 1 and σ in (0, 1). Suppose w is a bounded, local, weak solution to (5.6) in Q := Q ̺ such that the structure conditions (5.7) are in force, and such that w(·, t) → w o as t ↓ 0 in the sense of L 2 (K ̺ ). Assume w o is continuous in K ̺ with modulus of continuity ω wo (·). Let ω = ess osc Q w and θ = ω 2−p .
Then, there exist constants β 1 in (0, 1) and γ > 1 depending only on the data N, p, C o , C 1 and σ (but independent of w), such that there holds: whenever we have ess osc then the oscillation decay estimate ess osc holds true for all 0 < r < ̺. Here, we use the notation Q r (θ) = B r × (0, θr p ]. As in Sections 5.3 and 6.5 (distinguishing the cases 1 < p < 2 and p > 2), one quickly checks that there exist absolute constants c, C > 0, such that c ≤ ω ≤ C and the condition in Proposition 7.1 is fulfilled for some proper σ. Moreover, the lower/upper bounds of ω actually allow us to obtain the set inclusion Using this set inclusion and rephrasing the oscillation decay of Proposition 7.1 in terms of u, we have ess osc whenever 0 < r < ̺; here we argue similarly to the proof of (6.13). Now we revert to using the suffix j. The above oscillation estimate then reads as ess osc for all 0 < r < ̺ j . Combining the above two cases, we arrive at the desired conclusion, i.e., for all 0 < r < ̺, there holds ess osc Observe that we may replace ̺ by any̺ ∈ (r, ̺). In particular, we may set̺ = √ r̺. In this way, we end up with ess osc when u is away from zero: µ − > ξω or µ + < −ξω.
Here ξ is the positive constant fixed in Proposition 4.1 through dependence on the data and α = α * , whereas α * comes from the property of positive geometric density of ∂E. Let us suppose the first case holds. The proof continues with a comparison to the boundary datum g, i.e., we may assume For otherwise, we would arrive at ess osc Qo u ≤ 2 ess osc Qo∩ST g.
Let us suppose for instance the second inequality holds. To proceed, we turn our attention to the energy estimates in Proposition 3.2 for super-solutions. Since (u − k) − vanishes on Q o ∩ S T for all k ≤ µ − + 1 4 ω (i.e. k satisfies (3.5) 2 with Q R,S replaced by Q o ), we may extend all integrals in the energy estimates to zero outside of E T . The extended (u − k) − will be denoted by the same symbol and it is still a member of the functional space in (1.8) within Q o .
The proofs of Lemma 4.2 and Lemma 4.3 can be carried over to the current situation with properly chosen parameters, bearing in mind that we have assumed ∂E fulfills the property of positive geometric density (1.5), and therefore for any k ≤ µ − + 1 4 ω, we have Thus the conclusion of Proposition 4.1 can be reached. As a result the oscillation is reduced in the case 1 < p < 2 under the condition (7.3) 1 with ξ = 1 just like in Section 5, whereas this is true for p > 2 only when |µ − | < ξω with some very small ξ. As a result, one still needs to handle the situation when µ − < −ξω since this is not excluded in (7.3) 1 , for p > 2. Like in Section 6, there seems to be some technical complication due to the smallness of the parameter ξ in the case p > 2. However, the property (7.4) offers considerable simplification. Indeed, we do not need to split the proof into two parts. Our current hypothesis to continue consist of (7.4) and −2ω < µ − < −ξω as we have assumed µ + ≥ −ξω in (7.3) 1 . They are analogues of (6.7) and (6.9) formulated near the infimum instead of the supremum, with which one can run the machinery employed in Lemma 6.3 and Lemma 6.4. The only difference is that Lemma 6.3 and Lemma 6.4 have been presented in terms of sub-solutions near the supreme, whereas now one needs to reproduce similar arguments in terms of super-solutions near the infimum. Therefore we can reduce the oscillation under the condition (7.3) 1 , for p > 2 as well.
Next, we can proceed by induction just like the interior case until a certain index j, when the second case of (7.3) happens for the first time. Starting from j, the equation will behave like the parabolic p-Laplacian type equation within Q j ∩ E T . We may render this point technically just like in the interior case. Accordingly, we need the following result near the lateral boundary (cf. [5]). Then, there exist constants β 1 in (0, 1) and γ > 1 depending only on the data N, p, C o , C 1 and σ (but independent of w), such that there holds: if (7.5) ess osc then, the oscillation decay estimate ess osc holds true for all 0 < r < ̺.
We refrain from further elaboration due to the similarity of the arguments. The proof may be concluded as in the previous section.
Remark 7.1. We have omitted actual computations due to similarities with the proof of interior regularity. Nevertheless the conclusions in the interior, such as Proposition 4.1, can be applied directly to the current situation without repeating their proofs, thanks to their emphasis on the distinct roles of sub-solutions and super-solutions, provided we can extend u properly to the outside of E T and generate sub(super)-solutions across the lateral boundary. In this regard, we refer to Lemma A.2 for such extensions. This observation is also essentially the gist in the proofs of Theorem 1.2 -1.3. Similar calculations have to be reproduced mainly due to the variant energy estimates in Proposition 3.2 -3.3 that have incorporated either initial data or Dirichlet data. In particular, a key ingredient -the Sobolev imbedding (cf. [5, Chapter I, Proposition 3.1]) -was used in all these situations, assuming the functions (u − k) ± ζ p vanish on the lateral boundary of the domain of integration. This assumption in turn is fulfilled either by choosing a proper cutoff function ζ or by restricting the value of the level k according to the Dirichlet data as in (3.5) or the initial data as in (3.4).
The main difference in the current situation lies in that such a Sobolev imbedding cannot be used because in general the functions (u − k) ± ζ p under conditions of Proposition 3.4 do not vanish on S T . However, a similar Sobolev imbedding (cf. [5, Chapter. I, Proposition 3.2]) that does not require functions to vanish on the boundary still holds for the functional space It is remarkable that the imbedding constant now depends on N , the structure of ∂E and the ratio T /|E| p N , which is invariant for cylinders of the type Q ̺ = K ̺ × (−̺ p , 0] and Q ̺ ∩ E T as well, provided ∂E is smooth enough. As an example, we exhibit in the following how to modify the proof of Lemma 4.3 technically. Based on Proposition 3.4 and under the similar notations in Lemma 4.3, with the interior cylinders replaced by their intersection with E T , the energy estimate (4.6) becomes, assuming |µ − | ≤ 8M , where we have abbreviated The term with γ * comes from the extra term generated by the Neumann datum ψ and γ * depends on C 2 through (N). We may assume that the first term on the right dominates the second, for otherwise we would have M ≤ ( γ * γ ) 1 p ̺. When p > 2, the first integral on the left may be estimated from below bŷ When 1 < p < 2, we introducek n = 3 4 k n+1 + 1 4 k n <k n and estimatê In all cases, the energy estimate becomes Then one may proceed to use the previously mentioned Sobolev imbedding (cf. [5, Chapter. I, Proposition 3.2]) to establish a recursive inequality of fast geometric convergence for Y n = |A n |/|Q n ∩ E T | as in Lemma 4.3.
As shown above, whenever we use the energy estimate in Proposition 3.4, the extra term containing C 2 is always absorbed into other terms via the assumption M > ( γ * γ ) 1 p ̺, and as such it contributes in the proof of Hölder regularity only by an extra control on the oscillation via ω ≤ γ̺ with γ depending also on C 2 . This remark also holds for the proof of a result like Lemma 4.1.
Finally, we remark that the use of De Giorgi's isoperimetric inequality (cf. [5, Chapter I, Lemma 2.2]) is permitted for all convex domains. This is not restrictive in our case upon a local flattening of ∂E. In other words, since ∂E is of class C 1 , the portion of ∂E within K R (x o ) can be represented in a local coordinate system as part of the hyperplane x N = 0 and Without loss of generality we may assume that the weak formulation in Section 1.4.4 is written in such a coordinate system. Consequently, the energy estimate in Proposition 3.4 is written with K R (x o ) ∩ E and Q R,S ∩ E T replaced by K + R and Q + R,S respectively. Thus the machinery used in Lemma 4.2 can be reproduced with modifications as indicated above.
As usual, another main component of the induction argument will be a corresponding result from the regularity theory for the parabolic p-Laplacian type equation (cf. [5]), which we record in the following. If for some constant σ in (0, 1), there holds (7.6) ess osc then, there exist constants β 1 in (0, 1) and γ > 1 depending only on the data N, p, C o , C 1 , C 2 , σ and the structure of ∂E, such that for all 0 < r < ̺, we have ess osc Proof. Without loss of generality, let u be a local weak sub-solution to (1.1) -(1.2). We show that k + (u − k) + is a local weak sub-solutions to (1.1) -(1.2). Write down the mollified equation (3.1) and follow the introduction of Q R,S ⋐ E T , the function w h and the piecewise smooth functions ζ and ψ ε in the proof of Proposition 3.1. Instead of (3.2), we choose here, for some σ > 0, the test function (A.1) Q R,S ∋ (x, t) → ϕ(x, t) = ζ p (x, t)ψ ε (t) u(x, t) − k + u(x, t) − k + + σ .
Like in the proof of Proposition 3.1, we treat the various terms in (3.1). First of all, we consider the time part. We havë where we have defined h + (u, σ, k) def = (p − 1) k p−1 +ˆu k |s| p−2 (s − k) + (s − k) + + σ ds .
Note that lim σ↓0 h + (u(x, t), σ, k) = [k +(u−k) + ] p−1 . As in the proof of Proposition 3.1, we have used the fact that the second line in the above estimates has a non-negative contribution, due to (3.3) and the fact that the map s → (s 1 p−1 − k) + (s 1 p−1 − k) + + σ is a monotone increasing function. We now send h ↓ 0 and then ε ↓ 0, as in the proof of Proposition 3.1, to obtain ∂ t ζ p h + (u, σ, k) dxdt.
Finally we send σ ↓ 0 to finish the proof.
The above Lemma A.1 has an analog near the lateral boundary S T . Suppose u is a sub(super)solution to (1.4). The cylinder Q R,S = K R (x o ) × (t o − S, t o ) has its vertex (x o , t o ) attached to S T and the level k satisfies (3.5). We define the following truncated extension of u in Q R,S : An extension of A can be defined as In this way, A is a Caratheodory function satisfying (1.2) with C o and C 1 replaced by min{1, C o } and max{1, C 1 } respectively. Furthermore, we have Lemma A.2. Suppose u is a sub(super)-solution to (1.4) with (1.2) and the level k satisfies (3.5). Let u ± k be defined as above. Then u ± k is a local weak sub(super)-solution to (1.1) with A in Q R,S .
Proof. The calculations are similar to the proof of Lemma A.1. One only has to notice that due to our choice of k satifying (3.5), the test function (A.1) is still admissible if we extend it to zero in Q R,S \ E T (cf. [11,Lemma 2.1]). Notice also this extension does not require any smoothness of the lateral boundary S T a priori. In this way, all the subsequent integrals are carried over into the whole Q R,S .
Remark A.1. We point out that Lemma A.2 exhibits an important character of parabolic equations. So-extended sub(super)-solutions across the lateral boundary often play a basic role in investigating the boundary regularity of solutions on rough domains. See for instance [10,11] in this regard. There exist constants δ and η in (0, 1) depending only on the data and α, such that u ≥ ηM a.e. in K 2̺ (x o ) × t o + δ( 1 2 ̺) p , t o + δ̺ p . It is worth mentioning that a similar remark as Remark 4.1 also holds under current circumstance. The following Harnack's inequality has been shown in [12,18,27]. However, we give an alternative proof based on Proposition 4.1 and Theorem 1.1. Proof. We only prove the right-hand inequality, as the left-hand one is a direct consequence (cf. To this end, we introduce, for τ ∈ (0, 1), the family of nested cylinders {Q τ } and the families of non-negative numbers {M τ } and {N τ } as follows: where σ > 1 is to be chosen. The two functions [0, 1) ∋ τ → M τ , N τ are increasing, and M o = N o = 1 since v(0, 0) = 1. Moreover, N τ → ∞ as τ → 1 whereas M τ is bounded since v is locally bounded. Therefore the equation M τ = N τ has roots and we denote the largest one as τ * . By the continuity of v, there exists (y, s) ∈ Q τ * , such that v(y, s) = M τ * = N τ * = (1 − τ * ) −σ . Moreover, Therefore by the definition of τ * , Now let ε * ∈ (0, 1) and set r = ε * R. By Theorem 1.1 (note also Remark 1.1), for all r < R and for all x ∈ K r (y), we have v(x, s) − v(y, s) ≥ −γM * r R provided we choose ε * so small that
From this we may start employing Proposition B.1 with α = 1 to conclude that there exist positive constants η and δ as indicated, such that v ≥ ηM in the cylinder Repeating this process, we conclude that for any positive integer n, v ≥ η n M in the cylinder Q (n) def = K 2 n r (y) × s + δ(2 n−1 r) p , s + δ(2 n r) p .
We may assume ε * (1 − τ * ) is a negative, integral power of 2. Then choose n such that 2 n r = 2.