Singular orthotropic functionals with nonstandard growth conditions

We pursue the study of a model convex functional with orthotropic structure and nonstandard growth conditions, this time focusing on the sub-quadratic case. We prove that bounded local minimizers are locally Lipschitz. No restriction on the ratio between the highest and the lowest growth rates are needed. The result holds also in presence of a non-autonomous lower order term, under sharp integrability assumptions. Finally, we prove higher differentiability of bounded local minimizers, as well.

1. Introduction 1.1.Overview.In this paper we expand on the gradient regularity theory for minimizers of functionals from the Calculus of Variations, having an orthotropic structure, in the nonstandard growth case.This may be seen as a follow up of our previous papers [9] and [12].
In the superquadratic case, i.e. for and for f ≡ 0, it has been recently proved in [9] that any local minimizer U ∈ W 1,p loc (Ω) ∩ L ∞ (Ω) is such that U xi ∈ W 1,2 loc (Ω), for i = 1, . . ., N. The main goal of this paper is to address the same kind of regularity issues, again for bounded local minimizers, this time in the subquadratic case However, we will obtain some regularity results which actually hold in the full range 1 < p 1 ≤ • • • ≤ p N < ∞, see the next section for more details.
We recall that u ∈ W 1,p loc (Ω) for every ϕ − u ∈ W 1,p 0 (Ω ′ ) ∩ L ∞ (Ω ′ ) and every Ω ′ ⋐ Ω.Here we denote by W 1,p 0 (Ω ′ ) the completion of C ∞ 0 (Ω ′ ) with respect to the norm By convexity of F p , we have that u is a local minimizer if and only if it is a local weak solution in W 1,p loc (Ω) ∩ L ∞ loc (Ω) of the quasilinear equation This can be seen as a particular instance of elliptic equation in the wide context of Musielak-Orlicz spaces, see [17] for a comprehensive study on the subject.We emphasize the fact that in this paper we just consider bounded minimizers u.As a consequence, we discard a priori all the counterexamples to regularity arising in the literature related to nonstandard growth variational problems, see [33,35,45].For completeness, we mention that the boundedness of minimizers in this setting has already been extensively studied, see [32] for the homogeneous case f ≡ 0 and [19,20] for the non-homogeneous one.We also ignore the problem of the existence of a minimizer in W 1,p , for which we would also need to assume that f belongs to a suitable dual Sobolev space.Here instead, we assume a priori to have a bounded minimizer U and focus on identifying sharp conditions on the function f needed to obtain its Lipschitz continuity and higher differentiability.
The main feature of all our regularity results will be that we do not need to impose any restriction on the ratio p N /p 1 .
We refer the reader to our previous papers [9,11,12] for an introduction to the realm of gradient regularity for minimizers of orthotropic functionals (see also [24] for an approach based on viscosity methods).We just recall here that already for the standard growth case p 1 = p N = p, the superquadratic case p > 2 is much more involved than the case of the model functional

ˆΩ′
|∇u| p dx, for u ∈ W 1,p loc (Ω) and Ω ′ ⋐ Ω, as far as the regularity of the gradient of local minimizers is concerned.On the other hand, the subquadratic case 1 < p < 2 is simpler, in a sense: the Lipschitz continuity is a consequence of a general result due to Fonseca and Fusco [31,Theorem 2.2], as observed in the introduction of [11].
In particular, it seems natural to try to adapt the techniques used in [31] since they allow to establish the Lipschitz regularity for the subquadratic case when p 1 = p N = p < 2. However, we stress that in the case p 1 = p N , our functional pertains to the class of variational problems with nonstandard growth conditions, following the terminology of Marcellini in [43,44].Then it couples in a nontrivial way the difficulties coming from the two situations: orthotropic structure and nonstandard growth conditions.Thus, even if we will prove Lipschitz regularity with a proof inspired from [31,Theorem 2.2], nontrivial adaptations and intermediate results will be needed.
Finally, it is worth recalling that, in spite of a large number of papers and contributions on nonstandard growth problems (including for example [5,7,15,16,29,30,36,37,38,47,50]), a complete gradient regularity theory is still missing, even for the case of orthotropic structures.Moreover, we recall that also the case of basic regularity (i.e.C 0,α estimates, Harnack inequalities, and an extension of the De Giorgi's regularity theory) is still not fully well-understood (see for example [1,3] and [42] for some results) for local minimizers of F p .1.2.Main results.Our first result is an higher integrability statement, which is valid without any restriction on the exponents p i .As we will see in a while, this will be instrumental to the two main regularity results of this paper.In what follows, we use the notation where ( • ) + stands for the positive part.This function naturally arises from the principal part of F p .It encodes in a natural way the full summability informations for each component of the gradient.A similar idea has been considered for example in the papers [2,22,23], dealing with the so-called double phase problems.
Observe that the assumption γ > N is sharp (in the scale of Lebesgue spaces) to obtain the Lipschitz continuity of U .Actually, this is already true when It is a remarkable fact that, even in the orthotropic case with nonstandard growth conditions, this universal assumption on f still leads to Lipschitz continuity.We refer the reader to [4] for a wide class of variational problems (not including orthotropic structures, however) where this same condition is known to guarantee Lipschitz continuity of local minimizers.

1.3.
Comparison with known results.In the homogeneous case f = 0, Proposition 1.1 can be obtained as a consequence of [40,Lemma 4.2].In the superquadratic case p 1 ≥ 2 and still for f ≡ 0, an alternate proof can also be found in [9,Proposition 6.1].We present here a new proof that takes into account the presence of the forcing term f .Our argument is certainly more elementary than the one in [9], and arguably more natural in our setting than the one in [40], in the sense that it strongly relies on some tools that will be repeatedly used in the proofs of the other main results, see the next section for further comments.
Theorem L is the counterpart for the subquadratic case of our previous result [9, Theorem 1.1], which deals with the superquadratic case.We shall explain in the next section why the two situations require different arguments.We point out that experts in the field may recognize Theorem L (and [9, Theorem 1.1], as well) as a particular case of the main result in [40], at least in the homogeneous case f ≡ 0. However, it turns out that the proof [40, Proposition 2.1] is affected by a crucial flaw, we refer the reader to [9,Remark 1.4] for a detailed discussion on this delicate point.In any case, it is fair to admit that some other parts of Lieberman's paper [40] have been an important source of inspiration for the proof of Proposition 1.1.
In [28,Corollary 3.4], the authors proves the local Lipschitz continuity of local minimizers (not a priori bounded) of the following functional Observe that such a functional has an orthotropic structure, with nonstandard subquadratic growth conditions, exactly as our F p .However, it should be noticed that the functional (1.4) is neither degenerate nor singular : this is the crucial difference with our case.Indeed, the Hessian of the function This property fails to be satisfied by our functional, where the integrand is given by Even worse, in contrast with the general framework of [28], in our situation there is no continuous functions even for large values of z.Indeed, D 2 G(z) is given by the diagonal matrix and each entry on the diagonal blows-up as the corresponding component of z vanishes.
As for Theorem S, we observe that this may be seen as a generalization of the following classical result for the p−Laplacian: Ω) (see for example [26]).The reader may notice that for p 1 = p N = p our assumption (1.3) boils down to Since for 1 < p < 2 we have 1 + 2/p < p ′ , this is a weaker requirement when compared with the classical result recalled above.This is not surprising, since we are now assuming that u is a priori bounded.Such an assumption is responsible for this new feature.In the standard growth case, this has been recently observed in [21,Theorem 1.2].Higher differentiability of local minimizers is a well-studied problem: for the specific case of orthotropic functionals with subquadratic nonstandard growth, some prior results can be found for example in [5,Theorem 3], [6, Corollary 1] and [16,Theorem 2].
Finally, in the superquadratic case p 1 ≥ 2, as already recalled the counterpart of Theorem S has been obtained in [9,Corollary 7.1], for f ≡ 0. In the case of a right-hand side f ≡ 0, some results have been obtained in [14,Theorem 1.1] and [47,Corollary 2].
Remark 1.2 (On the C 1 regularity).In dimension N = 2, the C 1 regularity of a Lipschitz local minimizer essentially follows from [25, Theorem 1.1], both in the case p 1 ≤ 2 and p 1 ≥ 2, provided that f ≡ 0. This assertion is detailed in [8], where the "mixed" case p 1 ≤ 2 ≤ p 2 is considered, as well.For a slightly different approach, see [41] when p 1 ≥ 2 and [48] when p 1 = p 2 < 2: these references still deal with the case f ≡ 0. In the non-homogenous case, the strategy followed in [10] (and originally written for p 1 = p 2 and f ≡ 0) could be adapted to more general situations, provided f satisfies suitable differentiability and summability conditions.In any case, the C 1 regularity of local minimizers when N ≥ 3 is entirely open, even for p 1 = p N and f ≡ 0.
1.4.Structure of the proofs.The proofs of Proposition 1.1, Theorem L and Theorem S are based on a classical three steps strategy.We first approximate our local minimizer U ∈ W 1,p loc (Ω) ∩ L ∞ loc (Ω) by a sequence of minimizers {u ε } ε>0 of regularized functionals F p,ε having good smoothing properties.We next obtain uniform a priori estimates on these minimizers.Finally, we pass to the limit in order to transfer these a priori bounds to U .The first step is usually quite easy, it is sufficient to perturb the initial functional by adding some uniformly convex ε−perturbation and possibly smooth out the coefficients, for example by replacing f with its mollifcation f ε .This regularization strategy allows to avoid the usual difference quotient method, and the technicalities that go with its use in the nonstandard growth setting.However, here we face a first difficulty: remember that we are not assuming f to be in the correct dual Sobolev space.This also entails that we do not have a good a priori L ∞ estimate at our disposal.Thus, such an approximation has to be handled with great care.We circumvent this technical difficulty, by adding a nonlinear lower term in the regularized functional, which forces the minimizers u ε to be bounded, with a uniform L ∞ bound which only depends on the local L ∞ norm of U (see Lemma 2.4).This is a technical aspect of the proof, which we believe to have its own interest.
The core of the matter is next to establish the a priori estimates for the gradient of u ε , the minimizer of the regularized functional F p,ε .As for the estimates leading to Proposition 1.1 and Theorem L, these are achieved by means of Moser-type schemes: a slow one and a fast one, respectively.
The cornerstone of these schemes is a Caccioppoli inequality for power functions of the gradients (see Proposition 3.1).In a simplified way, for every α ≥ 0 this reads as where G is the same function as in (1.5).For simplicity, we put f ≡ 0 and write u in place of u ε .Observe that on the left-hand side of (1.6) we have a weighted gradient of a power of G(∇u): the weights |u xi | pi−2 are the typical feature of degenerate/singular orthotropic functionals.The main difficulty in getting regularity results out of this estimate is precisely due to their presence.In contrast with the Caccioppoli inequality previously obtained in [9, Lemma 3.1] to handle the superquadratic case p 1 ≥ 2, now these weights do not pop-up on the right-hand side.This is a crucial ingredient of the estimate: indeed, no control from above would be possible on Not surprisingly, the proof of (1.6) relies on the differentiated Euler-Lagrange equation, which is nothing but the equation solved by the components of ∇u ε .In a nutshell, the idea to reach such an estimate not containing the nasty weights |u xi | pi−2 on the right-hand side, is that of using an integration by parts trick: this permits to trade the presence of the term D 2 G, with the more tractable one DG.This idea is certainly not new in the context of singular variational problems: it goes back at least to [46], and has then become standard in the field.
However, as natural as this idea may appear, its technical implementation in our context needs some efforts: in particular, a careful choice of the test functions for the differentiated equation has to be done.Such a choice must reflect the algebraic structure of the operator, in a sense.Without entering too much into the details, we refer to the proof of Proposition 3.1 below.The choice of the correct test functions here has been suggested to us by [40], even if our choice seems to be simpler and more natural.
The Caccioppoli inequality (1.6) is first used in the proof of the higher integrability result of Proposition 1.1.More specifically, it permits to obtain a self-improving estimate of the type This is the slow Moser's iteration scheme we were referring to above: by iterating (1.7) a finite number of times, we can conclude that G(∇u) (and thus ∇u itself) can be estimated in L q , for every finite q ≥ 1. Observe that the additive integrability gain at each step and the presence of the factor β 2 on the right-hand side make the previous scheme not suitable for being iterated infinitely many times.This explains why we cannot reach the limiting case ∇u ∈ L ∞ with this approach.
Estimates like (1.7) are quite typical in the Regularity Theory, both in the contexts of standard and nonstandard growth problems (among others, see for example [27,Proposition 3.1] and [18,Theorem 3], respectively).Usually, they are obtained by coupling an integration by parts, with a Caccioppoli inequality for the gradient, like the one (1.6) at our disposal.The L ∞ bound on the solution is used to treat the solution itself as a constant in the estimates.
We stress here that in this part of the proofs we do not need the restriction p N ≤ 2. Thus, in particular we can extend and simplify the higher integrability result we previously obtained 1 in [9, Proposition 4.3].
With the aid of (1.7), in the case p N ≤ 2 we can transpose to our situation the typical absorption trick which lies at the basis of the Lipschitz estimate for the standard p−Laplacian (see for example [27,Section 3]).Up to some nontrivial technical issues, this consists in observing that when G(∇u) ≥ 1, we have and thus The weights |u xi | pi−2 have then been absorbed into a suitable power function of the gradient.In this sense, in the case p N ≤ 2, the presence of the weights |u xi | pi−2 on the left-hand side of (1.7) helps, more than it hurts.
1 There is however a subtle detail here: the result in [9] was obtained through a complicate self-improving iterative scheme (inspired from that of [7, Theorem 1.1]), which was not of Moser-type.Actually, this was much more sophisticated and could be roughly described as follows: improvement of integrability of N − 1 components of the gradient entails that the missing one improves its integrability, as well.
At this point, by joining (1.8) and (1.7), the orthotropic nature of the problem completely disappears and we simply fall into the realm of nonstandard growth problems.A standard application of Sobolev inequality makes then possible to launch a standard Moser's iterative scheme (i.e. a fast one, with a multiplicative gain of integrability at each step).This permits to reach an L ∞ − L q estimate on G(∇u), after infinitely many iterations.This is not the end of the story.Indeed, we still have to pay attention to a detail which is quite typical of the nonstandard growth case: the exponent q in this a priori estimate could be too large.However, this preliminary estimate can be "rectified" by combining the higher integrability result of Proposition 1.1 together with an interpolation trick which decreases the initial integrability requirement on G(∇u).We then finally get a L ∞ − L 1 estimate on G(∇u), as desired.
In contrast to Proposition 1.1 and Theorem L, the proof of Theorem S does not rely on the Caccioppoli inequality of Proposition 3.1.The proof follows the same idea as in the case of the result for the familiar p−Laplacian, for the case 1 < p < 2: we test the differentiated equation with the gradient itself u x k and perform an integration by parts as in Naumann's trick [46].Again, this permits to avoid using the undesired upper bound on the Hessian D 2 G.In order to conclude, one has to control from above terms of the form Observe that for every k = i, the two terms are completely decoupled.However, when p i ≤ 2, we can simply estimate this term from above by Young's inequality The first term is absorbed on the left-hand side, while the second term can be estimated from above by means of an integrability estimate (here we rely again on the information provided by Proposition 1.1).This explains why we require p N ≤ 2 in the statement of Theorem S. 1.5.Plan of the paper.In Section 2, we present the approximation scheme and some basic material used all along the paper.Section 3 contains the crucial Caccioppoli-type inequality for the gradient (Proposition 3.1).The latter is exploited in Section 4 to perform the slow Moser iteration leading to the higher integrability estimate needed in Proposition 1.1.The Lipschitz bound related to Theorem L is proved in Section 5, while Section 6 is devoted to the proof of the higher differentiability estimates corresponding to Theorem S.Then, in Section 7, we eventually prove our three main results by passing to the limit in the approximation scheme.Finally, for completeness, we include in Appendix A the proof of a maximum principle ensuring the uniform boundedness of the approximating sequence.
Acknowledgments.This work has been finalized during a staying of P. B. and L. B. at the Institute of Applied Mathematics and Mechanics of the University of Warsaw, in July 2022.Iwona Chlebicka and Anna Zatorska-Goldstein are gratefully acknowledged for their kind invitation and the nice working atmosphere provided during the whole staying.
C. L. is a member of the Gruppo Nazionale per l'Analisi Matematica, la Probabilità e le loro Applicazioni (GNAMPA) of the Istituto Nazionale di Alta Matematica (INdAM).The three authors gratefully acknowledge the financial support of the projects FAR 2019 and FAR 2020 of the University of Ferrara.

Preliminaries
In this section, we fix 2.1.Some auxiliary functions.For every i = 1, . . ., N and ε > 0, we define (2.1) Then for every t ∈ R, we have Proof.The second derivative of g i,ε is given by In particular, by using that 0 < p i − 1 ≤ 1, we easily get (2.2).We also observe that which proves (2.3).
Finally, (2.4) follows by writing 2 , and then using the definition of g i,ε and the lower bound in (2.2).Lemma 2.2 (Super-quadratic case).Let p i > 2. Then for every t ∈ R, we have Proof.The proofs of (2.5) and (2.7) are similar to those of (2.2) and (2.4) respectively and we omit them.In order to prove (2.6), we use the convexity of the map τ → |τ | pi/2 .This implies We then observe that

By combining the last two inequalities, we get (2.6).
We also define the function which will play a crucial role in our estimates.The next result holds without any restriction on p i .
Lemma 2.3.For every z = (z 1 , . . ., z N ) ∈ R N and every i = 1, . . ., N , we have Proof.By recalling the definition of both g i,ε and G ε , we have By using that p N > 1 and that p i ≤ p N , we can estimate the last term from above as claimed.
2.2.Regularized problems.We will use an approximation scheme which is similar to that already used in our previous papers, starting from [11,Section 2].We want to consider local minimizers of the following convex integral functional The function f is taken to belong to L 1 loc (Ω).In the rest of the paper, we fix Here by λ B we denote the ball concentric with B, scaled by a factor λ > 0. We set For every 0 < ε ≤ ε 0 and every x ∈ B, we then define As usual, we denote by ̺ ε the usual family of Friedrichs mollifiers, supported in a ball of radius ε centered at the origin.Finally, we set and take By recalling the definition (2.1) of g i,ε , we then define the regularized functional Lemma 2.4 (Existence and regularity of a minimum for F p,ε ).For every 0 < ε ≤ ε 0 , the problem admits a solution u ε , which belongs to C ∞ (B).Moreover, for every 0 < ε ≤ ε 0 , we have We first show that we can apply [49, Theorem 9.2] and get existence of a solution to (2.12) min For this, we check the required assumptions.We first claim that for every z, ξ ∈ R N (2.13) From (2.2) and (2.5) we get Then we observe that if p i ≤ 2, then p 1 ≤ 2 and we can write: Thus, in both cases, (2.13) holds.
As for the lower order term, observe that the smooth function Finally, the uniform convexity of B and the smoothness of U ε entail that the latter satisfies the bounded slope condition.Then [49, Theorem 9.2] yields the existence of a solution to (2.12).Since all the data are smooth, [49,Theorem 9.3] implies that u ε ∈ C ∞ (B).We claim that u ε is a solution of (2.11), as well.Indeed, by using [13, Theorem 1.1], for every v ∈ U ε + W 1,p 0 (B) we can infer the existence of By the dominated convergence theorem and the uniform boundedness of ζ ε , we also have This proves that there is no Lavrentiev phenomenon for F p,ε , that is Thus, we get that u ε solves (2.11), as well.Finally, the claimed L ∞ estimate readily follows from Lemma A.1 in the Appendix.
The smooth minimizer u ε satisfies the Euler-Lagrange equation For every k = 1, . . ., N , one can insert test functions of the form ϕ x k , with ϕ ∈ C 2 compactly supported in B. By integrating by parts, we then get the equation for the partial derivatives of u ε (2.15) As usual, by a density argument, the equation can be tested by any ϕ ∈ W 1,p 0 (B).The first ingredient of our recipe is a simple a priori estimate, which is essentially the same as in [9, Lemma 2.1]: the only difference is the presence of the non-autonomous and nonlinear term f ε ζ ε (v), together with a slight modification of the function g i,ε .Lemma 2.5 (Basic energy estimate).For every 0 < ε ≤ ε 0 , the following uniform estimate holds Proof.By testing the minimality of u ε against U ε , we obtain The convexity of the function g i,ε allows to apply Jensen's inequality in connection with the fact that U ε is defined by a convolution.This gives By using also that g i,ε (t) ≥ |t| pi /p i and the 1−Lipschitz character of ζ ε , we get where C M is a positive constant which only depends on M .We finally rely on (2.8) when p i > 2 or the subadditivity of t → |t| This concludes the proof.
In view of our scopes, it is mandatory to have a convergence result for the minimizers {u ε } 0<ε<ε0 .This is the content of the next lemma, which is an extension of [9, Lemma 2.2].Lemma 2.6 (Convergence to a minimizer).With the same notation as above, we have for every 1 ≤ q < ∞.
Proof.The proof goes as in [9, Lemma 2.2].We repeat the argument, since this gives us the occasion to fix some missing details in [9].By using the uniform estimate of Lemma 2.5 and the definition of U ε , we get that {u ε − U ε } 0<ε≤ε0 is a bounded family in W 1,p 0 (B).Thanks to Lemma 2.4, we also have that {u ε − U ε } 0<ε≤ε0 is a bounded family in L ∞ (B).From those two facts, we can infer the existence of an infinitesimal sequence and lim By recalling that U ε k has been constructed by convolution, we also have that it converges strongly in W 1,p (B) and almost everywhere to U .This permits to conclude that {u ε k } k∈N converges weakly and almost everywhere to u := φ + U .We need to prove that actually u = U .With this aim, we test the minimality of each u ε k against the function U ε k .Thus, by lower semicontinuity of the L pi norms, we can infer (2.18) Observe that for the convergence of the lower order term, we used that f ε k converges strongly in L 1 (B), that U ε k and u ε k are equibounded in L ∞ (B) and converge almost everywhere to U and u respectively and that ζ ε k converges uniformly to the Lipschitz function ζ.This shows that The L ∞ -boundedness of u ε proved in Lemma 2.4, gives that u L ∞ (B) ≤ M .A similar estimate holds for U by assumption.Since ζ(t) = t for every t ∈ [−M, M ], one gets By the strict convexity of the functional F p , the minimizer must be unique and thus we get u = U , as desired.
In order to prove (2.17), we can adapt the argument of [ For every i = 1, . . ., N , we rely on the lower semicontinuity of the L pi norm to get In connection with (2.19), this implies that The convergence of the norms, in conjunction with the weak convergence, permits to infer that (u ε k xi ) k∈N converges to U xi in L pi (B) for every i = 1, . . ., N (see for example [39,Theorem 2.11]).Moreover, since {u ε k } k∈N is bounded by M and converges almost everywhere in B to U , the dominated convergence theorem implies that (u ε k ) k∈N converges to U in L q (B) for every 1 ≤ q < ∞.
Finally, we observe that we can repeat this argument with any subsequence of the original family {u ε } ε>0 .Thus the above limit holds true for the whole family {u ε } 0<ε≤ε0 instead of {u ε k } k∈N and (2.17) follows.
The following technical result is classical in the Regularity Theory.This is taken from [34,Lemma 6.1] and we state it here for the reader's convenience.Lemma 2.7.Let 0 < r < R and let Z : [r, R] → [0, ∞) be a bounded function.Assume that for r ≤ s < t ≤ R we have with A, B, C ≥ 0, α 0 ≥ β 0 > 0 and 0 ≤ ϑ < 1.Then we have where λ is any number such that ϑ 1 α 0 < λ < 1.

Caccioppoli-type inequalities for the gradient
Throughout this section, we will assume that 1

without any further restriction.
In what follows, we will use the following function where G ε is the same function as in (2.9).
Proposition 3.1 (Caccioppoli inequality for power functions of the gradient).
Proof.We are going to use a trick based on integration by parts, taken from [46, Theorem 1] (see also [31]).This permits to circumvent the use of the upper bound on the Hessian of the function G ε .We start by fixing k ∈ {1, . . ., N } and inserting in (2.15) the test function where F is a non-negative C 1 monotone non-decreasing function, that will be specified later on.This is a feasible test function, thanks to the regularity of u ε .Thus we get We observe that Then by integrating by parts on the right-hand side2 of (3.3), we obtain This is valid for every k = 1, . . ., N , we then take the sum over k.
On the left-hand side, the first term then becomes For the second term of the left-hand side in (3.4), we observe that and this is non-negative, since each g k,ε is convex and F ≥ 0. We thus obtain By (2.14), the second term of the right-hand side can be written as By an integration by parts and (2.14) again, the last term on the right-hand side of (3.5) is equal to We observe that the third term in the above sum is equal to the quantity in (3.6).Hence, (3.5) is equivalent to We first estimate I 3 : by Young's inequality, we have for every τ > 0 By taking τ = 1/2, we can absorb the term I 1 on the right-hand side and obtain from (3.7) (3.8) The term is easier to handle: we simply have In conclusion, from (3.8) we get (3.9) We now treat the three terms containing f ε : we start from The last term coincides with 1 2 F 3 while the first term is bounded from above (up to a multiplicative constant) by the third term on the right-hand side of (3.9) .Using also that (ζ ε ) ′ L ∞ (R) ≤ 1, we thus get from (3.9) (3.10) The last term F 2 contains second order derivatives of u ε that should be absorbed on the left-hand side.We proceed similarly as for I 3 and estimate it as follows thanks to Young's inequality.Here as always τ > 0 is arbitrary.By inserting this estimate in (3.10) and choosing τ = 1/2, we obtain We observe that if we set Thus we have obtained We now use Lemma 2.3 to estimate from above the right-hand side.Thus, from (3.13), we get By recalling the definition (3.1) of G ε , we observe that and from (3.14), we obtain In order to conclude, we now make the choice We observe that thanks to the definition of G ε .It follows that and thus From (3.15) we get for some C = C(N, p 1 , p N ) > 0. We are only left with estimating I 2 from below: recall that we have We now observe that by (3.16) and through some lengthy though elementary computations, we get We then apply (2.4) or (2.7) on the last term, so to get where δ k is the same quantity defined in (3.11).We further observe that This discussion leads us to By inserting this inequality in I 2 , we get Finally, we can use this estimate in (3.17), so as to get the desired conclusion for α > 0. The limit case α = 0 can now be simply obtained by taking the limit α goes to 0 in the previously obtained estimate, since the relevant constant remains bounded.
Proof.We start by taking β ≥ 1 and writing We observe that if we set , using (2.3) or (2.6) on the second integral, we get By recalling that g ′ k,ε (t) t ≥ 0 and using that 0 On the last term, using product rule and equation (2.14), we obtain By using that u ε is bounded and that 0 Now, (2.10) together with the fact that where in the last inequality we applied Young's inequality.By choosing τ = 1/2, we can absorb the term containing G ε (∇u ε ) β+1 and get from (3.19) For the second term on the right-hand side, we use Young's inequality: for every τ > 0, We also notice that by (3.12) we have Thus from (3.20), we get By choosing τ = 1/(2 δ), we can absorb again the term containing G ε (∇u ε ) β+1 on the right-hand side and obtain On the right-hand side, we now use the Caccioppoli inequality (3.2) with α = β − 1 ≥ 0, so to get for some C = C(N, p 1 , p N ) > 0. This finally gives for some C = C(N, p 1 , p N ) > 0. On the first term on the right-hand side, we can use Young's inequality ˆη2 dx.
By choosing τ = 1/2, we can re-absorb the term G ε (∇u ε ) β+1 .This gives We proceed in a similar way for the two terms containing f ε .By using Young's inequality with By choosing τ = 1/8, we can absorb again the terms containing the power β + 1 of G ε (∇u ε ).This finally leads to the estimate (3.18), up to renaming ϑ = β + 1 − 2/p N .This concludes the proof.

Uniform higher integrability
In this section, we establish a higher integrability estimate for ∇u ε , which will eventually lead to the result of Proposition 1.1.We assume throughout the section that 1 . Proof.We take γ ≥ 2 and define the sequence of exponents We set (4.1) This in particular implies that ϑ i0+1 ≤ γ < ϑ i0+2 .
We now need to distinguish various cases, according to the values of p N and γ.
Case A.1.Here we assume that This is the simplest case: we get the estimate by iterating Proposition 3.2 with exponents ϑ = ϑ i and a suitable sequence of shrinking balls.
More precisely, we fix B r and B R as in the statement and define the sequence of decreasing radii Accordingly, we take a cut-off function η i ∈ C 2 0 (B ri ) for i = 0, . . ., i 0 , such that By applying (3.18) with ϑ = ϑ i , η = η i and using the properties of the cut-off function, we get for a constant C = C(N, p 1 , p N ) > 0. By using Hölder's inequality on the right-hand side, we also get Starting from i = 0 and iterating (4.3) from 0 to i 0 , we get where we set for notational simplicity for every natural number 0 ≤ k ≤ i 0 with the notational agreement that D i0+1 = 1.Their precise expression is not very important, but we point out that D 0 and M depend only on Thanks to the assumption (4.2) and recalling that G ε ≥ 1, we get thus the desired conclusion follows from (4.4).
Case A.2.Here we assume that In light of the second assumption, we have where i 0 is the index defined in (4.1).Thus in this case, we need an extra step of the iteration, by suitably adapting the choice of the exponent ϑ.We take B r ⋐ B ̺ ⋐ B R where and define the sequence of decreasing radii We take a cut-off function By proceeding as above, we now get The last term can be estimated from above, by using again that 2/p ′ N ≤ 1.However, in order to reach the desired exponent γ, we need to apply (3.18) once more.We take a cut-off function By applying (3.18) with 3 and the cut-off function above, we get On the right hand side, the term containing G ε (∇u ε ) is under control, since by construction and the last term can be estimated by (4.5).
3 Observe that such a choice is feasible, since Case B.Here we assume that p N > 2. The proof goes exactly as before, so, for every r < R with B R ⋐ B we certainly have but now the major difference is that we need to estimate the integral on the right hand side.Indeed, in this case 2/p ′ N > 1 and we can not directly assure that this term is bounded, uniformly in ε.We need to use an interpolation trick to get a reverse L 2/p ′ N − L 1 estimate on G ε (∇u ε ).We denote ̺ = (R + r)/2 and we observe that thus by interpolation in Lebesgue spaces, we get for every .
The L 2 norm on the right-hand side can in turn be estimated by means of (3.18), observing that we thus get from (3.18) We spend this information into (4.6) and use the subadditivity of τ → τ (pN −2)/pN , so to get Remark 4.2 (Quality of the constants).For future references, it is important to notice that the two constants Γ 1 and Γ 2 in the previous statement are uniformly bounded from above, whenever there exists a constant C ≥ 1 such that On the contrary, we see from the proof above that lim γ→∞ with an exponential rate of divergence.

Uniform Lipschitz bound
We now establish a local L ∞ estimate for ∇u ε : this time, this will lead to Theorem L.
Then for every pair of concentric balls B r ⋐ B R ⋐ B and every q > N , we have for some C = C(N, p N , p 1 , q) > 0.
Proof.We will use a Moser's iteration scheme, in order to get the claimed estimate.By (2.2), for every i = 1, . . ., N we have thanks to the fact that p i ≤ 2, for every i = 1, . . ., N .We can further estimate the last term from below as follows . By using this lower bound in (3.2), we get for some C = C(N, p 1 , p N ) > 0. With simple algebraic manipulations, for every α ≥ 0 we have Thus from (5.1) we obtain for some C = C(N, p 1 , p N ) > 0. By adding on both sides the term and using again that G ε ≥ 1, we then obtain from (5.2) possibly for a different constant C = C(N, p 1 , p N ) > 0. Let us suppose for simplicity that N ≥ 3.
The case N = 2 can be treated with minor modifications.On the left-hand side of (5.3), we then use Sobolev's inequality in W 1,2 (R N ).This gives with some new constant C = C(N, p 1 , p N ) > 0. We choose η ∈ C 2 0 (B R ) to be a cut-off function such that Thus we obtain from (5.4) We now take an exponent q > N .By Hölder's inequality on the last term, we deduce that . (5.5) Before proceeding further, we rely on Hölder's inequality to get .
Moreover, by recalling that G ε ≥ 1, we have By using these two facts in (5.5), we obtain . (5.6) We now set so that from (5.6), we get (5.7) We define the sequence of exponents through the following recursive relation We also define the classical sequence of shrinking radii With this notation, from (5.7) we get and the latter holds true, in view of our assumption.
By starting from i = 0 and iterating (5.8) n times, we get . (5.9) By observing that lim n→∞ n i=0 and if we take the limit as n goes to ∞ in (5.9), we end up with , for some C = C(N, p 1 , p N , q) > 0. The previous estimate holds for every r < R such that B R ⋐ B. Thus, we can now use a standard interpolation trick to rectify it and replace the L q/(q−2) norm on the right-hand side by the L 1 norm.This goes as follows: we first observe that .
Then by using Young's inequality with exponents q/2 and q/(q − 2), we get We now take s, t such that r ≤ s < t ≤ R. The previous estimate is valid by replacing r with s and R with t.Thus, with some simple algebraic manipulations, we get By relying once again on Lemma 2.7, from the last estimate we get By finally using that N ≤ G ε (∇u ε ) q , we eventually conclude the proof.

Uniform higher differentiability
At last, we prove a Sobolev-type regularity result for (some nonlinear function of) ∇u ε , which eventually will permit to establish Theorem S.
When γ = 2, the last term is simply Proof.We start by fixing k ∈ {1, . . ., N } and inserting in the differentiated equation (2.15) the test function ϕ = u ε x k η 2 .Thus we get For the first term of the right-hand side, we use the same trick as in the proof of Proposition 3.1: we observe that and then integrate by parts.We integrate by parts the term (f ε (ζ ε ) ′ (u ε )) x k , as well.This yields By Young's inequality and the fact that 0 ≤ (ζ ε ) ′ ≤ 1, we can estimate the first term of the right-hand side as follows: We use (2.2) to estimate 1/g ′′ k,ε on the right-hand side.On account of this inequality, we get Inserting the above inequality into (6.3) and absorbing the Hessian term of the right-hand side into the left-hand side, one gets We now estimate the last term as follows: Observe that we used again that 0 ≤ (ζ ε ) ′ ≤ 1.We apply Young's inequality on the last term, so to obtain By further using that p k ≤ 2, we have It follows from the above inequality and (6.4) that for some C = C(p 1 , p N ) > 0. Then take the sum over k = 1, . . ., N .This gives By Lemma 2.3, we have Moreover, by the definitions of g k,ε , G ε , and G ε , it is easily seen that where all the constants depend only on N, p 1 and p N .From (6.5), we get the third term can be absorbed in the first one, up to increasing C if necessary: In the last term, we write and use Young's inequality: ˆGε (∇u ε ) where in the last line, we have used again that G ε ≥ 1. Inserting this estimate in (6.6), we obtain On the left-hand side, we use the definition (6.1) of V i,ε which gives that This yields the the desired conclusion when the exponent γ in the statement of Proposition is equal to 2. When γ > 2, we only need to apply the Hölder inequality to the last term of the right-hand side with the exponents γ/(γ − 2) and γ/2.The proof is complete.

Proofs of the main results
We finally establish the three results presented in the Introduction by relying on the relevant a priori estimates that we have obtained in the previous sections.We thus fix a ball B 4R (x 0 ) ⋐ Ω as in the statements of Proposition 1.1, Theorem L and Theorem S: we are going to use the results of the previous sections, with the choice B = B 2 R (x 0 ).
In particular, by using that5  In view of Lemma 2.6, there exists an infinitesimal sequence {ε k } k∈N such that (u ε k , ∇u ε k ) converges to (U, ∇U ) a. e. in B.
We then take the limit on both sides of the estimate above and use Fatou's lemma on the left.We get By Lemma 2.6, the functions u ε xi converge to U xi in L pi (B).Hence, the continuity of the map v ∈ L pi (B) → |v| pi ∈ L 1 (B) implies that (7.4) lim ε→0 G 0 (∇u ε ) − G 0 (∇U ) L 1 (B) = 0.
By using this result in (7.3), we obtain This concludes the proof, up to rename the constant Γ 2 .
We introduce the ball B R ⋐ B as before.By Proposition 5.1 applied with B R/4 and B R/2 , for every ε ∈ (0, ε 0 ), we have for some C = C(N, p N , p 1 , γ) > 0. On the right-hand side, we can apply (7.1), in order to estimate the term containing G γ ε .This yields G ε (∇u ε ) L 1 (BR) .
We now take the same infinitesimal sequence {ε k } k∈N as in the proof of Proposition 1.1.By using again (7.2), the lower semicontinuity of the L ∞ norm with respect to almost everywhere convergence, equation (7.4) and the fact that f ε k is defined from f by convolution with a smooth kernel, the limit as k goes to ∞ gives Then, as a consequence of (6.2), we have where we also used that G ε ≥ 1, by definition.We now rely again on (7.1), to estimate the terms containing G γ ε .This estimate and Young's inequality with exponents γ/2 and γ/(γ − 2) give (7.5) possibly for a different constant C = C(N, p N , p 1 ) > 0. From this estimate, we deduce that the family ∇V i,ε is uniformly bounded in L 2 (B R/4 , for t = 0.
Thus, by recalling the definition of V i,ε , we get and the latter is uniformly bounded, thanks to Lemma 2.5.Thus, by taking the same infinitesimal sequence {ε k } k≥1 as in the proof of Proposition 1.1, we have obtained that {V i,ε k } k∈N is a bounded sequence in W 1,2 (B R/4 ).By appealing to the Rellich-Kondrašov Theorem, we can infer its convergence to a function V i ∈ W 1,2 (B R/4 ), weakly in W 1,2 (B R/4 ) and strongly in L 2 (B R/4 ) (up to a subsequence).By the weak lower semicontinuity of the L 2 norm, (7.2) and (7.4), we obtain from (7.5) that By construction, we still have v ∈ U + W 1,1 0 (B) and by minimality of u, we get F (u) ≤ F (v).
By the properties of ζ and the fact that G(0) = 0, we have