Convergence analysis of a Local Discontinuous Galerkin approximation for nonlinear systems with balanced Orlicz-structure

In this paper, we investigate a Local Discontinuous Galerkin (LDG) approximation for systems with balanced Orlicz-structure. We propose a new numerical flux, which yields optimal convergence rates for linear ansatz functions. In particular, our approach yields a unified treatment for problems with $(p,\delta)$-structure for arbitrary $p \in (1,\infty)$ and $\delta\ge 0$.


Introduction
We consider the numerical approximation of the nonlinear system − div A(∇u) = g − div G in Ω , u = u D on Γ D , A(∇u)n = a N on Γ N , using a Local Discontinuous Galerkin (LDG) scheme. More precisely, for given data g, G, u D and a N , we seek a vector field u = (u 1 , . . . , u d ) : Ω → R d solving (1.1). Here, Ω ⊆ R n , n ≥ 2, is a polyhedral, bounded domain with Lipschitz continuous boundary ∂Ω, which is disjointly divided into a Dirichlet part Γ D , where we assume that |Γ D | > 0, and a Neumann part Γ N , i.e., Γ D ∪ Γ N = ∂Ω and Γ D ∩ Γ N = ∅. By n : ∂Ω → S n−1 , we denote the unit normal vector field to ∂Ω pointing outward. We consider the case that A : R d×n → R d×n is a nonlinear operator having ϕ-structure for some balanced N-function ϕ (cf. Section 2.2). The relevant example falling into this class is Introducing the additional unknowns L : Ω → R d×n and A : Ω → R d×n , the system (1.1) can be re-written as a "first order" system, i.e., Discontinuous Galerkin (DG) methods for elliptic problems have been introduced in the late 90's. They are by now well-understood and rigorously analyzed in the context of linear elliptic problems (cf. [2] for the Poisson problem). In contrast to this, only few papers treat p-Laplace type problems with DG methods (cf. [11,10,17,12,23,25]), or non-conforming methods (cf. [5]). There exists even fewer numerical investigations of problems with Orlicz-structure. Steady problems with Orlicz-structure are treated, using finite element methods (FEM) in [18,16,9] and non-conforming methods in [5]. Unsteady problems are investigated in [20,27]. To the best of the author's knowledge, there are no studies of steady problems with Orlicz-structure using DG methods, except for [17,Remark 2.3], where it is mentioned that the results for the p-Laplace can be extended to balanced N-functions. Note that the convergence rates in [17] are suboptimal (cf. Remark 5.24) since the continuous solution u of (1.1) satisfies the flux formulation of (1.1) with an additional term defined on interior and Dirichlet faces (cf. (5.1)). The main purpose of this paper is to overcome this drawback. Moreover, we state our results in the context of balanced N-functions to make clear that the usual distinction between the cases p ≥ 2 and p ≤ 2 for p-Laplace problems is not needed.
To this end, we introduce a new numerical flux (cf. (3.4)), which allows us to establish convergence of DG-solutions to a weak solution of the system (1.1) for general right-hand sides g − div G (cf. Theorem 4.27), and error estimates if G = 0 (cf. Theorem 5.10, Corollary 5.22). The convergence rates are optimal for linear ansatz functions. Further, our approach yields a unified treatment of problems with (p, δ)-structure (cf. [17]), p ∈ (1, ∞), δ ≥ 0 and recovers in the DG setting the results in [19,18] (using FEM) and [5] (using special non-conforming methods). The presence of the shift in the new flux is analogous to the gradient shift in the natural distance. It takes into account the structure of the nonlinear problem (1.1) (cf. Proposition 2.17, Remark 2.30, Remark 5.24).
This paper is organized as follows: In Section 2, we introduce the employed notation, define the relevant function spaces, the basic assumptions on the nonlinear operator and its consequences, introduce discrete operators and discuss their properties. In Section 3, we define our numerical fluxes and derive the flux and the primal formulation of our problem. In Section 4, we prove the existence of DG solutions (cf. Proposition 4.19), the stability of the method, i.e., a priori estimates (cf. Proposition 4.21, Corollary 4.23), and the convergence of DG solutions (cf. Theorem 4.27). In Section 5, we derive error estimates for our problem (cf. Theorem 5.10, Corollary 5.22). These are the first convergence rates for an LDG method for systems with balanced Orlicz-structure. In Section 6, we confirm our theoretical findings via numerical experiments. For the convenience of the reader, we collect in the Appendix A known results in the DG Orlicz setting, needed in the paper, and prove some new results.

Preliminaries
2.1. Function spaces. We use c, C > 0 to denote generic constants, that may change from line to line, but are not depending on the crucial quantities. Moreover, we write f ∼ g if and only if there exists constants c, C > 0 such that c f ≤ g ≤ C f .
For k ∈ N and p ∈ [1, ∞], we will employ the customary Lebesgue spaces L p (Ω) and Sobolev spaces W k,p (Ω), where Ω ⊆ R n , n ≥ 2, is a bounded, polyhedral domain having a Lipschitz continuous boundary ∂Ω, which is disjointly divided into a Dirichlet part Γ D ⊆ ∂Ω, where we assume that |Γ D | > 0, and a Neumann part Γ N ⊆ ∂Ω, i.e., Γ D ∪Γ N = ∂Ω and Γ D ∩Γ N = ∅. We denote by · p , the norm in L p (Ω) and by · k,p , the norm in W k,p (Ω). The space W 1,p Γ D (Ω) is defined as those functions from W 1,p (Ω) whose trace vanishes on Γ D . We equip W 1,p Γ D (Ω) with the gradient norm ∇ · p . For a normed vector space X, we denote its (topological) dual space by X * . We do not distinguish between function spaces for scalar, vector-or tensor-valued functions. However, we will denote vector-valued functions by boldface letters and tensor-valued functions by capital boldface letters. For d ∈ N, the standard scalar product between two vectors u = (u 1 , . . . , u d ) , v = (v 1 , . . . , v d ) ∈ R d is denoted by u · v = d i=1 u i v i , and we use the notation |u| = √ u · u for all u ∈ R d . For d, n ∈ N, the Frobenius scalar product between two tensors P = (P ij ) i=1,...,d,j=1,...,n ,Q = (Q ij ) i=1,...,d,j=1,...,n ∈ R d×n is denoted by P : Q = d i=1 n j=1 P ij Q ij , and we write |P| = √ P : P for all P ∈ R d×n . We denote by |M |, the n-or (n − 1)-dimensional Lebesgue measure of a (Lebesgue) measurable set M ⊆ R n , n ∈ N. The mean value of a locally integrable function f over a measurable set M ⊆ Ω is denoted by f M := − M f dx := 1 |M | M f dx. Moreover, we use the notation (f, g) := Ω f g dx, whenever the right-hand side is well-defined.
2.2. Basic properties of the nonlinear operator. In the whole paper, we always assume that the nonlinear operator A : R d×n → R d×n has ϕ-structure, which will be defined now. A detailed discussion and thorough proofs can be found in [14,29,8]. A regular N-function ψ is called balanced, if there exist constants γ 1 ∈ (0, 1] and γ 2 ≥ 1 such that for all t > 0, there holds The constants γ 1 and γ 2 are called characteristics of the balanced N-function ψ, and will be denoted as (γ 1 , γ 2 ). The basic properties of balanced N-functions are collected in the following lemma (cf. [15] for related results of a slightly different approach).
Lemma 2.4. Let ψ be a balanced N-function with characteristics (γ 1 , γ 2 ). Then, the following statements apply: The N-functions ψ and ψ * satisfy the ∆ 2 -condition, and the ∆ 2 -constants of ψ and ψ * possess an upper bound depending only on the characteristics of ψ. (iii) Uniformly with respect to t > 0, we have that ψ(t) ∼ ψ (t) t ∼ ψ (t) t 2 , with constants of equivalence depending only on the characteristics of ψ.
Closely related to the nonlinear operator A : R d×n → R d×n with ϕ-structure are the functions F, F * : R d×n → R d×n , for every P ∈ R d×n defined via Remark 2.10. Note that F and F * are derived from the potentials respectively. One easily sees that these potentials are balanced N-functions.
Another important tool are the shifted N-functions (cf. [14,29,28]). For a given N-function ψ, we define the shifted N-functions ψ a : The following properties of shifted N-functions are proved in [14,29].
The connection between A, F, F * : R d×n → R d×n and ϕ a , (ϕ * ) a : R ≥0 → R ≥0 , a ≥ 0, is best explained by the following proposition (cf. [14,29]). Proposition 2.17. Let A satisfy Assumption 2.7 for a balanced N-function ϕ. Then, uniformly with respect to P, Q ∈ R d×n , we have that (2.20) Moreover, uniformly with respect to Q ∈ R d×n , we have that with constants depending only on the characteristics of A and ϕ.
Lemma 2.28. Let A satisfy Assumption 2.7 for a balanced N-function ϕ. Then, uniformly with respect to t ≥ 0 and Q, P ∈ R d×n , we have that with constants depending only on the characteristics of A and ϕ.
Remark 2.29 (Natural energy spaces). If A satisfies Assumption 2.7 for a balanced N-function ϕ, we see from (2.21) and (2.23) that u ∈ W 1,ϕ (Ω), L = ∇u ∈ L ϕ (Ω) and Remark 2.30 (Natural distance). If A satisfies Assumption 2.7 for a balanced N-function ϕ, we see from the previous results that for all u, w ∈ W 1,ϕ (Ω) where the constants depend only on the characteristics of A and ϕ. In the context of p-Laplace problems, the quantity F was introduced in [1], while the last expression equals the quasi-norm introduced in [4] raised to the power ρ = max {p, 2}. We refer to all three equivalent quantities as the natural distance. This name expresses the fact, that the natural distance provides the appropriate error measure for problem (1.1) rather than the W 1,p (Ω)-semi-norm.
2.3. DG spaces, jumps and averages. Let (T h ) h>0 be a family of triangulations of our domain Ω consisting of n-dimensional simplices K. Here, the parameter h > 0, refers to the maximal mesh-size, i.e., if we set h K := diam(K) for all K ∈ T h , then h := max K∈T h h K . For simplicity, we always assume, in the paper, that h ≤ 1. For a simplex K ∈ T h , we denote by ρ K > 0, the supremum of diameters of inscribed balls. We assume that there exists a constant ω 0 > 0, independent of h > 0, such that h K ρ −1 K ≤ ω 0 for all K ∈ T h . The smallest such constant is called the chunkiness of (T h ) h>0 . Note that, in the following, all constants may depend on the chunkiness ω 0 , but are independent of h > 0. For K ∈ T h , let S K denote the neighborhood of K, i.e., the patch S K is the union of all simplices of T h touching K. We assume further for our triangulation that the interior of each S K is connected. Under these assumptions, |K| ∼ |S K | uniformly in K ∈ T h and h > 0, and the number of simplices in S K and patches to which a simplex belongs to are uniformly bounded with respect to K ∈ T h and h > 0. We define the faces of T h as follows: an interior face of T h is the non-empty interior of ∂K ∩ ∂K , where K, K are two adjacent elements of T h . For the face γ := ∂K ∩ ∂K , we use the notation S γ := K ∪ K . A boundary face of T h is the non-empty interior of ∂K ∩ ∂Ω, where K is a boundary element of T h . For the face γ := ∂K ∩ ∂Ω, we use the notation S γ := K. By Γ i h , Γ D , and Γ N , we denote the interior, the Dirichlet and the Neumann faces, resp., and put Γ h := Γ i h ∪ Γ D ∪ Γ N . We assume that each K ∈ T h has at most one face from Γ D ∪ Γ N . We introduce the following scalar products on Γ h if all the integrals are well-defined. Similarly, we define ·, · Γ D , ·, · Γ N and ·, · Γ i h . We extend the notation of modulars to the sets Γ i h , Γ D , and Γ i h ∪ Γ D , i.e., we define For m ∈ N 0 and K ∈ T h , we denote by P m (K), the space of scalar, vector-valued or tensor-valued polynomials of degree at most m on K. Given a triangulation of Ω with the above properties, given an N-function ψ, and given k ∈ N 0 , we define For each face γ ∈ Γ h of a given simplex T ∈ T h , we define this interior trace by tr K γ (w h ) ∈ L ψ (γ). Then, for w h ∈ W 1,ψ (T h ) and interior faces γ ∈ Γ I shared by adjacent elements K − γ , K + γ ∈ T h , we denote by the average and the normal jump, resp., of w h on γ. Moreover, for every w h ∈ W 1,ψ (T h ) and boundary faces γ ∈ Γ D , we define boundary averages and boundary jumps, resp., via where n : ∂Ω → S d−1 denotes the unit normal vector field to Ω pointing outward.
and the (global) jump operator via (2.42) The induced Luxembourg norm of the modular M ψ,h is denoted by · M ψ,h . Note that for every u ∈ W 1,ψ Γ D (Ω), it holds m ψ,h (u) = 0 and M ψ,h (u) = ρ ψ,Ω (∇u), so M ψ,h forms an extension of the modular ρ ψ,Ω (∇ · ) on W 1,ψ Γ D (Ω) to the DG setting. In most cases, we will omit the index

Fluxes and LDG formulations
In order to obtain the LDG formulation of (1.1) for given k ∈ N, we multiply the equations in (1.2) 1 by X h ∈ X k h , Y h ∈ X k h and z h ∈ U k h , resp., use partial integration, replace in the volume integrals the fields u, L, A and G by the discrete objects u h , L h , A h and Π k h G, resp., and in the surface integrals u, A and G by the numerical fluxes 1 Note that we use for discrete gradients the notation from [13], which is different from the one used in [17].
For given boundary data u D : Γ D → R d and a N : Γ N → R d , we denote by u * D ∈ W 1,ϕ (Ω) an extension of u D , which exists if u D belongs to the trace space of W 1,ϕ (Ω), that is characterized in [24]. For the nonlinear operator A : R d×n → R d×n having ϕ-structure, we define for every a ≥ 0 and P ∈ R d×n where ϕ a , a ≥ 0, is the shifted N-function of the balanced N-function ϕ. Using these notions, the numerical fluxes are defined via 2 and resp., where α > 0 is some constant. Note that we actually would like to use the shift |∇u| in the flux (3.4), which apparently is not possible since ∇u is not known a priori.
). Thus, this choice of the flux is the natural DG equivalent of L = ∇u. Moreover, (3.7) also implies that the flux A(u h , A h , L h ) is actually only depending on A h and u h . It is a natural generalization of the corresponding fluxes for the Laplace problem (cf. [2]) and the p-Laplace problem (cf. [11]) taking into account the ϕ-structure of (1.1). Note that it coincides for ϕ(t) = t 2 with a standard flux for the Laplace problem. The usage of the operator is the key to better approximation properties (cf. Theorem 5.10, Corollary 5.22, Remark 5.24). The flux G is designed such that it yields the weak form in (3.6).
Proceeding as in [17] and, in addition, using that Π k h is self-adjoint, we arrive at the flux formulation of (1.1): For given data Next, we want to eliminate in the system (3.6) the variables L h ∈ X k h and A h ∈ X k h to derive a system only expressed in terms of the single variable u h ∈ U k h . To this end, first note that from (3.6), it follows that

7)
We have chosen to formulate the flux in this form for a more compact notation.
If we insert (3.7) and (3.8) into (3.6) 3 , we find that for every z h ∈ U k h , there holds This is the primal formulation of our system. It can be equivalently formulated as where the nonlinear operator 4. Well-posedness, stability and weak convergence In this section, for k ∈ N and α > 0, we establish the existence of a solution of (3.6), (3.9) and (3.10) (i.e., well-posedness), resp., their stability (i.e., an a priori estimate), and the weak convergence of the discrete solutions to a solution of problem (1.1). Our approach extends the treatment in [17]. Although the existence of discrete solution resorts to standard tools, the rigorous argument is involved due to the presence of the |} in the stabilization term. Hence, we present a detailed treatment. We start showing several estimates needed for the existence of discrete solutions and their stability.
with a constant c > 0 depending only on the characteristics of A and ϕ, and the chunkiness ω 0 > 0, and a constant c α > 0 additionally depending on max{1, α}.
Proof. In view of (3.7), we write D )|} to shorten the notation. Resorting to (2.40), for every u h ∈ U k h , we find that Before we estimate the terms I i , i = 3, . . . , 5, we collect the information we obtain from the terms I 1 and I 2 . From Proposition 2.17, for every u h ∈ U k h , it follows that Resorting, again, to (2.40), for every u h ∈ U k h , we observe that which together with the properties of ϕ implies for every The shift change from Lemma 2.15, (3.7), (4.5), the trace inequality (A.23), (4.6), the L ϕ -stability of Π k h in (A.12) with k = 0, and (4.7), for every . From (4.7), (4.8) and the equivalent expression for M ϕ,h in Lemma A.3, it follows that (4.9) Now we can estimate the remaining terms. Using (2.22), the ε-Young inequality (2.2), (2.1), the L ϕ -gradient stability of Π k h in (A.13), and (4.7), we have that and with the stability of R k h in (A.2) and the approximation property of Π k h in (A.21) Choosing first ε > 0 and, then, κ > 0 sufficiently small, we conclude from (4.4), (4.9)-(4.12) that the first inequality in (4.2) applies. Then, the second one follows from the first one and where we used the definition of M ϕ,h in (2.42) and the trace inequality (A.22).
Lemma 4.14. Let A satisfy Assumption 2.7 for a balanced N-function ϕ. Then, for every ε > 0, there exists a constant c ε > 0 such that for every u h ∈ U k h , we have that with a constant c > 0 depending only on α > 0, the characteristics of A and ϕ, and the chunkiness ω 0 > 0.
Proof. In view of (3.7), we write Proceeding as in the proof of Lemma 4.1, for every u h ∈ U k h , we find that (cf. (4.9)) Proceeding as in the estimate (4.12) and, in addition, using the equivalent expression for M ϕ,h in (A.4), for every u h ∈ U k h , we obtain with constants c κ , c ε > 0 depending on α > 0. Choosing first ε > 0 and than κ > 0 sufficiently small, the last three estimates and (4.13) prove the assertion.
. Proof. Using the identities (4.3) and (4.6), for every (4.17) Resorting to the ε-Young inequality (2.2), Poincaré's inequality (A.35) and the approximation property of Π k h in (A.10) in doing so, for every u h ∈ U k h , we find that . Resorting, again, to the ε-Young inequality (2.2), using the equivalent expression for M ϕ,h in (A.4), the stability of R k h in (A.2) and the approximation property of Π k h in (A.21), for every u h ∈ U k h , we deduce that . Finally, we estimate, using the ε-Young inequality (2.2), the trace inequality (A.36), the L ϕ -gradient stability of Π k h in (A.11) and the approximation property of Π k h in (A.21) to conclude for every u h ∈ U k h that . Lemma 4.18. Let A satisfy Assumption 2.7 for a balanced N-function ϕ. Then, for every ε > 0, there exists a constant c ε > 0 such that for every u h ∈ U k h , we have that Proof. The assertion follows, using the same tools as in the proof of Lemma 4.16. Now we have everything at our disposal to prove the existence of discrete solutions and their stability. Proposition 4.19 (Well-posedness). Let A satisfy Assumption 2.7 for a balanced N-function ϕ. Then, for given data u D ∈ tr W 1,ϕ (Ω), g ∈ L ϕ * (Ω), G ∈ L ϕ * (Ω) and a N ∈ L ϕ * (Γ N ) as well as given k ∈ N and α > 0, h > 0, there exist u h ∈ U k h solving (3.9), and (L h , A h ) ∈ X k h × X k h such that (u h , L h , A h ) solves (3.6). Proof. Lemma 4.14 and Lemma 4.18 with ε > 0 small enough yield for every , we find that the right-hand side in (4.20) converges to infinity for u h M ϕ,h → ∞. Thus, Brouwer's fixed point theorem yields the existence of u h ∈ U k h solving (3.9). In turn, we also conclude the existence of L h ∈ X k h and A h ∈ X k h from (3.7) and (3.8), respectively, which shows the solvability of (3.6).
Proposition 4.21 (Stability). Let A satisfy Assumption 2.7 for a balanced Nfunction ϕ. Moreover, let u h ∈ U k h be a solution of (3.9) for α > 0, h > 0, and k ∈ N. Then, it holds with a constant c > 0 depending only on α > 0, the characteristics of A and ϕ, k ∈ N, and the chunkiness ω 0 > 0.
h be a solution of (3.6) for α > 0, h > 0 and k ∈ N. Then, it holds with a constant c > 0 depending only on α > 0, the characteristics of A and ϕ, and the chunkiness ω 0 > 0.

Proof. From (3.7) and (4.6), it follows that
, which together with the properties of ϕ and the equivalent expression for the modular M ϕ,h in Lemma A.4, implies that (4.24)

From (3.8) we know that
(4.25) In addition, using the shift change in (2.14) and (2.20), we find that , which, using (2.1), the discrete trace inequality (A.18), and the L ϕ -stability of Π 0 h in (A.12), implies that h be a solution of (3.6) for α > 0 and k ∈ N. Moreover, let (h n ) n∈N ⊆ R >0 be any sequence such that h n → 0 (n → ∞) and define (u n , L n , A n ) := (u hn , L hn , A hn ) ∈ U k hn × X k hn × X k hn , n ∈ N. Then, it holds where u ∈ W 1,ϕ (Ω) is the unique weak solution of (1.1), i.e., u = u D in L ϕ (Γ D ) and for every z ∈ W 1,ϕ Γ D (Ω), it holds (A(∇u), ∇z) = (g, z) + (G, ∇z) + a N , z Γ N . (4.29) Proof. Resorting to the theory of monotone operators (cf. [31]), we observe that (1.1) admits a unique weak solution, since A is strictly monotone. Thus, to prove the convergences (4.28) for the entire sequence to the unique weak solution u ∈ W 1,ϕ (Ω) of (1.1), it is sufficient to show that each sequence has a subsequence that satisfies (4.28). Then, the assertion for the entire sequence follows from the standard convergence principle. To this end, we adapt Minty's trick to our situation. Appealing to Proposition 4.21, it holds sup n∈N M ϕ,hn (u n − u * D ) < ∞, which, resorting to Lemma A.37, yields a not relabeled subsequence and a functionũ ∈ W 1,ϕ Γ D (Ω) such that (4.30) Therefore, introducing the notation u :=ũ+u * D ∈ W 1,ϕ (Ω), in particular, exploiting that G k hn (u n −u * D ) = L n −∇u * D (cf. (4.6)), (4.30) is exactly (4.28) and yields directly that u = u D in L ϕ (Γ D ). Resorting to Corollary 4.23 and the reflexivity of L ϕ * (Ω), we obtain a further not relabeled subsequence and a function A ∈ L ϕ * (Ω) such that Next, let z ∈ W 1,ϕ Γ D (Ω) be arbitrary and define z n := Π k hn z ∈ V k hn for every n ∈ N. Then, appealing to the convergence properties for Π k hn in Lemma A.27, Corollary A.8, and the trace inequality in (A.36), we obtain for n → ∞ that Apart from that, appealing to Corollary 4.23, we have that Since z n ∈ U k hn is an admissible test function in (3.6), for every n ∈ N, we have that where we used that z ⊗ n = 0 on Γ i h ∪ Γ D . This together with (4.32) and (4.33) yields for n → ∞ that for every z ∈ W 1,ϕ Γ D (Ω) For arbitrary z ∈ W 1,ϕ (Ω), we set z n := Π k hn z ∈ V k hn , n ∈ N. Then, recalling that A n = Π k hn A(L n ) in X k hn , n ∈ N, the monotonicity of A, the self-adjointness of Π k hn , and (4.34), for every n ∈ N and z ∈ W 1,ϕ (Ω), further yield that Moreover, using A {|Π 0 h L h |} (P) : P ≥ 0 for every P ∈ R d×n , we conclude from (4.36) that for every n ∈ N, there holds (4.37) Hence, by passing for n → ∞ in (4.37), taking into account (4.30)-(4.32), the convergence properties of Π k hn in Lemma A.27, Corollary A.8, the trace inequality in (A.36), the estimate (4.33), and the fact that A generates a Nemyckii operator, we conclude for every z ∈ W 1,ϕ (Ω) that where we used for the first equality sign that u−u * D =ũ ∈ W 1,ϕ Γ D (Ω) and, thus, (4.35) applies with z = u − u * D ∈ W 1,ϕ Γ D (Ω). Eventually, choosing z := u ± τz ∈ W 1,ϕ (Ω) in (4.38) for arbitrary τ ∈ (0, 1) andz ∈ W 1,ϕ Γ D (Ω), diving by τ > 0 and passing for τ → 0, for everyz ∈ W 1,ϕ Γ D (Ω), we conclude that (A−A(∇u), ∇z) = 0, which, due to (4.35), implies that u ∈ W 1,ϕ (Ω) satisfies (4.29). We are only aware of the studies [18,16,20,9,27]. Non of these contributions uses DG methods. For the subclass of nonlinear problems of p-Laplace type DG methods have been used in [11,10,17,12,23,25]. In all these investigations, only the case G = 0 is treated. Thus, to the best of the author's knowledge Theorem 4.27 is the first convergence result for a DG scheme in an Orlicz-setting for general right-hand sides g − div G.

Error estimates
In order to establish error estimates, we need to find a system similar to (3.6), which is satisfied by a solution of our original problem (1.1) in the case G = 0. Using the notation L = ∇u, A = A(L), we find that (u, L, A) ∈ W 1,ϕ (Ω) × L ϕ (Ω) × L ϕ * (Ω). If, in addition, A ∈ W 1,1 (Ω), we observe as in [17], i.e., using integration-by-parts, the boundary conditions, the properties of Π k h , the definition of the discrete gradient, and of the jump functional, that Using this and (3.9), we arrive at which is satisfied for all z h ∈ U k h . Before we prove the convergence rate in Theorem 5.10, we derive some estimates. Lemma 5.3. Let A satisfy Assumption 2.7 for a balanced N-function ϕ and k ∈ N. Moreover, let u ∈ W 1,ϕ (Ω) satisfy F(∇u) ∈ W 1,2 (Ω). Then, we have that with a constant c > 0 depending only on the characteristics of A and ϕ, and the chunkiness ω 0 > 0. The same assertion is valid for the Scott-Zhang interpolation operator Π k SZ , defined in [30], where for each face γ ∈ Γ i h ∪ Γ D the set S γ consist of at most two elements K ∈ T h , and the fact that By summation over γ ∈ Γ i h ∪Γ D , using Proposition 2.17 and Lemma A.15 in doing so, we conclude that . This proves the assertion for Π k h . Since the Scott-Zhang interpolation operator has the same properties (cf. [18,17]), the assertion for Π k SZ follows analogously.
Proof. Resorting to Poincaré's inequality on each K ∈ T h and Proposition 2.17, we find that The L ϕ -stability of Π k h in (A.12), Proposition 2.17, and again Poincaré's inequality on each K ∈ T h yield Combining (5.6) and (5.7), we conclude the first assertion. The second assertion follows analogously, if we additionally use Lemma 2.24 and (cf. [17,Lemma 4.4]) Lemma 5.8. Let A satisfy Assumption 2.7 for a balanced N-function ϕ and k ∈ N 0 . Moreover, let X ∈ L ϕ (Ω) and let u ∈ W 1,ϕ (Ω) satisfy F(∇u) ∈ W 1,2 (Ω). Then, we have that with a constant c > 0 depending only on the characteristics of A and ϕ, and the chunkiness ω 0 > 0.
Then, the assertion follows by summing with respect to γ ∈ Γ i h ∪ Γ D and resorting to Lemma 5.5 with j = 0.
Theorem 5.10. Let A satisfy Assumption 2.7 for a balanced N-function ϕ. Moreover, let u ∈ W 1,ϕ (Ω) be a solution of (1.1) which satisfies F(∇u) ∈ W 1,2 (Ω) and let u h ∈ U k h be a solution of (3.9) for α > 0 and k ∈ N. Then, we have that with a constant c > 0 depending only on the characteristics of A and ϕ, k ∈ N, the chunkiness ω 0 > 0, and α −1 > 0.
Proof. To shorten the notation, we use again L h instead of G k h u h +R k h u * D in the shifts.
so the error equation (5.2) gives us (5.11) Due to Proposition 2.17, the first term on the left-hand side of (5.11) is equivalent to while (2.5) yields that the second term on the left-hand side of (5.11) is equivalent to It is important to use that u * D ⊗ n = u ⊗ n on Γ i h ∪Γ D , which allows to replace in (5.13) u h − u * D by u h − u. Thus, (5.12) and (5.13) yield that the left-hand side of (5.11) is bounded from below by To treat the term K 1 , we exploit that, owing to Thus, Proposition 2.17, the ε-Young inequality (2.2) with ψ = ϕ |∇u| , Lemma A.15, and Lemma 5.3 yield that for all ε > 0, there exists c ε > 0 such that (5.17) To treat K 3 , we use the ε-Young inequality (2.2) for ψ = ϕ |∇u| to get 3 Using a shift change in Lemma 2.15, Corollary A.25 with w h = u − Π k h u, Proposition 2.17, Lemma A.15 and Lemma 5.8, we find that All together, choosing first ε > 0 and than κ > 0 small enough, and absorbing the terms with ε and κ in the left-hand side, we conclude the assertion.
Remark 5.24. Let us compare our results in the special case that ϕ possesses (p, δ)-structure and k = 1 with the corresponding ones in [17].
(i) [17, Theorem 4.8, Corollary 4.10 (ii)] provides sub-optimal convergence rates for p ≤ 2 as well as for p ≥ 2, while Theorem 5.10 and Corollary 5.22 prove optimal ones for all p ∈ (1, ∞). We emphasize that in [17], the cases p ≤ 2 and p ≥ 2 are treated differently, while our approach provides a unified treatment.
is the solution of (1.1). In this case, an optimal convergence rate is proved for p ≤ 2, while for p ≥ 2 only sub-optimal results are provided. Inasmuch as the solution u is a priori unknown and, in general, Π k SZ u = u = u D on Γ D , these results are of rather theoretical interest than of interest from the practical computational point of view. (iii) The assertions of Theorem 5.10 and Corollary 5.22 also hold if we replace the extension u * D ∈ W 1,ϕ (Ω) of the boundary datum u D by the approximation Π k SZ u ∈ U k h ∩W 1,ϕ (Ω) of the solution u ∈ W 1,ϕ (Ω), i.e., we define u * D := Π k SZ u. In this case, we obtain Thus, we can, once more, replace if we add on the right-hand side of (5.11) the term Similarly, we get instead of (5.15) which amounts in an additional term Consequently, in the case u * D = Π k SZ u ∈ U k h ∩ W 1,ϕ (Ω), we have to treat additionally the terms in (5.25) and (5.26). However, the Scott-Zhang interpolation operator Π k SZ has similar properties as the projection operator Π k h . Thus, we can handle these terms in the same way as the corresponding terms in the proof of Theorem 5.10. Thus, our approach also gives for u * D = Π k SZ u optimal convergences rates for all balanced N-functions ϕ.

Numerical experiments
In this section, we apply the LDG scheme (3.6) (or (3.9)), described above, to solve numerically the system (1.1) with balanced Orlicz-structure with the nonlinear operator A : R d×d → R d×d , for every P ∈ R d×d defined by A(P) := (δ + |P|) p−2 ln(1 + δ + |P|)P , where δ := 1e−3 and p ∈ (1, ∞), i.e., the operator F : R d×d → R d×d for every P ∈ R d×d is defined by We approximate the discrete solution u h ∈ U k h of the nonlinear problem (3.6) deploying the Newton line search algorithm of PETSc (version 3.16.1), cf. [3], with an absolute tolerance of τ abs = 1e−8 and a relative tolerance of τ rel = 1e−10. The linear system emerging in each Newton step is solved deploying PETSc's preconditioned biconjgate gradient stabilized method (BCGSTAB) with an incomplete LU factorizaton. For the numerical flux (3.4), we choose the parameter α > 0 according to Table 1 Table 1. Choice of the stabilization parameter α > 0. All experiments were carried out using the finite element software package FEniCS (version 2019.1.0), cf. [22]. All graphics are generated using the Matplotlib library (version 3.5.1), cf. [21].
Then, for the resulting series of triangulations T hi , i = 1, . . . , 5, we apply the above Newton scheme to compute the corresponding numerical solutions (u i , As estimation of the convergence rates, the experimental order of convergence (EOC) where for every i = 1, . . . , 5, we denote by e i either e L,i or e ,i , resp., is recorded.
Proof. The first assertion is shown in [17, (A.7)], and the second one follows from the triangle inequality.
Then, the assertion follows by summing with respect to γ ∈ Γ i h ∪ Γ D .