Goal-oriented adaptive finite element method for semilinear elliptic PDEs

We formulate and analyze a goal-oriented adaptive finite element method (GOAFEM) for a semilinear elliptic PDE and a linear goal functional. The strategy involves the finite element solution of a linearized dual problem, where the linearization is part of the adaptive strategy. Linear convergence and optimal algebraic convergence rates are shown.


INTRODUCTION
1.1. Goal-oriented adaptive FEM and contributions. While standard adaptivity aims to approximate the exact solution u ∈ H 1 0 (Ω) of a suitable PDE at optimal rate in the energy norm (see, e.g., [Dör96; MNS00; BDD04; Ste07; CKNS08] for some seminal contributions and [FFP14] for the present model problem), goal-oriented adaptivity aims to approximate, at optimal rate, only the functional value G(u) ∈ R (also called quantity of interest in the literature). Usually, goal-oriented adaptivity is more important in practice than standard adaptivity and, therefore, has attracted much interest also in the mathematical literature; see, e.g., [BR03; EEHJ95; GS02; BR01] for some prominent works and [KVD19; DBR21; BGIP21; BMZ21] for some recent contributions. Unlike standard adaptivity, there are only few works that aim for a mathematical understanding of optimal rates for goal-oriented adaptivity; see [MS09;BET11;FFGHP16;FPZ16] for linear problems with linear goal functional and [BIP21] for a linear problem, but nonlinear goal functional. The works [HPZ15;XHYM21] consider semilinear PDEs and linear goal functionals, but only prove convergence, while optimal convergence rates remain open (and can hardly be proved for the proposed algorithms). The present work proves, for the first time, optimal convergence rates for goal-oriented adaptivity for a nonlinear problem. To this end, we see, in particular, that the marking strategy used in [HPZ15; XHYM21] must be modified along the ideas of [BIP21].
The weak formulation of the so-called primal problem (2) reads as follows: Find u ∈ H 1 0 (Ω) such that where v , w := Ω vw dx denotes the L 2 (Ω)-scalar product and v , w := A∇v , ∇w is the A-induced energy scalar product on H 1 0 (Ω). We stress that existence and uniqueness of the solution u ∈ H 1 0 (Ω) of (3) follow from the Browder-Minty theorem on monotone operators (see Section 2.4 for details).
Based on conforming triangulations T H of Ω and fixed polynomial degree m ∈ N, let X H := v H ∈ H 1 0 (Ω) | ∀ T ∈ T H : v H | T is a polynomial of degree ≤ m . Then, the FEM discretization of the primal problem (3) reads: Find u H ∈ X H such that Then, the finite element method approximates the sought goal quantity G(u) by means of the computable quantity G(u H ).
1.3. Error control and GOAFEM algorithm. The optimal error control of the goal error G(u) − G(u H ) involves the so-called (practical) dual problem: Find z[u H ] ∈ H 1 0 (Ω) such that where b (x, t) := ∂ t b(x, t). Existence and uniqueness of z[u H ] follow from the Lax-Milgram lemma (see Section 2.5). With the same FEM spaces as for the primal problem, the FEM discretization of the dual problem (5) reads: Find z H [u H ] ∈ X H such that The notation z[u H ] emphasizes that the dual solution depends on the (exact) discrete primal solution u H (instead of the practically unavailable exact primal solution u); the same holds for the discrete dual solution z H [u H ].
For this setting, we derive below (see Theorem 7) the goal error estimate where denotes ≤ up to some generic multiplicative constant C > 0. The arising error terms are controlled by computable a posteriori error estimates 1.4. Contributions. Let (T ) ∈N 0 be the sequence of meshes generated by the adaptive loop (9) with corresponding error estimators η := η (u ) and ζ := ζ (z [u ]). We prove that the proposed adaptive strategy leads to linear convergence η +n [η 2 +n + ζ 2 +n ] 1/2 ≤ C lin q n lin η [η 2 + ζ 2 ] 1/2 for all , n ∈ N 0 , where C lin > 0 and 0 < q lin < 1 are generic constants. Moreover, we prove that this estimator product leads to convergence where the rate α = min{2s, s + t} is optimal in the sense that s > 0 is any possible rate for η and t > 0 is any possible rate for ζ (in the sense of the usual approximation classes [CFPP14]). We stress that this is the first optimality result on GOAFEM for a nonlinear model problem. While α = s + t for linear model problems [MS09;FPZ16], the slightly worse rate α = min{2s, s+t} stems from the fact that the adaptive algorithm must also control the linearization of the dual problem. Besides the goal error estimate (7), technical key results also include Pythagoras-type quasi-orthogonalities for the semilinear model problem (2). Finally, we note that our analysis allows to modify the marking strategies of [HPZ15; XHYM21] to ensure linear convergence of η 2 + ζ 2 = O((#T ) −α ) with rate α = min{2s, 2t}.
1.5. Outline. This work is organized as follows: In Section 2, the analytical preliminaries for the semilinear setting and its linearizations are presented. This includes the precise assumptions on the problem and the right-hand sides as well as well-posedness of the arising continuous and discrete problems. In Section 2.7, the key estimate (7) is proved; cf. Theorem 7. In Section 3, we formulate the GOAFEM algorithm (cf. Algorithm 17), which employs a marking strategy that respects the product structure found in (8). We proceed with stating the main results. First, Theorem 19 shows linear convergence (10) of the proposed algorithm. Second, Theorem 20 shows optimal convergence rates (11). Section 4 is devoted to the proofs of the aforementioned results, which contain the axioms of adaptivity [CFPP14] for the semilinear setting (Section 4.1), a stability result for the linearized dual problem (Section 4.2),which turns out to be important, and the necessary quasi-orthogonalities (Section 4.5). Numerical experiments underline our theoretical findings in Section 5.

MODEL PROBLEM
2.1. Assumptions on diffusion coefficient. The diffusion coefficient A : Ω → R d×d sym satisfies the following standard assumptions: sym is a symmetric and uniformly positive definite matrix, i.e., the minimal and maximal eigenvalues satisfy In particular, the A-induced energy scalar product v , w = A∇v , ∇w induces an equivalent norm |||v||| := v , v 1/2 on H 1 0 (Ω). To guarantee later that the residual a posteriori error estimators are well-defined, we additionally require that A| T ∈ W 1,∞ (T ) for all T ∈ T 0 , where T 0 is the initial triangulation of the adaptive algorithm.
While (GC) turns out to be sufficient for plain convergence of the later GOAFEM algorithm, we require the following stronger assumption for linear convergence and optimal convergence rates.
To establish continuity of (v, w) → b(v) , w , we apply the Hölder inequality with Hölder conjugates 1 ≤ s, s ≤ ∞ to obtain that To guarantee that b(v) , w < ∞, condition (GC) has to ensure that the embedding is continuous for r = s and r = ns .
(v) The lower bound 2 ≤ n imposed for d ∈ {1, 2, 3} stems from the necessity of a Taylor expansion of the dual problem; cf. (50).
To guarantee later that the residual a posteriori error estimators from (59)-(60) are welldefined, we additionally require that f | T , g| T ∈ H(div, T ) and f | T · n, g| T · n ∈ L 2 (∂T ) for all T ∈ T 0 , where T 0 is the initial triangulation of the adaptive algorithm.
Together with the continuity of · , · , we infer that A is well-defined. For x ∈ Ω and ξ 1 , ξ 2 ∈ R with ξ 1 < ξ 2 , the mean value theorem proves that Together with (ELL), we thus see that where the hidden constant depends only on µ 0 from (ELL). This proves that A is strongly monotone and hence, in particular, monotone and coercive. Moreover, the solution u ∈ H 1 0 (Ω) of (3) is necessarily unique. Finally, recall from (CAR) that b is smooth in ξ. Therefore, the mapping is continuous, i.e., A is hemi-continuous. Overall, the Browder-Minty theorem (see, e.g., [Zei90,Theorem 26.A (a)-(c)]) applies and proves that the primal problem (3) admits a (unique) solution u ∈ H 1 0 (Ω). The same argument shows that the discrete primal problem (4) admits a unique solution u H ∈ X H .
2.5. Well-posedness of dual problem and goal error identity. For v, w ∈ H 1 0 (Ω), define Note that B(w, v) : Ω → R ≥0 . If v = u, we introduce the shorthand B(w) := B(w, u). With this notation, the theoretical dual problem reads as follows: To address well-posedness, we show that (GC) implies that Ω |B(w)zv| dx < ∞ for all v, w, z ∈ H 1 0 (Ω). The cases d ∈ {1, 2} are covered, e.g., in [AW15, Lemma A.1]. If d = 3, we exploit (CGC) and apply the same reasoning as for the estimate (15) to obtain that For any approximationz[u H ] ≈ z H ∈ X H , this yields the error identity Hence, we introduce the practical dual problem (5) and its discretization (6), now considered for a general argument: The same arguments as for the theoretical problem (24) apply and prove existence and uniqueness of z[w] ∈ H 1 0 (Ω) and z H [w] ∈ X H . Overall, the error identity (23) for z H = z H [u H ] then takes the following form This identity will be the starting point for proving the goal error estimate (7); see Theorem 7 below for the formal statement.
2.6. Pointwise boundedness of primal and dual solutions. In this section, we prove that imposing regularity assumptions on the right-hand side yields that the exact solution u and the dual solutionsz[w] and z[w] are bounded in L ∞ (Ω). For d = 1, this is immediate, since H 1 (Ω) ⊂ C(Ω). For d ∈ {2, 3} and f = 0, we refer to, e.g., [BHSZ11, Theorem 2.2]. These results turn out to be crucial for the goal error estimate (Theorem 7) as well as for the numerical analysis of the proposed adaptive goal-oriented strategy (Algorithm 17). In particular, they also allow one to derive Céa-type estimates for the discrete primal and dual solutions (Proposition 11, 12).
with a constant C = C(d, p) > 0.
In the same spirit as in Proposition 2, we are able to establish L ∞ -bounds for the solutions of the theoretical and practical dual problems (20a) and (24a).
with a constant C = C(d, p) > 0, which is, in particular, independent of w.
Proof. We argue as for Proposition 2. The case d = 1 follows from the Sobolev embedding. For d ∈ {2, 3} and for λ ≥ 0, we define the test function Using the coercivity assumption (ELL) and testing the weak formulation (20a) with ϕ + λ , we observe that Following the steps of the proof of Proposition 2 (where the latter estimate corresponds to (30)), we conclude the proof forz[w]. The same argument applies for the practical dual problem, where B(w) is replaced by b (w) ≥ 0. This concludes the proof.
2.7. Goal error estimate. The following theorem provides, up to norm equivalence, the formal statement of the goal error estimate (7).
Proof. In the case d = 1, (39) follows from the Sobolev embedding and (ELL). Moreover, note that b(0) = 0 and (17) prove that b(u) , u ≥ 0. Using (ELL), (MON), and the Hölder inequality, we obtain that Arguing as for (32) and applying the Hölder inequality, we see that where C GNS depends only on d and p . With ∇u L 2 (Ω) ≤ µ −1/2 0 |||u|||, this concludes the proof for u. The same argument (based on (4) instead of (3)) applies for u H . Furthermore, the same argument applies also for the dual problems (based on (20) and (24) instead of (3) for the theoretical and practical dual problem, respectively).
respectively. This concludes the proof.
The following lemma is one of the two main ingredients for the proof of Theorem 7.
Proof. We argue as in the proof of [BHSZ11, Theorem 3.4]. With respect to Remark 1, choose s > 1 arbitrarily for d ∈ {1, 2} and s = 2 * for d = 3. In any case, we see that Due to the smoothness assumption (CAR), we may consider the Taylor expansion Since b is smooth and u ∈ L ∞ (Ω), we obtain that where C depends only on u L ∞ (Ω) , and n. Moreover, (GC) allows to bound the remainder term, i.e., for any 0 ≤ τ ≤ 1, it holds that where C depends only on |Ω|, n, and R. The triangle inequality yields that Recall from Remark 1 that H 1 0 (Ω) → L ks (Ω) for all 1 ≤ k ≤ n by choice of s and n. Therefore, the Gagliardo-Nirenberg-Sobolev inequality proves that where the hidden constant depends only on |Ω|, d, u L ∞ (Ω) , n, and R. With with hidden constants Together with (43), this concludes the proof of (42).
The following lemma is the last missing part for establishing Theorem 7.
For the exact primal solution u, we observe that the theoretical dual problem and the practical dual problem coincide, as . Using monotonicity and the definition of the theoretical as well as practical dual problem, we obtain that Since Proposition 6 yields thatz[w] ∈ L ∞ (Ω) independently of w, we can proceed as in Remark 1(i). To this end, we choose s > 1 arbitrarily for d ∈ {1, 2} and s = 2 * for d = 3. Assumption (GC) then yields that It remains to prove that We observe that the change of variables τ → 1 − τ leads to and, hence, We only prove the second inequality of (50), but note that the first estimate follows for τ = 1 by the subsequent arguments: Due to the smoothness assumption (CAR), we may consider the Taylor expansion of Since b is smooth and u ∈ L ∞ (Ω), we obtain that where C depends only on u L ∞ (Ω) , and n. Moreover, (GC) allows us to bound the remainder term, i.e., for any 0 ≤ τ σ ≤ 1, it holds that where C depends only on |Ω|, n, and R. If d ∈ {1, 2}, note that (n − 1)s < ∞. If d = 3, it holds that (n − 1)s < 2 * . Hence, we obtain for all 2 ≤ k ≤ n − 1 that where the hidden constant depends only on norm equivalence |||·||| ∇(·) L 2 (Ω) . Arguing as for (45)-(47) above, we infer that where This concludes the proof.
Proof of Theorem 7. Since Lemma 8 guarantees |||u H ||| ≤ C bnd , we can apply Lemma 9 and Lemma 10 to w = u H to obtain that as well as Combining these estimates with the error identity (25), we prove the error estimate This concludes the proof.
The assumptions of Lemma 9 (resp. Lemma 10) also yield the validity of a Céa-type best approximation property for the discrete primal solution u H ∈ X H (resp. for the discrete dual solutionsz (16) is not Lipschitz continuous.
Proof. The Galerkin orthogonality reads Using (MON) and the Galerkin orthogonality, we observe that where C Céa := 1 + C Lip . This proves (54), where the minimum is attained due to finite dimension of X H .
Let w ∈ H 1 0 (Ω) with |||w||| ≤ M < ∞. Under the assumptions of Lemma 10, it holds that Proof. We prove the statement for the practical dual problem. With minor modifications, the same argument also applies for the theoretical dual problem. We only need to show that the bilinear form of the practical dual problem is continuous and elliptic. Then, by standard theory for Lax-Milgram-type problems, this proves the Céa lemma (57). To this end, we exploit (MON) and obtain that i.e., the bilinear form is elliptic with constant 1. In view of Remark 1, choose t > 1 arbitrarily for d ∈ {1, 2} and t = 2 * and, hence, t = 2 * /(2 * − 2) for d = 3. With (ELL) and (GC), we have that Combining the last two estimates, we prove continuity of the bilinear form with C Céa = C Céa (|Ω|, d, u L ∞ (Ω) , M, n, R, p, f, f , µ 0 ). This concludes the proof.
Remark 13. If it is a priori guaranteed that u H L ∞ (Ω) ≤ C < ∞, then the proofs of Section 2.7 simplify considerably and the use of (GC) can be avoided. By Proposition 2, we infer To establish Lemma 9, recall B(u H ) from (22). The observation (58) together with the smoothness assumption (CAR) yields that Note that (58) also establishes the crucial estimate (50) from Lemma 10 due to the local Lipschitz continuity from (CAR); see [HPZ15, Proposition 1]. However, we stress that already for lowest-order FEM, the validity of a discrete maximum principle requires assumptions on the triangulation which are not imposed for (GC) and usually not met for adaptive mesh refinement.
Remark 14. Note that (CAR) implies only that b(x, · ) is locally Lipschitz. If we additionally assume global Lipschitz continuity, i.e., L := sup x∈Ω b (x, ·) L ∞ (R) < ∞, then the strongly monotone operator A : H 1 0 (Ω) → H −1 (Ω) from (16) is also Lipschitz continuous with L := max{µ 1 , L }. In particular, the problem (3) fits into the framework of the main theorem on strongly monotone operators, and the proof of Lemma 9 becomes trivial. The same applies to the proof of Lemma 10, if b is globally Lipschitz continuous.

GOAL-ORIENTED ADAPTIVE ALGORITHM AND MAIN RESULTS
3.1. Mesh refinement. From now on, let T 0 be a given conforming triangulation of Ω. For mesh refinement, we employ newest vertex bisection (NVB); see [Ste08]. Throughout, each triangulation T H ∈ T is associated with the finite-dimensional FEM space X H ⊆ H 1 0 (Ω) from the introduction, and, since we employ NVB, T h ∈ T(T H ) implies nestedness X H ⊆ X h .

A posteriori error estimators. For
be the local contributions of the standard residual error estimators, where · denotes the jump across edges (for d = 2) resp. faces (for d = 3) and n denotes the outer unit normal vector. For d = 1, these jumps vanish, i.e., · = 0. For U H ⊆ T H , let The next result establishes that the error estimators (59)-(60) satisfy the following slightly relaxed axioms of adaptivity from [CFPP14]. Compared to [CFPP14], stability (A1) is slightly relaxed and reduction (A2) is simplified due to the nestedness of the discrete spaces. Furthermore, we note that well-posedness of (59)-(60) requires additional regularity assumptions on A, f , and g (as stated in Section 2.1 and 2.3) so that the jump terms are well-defined.
(A2) reduction: With 0 < q red := 2 −1/(2d) < 1, there holds that, for all v H ∈ X H and all w ∈ H 1 0 (Ω), (A4) discrete reliability: For all w ∈ H 1 0 (Ω), there exists C drel > 0 such that The constant C rel depends only on d, µ 0 , and uniform shape regularity of the meshes T H ∈ T. C drel depends additionally on the polynomial degree m, and C stab [M ] depends furthermore on |Ω|, M , n, R, and A. 3.3. Goal-oriented adaptive algorithm. The following algorithm essentially coincides with that of [HPZ15]. Following [BIP21], we adapt the marking strategy to mathematically guarantee optimal convergence rates.
We stress that (62) is an immediate consequence of the goal error estimate (38) from Theorem 7 and reliability (A3), i.e., Consequently, only the convergence (63) of Proposition 18(ii) has to be proven. Replacing the assumption (GC) on the nonlinearity by the stronger assumption (CGC), we even get linear convergence, which improves Proposition 18(ii).
To formulate our main result on optimal convergence rates, we need some additional notation. For N ∈ N 0 , let T N := T ∈ T | #T − #T 0 ≤ N denote the (finite) set of all refinements of T 0 which have at most N elements more than T 0 . For s, t > 0, we define u As := sup In explicit terms, e.g., u As < ∞ means that an algebraic convergence rate O(N −s ) for the error estimator η is possible, if the optimal triangulations are chosen.
In comparison to [HPZ15] or [XHYM21], our proof of Theorem 19 avoids any L ∞bounds on the discrete solutions as well as the assumption that the initial mesh is sufficiently fine. Moreover, in contrast to [XHYM21], which proves linear convergence for the marking strategy suggested in [HPZ15] (and a multilevel correction step), we even prove optimal convergence rates without assuming Lipschitz continuity for the primal and dual operators.
Remark 21. Compared to the treatment of linear problems in [FPZ16], the marking strategy considers the combined error estimator due to the structure from (7). In addition, the proofs of the essential quasi-orthogonalities are more involved both in the semilinear primal setting as well as for the combined error estimator -which is also a key ingredient of the analysis in [BIP21], but note that this is a slight modification of the marking strategies of [HPZ15; XHYM21] that allows us to prove convergence rates.

Remark 22. With the estimate η (u )[η (u )
one can also consider Algorithm 17 with M := M uz , which then takes the form of the standard AFEM algorithm (see, e.g., [CFPP14]) for the product space estimator. Then, Theorem 19 and 20 hold accordingly with the product replaced by the square sum and α = min{2s, 2t}, which is slightly worse than the rate α from Theorem 20. We refer to [BIP21] for details (in a different, but structurally similar setting).

PROOFS
In this section, we give the proofs of Proposition 15 and 18 as well as Theorem 19 and 20.

Axioms of Adaptivity.
In this section, we sketch the proof of Proposition 15 and verify that the residual error estimators from Section 3.2 satisfy the (relaxed) axioms of adaptivity (A1)-(A4) from [CFPP14]. As usual for nonlinear problems, only the verification of stability (A1) requires new ideas, while (A2)-(A4) follow from standard arguments. For a triangulation T h ∈ T and an element T ∈ T h , let E(T ) be the set of its facets (i.e., nodes for d = 1, edges for d = 2, and faces for d = 3, respectively). Moreover, let denote the usual element patch. Recall that (RHS) ensures that the error estimators (59)-(60) are well-defined. To abbreviate notation, we define the primal and dual residuals for all v H ∈ X H and w ∈ H 1 0 (Ω). We stress that we do not explicitly state the dependence of the constants on the γ-shape regularity constant.
To prove stability (A1), we need the following auxiliary result: Proof. Similarly to the Taylor expansion in (44), it holds that This yields that .
With Lemma 23 at hand, stability (A1) follows as for a linear model problem [CKNS08].
Proof of stability (A1) for primal problem. With the primal residual R(v H ) from (67a), the refinement indicators read

Elementary calculus proves that
Recalling the definition of D(δ h ), we see that For the first term in (71), we use the product rule and an inverse inequality to see that where : denotes the Frobenius scalar product on R d×d and D 2 δ h is the Hessian of δ h . The jump term in (70) can be estimated by a discrete trace inequality: Collecting (70)-(73), we obtain that where the hidden constant depends only on the shape regularity of T h , and the polynomial degree m of the ansatz spaces. Together with Lemma 23, this yields that The hidden constant depends only on |Ω|, the shape regularity of T h , d, m, M , n, R, µ 0 , and A. Note that for any non-refined element This concludes the proof.
Proof of stability (A1) for dual problem. With the dual residual R * (w; v H ) from (67b), the refinement indicators read Observe that similar arguments as for the proof of stability (A1) of the primal problem lead to and, hence, Here, we only estimate the term b (w)δ h L 2 (Ω) , since the other terms follow from the arguments provided for the primal problem. To this end, choose 2 < ρ < ∞ arbitrarily if d ∈ {1, 2}. If d = 3, let ρ = 3 and, hence, ρ = 3/2. Assumption (CGC) guarantees that the Sobolev embedding (14) holds with r = 2ρ and r = 2(n − 1)ρ simultaneously. Therefore, we obtain that Arguing as for the primal problem, we see that The hidden constant depends only on |Ω|, the shape regularity of T h , d, the polynomial degree m of the ansatz spaces, M , n, R, µ 0 , and A. This concludes the proof.
The proof of reduction (A2) resembles the linear case in [CKNS08].
Proof of reduction (A2). For T ∈ T H \T h , let T h | T := T ∈ T h | T ⊆ T denote the set of its children. Note that NVB guarantees that Recall that . Applying the bisection estimate (74), we obtain that For the first term, it holds that For the second term, note that v H ∈ X H is a coarse-mesh function and, hence, smooth in the interior of T ∈ T H . Hence, all jumps in the interior of T ∈ T H vanish. This leads to Altogether, we conclude reduction (A2) for the primal estimator The same arguments apply for the dual estimator.
Sketch of proof of discrete reliability (A4). To prove discrete reliability (A4), we choose I H as the Scott-Zhang projector [SZ90] for d ∈ {2, 3}, which is a Clément-type quasiinterpolation operator, and note that I H can be chosen in such a way that [CKNS08]. Standard arguments then show that The hidden constants depend only on the dimension d, the polynomial degree m, and norm equivalence. This concludes the proof of discrete reliability (A4).
Proof. First, note that We aim to prove that To this end, note that the strategy in the proof of Proposition 10 provides a similar estimate to (52) by choosing t from Remark 1(ii) instead of s from Remark 1(i), i.e., with C dual = C dual (|Ω|, d, u L ∞ (Ω) , M, n, R, p, f, f , g, g, µ 0 ) > 0. The Hölder inequality leads us to  Then, for any choice of the marking parameters 0 < θ ≤ 1 and 1 ≤ C mark ≤ ∞, Algorithm 17 guarantees that as → ∞. Moreover, at least one of these two cases is met.
Sketch of proof. The proof is essentially verbatim to that of [BIP21, Proposition 14] and therefore only sketched. From the Céa lemma (54) for the primal problem (resp. (57) for the dual problem), the nestedness X ⊆ X +1 of the discrete spaces for all ∈ N 0 , and the stability of the dual problem (Lemma 24), it follows that there exist a priori limits Together with stability (A1) and reduction (A2), the estimator reduction principle proves that Clearly, at least one of these two cases is met. With reliability (A3), it follows that Proposition 14] for details.
Proof of Proposition 18. The proof is verbatim that of [BIP21, Proposition 1] and therefore only sketched. From (A1)-(A3), the Céa lemma (54) for the primal problem (resp. (57) for the practical dual problem), and the nestedness of the discrete spaces, there follows boundedness see [BIP21, Section 4.1] for details. Together with the convergence results of Lemma 25, this yields convergence This concludes the proof.
4.4. Auxiliary results. We continue with some preliminary results, which are needed for proving the quasi-orthogonalities and which are, hence, crucial to prove linear convergence. To this end, consider the Fréchet derivative of A at w ∈ H 1 0 (Ω), i.e., Proof. Due to the linearity of · , · in the left-hand argument, we conclude that the only contribution is due to b, i.e., for all v ∈ H 1 0 (Ω), it holds that For v ∈ H 1 0 (Ω), the Hölder inequality with arbitrary 1 < s < ∞ if d ∈ {1, 2} and s = 2 * if d = 3 proves that From the Taylor expansion (44), note that Together with Lemma 8 and u ∈ L ∞ (Ω), the assumption (GC) yields that where the hidden constants depend only on C bnd from Lemma 8, n, R from (GC) and norm equivalence. This concludes the proof.
The next lemma is an auxiliary result for establishing quasi-orthogonality. Our proof combines arguments from the linear setting [BHP17,Lemma 17] with ideas from [FFP14, Lemma 6.10]. We stress that the proof exploits the a priori convergence |||u − u ||| → 0 from Lemma 25.
Proof. We only show the statement for e . The proof for E follows by similar arguments.
To prove that e 0 in H 1 0 (Ω), we show that each subsequence (e k ) k∈N 0 admits a further subsequence (e k j ) j∈N 0 such that e k j 0 as j → ∞. To this end, consider a subsequence (e k ) k∈N 0 of (e ) ∈N 0 . Without loss of generality, we may assume that e k = 0 for all k ∈ N 0 . Note that |||e k ||| ≤ 1. Hence, the Banach-Alaoglu theorem yields a further subsequence (e k j ) j∈N 0 satisfying weak convergence e k j w ∞ ∈ H 1 0 (Ω) as j → ∞. It remains to show that w ∞ = 0. Lemma 25 implies that u ∈ X ∞ and, hence, e ∈ X ∞ for all ∈ N 0 . Mazur's lemma (see, e.g., [FK80,Theorem 25.2]) yields that w ∞ ∈ X ∞ .
First, the Galerkin orthogonality shows that Letting j → ∞, we infer that where the hidden constant depends only on C Lip . Hence, we get that Moreover, Lemma 26 and the triangle inequality lead to Together with a priori convergence |||u − u k j ||| → 0, we thus obtain that Note that due to (ELL) and (MON), A [u]( · ) is bounded from below, i.e., Due to the smoothness of ξ → b(t, ξ) and the L ∞ -bound for u from Proposition 2, we infer that 0 ≤ b (u) ≤ C. Hence, A [u]( · ) is a bounded linear operator and the restriction A [u]( · )| X∞ : X ∞ → X * ∞ is an isomorphism. Consequently, also the adjoint (A [u]| X∞ ) * : X * ∞ → X ∞ is an isomorphism, where we note that X ∞ is a closed subspace of the Hilbert space H 1 0 (Ω) and, hence, reflexive. Hence, for every v ∞ ∈ X ∞ , there exists v ∞ ∈ X ∞ such that This shows that w ∞ = 0 and concludes the proof.
4.5. Quasi-orthogonalities. Our proof of the crucial quasi-orthogonalities adapts that of [BHP17,Lemma 17,18] from the linear setting in the Lax-Milgram framework to the present nonlinear setting. However, we stress that the following results all need the stronger growth condition (CGC), while our earlier results only require (GC).
While Lemma 25 guarantees a priori convergence |||u − u ||| → 0 of the primal problem, a priori convergence of the dual problem has to be assumed (and depends on the marking steps).
Remark 32. If d > 3, the same reasoning using the Hölder inequality still holds true, though the polynomial degree n in (CGC) becomes more constrained.

NUMERICAL EXPERIMENTS
In this section, we test and illustrate Algorithm 17 with numerical experiments for d = 1 and d = 2. We consider equation (2), where A = 1. The adaptivity parameter is set to θ = 0.5. We compare the proposed GOAFEM (Algorithm 17) with standard AFEM (adapted from, e.g., [CFPP14; CKNS08]), where mesh-refinement is driven by the primal estimator (i.e., Algorithm 17 with M := M u in step (v)) and standard AFEM driven by the product space estimator (see Remark 22).
The implementation of conforming finite elements of order m ∈ {1, 2, 3, 4} is done using Legendre polynomials and Gauss-Legendre quadrature and Gauss-Jacobi quadrature for the interval containing the left interval endpoint. For mesh refinement, 1D bisection is used. Moreover, we employ the (damped) Newton method from [AW15, Section 3] for step (i) in Algorithm 17 to approximate the nonlinear primal problem. Let g = x −9/20 ∈ L 2 (Ω) and g = 0 serve as the goal functions. As a reference, we use the value of the integral which reads G(u) = 1 0 sin(πx) x 9/20 dx ≈ 0.95925303932778833 . . . .
In Figure 2, for polynomial degree m = 1, 2, we compare the goal value calculated with the proposed GOAFEM algorithm to the goal evaluation of the standard AFEM implementation and AFEM using η 2 + ζ 2 as a marking criterion (AFEM+). In all cases, we employ a (undamped) Newton iteration following [AW15, Section 3]. The reference goal value G(u) = −0.001584951808832 is obtained by extrapolation from the calculated goal values using GOAFEM with m = 2. For m = 1, an example of the meshes generated by GOAFEM (Algorithm 17) is shown in Figure 3a, by the standard AFEM algorithm in Figure 3b, and AFEM+ in Figure 3c. One clearly sees that, for GOAFEM and AFEM+, the singularities for both the primal and the dual problem are resolved, whereas for standard AFEM only those of the primal problem are taken into account. The meshes for m = 2 look similar (not displayed). In particular, GOAFEM and AFEM+ lead to similar results, although in practice AFEM+ is slightly inferior from the point of theory (see Remark 22).