Asymptotically exact a posteriori error estimates for the BDM finite element approximation of mixed Laplace eigenvalue problems

We derive optimal and asymptotically exact a posteriori error estimates for the approximation of the eigenfunction of the Laplace eigenvalue problem. To do so, we combine two results from the literature. First, we use the hypercircle techniques developed for mixed eigenvalue approximations with Raviart-Thomas finite elements. In addition, we use the post-processings introduced for the eigenvalue and eigenfunction based on mixed approximations with the Brezzi-Douglas-Marini finite element. To combine these approaches, we define a novel additional local post-processing for the fluxes that appropriately modifies the divergence without compromising the approximation properties. Consequently, the new flux can be used to derive optimal and asymptotically exact upper bounds for the eigenfunction, and optimal upper bounds for the corresponding eigenvalue. Numerical examples validate the theory and motivate the use of an adaptive mesh refinement.

the analysis concerning stability, convergence properties and a priori error estimates, see [3,9].
Since in general one can not assume high regularity of the eigenfunctions on arbitrary domains [27], the requirement for an adaptive mesh refinement strategy is obvious. Central to this approach is the derivation of an efficient and reliable a posteriori error estimator, as already developed for finite element methods in general [1,35], and for eigenvalue problems in particular in [18].
In this work we consider the Laplace eigenvalue problem and approximate it using a mixed method. Several examples can be found in the literature using this approach, see [10,17,24,26,30], where adaptivity by means of residual error estimators (and using an H (div) × L 2 -norm analysis) is discussed. We particularly want to refer to [14] where a unified framework for (guaranteed) a posteriori bounds (using a proper discrete H 1 -energy norm) and a detailed overview of the literature is presented. A fundamental observation when using a mixed method is that it gives access to the hypercircle theory, see [28,32], eventually leading to asymptotically exact upper bounds and local efficiency. However, unlike for standard source problems, see [13,19,23,25,36], a more profound approach is needed since the orthogonality of the corresponding errors is no longer exactly satisfied.
For eigenvalue problems this was first introduced in the work [6], by means of the Raviart-Thomas finite element. To discuss details, note that we have where λ, u, σ are the exact eigenvalue, eigenfunction and its gradient, λ h , u h , σ h are the corresponding approximations and u * * h denotes some H 1 -conforming postprocessed function of u h . The first term on the right-hand side of (1) is computable and can therefore be used to define an a posteriori estimator η. The astonishing observation in [6] was that in the case of an approximation using the Raviart-Thomas finite element, the second term 2(σ h − σ, ∇(u − u * * h )) converges with higher order. Consequently, η is an asymptotically exact upper bound for the errors on the left-hand side of (1).
The question was whether the same ideas can be applied when using the Brezzi-Douglas-Marini (BDM) finite element instead. Surprisingly, as observed in [5] this is not the case. However, in [4] (using ideas from [21]) the authors were able to derive optimal upper bounds but with unknown constants (in contrast to the asymptotic bounds provided by η above).
The goal of this work is to derive asymptotically exact upper bounds (for the eigenfunction) as in [6] when using the BDM finite element method. For this we use the post-processing techniques for the eigenvalue and the eigenfunction as in [4], and consider modifications of the approaches from [6]. We introduce an additional (local) post-processing for the flux variable σ h , where we correct its divergence to fit the additional term in (1), which consequently converges again with higher order. Note, that the proposed method of this work is defined for all polynomial orders k ≥ 1, but the convergence results are only improved (compared to the Raviart-Thomas finite element) for k ≥ 1, see Remark 2.
The rest of the paper is organized as follows. Section 3 discusses the problem setting and its approximation. In Sect. 4 we present the local post-processing technique for the eigenfunction and the eigenvalue. The main results are then discussed in Sect. 5. While we first recapture the standard a posteriori error analysis based on (1) and reveal its breakdown due to a slow convergence of the additional terms, we then introduce the novel post-processing of the flux and derive the asymptotically exact upper bound. In the last Sect. 6 we present two numerical examples to validate our findings. The appendix, see Sect. 1, considers some additional results needed in the analysis.

Notation
We use the established notation for Sobolev spaces, i.e. L 2 (Ω), H 1 (Ω) and H (div, Ω) for a given domain Ω. An additional zero subscript (for the latter two) indicates a vanishing trace. Further H 1 (Ω, R d ) (and similarly for other spaces) denotes a corresponding vector-valued version with d components. For ω ⊂ Ω we use (·, ·) ω and · 0,ω for the inner product and the norm on L 2 (ω), respectively, and | · | s,ω as the standard Sobolev seminorm of order s. If ω = Ω we omit the additional subscript. We write A B when there is a positive constant C, that is independent of the mesh parameter h (see below) such that A ≤ C B. Analogously we define A B.

Problem setting
Let Ω ⊂ R d be a polygon or polyhedron for d = 2, 3, respectively. We consider the mixed formulation of the Laplace eigenvalue problem with homogeneous Dirichlet boundary conditions, i.e. we want to find a λ ∈ R, u ∈ L 2 (Ω) and σ ∈ H (div, Ω) such that We approximate (2) by a mixed method using the BDM finite element for the approximation of σ and a piece-wise polynomial approximation of u. To this end let T h be a regular triangulation of Ω into triangles and tetrahedrons in two and three dimensions, respectively. Let k ≥ 1 be a fixed integer (see Remark 2 for a comment regarding the lowest order case). We introduce the spaces where P l (K ) denotes the space of polynomials of order l ≥ 0 on K , and P l (K , R d ) denotes the corresponding vector-valued version. An approximation of (2) then seeks λ h ∈ R, u h ∈ U h and σ h ∈ Σ h such that Review article [9] (for example) states that problem (3) defines a good approximation of the continuous eigenvalue problem (2) in the sense that it does not produce any spurious modes and that eigenfunctions are approximated with the proper multiplicity. The approximation results are summarized in the following. To this end let s > 1/2 and let (λ, u, σ ) be a solution of the eigenvalue problem (2) with the regularity u ∈ H 1+s (Ω) and σ ∈ H (div, Ω) ∩ H s (Ω, R d ) (for the regularity results see [20,22]). Then there exists a discrete solution of (3) such that (see [3,9]) where r = min{s, k + 1} and r = min{s, k + 2}, and h = max If s is big enough we have r = r + 1. Above estimates follow from the abstract theory from [9], [17] and [30], and the approximation results of the source problem, see [10]. It is worth mentioning, that the constants in (4) are non-trivial as they depend, beside the discrete stability constants of (3), particularly on the spectrum of the associated solution operator of the continuous eigenvalue problem (2). In addition we have where Here [[·]] denotes the standard jump operator, F h the set of facets of the triangulation T h , and h F the diameter of a facet F ∈ F h . Note that above results demand a careful choice of the approximated eigenfunction u h and the approximated gradient σ h . An example, well established in the literature, is given by a normalization such that u h 0 = u 0 = 1 and by choosing the sign (u, u h ) > 0. Note that this also fixes σ and σ h by (2a) and (3a), respectively. For simplicity, we assume for the rest of this work that λ is a simple eigenvalue and that the above choice of sign and scaling of the continuous and the discrete eigenfunctions is applied. Further, for simplicity, we will call (λ, u, σ ) the solution of (2), keeping in mind that a different scaling and sign can be chosen.

Remark 1
The case of eigenvalues with a higher multiplicity demands more carefulness, particularly if an a posteriori analysis is considered. We particularly want to refer to [11] where the authors considered eigenvalue clusters using a mixed formulation.
For the convergence of the adaptive scheme they used a residual based error estimator and provided a detailed analysis. An extension to estimators that are based on identity (1) is still open and is discussed in future works.

Remark 2
Although the schemes proposed in this work are computable also for the lowest order case k = 0, one does not observe any high-order convergence of the post processed variables defined later in the work. The reason for this is that the Aubin-Nitsche technique, needed in the analysis, can not be applied for this case.

Local post-processing for u h and h
For a sufficiently smooth solution, estimates (4) and (5) show that there is a gap of two between the order of convergence of σ − σ h 0 and u − u h 1,h . In [17] the following identity is proven which, due to (4), gives the estimate (using r ≤ r ) We see that the order of convergence of |λ − λ h | is dominated by the order of the L 2error of the eigenfunction. The reduced convergence of u h (compared to the L 2 -error of σ h ) is well known for mixed methods and can be improved by means of a local post-processing, see [2,34], and particularly for eigenfunctions in [15]. Consequently, using the ideas from [21], we can then also get an improved eigenvalue. For a given integer l ≥ 0 let Π l denote the L 2 -projection onto element-wise polynomials of order l. Consider the spaces For the discretization of the standard source problem (i.e. the Poisson problem), it is known that the kernel inclusion property div Σ h ⊆ U h (see [10]) and commuting interpolation operators yield a super convergence property of the projected error Here ρ(h) is a function that depends on the regularity of the problem and for which we have ρ(h) → 0 as h → 0. For convex domains we have ρ(h) = O(h). This super convergence of the projected error is the fundamental ingredient to derive the enhanced approximation properties of u * h . Unfortunately the same technique, i.e. the one from the standard source problem, does not work for the eigenvalue problem and an improved convergence estimate of Π k u−u h 0 is more involved. This has been discussed for the lowest order case in [20], for a more general setting including eigenvalue clusters in [11], for Maxwell eigenvalue problems in [12] and for the Stokes problem for example in [21]. Unfortunately, these results are only presented using the full · div -norm (or the corresponding mixed norm) for Σ and Σ h . While such an estimate is applicable for an approximation of (3) using Raviart-Thomas finite elements, the BDM case is not covered since (4c) and (4b) show different convergence orders which would spoil the estimate. As the author is not aware of a detailed proof that can be found in the literature, it will be given in the appendix in Sect. 1. Note however, that these results are already used for example in [4] (without proof). The resulting super convergence reads as which in combination with the techniques from [34], then yields the approximation properties (see also [29]) Since u * h is not H 1 -conforming the final post-processing step consists of the application of an averaging operator I a : U * h → U * * h often also called Oswald operator, see [31] and [16] for details on the approximation properties. We set u * * h := I a (u * h ) for which we have by (10) We conclude this section by introducing a post-processing of the eigenvalue. As in [4,21] we define The following lemma was given in [21]. Since we need some intermediate steps in the sequel, we include the proof.
Proof Since u 0 = 1 we have using that div Σ h ⊆ U h and (8b) and thus again with u 0 = 1 this gives , the last term can be written as By the Cauchy-Schwarz inequality we finally get Thus, for h small enough, the last term can be moved to the left hand side, and we can conclude the proof using (10) and (4).

A posteriori analysis
In this section we provide an a posteriori error analysis and define an appropriate error estimator. We follow [6] where the authors derived an error estimator using the variables σ h and u * * h . While this works for a mixed approximation of (4) using the Raviart-Thomas finite element of order k (as was done in [6]), this does not work for our setting. To discuss the problematic terms and to motivate our modification, we present more details in the following. Since σ = ∇u we have Using integration by parts, u * * h ∈ H 1 0 (Ω) and − div σ h = λ h u h , the last term can be written as In total this gives the guaranteed upper bound In [6] the first term on the right hand side is the (computable) proposed error estimator, whereas the second and third are high-order terms. Compared to our setting we can see the problem since where for simplicity, i.e. to allow a simpler comparison, we assumed a smooth solution.
Whereas the second term converges with an increased rate (compared to 2k + 4), the reduced convergence order of u − u h 0 , see equation (4a), spoils the estimate of the last term. Note that the error of u h appears in the estimates because we used the identity − div σ h = λ h u h in the above proof.
To fix this problem we propose another post-processing. Whereas the first two post-processing routines were used to increase the convergence rate of the error of the eigenfunction and eigenvalue i.e. u * h (and u * * h ) and λ * h , respectively, we now aim to construct a σ * h with a fixed divergence constraint rather than improving its approximation properties measured in the L 2 -norm. To this end we define the space The space Σ * h reads as a BDM space of order k + 3 with a reduced polynomial order of the normal traces. Note that other choices of Σ * h are possible, see Remark 3. The basic idea now is to find a σ * h ∈ Σ * h being as "close" as possible to σ h (i.e. being a good approximation) such that the divergence is modified appropriately using the additional high-order normal-bubbles (i.e. functions with a zero normal component along the boundary of each element). Since these bubbles are defined locally, this can be done in an element-wise procedure. Now let ξ h ∈ Σ * h be arbitrary. Proposition 2.3.1 in [10] shows that the following degrees of freedom (here applied to ξ h ) Facet moments: Div moments: Vol moments: where By that we can define the post processed flux Note that since σ h is normal continuous, i.e. the normal trace coincides on a common facet of two neighboring elements, the boundary constraints (15a) of σ * h can be set locally on each element (boundary) separately. Further, since σ h · n and σ * h · n have the same polynomial degree k + 1, the moments from (15a) result in σ h · n = σ * h · n. In total this shows that one can solve for σ * h on each element independently, i.e. this can be done computationally very efficient. In Remark 4 we also make a comment on the choice of (15b).
Proof We start with the proof of the divergence identity. Let K ∈ T h and q h ∈ P k+2 (K ) be arbitrary, then we have where the second step followed due to (15b) and the Gauss theorem. Using (15a) and (3b), the last integral can be written as where we used (8b) in the last step. All together this gives from which we conclude the proof as div σ * h − λ h u * h ∈ P k+2 (K ) and q h was arbitrary. Now let I * h be the canonical interpolation operator into Σ * h with respect to the moments (14), and let I h be the interpolation operator into Σ h which is defined using the same moments (14) but with q h ∈ P k (K )/P 0 (K ) and l h ∈ H k+1 (K ) instead. First, the triangle inequality gives σ Since the first term can be bounded by the properties of I * h , we continue with the latter which can be written as By the definition of the interpolation operators and similar steps as above we have I h (σ * h ) = σ h , and thus the term most to the right simplifies to We continue with the other term. For this let ψ div i be the hierarchical dual basis functions of the highest order divergence moments from (14b) given by K div(·)q i dx with q i ∈ P k+2 (K )/P k (K ). Similarly let ψ H i be the hierarchical dual basis functions of the highest order vol moments from (14c) given by K (·) · l i dx with l i ∈ H k+3 (K )/H k+1 (K ). An explicit construction of these basis functions can be found for example in [8,37]. Also let N div and N H be the corresponding index sets. Using (2b), (15b) and (15c), this then gives which implies that (using that the norms of the q i , l i and ψ div i , ψ H i are bounded) Since u * h 0 ≤ u * h − u 0 + u 0 , we can conclude the proof by the approximation properties of I h and I * h (see Proposition 2.5.1 in [10]), estimates (7) and (10) and by ρ(h)h r ≤ h r and h 2r ≤ h r .

Remark 3 Instead of choosing Σ *
h as above, one can for example also use the standard Raviart-Thomas space of order k + 2 denoted by RT k+2 . Since div RT k+2 = U * h it is again possible to set − div σ * h = λ h u * h (using the appropriate degrees of freedom). However, since the normal trace of σ * h is now in P k+2 (F) on each facet F ∈ F h , one has to be more careful defining the edge moments. Precisely, we would now set where the projection has to be understood as the L 2 -projection on the facets.

Remark 4
One might be curious why we do not use λ * h instead of λ h in the definition of div σ * h in (15b). Indeed, as can be seen in the proof this is a crucial choice since we used in (16) that the mean value of the divergence is fixed by the constant normal moments (first equal sign) and thus coincides with Π 0 (λ h u h ) (third equal sign). Choosing λ * h in (15a) would then lead to a mismatch of the low-order and high-order parts of the divergence.
We are now in the position of defining the local error estimator on each element K ∈ T h by and the corresponding global estimator by Theorem 2 Let (λ, u, σ ) be the solution of (2). Let (λ h , u h , σ h ) be the solution of (4) and let u * * h and σ * h be the post-processed solutions. There holds the reliability estimate is a high-order term compared to O(h 2r ) as h → 0. Further, there holds the efficiency Proof Following the same steps as at the beginning of this section we arrive at For the last term we now have Whereas the first term converges of order we have for the second term

It remains to show that hot(h) ρ(h)(h 2r +r + ρ(h)h 2r ) is of higher order compared
to h 2r . Due to the additional ρ(h) in the upper bound of hot(h), we only have to show that 2r ≤ 2r + r . For the low regularity case, i.e. s = r = r , and the case of full regularity, i.e. r = k + 1 and r = k + 2, this follows immediately. For the case where r = k + 1 and r = s with k + 1 < s < k + 2, we also have 2r = 2s < k + 2 + s < 2(k + 1) + s = 2r + r , from which we conclude the proof of the reliability. The efficiency estimate follows by the triangle inequality and σ = ∇u.
Using the estimator from above we are now also able to derive an upper bound for λ * h . To this end let The last two terms from the estimator η λ are needed to measure the difference between the quantities used in η and the functions used in the definition of λ * h . Unfortunately the authors do not see how the definition of λ * h can be changed such that only σ * h and u * * h are used, which would allow a direct estimate by η. (λ, u, σ ) be the solution of (2). Let (λ h , u h , σ h ) the the solution of (4) and let u * * h and σ * h be the post-processed solutions. There holds the estimate

Theorem 3 Let
and hot(h) are both higher order terms compared to O(ρ(h)h r +r + h 2r ) as h → 0.
Proof Following (13) we have the equation Note that the second term on the right side is already of higher order, thus we only consider the remaining terms. The idea is to modify the terms including σ h such that we can use the results from the previous theorem. By the triangle inequality we have h 0 can be bounded by the estimator from the previous theorem, we are left with an estimate for the last term on the right hand side of (17).
In contrast to the proof of Lemma 1 we now add and subtract u * * h (and not u * h ) which gives The last term is computable and will be used in the estimator. For the first one we have using that u * * h ∈ H 1 0 (Ω), u 0 = 1 and integration by parts The first term can be estimated as before, thus for h small enough we have To show that hot(h) and hot(h) are of higher order compared to O(ρ(h)h r +r + h 2r ), one follows the same steps as in the proof of Theorem 2.

Numerical examples
In this section we discuss some numerical examples to validate our theoretical findings. All methods were implemented in the Finite Element library Netgen/NGSolve, see www.ngsolve.org and [33].

Convergence on a unit square
The first example considers the unit square domain Ω = (0, 1) 2 . The eigenfunction and the smallest eigenvalue of (2) is given by u = 2 sin(2π x) sin(2π y) and λ = 2π 2 , respectively. We start with an initial mesh with |T h | = 32 elements and use a uniform refinement. Note that for simplicity we used a structured mesh for this example, thus we have h ∼ (0.5|T h |) −1/2 . In Tables 1 and 2 we present several errors and their convergence rate (given in brackets) for different polynomial orders k = 1 and k = 2.
Beside the errors we also plot the high-order term from Theorem 2, and the efficiencies Since Ω is convex we have for this example that ρ(h) ∼ h, thus we expect the following convergence orders (for simplicity recalled here) In accordance to the theory all errors converge with the optimal orders. Further the high-order term hot(h) converges faster than the estimator η as predicted by Theorem 2. Note that this results in an efficiency eff converging to one, i.e. the error estimator is asymptotically exact. Also the estimator for the error of the eigenvalue converges appropriately and shows a good efficiency eff λ . The same conclusions can be made for k = 2, however, the error of the eigenvalues λ h and λ * h converge so fast that they are too small and rounding errors dominate on the finest meshes. For the same reason we also do not present any numbers for hot(h) since this term converges even faster resulting in very small numbers already on coarse meshes.

Adaptive refinement on the L-shape
For the second example we choose the L-shape domain Ω = (−1, 1) 2 \([0, 1] × [−1, 0]) where the first eigenvalue reads as λ ≈ 9.63972384402, see [7]. In this example the corresponding eigenfunction is singular, thus we expect a suboptimal convergence on a uniform refined mesh. To this end we solve the problem using an adaptive mesh refinement. The refinement loop is defined as usual by

Table 1
Convergence of several errors for the example on the unit square with k  Table 2 Convergence of several errors for the example on the unit square with k = 1, 2 1.48 · 10 −3 (−) 4.52 · 10 −4 (−) 2.39 · 10 −3 (−) 5.29 128 9.91 · 10 −5 (3.9) 7.82 · 10 −6 (5.9) 4.11 · 10 −5 (5.9) 5.26 512 6.33 · 10 −6 (4.0) 1.25 · 10 −7 (6.0) 6.63 · 10 −7 (6.0) 5.28 and is based on the local contributions η(K ) as element-wise refinement indicators. In the marking step we mark an element if η(K ) ≥ 1 4 max The refinement routine then refines all marked elements plus further elements in a closure step to guarantee a regular triangulation. In Fig. 1 we present the error history of the post processed eigenvalue λ * h , its estimator η λ and the estimator for the eigenfunction error η for polynomial order k = 2, 3. We can observe an optimal convergence O(N −2(k+2) ), O(N −2(k+2) ) and O(N −(k+2) ), for |λ − λ * Proof We solve the continuous problem: Find Θ ∈ H (div, Ω) and Ψ ∈ L 2 (Ω) such that Note, that we have the regularity Θ ∈ H s (Ω, R d ) and Ψ ∈ H 1+s (Ω) with s > 1/2, and there holds the stability estimate (see for example [20]) This then gives where we used the commuting diagram property of the BDM-interpolation operator I h and the L 2 projection Π k , see Section 2.5 in [10]. By problems (18) and (20) we then have where the last step followed by (div(σ − σ h ), Π k Ψ ) = 0. By the interpolation properties of Π k and I h and the stability (21) we conclude Lemma 3 Let (λ, u, σ ) be the solution of (2), (λ h , u h , σ h ) be the solution of (3) and let ( u h , σ h ) be the solution of (18). There holds the estimate Proof We start with the estimate of the divergence term. By the triangle inequality we have div(σ − σ h ) 0 ≤ div(σ − σ h ) 0 + div(σ h − σ h ) 0 , thus since also div(σ − σ h ) 0 = λu − λ h u h 0 we have with (6) and a small enough mesh size h that For the second term we proceed similarly. The triangle inequality gives σ − σ h 0 ≤ σ − σ h 0 + σ h − σ h 0 . For the latter we then have with (19) We continue to bound the last term. With Π k u 0 u 0 we have as above with (6) λ h u h − λΠ k u u h − u h 0 Since u ∈ H 1+s (Ω) we can bound u − u h 0 h ∇u 0 which gives for h small enough (i.e. bounding σ − σ h and thus in total we conclude with Lemma 4 Let (λ, u, σ ) be the solution of (2), (λ h , u h , σ h ) be the solution of (3) and let ( u h , σ h ) be the solution of (18). There holds the estimate Proof Using equation (19) the proof follows with exactly the same steps as in the proof of Lemma 11 in [12] or Lemma 6.3 in [11].
Combining above results we have the super convergence property. (λ, u, σ ) be the solution of (2), (λ h , u h , σ h ) be the solution of (3) and let ( u h , σ h ) be the solution of (18). For h small enough there holds the super convergence property