Divergence-Conforming Velocity and Vorticity Approximations for Incompressible Fluids Obtained with Minimal Facet Coupling

We introduce two new lowest order methods, a mixed method, and a hybrid discontinuous Galerkin method, for the approximation of incompressible flows. Both methods use divergence-conforming linear Brezzi–Douglas–Marini space for approximating the velocity and the lowest order Raviart–Thomas space for approximating the vorticity. Our methods are based on the physically correct viscous stress tensor of the fluid, involving the symmetric gradient of velocity (rather than the gradient), provide exactly divergence-free discrete velocity solutions, and optimal error estimates that are also pressure robust. We explain how the methods are constructed using the minimal number of coupling degrees of freedom per facet. The stability analysis of both methods are based on a Korn-like inequality for vector finite elements with continuous normal component. Numerical examples illustrate the theoretical findings and offer comparisons of condition numbers between the two new methods.


Introduction
In this work we introduce two new methods for the discretization of the steady incompressible Stokes equations in three space dimensions. Let Ω ⊂ R 3 be an open bounded domain with Lipschitz boundary ∂Ω that is split into the Dirichlet boundary Γ D and outflow boundary Γ N . The Stokes system for the fluid velocity u and the pressure p is given by where ε(u) := (∇u + ∇u T )/2 is the symmetric gradient, f : Ω → R 3 is an external body force, ν is twice the kinematic viscosity, n is the outward unit normal vector and I ∈ R 3×3 is the identity matrix. We assume that both Γ D and Γ N have positive boundary measure, and any rigid displacement vanishing on Γ D vanishes everywhere in Ω. (As usual, when Γ N is empty the pressure space must be adapted to obtain a unique pressure [18], but we omit this case for simplicity.) Next, define the viscous stress tensor [23] by σ = νε(u) and the vorticity by ω = curl u. Using them, we can rewrite the above system as Here we used the deviatoric part of the tensor τ given by dev τ := τ − 1 3 tr(τ )I , the matrix trace tr(τ ) := 3 i=1 τ ii , and the operator κ : R 3 → {τ ∈ R 3×3 : τ + τ T = 0} defined by Note the obvious identities for vector fields v and w (the first of which was already used in (2a)). We will refer to system (1) as the primal formulation and system (2) as the mixed formulation. The literature on discretizations of (1) and (2) is too vast to list here. The relatively recent quest for exactly divergence-free velocity solutions and pressure-independent a priori error estimates for velocity, often referred to as pressure robust estimates [27,30], has rejuvenated the field. A recurring theme in this vast literature, from the early non-conforming method of [10] to the more recent [29], is the desire to improve computational efficiency by minimizing inter-element coupling. However, less studied are its side effects on stability when the actual physical flux replaces the often-used simplified diffusive flux, i.e., when − div(νε(u)) replaces − div(ν∇u), (4) even though an early work [11] cautions how the lowest order method of [10] can become unstable when doing so. Such instabilities arise because the larger null space of ε necessitates increased inter-element coupling (as explained in more detail below) and are manifested in certain lowest order cases with insufficient inter-element coupling. In this work, focusing on the lowest order case, we identify new stable finite element methods, with the minimal necessary inter-element coupling, that yield exactly divergence-free and pressure robust velocities. New methods based on both the primal and the mixed formulations are designed. Yet another reason for focusing on the lowest order case is its utility in preconditioning. Roughly speaking, a common strategy for preconditioning high order Stokes discretizations involves combining local (high order) error dampers via, say block Jacobi or other smoothers, with a global (low order) error corrector such as multigrid (or even a direct solver) applied to the smaller lowest order discretization. From this point of view, it is desirable to have stable low order versions (that remain stable under (4)) of high order methods for design of preconditioners, an interesting topic which we shall not touch upon further in this paper.
To delve deeper into the mechanics of the above-mentioned instability, consider the kernel of ε, consisting of rigid displacements of the form x → a + b × x with a, b ∈ R 3 . Reasonable methods approximating the operator − div(νε(u)) produce element matrices whose nullspaces contain these rigid displacements. Ideally, when these element-wise rigid displacements are subjected to the inter-element continuity conditions of the discrete velocity space, they should equal element-wise restrictions of a global rigid displacement on Ω (which can then be eliminated by boundary conditions). However, if the inter-element coupling in the discrete velocity space is so weak to allow for the existence of a u in it that does not equal a global rigid displacement on Ω even though u| T is a rigid displacement on every mesh element T , then instabilities can arise [11].
The discrete velocity space we have in mind is the lowest order (piecewise linear) H (div)conforming Brezzi-Douglas-Marini (BDM 1 ) space [3]. (A basic premise of this paper is the unquestionable utility of H (div)-conforming velocity spaces to obtain exactly divergencefree discrete Stokes velocity fields, well established in prior works [8,9,20,21,29]). Hence, to understand how to avoid the above-mentioned instability while setting velocity in the BDM 1 space, we ask the following question: how many coupling degrees of freedom (dofs) are needed to guarantee that two rigid displacements u ± ,given respectively on two adjacent elements T ± , coincide on the common interface F = ∂ T + ∩ ∂ T − ?
The pictorial representations of the deformations created by u ± in Fig. 1lead to the answer. Three of the pictured deformations are just translations (generated by the a-vector in a+b×x). For a unit vector b, letting R b θ denote the unitary operator that performs a counterclockwise rotation by angle θ around b, it is easy to see that as θ → 0. Therefore the deformation created by the rigid displacement b × x can be viewed as an infinitesimal rotation about b. These deformations are portrayed in Fig. 1 as rotations about three linearly independent b-vectors. The first row in Fig. 1 illustrates deformations generated by piecewise rigid displacements which are given by two b-vectors coplanar with F and an a-vector normal to F. These rigid displacements are forbidden in the BDM 1 space. Indeed, recall [3] that the BDM 1 dofs on the facet F are given by the linear functionals u → F u · n q ds for all linear polynomials q on F, where n is a normal vector on F. These represent three dofs illustrated in left diagram of Fig. 2. If these three dofs coincide for two rigid displacements u ± , then the corresponding normal component must be continuous on F. This continuity forbids the above-mentioned deformations to be generated by elements of the BDM 1 space. We summarize this by saying that the rigid displacements portrayed in It remains to control the rigid displacements of the second row of Fig. 1 using three additional dofs per facet. To this end, our new methods have two additional spaces: (i) one that approximates the in-plane components of the velocity on facets, illustrated in the middle diagram of Fig. 2, used to control the first two rigid displacements in the second row of Fig. 1; and (ii) a second space, schematically indicated in the last diagram of Fig. 2, that controls the third deformation in the second row of Fig. 1. The latter deformation arises from piecewise rigid displacements of the form u ± = b ± × x with b ± collinear to n, a unit normal of F. Since curl(b ± × x) = 2b ± , we can make the two rigid displacements coincide on F by requiring continuity of nldot curl u ± . While continuity of n · curl u certainly holds if u is the exact Stokes velocity, it does not generally hold for u in BDM 1 . Hence, keeping in view that ω = curl u represents vorticity, we incorporate this constraint in our new methods by approximating vorticity ωin the lowest order Raviart-Thomas space. This is our second additional space. Its single dof per facet is shown schematically in the last diagram of Fig. 2.
In the first part of the paper we will employ these additional spaces to construct a novel HDG method to approximate (1) and present a detailed stability and error analysis. HDG methods have become popular ever since its introduction in [7] which showed how interface variables, or facet variables, can be effectively used to construct DG schemes amenable to static condensation. In the method presented here, the interface variable approximates the tangential components of the velocity. The key technical ingredient in the analysis that reflects the insight garnered from the above pictorial discussion is a discrete Korn-like inequality for the BDM 1 space (see Lemma 1 below, and the version with interface variables in Lemma 2).
The second part of this work discusses the derivation of a novel mixed method for the approximation of (2) and is motivated by our previous two papers [20,21] and the many other works on discretizing (2) such as [12][13][14][15]. In [20] we derived the "Mass-Conserving Stress-yielding" (MCS) formulation where the symmetry of σ was incorporated in a weak sense by means of a Lagrange multiplier that approximates ω = curl u. While the ω there was approximated using element-wise linear (or higher degree) functions without any interelement continuity requirements, the new mixed method we propose here will approximate ω in the lowest order Raviart-Thomas space instead. The lowest order case that was proved to be stable in [20] had nine coupling dofs per facet. We are able to reduce this number to the minimal six (the dimension of rigid displacements) in this paper. This minimal coupling was achieved earlier in [35] using a bubble-augmented velocity space which is a subspace of a degree-four vector polynomial space. Since higher degrees necessitate more expensive integration rules, we offer our simpler elements as an alternative.
Other methods that approximate the operator div(ν∇u), such as [10,21,29], are able to reduce the number of coupling dofs per facet even further. Since our focus here is on methods that approximate div(νε(u)), we restrict ourselves to a brief remark on this. Since the kernel of ∇ (applied to vector fields) is three dimensional, we expect the minimal number of coupling dofs per facet to be three when approximating div(ν∇u). A method with this minimal coupling was achieved early by [10]. To also obtain pressure robust and exactly divergence-free solutions, prior works [21,29] settled for a slightly higher five coupling dofs per facet in the lowest order case. It is now known that this can be improved by employing the technique of "relaxed H (div, Ω)-conformity," see [25,26], which results in a method with the minimal three coupling dofs per facet and yet, thanks to a simple post-processing, provides optimal convergence orders and pressure robustness. While on the subject of coupling dofs, an explanation of our focus on three-dimensional (3D) domains is in order. On two-dimensional (2D) domains, the space of rigid displacements only has three dimensions. In the lowest order 2D case, BDM 1 space provides two coupling dofs per facet edge, and the space of tangential facet velocities adds one more coupling degree of freedom. Thus the minimal facet coupling (of three dofs) needed to eliminate the rigid displacements are more immediate in 2D case when compared to the 3D case, which is why restrict to the 3D case henceforth.
The new HDG method and the new mixed method proposed in this paper both have the same coupling dofs, the same velocity convergence orders and the same structure preservation properties like pressure robustness and mass conservation. On closer comparison, two advantages of the mixed method are notable. One is its direct approximation of viscous stresses. Another is the absence of any stabilization parameters in it. In fact, in our numerical studies, the conditioning of a matrix block arising from the parameter-free mixed method was found to be better than the analogous HDG block for all ranges of the HDG stabilization parameter we considered.
Outline. We set up general notation in Sect. 2 and continue with a description of the variational framework used throughout the paper. Finite element spaces, a discrete Korn-like inequality, and resulting norm equivalences are introduced in Sect. 3. A list of interpolation operators into these spaces and their properties with references to literature can also be found there. In Sect. 4 we introduce and analyze the HDG method for the primal set of Eq. (1) and in Sect. 5 we do the same for the MCS method for the mixed set of eq. (2). Finally, in Sect. 6 we perform numerical experiments to illustrate and complement our theoretical findings.

Notation and Weak Forms
By M we denote the vector space of real 3 × 3 matrices and by K we denote the vector space of 3 × 3 skew symmetric matrices, i.e., K = skw(M), where skw τ = 1 2 (τ − τ T ) for τ ∈ M. Further, let D = dev (M). To indicate vector and matrix-valued functions on Ω, we include the range in the notation, thus while L 2 (Ω) = L 2 (Ω, R) denotes the space of square integrable and weakly differentiable R-valued functions on Ω, the corresponding vector and matrix-valued function spaces are defined by L 2 (Ω, , respectively. For anyΩ ⊆ Ω, we denote by (·, ·)Ω the inner product on L 2 (Ω) (or its vector-or matrix-valued versions). Similarly, we extend this notation and write · Ω for the corresponding L 2 -norm of a (scalar, vector, or matrix-valued) function on the domainΩ. In the caseΩ = Ω we will omit the subscript in the inner product, i.e. we have (·, ·)Ω = (·, ·) and we will use the notation · 0 = · Ω .
In addition to the differential operators we have already used, ∇, ε, curl, we understand div Φ as either In addition to the standard Sobolev spaces H m (Ω) for any m ∈ R, we shall also use the well-known spaces , to denote the spaces of functions whose trace, normal trace and tangential trace respectively vanish on Γ B , for B ∈ {D, N }. The only somewhat nonstandard Sobolev space that we shall use is where H 0,D (div, Ω) * is the dual space of H 0,D (div, Ω). In the case Γ D = ∂Ω, as proved in [21], the dual of H 0,D (div, Ω) equals H −1 (curl, Ω), so the condition that div τ ∈ H 0,D (div, Ω) * in (5) is the same as requiring that curl div τ ∈ H −1 (Ω). This explains the presence of the operator "curl div" in the name of the space in (5).
We denote by T a quasiuniform and shape regular triangulation of the domain Ω into tetrahedra. Let h denote the maximum of the diameters of all elements in T . Throughout this work we write A ∼ B when there exist two constants c, C > 0 independent of the mesh size has well as the viscosity ν such that c A ≤ B ≤ C A. Similarly, we use the notation A B if there exists a similar constant C (independent of h and ν) such that A ≤ C B. Henceforth we assume that ν is a constant. Due to quasiuniformity we have h ∼ diam(T ) for any T ∈ T . The set of element interfaces and boundaries is denoted by F . This set is further split into facets on the Dirichlet boundary, For piecewise smooth functions v on the mesh, v and {v} are functions on F whose values on each interior facet equal the jump (defined up to a sign) of v and the mean of the values of v from adjacent elements. On boundary facets, they are both defined to be the trace of v. On each element boundary, and similarly on each facet on the global boundary we denote by n the outward unit normal vector. Then the normal and tangential trace of a smooth enough vector field v is given by Accordingly, the normal trace is a scalar function and the tangential trace is a vector function. In a similar manner we introduce the normal-normal (nn) trace and the normal-tangential (nt) trace of a matrix valued function τ by τ nn := τ : n ⊗ n = n T τ n and τ nt = (τ n) t .
For anyΩ ⊆ Ω, we denote by P k (Ω) = P k (Ω, R) the set of polynomials of degree at most k, restricted toΩ. Let P k (Ω, R 3 ) and P k (Ω, M) denote the analogous vector-and matrix-valued versions whose components are in P k (Ω). With respect to these spaces we then define Π kΩ , the L 2 (Ω)-projection into the space P k (Ω) or its vector-or matrix-valued versions. We omit subscript from Π kΩ if it is clear from context. For the space of functions the restrictions of which are in P k (T ) for all T ∈ T we write simply P k (T ). The analogous convention holds for H k (T ), L 2 (F), etc.

The Finite Elements Used and Their Properties
In this preparatory section, we define the standard finite element spaces used to construct our methods, their natural interpolators, and a number of discrete norm equivalences revealing equivalent norms involving piecewise ε(·). Lemma 2 below will be used in the analysis of the HDG scheme while the analysis of the MCS scheme will additionally need Lemmas 3-4. We begin with the finite element spaces used in this paper: Note that for any τ h ∈ Σ h , on a facet F, (τ h ) nt is a constant function on F taking values in n ⊥ F , where n ⊥ F denotes the orthogonal complement of n F , a unit normal of F. This is indicated by the notation (τ h ) nt ∈ P 0 (F, n ⊥ F ) in (8d). Also any v h ∈ V h is tangential and takes values in n ⊥ F on each facet F. Note also that V h , which equals H 0,D (div, Ω) ∩ BDM 1 in the notation of §1, is the lowest order Brezzi-Douglas-Marini space while W h is the lowest order Raviart-Thomas space [3]. The space Σ h is a discontinuous version of the "nt-continuous" space introduced in [21], for which simple shape functions were exhibited there. All of these finite element spaces are obtained by mappings from a single reference finite element. (All these maps extend to curvilinear elements, although we restrict to affine equivalent elements in our analysis here.) The maps are compatible with the degrees of freedom of the spaces. (For Σ h , the appropriate map is given in [21] and compatibility with degrees of freedom is proved in [21,Lemma 5.7], while for the other spaces, the mappings are standard.) In the case of V h and W h , the maps are Piola maps which also preserve divergence-free subspaces.

A Discrete Korn-Type Inequality
Korn inequalities for piecewise functions were given in [5,Theorem 3.1]. A further refinement was given in [31, Theorem 3.1]. To describe it, let Π R denote the facet-wise L 2 projection onto R F := {t + α n × x : t ∈ n ⊥ , α ∈ R}, the space of tangential components (on a facet F) of the rigid displacements (or simply the space of two-dimensional rigid displacements on for all elements T ∈ T and u n = 0 on all facets F ∈ F 0,D }. A minor modification of the proof of [31, Theorem 3.1] shows that Here and throughout, we use · 2 T to abbreviate T ∈T · 2 T with the understanding that any derivative operators in the argument of these norms are evaluated summand by summand, e.g., the gradient and ε are evaluated element by element in (9). This notation is similarly extended to facets, so · 2 F 0,D = F∈F 0,D · 2 F . Note how normal components are controlled in (9) through the space H 1 n,D (T , R 3 ), while tangential components are controlled through the jumps u t . The next result shows that a part of the right hand side of (9) can be traded for a norm of the jump of n · curl u when u is in V h .
Proof By Pythagoras theorem, F . Hence (10) would follow once we prove that for all F ∈ F and all u h ∈ V h , To prove (11), first note that, restricted to every facet F, To simplify the numerator of the last term, let w equal u h | T for some T ∈ T F . We claim that To see why, recalling that w is linear in T (and hence in F ⊂ ∂ T ), for any x ∈ F, where we have used (3). Since r F is orthogonal to constants on F, to simplify the last term, we obtain (13). The equivalence of (11) is a consequence of the identity immediately obtained by combining (12) and (13). Indeed, by applying Cauchy-Schwarz inequality to the terms on the right hand side of (14), simple local scaling arguments give F , thus proving one side of equivalence in (11). To prove the other side, we begin by noting that curl(u h ) is constant on each element, so where we have used (14) and local scaling arguments again. Squaring both sides and applying Young's inequality, (11) is proved.

Norm Equivalences
The product space for the kinematic variables is given by For the analysis we define the norms where we have used · 2 ∂T to abbreviate T ∈T · 2 ∂ T . We will shortly establish relationships between these norms (Lemma 4). That these are all norms on U h may not be immediately obvious, but follows from Lemma 2 below (where we critically use that u h is single valued on facets). As we shall see later,~·~ε is the natural norm to analyze the new HDG method in §4, while · ε features in the analysis of the MCS method in §5. All the above norms involve the interface variable u h , so they may be referred to as "HDG-type" norms. In contrast, "DG-type" norms were used in Subsection 3.1, where Lemma 1 and (9) imply A similar discrete Korn-type inequality also holds for HDG-type norms, as seen in the next lemma.

Lemma 2 For all
The reverse inequality holds in the sense that for any Proof To prove (16), first note that on an interior facet where we have increased the right hand side to include facets on Γ N also. Similarly, since the normal component of the given ω h ∈ W h is continuous across F ∈ F 0 and zero on F ∈ F D , Using (18) and (19) in (15), we obtain the estimate (16).
To prove (17), consider a function ω h ∈ W h satisfying on the boundary of every element T ∈ T . Since curl u h is piecewise constant, by the well known degrees of freedom of the Raviart-Thomas space W h , these conditions uniquely fix an ω h ∈ W h . Then, (curl u h − ω h ) n F equals zero for F ∈ F N , equals 1 2 curl u h n F for F ∈ F 0 , and equals (curl u h ) n F for F ∈ F D , so Therefore, for this choice of ω h , we havẽ By a local scaling argument h curl u h n 2 F curl u h 2 T . Using this in the above inequality and recalling that ∇u h we complete the proof of (17).

Lemma 3 For any u h
Proof The first equivalence follows by standard scaling arguments (by equivalence of norms in the lowest order Raviart-Thomas space). Equivalence (20b) also follows by local scaling arguments and [20, eq. (4.14)]. We continue on to prove (20c). Applying the Pythagoras theorem twice, We also have, due to (20b), Here we have used an inverse inequality and the observation that derivatives of curl u h ∈ P 0 (T ) vanish. Combining (21), (22) and the continuity of the L 2 projection, we conclude that the right side of (20c) can be bounded by the left side. For the reverse inequality, Here, we used that dev ∇u h ∈ P 0 (T ), a standard approximation estimate for the L 2 projection, followed by (20b). The proof is then concluded using (21).

Lemma 4 For all
Proof This is a direct consequence of Lemma 3.

Interpolation Operators
In subsequent sections we will require the interpolation operators into the spaces in (8a)-(8e), denoted by where Σ = {τ ∈ H 1 (T , D) : τ nt = 0}. Of course, the natural interpolation for Q h , denoted by I Q : L 2 (Ω) → Q h , is simply the L 2 -orthogonal projection. The definitions and properties of the remaining interpolants are summarized in this subsection. An H (div)-interpolation into V h , denoted by I V : H 1 n,D (T ) → V h , is defined using the standard degrees of freedom (see e.g., [

3, Proposition 2.3.2]):
A well-known consequence of (24) is that for all u in the domain of I V . The interpolant I W : H 1 n,D (T ) → W h , defined by ((ω − I W ω) n , q) F = 0 for all q ∈ P 0 (F) and all F ∈ F , is also standard. The interpolation operator for the stress space I Σ : Σ → Σ h , borrowed from [21], is defined by Finally, the tangential L 2 -projection on facets, IV : To note the salient approximation properties of these interpolants, first observe that for a u ∈ H 1 (Ω, R 3 ) ∩ H 2 (T ), we have curl(u) ∈ H 1 n,D (T ). Hence (I V u, IV u t , I W curl(u)) is in U h and using standard scaling arguments and the Bramble-Hilbert lemma, we get Also recall that [21,Theorem 5.8] implies that for all σ ∈ Σ,

Derivation of the HDG Method
To derive our new HDG scheme for (6), let u, p be a sufficiently smooth exact solution of (1). (A sufficient smoothness condition is quantified in Lemma 5 below.) Let v h ∈ V h . Then, multiplying (1a) by v h and integrating by parts on each element, where we used the symmetry of ε(u) and the boundary condition (1d). Since p is smooth, v h n = 0 on F 0,D , Since v h is single-valued on all facets, v h = 0 on Γ D (see (8b)), and ε(u) is continuous across interior facets, Adding (30a)-(30c), since the tangential part of (1d) shows that ε(u) nt = 0 on Γ N . Hence we may also replace ∂ T \Γ N by ∂ T in the last term. Thus, (31a) Next, let ω = curl(u) and u = u t on each element boundary ∂ T . Then, obviously, for any test function η h ∈ W h and constant α > 0, i.e., if u, u and ω are replaced by u h , u h and ω h , respectively, then the terms on the right are consistent terms. Adding the equations (31a)-(31c), we obtain where Here and throughout, (·, ·) T = T ∈T (·, ·) T and (·, ·) ∂T = T ∈T (·, ·) ∂ T , extending our prior analogous norm notation to inner products. Of course, from (1b), we also have for all q h ∈ Q h . Equations (32a)-(32b), after replacing (u, u, ω) by (u h , u h , ω h ), yield the following discrete formulation: find (u h , u h , ω h ) ∈ U h and p h ∈ Q h such that Note that this method enforces Π 0 (u h ) t = 0 on Γ D as a consequence of how the last term of (31b) manifest in the method. Due to the Dirichlet conditions built into W h (see (8c)) the method also penalizes Π 0 curl(u h ) n Γ D through the manifestation of the consistent term (31c) in the method. System (33) may be thought of as a nonconforming HDG discretization of the standard weak form (6).
Lemma 5 (Consistency of the HDG method) Suppose the exact solution (u, p) of (6) is regular enough so that u, together with u = u t on facets and ω = curl u, satisfies (u, u, ω) ∈ U reg and suppose p ∈ Q reg . Then any Proof This follows by subtracting (33) from (32).

Pressure Robust Error Analysis of the HDG Scheme
We follow the usual mixed method approach and proceed to combine continuity and coercivity of a hdg with a discrete Stokes inf-sup condition, or the LBB [3] estimate. The latter implies the stability of (33), which also implies its unique solvability. We begin by noting that by local scaling arguments, there is a mesh-independent c 1 such that since ε(v h ) is constant on T . For the same reason, Π 0 may be introduced into the second and third terms in the definition of a hdg (u h , u h , ω h ; v h , v h , ω h ), e.g., Lemma 6 (Continuity of of a hdg ) For any (u, u Proof Inequality (37) follows from Cauchy-Schwarz inequality, while (38) follows by additionally employing (35) and (36). The estimate (39) is a consequence of 1 3 T . Lemma 7 (Coercivity of a hdg ) There is a mesh-independent α 0 > 0 such that for all α > α 0 and all (u h , u h , ω h ) ∈ U h , Proof By (36) and Young's inequality with any β > 0,

Lemma 8 (LBB condition for the HDG method) For any p h
Proof By classical results [18], there exists a u ∈ H 1 (Ω) such that Put v h = I V v and v h = IV v on each facet. Then, (42) and (25) Moreover, (as alluded to in [29]) it is easy to show that Choose η h ∈ W h as in (17) of Lemma 2. Then, by (42)- (43), concluding the proof.
Theorem 1 (Error estimates for the HDG method) Let u, u, ω, p denote the exact solution that satisfies the regularity assumption of Lemma 5 and let ((u h , u h , ω h ), p h ) ∈ U h × Q h be the discrete solution of (33). Then the errors in u h , u h , ω h can be bounded independently of the pressure error by~u Furthermore, the pressure error satisfies Proof By (25), div(I V u) = I Q div u = 0. Moreover, by (33b), div u h = 0. Hence by Lemma 6. Now we claim that~E To see this, first note that local scaling arguments give Hence the extra terms in~E h~2 ε,+ that are not iñ E h~2 ε can be bounded by applying (48) and (35) with by Lemma 2. This proves (47). Using (47) in (46), we conclude that~E h~ε ~E~ε ,+ . Combining with triangle inequality, where we have applied (28) in the last step. This proves (44). For the pressure estimate, we begin with triangle inequality and Lemma 8: To bound the numerator of the supremum, we use Lemma 5: Hence the already proved estimate (49), together with the standard L 2 projection error estimates finish the proof of (45).

An MCS Formulation with H(div)-Conforming Vorticity
In this section we derive a new mixed method for the approximation of (2), motivated by the weak formulation (7).
consider the terms on the right. When (σ h ) nt is continuous across element interfaces, the first two terms together realizes the duality pairing introduced in Sect. 2, namely div σ, v h div , per [20, Theorem 3.1]. The third term is used to impose the nt-continuity of the viscous stress (and prior works [20,21,24] provided enough rationale to employ nt-continuous finite elements for viscous stresses). Note, that a similar nt-continuous approximation of the gradient (but not the physical viscous stresses ε(u)) was also already considered in [17]. Due to the Dirichlet conditions built into V h on Γ D (see (8b)), this term is comprised only of integrals over facets in the interior and on Γ N , with the latter enforcing σ nt = 0 in Γ N as demanded by (2f). Finally, the last term above is used to weakly incorporate the symmetry constraint (2c). This technique of imposing symmetry weakly is widely used in finite elements for linear elasticity [1,2,4,6,15,19,34]. Viewing (7) in terms of div ·, · U h , we are led to the following mixed method: find and q h ∈ Q h , with the stabilizing bilinear form c(ω h , η h ) := νh 2 (div ω h , div η h ) Ω . Note that since ω h approximates the vorticity ω = curl(u), we have div ω = 0, so c(·, ·) is a consistent addition. Although the formulation (51) is very similar to the formulations from [20,21], note the following differences. First, while the nt-continuity of viscous stresses was built into the spaces in [20,21], now it is incorporated as an equation of the method by the well-known hybridization technique. Second, although we use the same local stress finite element space as in [21], we use the weak symmetric setting from [20]. In the latter, the Lagrange multiplier for the weak symmetry constraint was given by an element-wise discontinuous approximation, whereas here it is in the div-conforming W h .

Stability of the MCS Method
From the terms in (51), we anticipate that the norms · U h and · ε are more natural for the analysis of the MCS method (in contrast to the HDG method). The latter appears in the next lemma.
the following estimates hold: Proof Inequality (52a) is proved just like (39). To prove (52b), let us first note an equivalent and more compact form of div σ h ; v h , v h , η h U h obtained by integrating (50) by parts (see e.g, [20, eq. (3.11) Using (53), the fact that σ h is trace-free, the Cauchy-Schwarz inequality, and the following estimate (which follows by a local scaling argument using a specific mapping mentioned in the beginning of §3), we get where the last inequality is due to the same argument as in (23). Thus (52b) follows from Lemma 4.

Lemma 10 For any
where λ F is the linear barycentric coordinate function associated to the vertex opposite to the facet F. Since λ F S F i has a vanishing nt-trace and Π 0 dev(∇u h − κ(ω h )) ∈ D, we see that τ h = γ 0 τ 0 h + γ 1 τ 1 h , for any γ 0 , γ 1 ∈ R, is an element of Σ h . Also set q h = − div u h , so that −(div u h , q h ) = div u h 2 0 . For these choices, (55) obviously holds as long as γ i is chosen independent of h and ν. Indeed, such γ i can be chosen to also ensure that U h , so that (56) also holds. This follows from an argument which (we omit and) is similar to that detailed in [20,Lemma 6.5], proceeding simply by appropriately combining Young and Cauchy-Schwarz inequalities.
The combined bilinear form of the MCS method (51) is given by Define a norm on the product space Proof We will find the required s as a sum of three terms, each in S h , and each depending on the given r . The first term is set using The second term iss = (ντ h , 0, 0, 0, νq h ) ∈ S h , where τ h ∈ Σ h and q h ∈ Q h are as in Lemma 10 obtained using the given components u h , u h , ω h of r . The lemma gives somẽ C > 0 such that The third term is is as in Lemma 8 obtained using the given component p h of r . The lemma implies that div v h = p h and Note that to obtain the last inequality, we have also used Lemma 4. Now letting β > 0, a constant yet to be chosen, put s = βs * +s + s Δ . Then, combining (59a), (60a) and (61a), where . By (60b) and Young's inequality, To bound ρ 2 , note that by Lemma 9, To bound ρ 3 , we recall from Lemma 4 that h div there is a C > 0 such that Using these estimates for ρ i in (62), SinceC, C Δ and C are mesh-independent constants, choosing β > max(C + C Δ , C )/2 and recalling the norm equivalence of Lemma 4, we prove (57). Of course, inequality (58) follows from (59b), (60b), and (61b).

Pressure Robust Error Analysis of MCS Scheme
In addition to the spaces U reg and Q reg , the a priori error analysis will now also use a stress space with improved regularity, Σ reg := Σ sym ∩ H 1 (T , D). Note that the integrals in the terms defining B(σ, u, u, ω, p; ·) are well-defined for σ ∈ Σ reg , (u, u t , ω) ∈ U reg , and p ∈ Q reg , so B(·, ·) can be extended to such non-discrete arguments.
Proof Since σ is symmetric we have that σ : κ(η h ) = 0. Next, using the regularity assumptions, starting from (53), we get where the boundary integral vanished using (2f) given on Γ N . Next, since The final remaining term in the bilinear form is also zero since (div u, q h ) = 0 as the exact solution is divergence free.
Furthermore, the pressure error can be bounded by Proof As in the proof of Theorem 1, let then E h S h ν 1/2 h u H 2 (T ) , which is enough to yield the stated pressure-independent estimate (64): indeed, lettingĒ : using the interpolation estimates (28)- (29) to bound Ē S h . Inequality (67) obviously implies (64). Therefore we focus on proving (66) and proceed to separately inspect each term forming its left hand side. Let E j with j ∈ {σ, u, u, ω, p} denote the corresponding components of the interpolation error. Then (53) implies As (τ h ) nt is constant on each facet, we can insert Π 0 in the last term, so several applications of the Cauchy-Schwarz inequality with h 1/2 and h −1/2 weights for the boundary terms yields where we used (54) again and the interpolation estimate (28) in the last step. Next consider the symmetrically opposite term in B. Since ∇v h ∈ P 0 (T ) and E σ is orthogonal to facet-wise and element-wise constant functions [see (26) where on the right hand side of the last inequality, the first term is obtained using (20b) and Lemma 4, while the second term is obtained using (48). Thus, the interpolation estimate (29) and Lemma 2 imply The remaining terms are easy: by Cauchy-Schwarz inequality, and by the definition of I W , I Q and (25), where the last equation is due to div v h ∈ P 0 (T ). Summing up (68), (69), (70), and (71), we prove (66), and hence (64). The pressure error estimate (65) follows along the same lines as in the proof of Theorem 1.

Numerical Examples
In this last section we present a simple numerical example to provide a practical illustration of the theoretical asymptotic convergence rates as well as to compare the two new methods we presented. Both methods were implemented within the finite element library NGSolve/Netgen  a higher order of convergence whenever the dual problem shows enough regularity [3,21]. Not surprisingly therefore, we observe quadratic convergence for the L 2 -norm of the velocity error for both methods. Condition numbers For both HDG and MCS method, after static condensation within the (u h , u h , ω h )or (σ h , u h , u h , ω h )-block of the finite element matrix respectively, we obtain a symmetric and positive definite diagonal block, which we simply refer to here as the "A"blocks of the respective methods. (Of course, due to the incompressibility constraint, the entire system is still of saddle point structure.) Both the A blocks have the same non-zero structure and are expected to have condition number O(h −2 ), but they discretize slightly different operators, namely ε for the HDG method, and dev(ε) for the MCS method. As ε(u) = dev(ε(u))+ 1 3 div(u) I and the true solution is divergence-free, adding the (consistent) term 1 3 div u h div v h to the MCS bilinear form yields an A block that is directly comparable to the one of the HDG method. In Fig. 3we show approximate condition numbers (cond) of said A blocks for some of the meshes used in the previous computations and different values of α in a hdg . We see that in addition to the MCS method not being dependent on any stabilization parameter in the first place, there appears to be no possible choice of α that would make the HDG method's A block better conditioned than that of the MCS method.

Author Contributions
The authors wrote, read and approved the final version this work and is responsible of all methods implemented.
Funding The authors acknowledge support from the Austrian Science Fund (FWF) through the research program "Taming complexity in partial differential systems" (F65)-project "Automated discretization in multiphysics" (P10), the research program W1245, and the NSF grants 1912779, 2136228.

Data Availability
The datasets and scripts to generate the data with the finite element library NGSolve www. ngsolve.org are available from the corresponding author and at [28].