Matrix-oriented FEM formulation for stationary and time-dependent PDEs on x-normal domains

When numerical solution of elliptic and parabolic partial differential equations is required to be highly accurate in space, the discrete problem usually takes the form of large-scale and sparse linear systems. In this work, as an alternative, for spatial discretization we provide a Matrix-Oriented formulation of the classical Finite Element Method, called MO-FEM, of arbitrary order $k\in\mathbb{N}$. On structured 2D domains (e.g. squares or rectangles) the discrete problem is then reformulated as a Sylvester matrix equation, that we solve by the reduced approach in the associated spectral space. On a quite general class of domains, namely normal domains, and even on special surfaces, the MO-FEM yields a multiterm Sylvester matrix equation where the additional terms account for the geometric contribution of the domain shape. In particular, we obtain a sequence of these equations after time discretization of parabolic problems by the IMEX Euler method. We apply the matrix-oriented form of the Preconditioned Conjugate Gradient (MO-PCG) method to solve each multiterm Sylvester equation for MO-FEM of degree $k=1,\dots,4$ and for the lumped $\mathbb{P}_1$ case. We choose a matrix-oriented preconditioner with a single-term form that captures the spectral properties of the whole multiterm Sylvester operator. For several numerical examples, we show a gain in computational time and memory occupation wrt the classical vector approach solving large sparse linear systems by a direct method or by the vector PCG with same preconditioning. As an application, we show the advantages of the MO-FEM-PCG to approximate Turing patterns with high spatial resolution in a reaction-diffusion PDE system for battery modeling.


Introduction
We are interested in the discretisation of (i) elliptic PDEs of the form where γ ≥ 0, (ii) parabolic PDEs of the form with d u > 0 being a diffusion coefficient, and (iii) reaction-diffusion systems (RDS) of the form with d u , d v > 0 being diffusion coeffcients.Problems (1)-( 3) are endowed with either homogeneous Dirichlet or Neumann boundary conditions, problems (2)-( 3) are endowed with suitable initial conditions.The relevance of the PDE problems (1)-( 3) is well-known, as they find numerous applications across all fields of science.We stress that the RDS (3) is the playground of Turing's theory of morphogenesis [32], which encompasses extremely diverse applications such as biological patterning [2], biomembrane modelling [11], tumour growth [7], metal dealloying [10], financial risk management [4], oscillating chemical reactions [33] and the recent applications to metal electrodeposition [5] which we will consider in the present work.We focus on the approximation of Turing patterns, because, from a computational point of view, this is a challenging task since fine meshes are required in space to capture the morphological class of the pattern itself (spots, labyrinths, etc) that must be attained as steady state of the PDE dynamics for long time of integration.Among the existing methods for the spatial discretisation of problems ( 1)-( 3) we mention finite differences [17], finite elements [16], spectral methods [7], kernel methods [18] and many more.It is well known that numerical methods typically approximate the elliptic problem (1) through an algebraic system in vector form: with the vector ξ containing the coefficients of the expansion of the numerical solution in a given discrete function basis, the matrix A approximating the operator L(u) = −∆u + γu and the vector b approximating the right-hand side of (1).For the time-dependent problems (2)-(3), a common general approach is the so-called method of lines (MOL), which consists of discretising the spatial variables with a spatial method of choice, thereby producing a continuous-in-time ODE system.For problem (2), in a general setting including several spatial methods, the resulting spatially discrete problem takes the form of the following ODE system in vector form with the vector ξ = ξ(t) containing the time-dependent coefficients of the expansion of u, the matrix A u approximating the operator L(u) := −d u ∆u, the vector b(t) approximating the right-hand-side of (2), and the matrix M depends on the spatial methods (e.g. the identity matrix for finite differences, mass matrix for finite elements).Similarly, the spatially discrete formulation of the RDS (3) becomes the following (possibly nonlinear) ODE system in vector form The main computational challenge of (4)-( 6) is dimensionality.If the discrete function space has dimension d, the matrices A, A u , A v , M appearing in (4)-( 6) are of size d×d.There are special cases, as detailed below, where such matrices possess a general Kronecker structure with n ∈ N terms, e.g.
with ⊗ denoting the Kronecker product and R i , L i being matrices of lower dimension, e.g.[9,15,23,25,26,28].In such cases, since where U is such that vec(U ) = ξ, problem (4) can be reformulated as the following linear algebraic matrix equation where B is such that vec(B) = b.Problem ( 9) is called a multiterm Sylvester equation, see [31].
The solution of general multiterm Sylvester equations is mostly uncharted territory, as discussed in [30,31] and references therein.
A special case of (9) worth mentioning is the two-term case n = 2, when (9) specialises to a generalised Sylvester equation and closed-form algorithms are available, such as the Bartels-Stewart algorithm [3] or its improvement proposed by Golub and others [13].If L i , R i further fulfil suitable assumptions, even more efficient closed-form algorithms are available, based on spectral decomposition, see for instance [9], as we will also discuss in the next sections.
For the time-dependent problems ( 5)-( 6), if the matrices A u , A v , M possess a Kronecker decomposition similar to (7), problems ( 5)-( 6) can be reformulated as matrix ODE systems.In this work, we apply the Implicit-Explicit (IMEX) Euler scheme directly to (6) in vector form, then we will consider its MO counterpart.This will yield a sequence of multiterm Sylvester matrix equations as detailed in Section 6. Matrix formulations of spatial methods for PDEs were successfully carried out in some notable cases.A class of elliptic problems with convection, posed on rectangular or parallelepypedal domains, was discretised via central finite differences in matrix-oriented form in [25] and the discrete problem takes the form of a multiterm Sylvester equation.The methodology was then extended to address more general polygonal domains, see [15].Elliptic anisotropic PDEs with stochastic terms were approximated via Galerkin method in matrix-oriented form in [26], the discrete problem is a multiterm Sylvester equation.Isogeometric analysis was successfully applied to various elliptic problems, see for instance [1,23,28].On square domains, the discrete problem is a generalized Sylvester equation, see [28].On more general domains defined through splines or NURBS, a Sylvester form can still be achieved if using suitable low-rank approximations of kernels, see [23].The work in [9] addresses time-dependent problems, specifically the heat equation and RDSs, on rectangular domains, where the spatial discretisation is carried out via central finite differences in matrix-oriented form and the discrete problem takes the form of a two-term Sylvester equation, which lends itself to an extremely efficient numerical treatment based on spectral decomposition.
In the present work we consider both elliptic PDEs of the form (1) and parabolic PDE problems such as the semilinear heat equation (2) and RDSs (3), posed on a class of two-dimensional spatial domains known as normal domains.For such PDE problems we propose a Matrix-Oriented Finite Element Method for the spatial discretisation, that we will define as MO-FEM.The proposed framework advances the existing theory on matrix-oriented spatial discretisation of PDEs in several directions, as listed below.
• To the best of the author's knowledge, the present work provides the first MO formulation of the finite element method for elliptic and parabolic PDEs.The proposed theory is general and applies to a large class of basis functions, such as Lagrangian P k basis functions, k ∈ N, thereby covering arbitrarily high-order convergence in space.Special focus is given to the practical special case of lumped P 1 finite elements.
• On rectangular domains, where Cartesian-structured mesh are immediate to construct, the discrete Laplacian takes the form of a two-term Sylvester equation, in analogy with matrixoriented finite differences [9] or isogeometric analysis [28].Moreover, since central finite differences are equivalent to lumped P 1 finite elements, the matrix-oriented FD discretization of the considered PDE problems can be considered as a special case of the proposed theory.In this case we solve the two term Sylvester equation by the reduced approach in the associated spectral space, see Section 4.
• Thanks to a suitable coordinate transformation, the proposed theory applies to domains more general than rectangles, namely normal domains.To the best of the authors' knowledge, the first applications of a matrix-oriented method to non-rectangular domains are (i) the work in [23], where isogeometric analysis with suitable low rank approximation of kernels is applied to elliptic problems and (ii) the work in [15], where a matrix-oriented finite difference scheme with conformal mappings is applied to elliptic problems on polygonal domains.On normal domains, the proposed approach adopts a curved mesh that matches the (possibly curved) boundary exactly.Hence, the proposed approach combines the low dimensionality of a matrix-oriented approach with the absence of geometric error.Since normal domains can be wrapped around a cylinder, the proposed method also applies to spatial domains that are special surfaces, namely cylinders with curvilinear edges.In all these cases, he discrete problem takes the form of a multiterm Sylvester equation, where the additional terms account for domain shape.In this case, for the numerical approximation of general multiterm Sylvester equations, we propose an iterative method: a matrix-oriented preconditioned conjugate gradient method (MO-PCG) that always converges for the considered PDE problems,thanks to the involved differential operators being coercive and self-adjoint, see [27,Section 6.7].
• We provide numerical experiments that demonstrate that (i) for elliptic problems on square domains, both the reduced approach and the MO-PCG exhibit optimal spatial convergence, (ii) for elliptic and parabolic problems on x-normal domains, MO-PCG exhibits optimal convergence in space (and time, if combined with IMEX Euler), and (iii) both the reduced approach and the MO-PCG provide a significant gain in terms of computational time and memory storage in comparison to the standard vector form, solved both via a direct solver and vector PCG.
• Concerning reaction-diffusion systems, we show that the MO-FEM allows for accurate simulation of Turing patterns -obtained as asymptotic solutions-that might be prohibitive, in terms of time and memory, through standard finite elements in vector form, because fine spatial grids are required to capture the features of pattern morphology, after long-time integration, see [9].
In particular, here we solve a RDS of interest for battery modeling on some x-normal domains, cap and jar shaped, and cylindrical surfaces with curvilinear boundaries.It is worth noting that the right-hand sides of such RDSs are not low-rank, then the solution of the multiterm Sylvester equations cannot be approximated by Krylov methods such those in [30].
The paper is organized as follows.In Section 2, we introduce preliminary definitions and results, we elaborate on the classes of spatial domains to be considered, and we introduce curvilinear Cartesian-structured meshes.In Section 3, we define a general finite element method and we derive a Kronecker decomposition of the discrete differential operators, thereby considering as a practical variation lumped P 1 finite elements.
In Section 4, we introduce the MO formulation for these FEMs.on different kind of spatial 2D domains.Specifically,we discretise the stationary PDE problem (1) on a square domain and we solve the corresponding Sylvester equation by a spectral (reduced ) approach, also in the case of Lumped FEM.In Section 5, we present the discretisation of the elliptic problem (1) on x-normal domains and we present the solution of the corresponding multiterm Sylvester equations by the proposed matrix-oriented PCG method for FEM of orders k = 1, . . ., 4 in space.A comparison with the vector PCG is provided in the numerical examples.
In Section 6, we extend the proposed MO-FEM approach to the semilinear heat equation ( 2) and we apply the MO-PCG method to solve the sequence of Sylvester multiterm equations by the application of the IMEX-Euler method in time.Specifically, in Section 6.1, we present the convergence results on a cap-shaped domain and the computational performance in terms of execution time.In Section 7 we present the numerical simulations of electrochemical patterns arising in batteries for different choice of the parameters in the reaction-diffusion system (3) corresponding to the DIB model [19,29] yielding Turing patterns with spots-worms and holes.We show that these patterns can be seen as PDE solutions on cylindrical surfaces.In Section 8, we provide some concluding remarks and highlight future research directions.

Normal domains and curvilinear structured meshes
In this section we introduce the classes of domains to be considered in this work and we construct suitable curvilinear meshes whose nodes possess a Cartesian ordering and that match curved boundaries exactly.We also introduce preliminary definitions and results that we will adopt in the construction of the proposed matrix-oriented finite element method.

Square domains
To make the reader familiar with the proposed setting, we start by discretising square domains.On the one-dimensional unitary domain: K := [0, 1], we consider the equally spaced mesh K h with (N + 1) nodes, with N ∈ N.For m = 0, . . ., N , we define the m-th node as x m = m/N .Each element of K h is of the form E m = [x m , x m+1 ] for some m = 0, . . ., N − 1.On the square two-dimensional domain: Ω := [0, 1] 2 we consider the Cartesian mesh Ω h with (N + 1) × (N + 1) nodes.For m, n = 0, . . ., N , we define the node for some m, n = 0, . . ., N − 1, see Fig. 1a for an illustration.The above construction has a tensor structure:

x-normal domains
We now introduce the more general class of x-normal domains, we discretise such domains through a curvilinear mesh and we provide some related results.Consider a smooth Cartesian curve x = L(y) for y ∈ [0, 1] such that L(y) > 0 for all y ∈ [0, 1].Consider the following x-normal domain The reference domain Ω = [0, 1] 2 and the x-normal domain Ω L are linked by the diffeomorphism η : Ω → Ω L defined as follows The idea of mapping a class of domains onto the reference square in order to exploit the tensor structure of the mesh is reminiscent, for example, of the work in [15].Let x L mn := η(x mn ), m, n = 0, . . ., N be the transformed nodes and let Ω L h := η(Ω h ) be the transformed (curvilinear) mesh.For each element Q ∈ Ω h let T := η(Q) be the corresponding transformed element, see Fig. 1b for an illustration.The Jacobian J η of η is given by so that det J η (x, y) = L(y).
The inverse transformation η −1 : Ω L → Ω is given by so the Jacobian J η −1 of η −1 is given by Switching back to the original coordinates, we have which implies that We finally define the matrix which is symmetric and uniformly positive definite on Ω.
Remark 1 (Symmetric x-normal domains) The proposed theory still holds true on symmetric x-normal domains of the form where S(y) > 0, y ∈ [0, 1] is a smooth function.In this case, equations ( 12)-( 19) hold true by setting and L(y) = 2S(y).A symmetric x-normal domain Ω S of the form (20) with its discretisation Ω S h is shown in Fig. 1c.It is worth remarking that the proposed theory can be easily extended to non-symmetric x-normal domains, we do not consider this case for ease of presentation.
Remark 2 (Curvilinear cylindrical surfaces) Every x-normal domain (11) satisfying L(0) = L(1) or symmetric x-normal domain (20) satisfying S(0) = S(1) can be wrapped around a cylinder through the transformation σ : Ω L → R 3 defined by We can thus define the curvilinear cylinder Γ and its curved mesh Γ h as respectively, see Fig. 2 for an illustration.We will show that the matrix approach proposed in the next section can be applied also to special surface PDEs, see for instance [12].
Remark 3 (More general domains) The choice of normal domains is justified by the property that each entry of the matrix H(x, y) defined in ( 19) is a finite sum of separable terms.The proposed approach still applies if the coordinate transformation η defined in (12) is such that the corresponding H(x, y) retains this property.

Matrix-oriented formulation of FEM
We are now ready to (i) formulate a finite element method on the (possibly curvilinear) meshes introduced in the previous section and (ii) derive a Kronecker decomposition of the discrete differential operators involved in problems ( 1)-(3).To make the reader familiar with the notations and results, we start with the case of square domains.

Square domains
Let {ψ i } N +1 i=1 be any finite element basis functions on the one-dimensional mesh I h , e.g.piecewise Lagrange polynomials of any fixed degree.The corresponding stiffness-and mass matrices A, M ∈ R (N +1)×(N +1) in 1D are defined by for i, j = 1, . . ., N + 1, respectively.On Ω h we choose the tensor-product local Lagrange basis defined as In general we will write φ i (x, y) = φ ix (x)φ iy (y) for all i = 1, . . ., (N + 1) 2 , meaning that every 2D basis function can be uniquely decomposed as a product of 1D basis functions.The stiffness-and mass matrices A, M ∈ R (N +1) 2 ×(N +1) 2 in 2D are defined by for i, j = 1, . . ., (N + 1) 2 , respectively.The stiffness-and mass matrices A and M in 2D fulfil the following relations: x 00 x 10 x 20 x 01 x 11 x 21 x 02 x 12 x 22 (a) Unit square Ω and its approximation Ω h for N = 2.

x-normal domains: P 1 elements with lumping
The Kronecker decompositions (35) are absolutely general and encompass arbitrary choices of the FEM basis function, thereby including the case of Lagrangian spatial methods of any order.However, for the sake of practicality, we consider the special case of Lagrangian P 1 finite elements with mass and stiffness lumping, see [24].We show that such special case significantly simplifies the matrix identities (35).This has two advantages: (i) existing finite element codes can be adapted to the proposed approach with minor modifications and (ii) the resulting numerical schemes, which take the form of matrix equations, become much easier to solve.
In the remainder of this section, we specialise {ψ i } N +1 i=1 to be the standard P 1 Lagrangian (also called pyramidal) basis functions.We consider the (diagonal) lumped mass matrix M 0 ∈ R (N +1)×(N +1) in 1D defined by with I h being the element-wise interpolant operator [24] and δ ij being the Kronecker symbol.We also consider the (tri-diagonal) convection matrix C ∈ R (N +1)×(N +1) defined by Then we consider the (tri-diagonal) modified stiffness matrices A 1 , A 2 ∈ R (N +1)×(N +1) defined by for i, j = 1, . . ., N + 1.We finally define the following auxiliary diagonal matrices ) defined as follows By combining (19) and (34) we get The matrices defined by ( 33) and (40) can be understood as the mass and stiffness matrices of the anisotropic elliptic equation with H as defined in (19).Hence, by following [24], the lumped counterparts M of M and A of A are defined by for i, j = 1, . . ., (N + 1) 2 , where I h is the element-wise, vector-valued interpolant operator, see [12].So, the tensorial decompositions of (42) are carried out as follows Equations ( 43)-(44) translate to the following matrix identities: As mentioned earlier, the matrix relations (45) are simpler than (35).This is because (i) the matrices D 1 , D 2 , D 3 , M 0 are now diagonal and (ii) the only non-diagonal matrices involved, i.e.A, A 1 , A 2 , C are now tridiagonal.
Remark 4 (Homogeneous Dirichlet boundary conditions) In the presence of homogeneous Dirichlet boundary conditions, it is well-known that the boundary basis functions must be eliminated from the basis, see [16], hence all the matrices involved in (45) must be trimmed by eliminating all boundary entries.Hence, the dimension of such matrices drops from (N + 1) × (N + 1) to (N − 1) × (N − 1) and the following additional properties hold true.
• M 0 is a multiple of the identity: for m 0 := 1 N it holds that • The stiffness matrix A is Toeplitz.
• Thanks to the symmetries of the ψ i 's, C is Toeplitz and skew-symmetric, i.e.
Because the dimension of the matrices depends on the kind of boundary conditions, see Remark 4, we set In the presence of homogeneous Dirichlet boundary conditions, thanks to (46) and (47), relations (45) become Remark 5 (Recap on lumped P 1 matrix properties) We recap here the properties of the matrices appearing in (45)and ( 49): • M 0 and D 1 are diagonal and positive definite.In the Dirichlet case M 0 is multiple of the identity; • D 2 is diagonal and it is non-singular only when the curve x = L(y) is strictly monotone; • D 3 is diagonal and is singular when N is odd, or tends to being singular when N is even and approaches infinity; • A, A 1 are tridiagonal, moreover they are positive definite in the Dirichlet case and semidefinite in the Neumann case (one null eigenvalue).A is symmetric and in the Dirichlet case it is Toeplitz; • A 2 is tridiagonal non-symmetric, it is singular for N odd, or tends to being singular for N even approaching infinity; • C = tridiag([−1, 0, 1]), except for boundary entries in the Neumann case (it is the matrix of centered first derivatives).According to N being even or odd, and depending on the boundary conditions, C is singular or tends to being singular as N → +∞.

Stationary PDEs and the Sylvester equation
In this section we consider the Poisson equation on the unit square Ω = [0, 1] 2 : where γ ≥ 0 for the case of Dirichlet boundary conditions and γ > 0 for the case of Neumann boundary conditions.Since we encompass general boundary condition, we use the notation in (48) for the dimension of matrices.The general FEM discretisation of problem (50) in vector form is then with ξ being the nodal vector of the numerical solution and f being the corresponding nodal vector of f .By using (28), the linear system (51) becomes the following generalised Sylvester matrix equation: where U and F are such that vec(U ) = ξ and vec(F ) = f .Since the mass matrix M is positivedefinite, we can pre-and post-multiply both sides of (52) by M −1 : Even if M −1 A and AM −1 are not symmetric, they are diagonalizable nonetheless because A, M are both symmetric and M is positive definite.Hence, also Z 1 := M −1 A + γ 2 I and Z 2 := AM −1 + γ 2 I are diagonalizable.Then, (54) becomes and we can diagonalise Z 1 , Z 2 as follows with Λ (k) ∈ R q×q , k = 1, 2, being the diagonal matrices containing the eigenvalues λ (k) i , i = 1, . . ., q of Z 1 and Z 2 , respectively.Except special cases, the Λ (k) 's and X (k) 's must be computed numerically.Now, by setting U := X (1) −1 U X (2) and F := X (1) −1 F X (2) , (54) becomes which can be solved as follows.Let L ∈ R q×q be the matrix defined by As shown in [9], the solution to (56) in the spectral space can be expressed as: with • denoting the Hadamard product.The original variable U is thus given by We call this technique FEM reduced method.
Remark 6 (Memory performance of the reduced approach) Because U , F , L, X (1) , X (2)  are the only full matrices involved in the computation, the memory occupation of the proposed approach is 5N 2 +O(kN ) floating point numbers for any k.On the other hand, since the large matrices A and M are (2k + 1) 2 -diagonal, where k is the polynomial order of the method, and f and ξ are both full vectors, the vector formulation (51) has a memory occupation of ((2k + 1) 2 + 2)N 2 + O(kN ) floating point numbers.

Special case: Lagrangian P 1 elements with Dirichlet boundary conditions
In the special case of Lagrangian P 1 elements with Dirichlet boundary conditions, we are able to solve the matrix equation (54) in closed form without computing spectral decompositions numerically.In fact, since M and A are both symmetric, positive definite, tridiagonal Toeplitz matrices, there exist α, β > 0 such that M = αA + βI.In this specific case, we have This implies that (i) M −1 and A commute, (ii) M −1 and A share the same eigenvectors and (iii) M −1 A = AM −1 is diagonalizable and shares the same eigenvectors of M −1 and A. We can thus write with Λ ∈ R (N −1)×(N −1) being the diagonal matrix containing the eigenvalues λ i , i = 1, . . ., N − 1 of Z 1 = Z 2 .Now, by setting U := X −1 U X and F := X −1 F X, (54) becomes in the spectral space.We are left to show that X and Λ are known in closed form.In fact, as shown in [8], the eigenvalues and eigenvectors of A are given by Hence, by using (60) the entries of Λ are given by while the entries of X (common basis of eigenvectors of A, Z 1 and Z 2 ) are given in (64).Hence, U is given by (59) with Λ (1) = Λ (2) = Λ defined by (65) and X (1) = X (2) = X defined by (64).
For k = 1, we further compare the aforementioned methods with the reduced method in closed form (62).Here we consider a sequence of 7 meshes Γ i with N i = 24 • 2 i for all i = 0, . . ., 6.Such N i 's are compatible with P k elements for all k = 1, . . ., 4. The numerical results are shown in Fig. 3. On the finest mesh (N = 1536), the reduced approach (56) for k = 1 is approximately 1.38 times quicker than the direct (vector) method, 1.58 times quicker for k = 2, 1.70 times quicker for k = 3, and 1.96 times quicker for k = 4.The reduced approach in closed form (62) (only k = 1) is 21.73 times quicker than the vector formulation.Furthermore, on equal meshes, the methods produce the same solutions up to rounding errors (they are equivalent to each other) and exhibit optimal convergence in space, with the case k = 2 being superconvergent (fourth order), as we can see in Fig. 3, right plot.This and all the following experiments are carried out in MATLAB R2019a on a HP Z230 Tower Workstation with Intel Core i7-440 CPU and 16GB RAM.The timings were just taken once and not averaged with several measurements.

Stationary PDEs on x-normal domains and multiterm Sylvester equations
We consider the following stationary PDE problem on an x-normal domain Ω L : with γ ≥ 0 in the presence of zero Dirichlet boundary conditions and γ > 0 in the presence of zero Neumann boundary conditions.The FEM discretisation of problem (67) in vector form is then with A and M as defined in (32).The lumped counterpart of (68) is with M and A in (42).Here, ξ ∈ R q 2 is the nodal vector of the numerical solution and f ∈ R q 2 being the corresponding nodal vector of f .By using (35), the linear system (68) becomes the following multiterm Sylvester matrix equation: Similarly, by using (49) the "lumped" linear system (69) translates to: Thanks to (47), in the case of zero Dirichlet boundary conditions, (71) can be simplified as: We now propose an iterative strategy that can be applied for the solution of the multiterm matrix equations (70), ( 71) and (72).For the systems (68) and (69), the well known Preconditioned Conjugate Gradient method (PCG) [14] would be a suitable choice since the matrices A + γ M and A + γ M are symmetric and positive definite.Here, we propose its matrix oriented version to solve the corresponding Sylvester multiterm equations ( 70) and (72).To this end, we define the following matrix operators: for all U ∈ R q×q .We also consider suitable preconditioning operators P, P : R q×q → R q×q for the FEM and lumped FEM, respectively, whose choice will be discussed later.With these settings, the matrix-oriented formulation of the PCG for systems (68) and ( 69), that we define as MO-PCG, is given by ; )) ; where the and are omitted for ease of presentation.For the initial guess U (0) we can choose for instance U (0) = 0.
For illustrative purposes, we consider the following stopping criterion.If err v is the absolute error obtained by the direct method solving the linear systems (vector formulation) (68) or (69), we stop the MO-PCG iterations when the increment fulfils where • F denotes the Frobenius norm.

Memory performance of the PCG approach
Because U (s) , R (s) , Z (s) , Q (s) , F are the only full matrices involved in the computation, the memory occupation of the proposed approach is 5N 2 + O(kN ), with N being the grid size and k is the FEM polynomial order On the other hand, the vector -Kronecker formulation (51) has a memory occupation of ((2k + 1) 2 + 2)N 2 + O(kN ) for matrix storage, where k is the polynomial order of the method, as discussed in Remark 6. the convergence is optimal of order k + 1, for k = 2, we observe superconvergence of order 4. PCG is cheaper in execution time and for all k and for all N stops after two iterations, see also Table 1.
Table 1: Poisson equation (66) on the square Ω = [0, 1] 2 : computational times on the finest mesh (N = 1536) for all k = 1, 2, 3, 4 (see Fig. 4, left plot) and the respective time ratios between the direct method for the vector form (51) and the MO-PCG method (77)-( 78 method for the vector approach and the gap increases with N , as we can see in Fig. 4, left plot.The detailed time comparisons on the finest mesh are shown in Table 1.For all k, the methods exhibit optimal convergence ((k + 1)-th order), with the case k = 2 being superconvergent (fourth order) as we can see in Fig. 4, right plot.On the finest mesh (N = 1536, k = 4) errors near the machine precision affect the convergence order, with MO-PCG being more accurate.For all k and for all N , PCG terminates with two iterations.In conclusion, the MO-PCG method outperforms also the reduced matrix approach (56) when the eigenvalue decompositions (55) are computed numerically.

Example 2: Poisson equation on curved domain
Consider the cap-shaped symmetric x-normal domain (shown in the next Fig.6).On Ω S we consider the Poisson equation with zero Dirichlet boundary conditions: were f (x, y) is chosen in such a way that the exact solution is we omit the cumbersome expression of such f (x, y).We consider the P k elements (68), k = 1, . . ., 4, and the lumped P 1 elements in vector form (69) and we solve again these classical linear systems via the MATLAB direct solver mldivide.Hence, we compare the perfomance in terms of computational time with the matrix-oriented PCG method (77)-( 78) with preconditioners as in (80), respectively, that solve the multiterm Sylvester equations arising by the MO-FEM.By applying all methods on a sequence of 6 meshes Ω S i , i = 0, . . ., 5 with N i = 24 • 2 i for all i = 0, . . ., 5 we find optimal quadratic convergence in L 2 (Ω S ) and almost equal errors, as reported in Fig. 5, upper right plot.By comparing the computational times, as shown in Fig. 5, upper left plot, the MO-PCG is competitive only for k = 1 and moderate meshsizes N < 768.Note that the no lumped FEM is slightly better than the lumped version.On finer meshes, say for N ≥ 800, the number of iterations required by MO-PCG increases dramatically for k ≥ 2, so affecting its global performance.We guess that different preconditioners could improve this behaviour, but this study is outside the scope of the present work.On the other hand, we find in this particular experiment that, the PCG method in its classical vector form is much slower the our MO-PCG, as discussed before.In fact, by applying the built-in function pcg of Matlab with the same preconditioner, we find that MO-PCG is much quicker and the gap increases with N .For example, for k = 1 without lumping, the speedup factor is: 1.48 for N = 24 and 30.82 for N = 768.As a final remark, it is worth noting that in any case, as N and k increase, MO-PCG becomes more competitive than the Direct (Kronecker) solver in terms of memory occupation, as explained before in more detail.
In the next section we will show that the MO-PCG approach will prove far more convenient in the case of time-dependent PDEs.

Time-dependent PDEs
On a general x-normal domain Ω L we consider the following semilinear heat equation  3) is completely analogous and we omit the details.The P k and lumped P 1 FEM spatial discretisations of (84) in vector form are as follows respectively, where ξ(t) ∈ R q 2 , with q as defined in (48), is the time-dependent nodal vector of the spatially discrete solution.A full discretisation can be obtained by applying the IMEX Euler timestepping scheme with timestep τ > 0 to (85) and (86), which yields respectively, where N T := T τ , ξ (k) is the nodal vector of the fully discrete solution at time t k := kτ and f (k) := f (ξ (k) ), see [12].Following [9], the fully discrete scheme (88) can be further accelerated through an a-priori LU factorisation of the coefficient matrices (in combination with symamd reordering of such matrix to further increase sparsity).In matrix-oriented form, (87) and (88) become Table 2: Test 1 -Semilinear heat equation (97) on the cap-shaped domain: P k finite elements, k = 1, 2, 3, 4 and lumped P 1 elements.We show the convergence rates in space and time obtained by solving the vector formulations (87) and (88) by the direct method and the corresponding MO-FEM by the MO-PCG method (89).The convergence rates are optimal for all k = 1, 2,  1) and tol = 1.This choice of timesteps allows to highlight optimal convergence in L 2 (Ω S ) norm (i.e.(k + 1)-th order in space and first order in time).We have confined this test to these N values because the timestep τ i,k would become too small for larger values of N when k > 1.The results are shown in Table 2.
In Test 2, we consider N = 480, 960, 1920, fixed τ = 1e-2.This test is more representative of typical user-case scenarios where high spatial resolution is required.The obtained results indicate a significant advantage of MO-PCG and are shown in Table 3 for all k and for all N .At each timestep, MO-PCG converges with just one iteration, except with lumped P 1 elements, where up to two PCG iterations per timestep are required.

Applications to pattern formation in battery modeling
We now consider the following reaction-diffusion model in two variables η : Ω × [0, T ] → R and θ : Ω × [0, T ] → [0, 1], endowed with zero Neumann boundary conditions, on an arbitrary compact domain where d θ > 0 is the diffusion coefficient, ρ > 0 is a space-time rescaling factor and the kinetics are with α, γ, A 1 , A 2 , B, C, D, k 2 , k 3 positive parameters.
The PDE system (98)-( 100) is known as DIB model and has been introduced for the first time in [5] to describe electrodeposition processes.Under suitable choices of the parameters and of the domain Ω, this model was shown to possess a variety of spatially structured solutions, known as Turing patterns, see for example [20,21].An interesting application in battery modeling is reported in [19,22].Turing patterns are obtained as stationary solutions of (98)-(100) and then their numerical approximation requires highly spatial accuracy for longtime integration, this motivates the  97) on the cap-shaped domain (82), solved through P k finite elements, k = 1, 2, 3, 4 and lumped P 1 elements.For each meshsize N , we apply the IMEX Euler method with timestep τ = 0.01.In all cases, the MO-PCG approach is quicker than the vector (direct) approach, especially for k = 1 and k = 4.The gap increases with N , as shown by comparing the time ratios.MO-PCG always converges with one single iteration, except with lumped P 1 elements, where two iterations are required.100) is [9] where finite differences and several time solvers have been proposed on square domains.In [9], the Sylvester matrix equations obtained at each time step have been approximated by the reduced approach, similar to the one in Section 4. For example the IMEX Euler yielded the rEuler method, that revealed much more efficient than its classical vector approach.On the other hand, domain geometry was also proven to play an important role in pattern selection, as shown also in [19,22], for this reason efficient solvers that can be applied on domains as general as possible are need.Towards this aim, here we propose the matrix oriented FEM spatial approximation and the MO-PCG approach presented in Setion 6 with preconditioner (95) to deal in particular with some x-normal domains.We will present two kind of simulations, first on the cap shaped domain introduced in (82) and then on the jar-domain shown in Fig. 2a that correspond to the curvilinear cylinder in 2b.In both cases we will consider domains of increasing effective domain size given by A = ρ|Ω|, where |Ω| is the area of the domain in (98).In fact, as shown in [19,22] there exists a sufficiently large A * such that for A > A * the intrinsic Turing pattern corresponding to the fixed model parameters arises (see also [19] for more details), otherwise only a portion of it can be approximated, giving rise to doubts about its classification.

N k
To solve on domains of large sizes, we exploit the meaning and the role of the parameter ρ in (98), as follows.By introducing new variables ( x, y) = √ ρ(x, y), the chain rule yields Hence, we define Ω ρ := √ ρΩ and T ρ := ρT .Hence, ρ acts as a rescaling parameter in space and time.
In the following simulations we always solve 110 in the reference domanis Ω cap shaped e jar shaped with final time time T and timespet τ such that T ρ = 300 and τ ρ = 5e − 3 which guarantees the stability of the IMEX Euler method.The corresponding numerical solutions will be plotted in the rescaled domain Ω ρ for T ρ = 300.In all the experiments fix the following model parameters: The initial data are given by θ 0 (x, y) = θ e + 10 −4 rand(x, y) and η 0 (x, y) = η e + 10 −4 rand(x, y) and are small spatially random perturbations of the homogeneous equilibrium (η e , θ e ) := (0, 0.5).

Cap-shaped domain
In this example, we consider the cap-shaped domain (82) and we choose A 2 = 30, B = 25, C = 7, that, according to the segmentation results in [29], can yield mixed spots-worms Turing patterns at the steady state.We solve the PDE RDS system with P k elements (k = 1, 2, 3, 4) and lumped P 1 elements in space and by the IMEX Euler, as in the previous section.We compare the vector approach solving the sequence of linear systems in (87)-(88) by the direct method (that we will call "vector method") with the MO-PCG approach (90), solving the multiterm Sylvester equations arising at each time step for this choice of the domain.
We solve the DIB model (98) with different combinations of ρ, N x and N y as listed in Table 4, that is for domains of larger area A = |Ω ρ |.In all the computations, we discretise the x dimension with N x nodes and the y dimension with N y nodes, with N x = 2N y , which reflects the aspect ratio of the domain.In Fig. 6, for each simulation, we report the final patterns obtained by the vector approach and by the MO-PCG, together with the respective increments η (k+1) − η (k) F as a function of time, with • F Frobenius norm.If such increment decreases over time and tends to an almost small stationary value, we deduce that the numerical solution is converging to a steady state.
For increasing values of the effective domain size A the solution morphology changes and a pattern with more structures is attained, as shows in Fig. 6,(a)-(c) corresponding to the values (a)-(c) in Table 4, respectively.In simulation (a) a good pattern is attained by both methods, but its morphology is not completely expressed.The vector and the matrix approach seem to be equivalent in this case also in terms of computational times (see Table 4).To capture the true Turing morphology a larger domain Ω and a sufficiently fine mesh is required, otherwise phantom patterns could be obtained.This is exactly what happens for the simulation in case (b), corresponding to the second row of both Fig. 6 and Table 4, where the same mesh of case (a) yields a "pixelated" pattern.Hence, for the same domain, in simulation (c) a finer mesh is used and both methods are finally able to attain a "complete" pattern.Note that, in vector form at each time step we solve a problem of dimension N x • N y = 200 • 400 = 8 • 10 4 , by the MO-PCG instead we solve a sequence of t = 6 • 10 4 rectangular multiterm Sylvester equations of size 400 × 200.Moreover, this example shows that the matrix-oriented PCG algorithm (77)-( 78) can successfully solve rectangular Sylvester equations.As shown in Table 4, the time ratios indicate that the matrix PCG approach (89)-(90) tends to become quicker than the vector-direct approach (88) with significant advantage only for P 4 and lumped and no lumped P 1 elements.For k = 2, 3, we guess that a different preconditioner could improve the results of the MO-PCG method.

Jar-shaped domain
Thanks to the results in the previous test, here we solve the model only with P 1 elements, both with and without lumping.To further explore the robustness of the matrix PCG approach w.r.t.domain complexity and mesh distortion, we consider the jar-shaped domain in Fig. 2a.We fix the parameters A 2 = 1, B = 30, C = 3 for the DIB model which are known to produce Turing patterns with holes (also called reversed spots) [29] and again we solve for increasing A with different combinations of ρ, N x and N y as listed in Table 5.In all the computations, we consider N x = 3N y , which reflects the aspect ratio of the domain.The timestep and the final time are as in the previous test on the cap-shaped domain.We show the P 1 solutions in Fig. 7.As we can see in the figure, in case (a) on the smallest domain only few holes arise in the pattern.In case (b) on the larger domain, more structures arise in the pattern, but the mesh is too coarse and a phantom pattern arises.In case (c), the intrinsic Turing pattern is well-resolved for N x = 600 and N y = 200.The MO-PCG and the vector solutions are very similar, but the MO approach converges in significant less time (see Table 5).
We conclude by remarking that, since the jar-shaped domain in Fig. 2a can be transformed to the cylinder Γ shown in Fig. 2b, then the solutions shown in Fig. 7 can be interpreted, after the coordinate transformation (21), as solutions to the surface DIB model, that is (98)-(100) where the Laplace operator ∆ is replaced by the Laplace-Beltrami operator ∆ Γ on the cylinder Γ.As an example, we report in Fig. 8 the solution in (a) wrapped on a curvilinear cylinder.The application of the model on a cylindrical surface can be of applicative interest as shown in [6], in which the authors consider the use of cylindrical Zn sponges as a means of limiting the shape change and dendrite formation issues in Zn-based rechargeable batteries.6 and performance comparison between the vector approach (87)-(88) based on LU decomposition and the matrix PCG method (90).When the time ratio r t is close to 1, the methods take approximately the same time; r t > 1 indicates that MO-PCG is quicker.The MO-PCG is less expensive as N x and N y increase.Simulation in (b) yields a phantom pattern.The last column shows the amount of iterations required by MO-PCG for each of the two PDEs of the model.4.

A
Table 5: Reaction-diffusion model ( 98)-(100) with parameters as in Section 7.2 on the jar-shaped domain in Fig. 2a: performance comparison between the vector method (87)-(88) with LU decomposition and the matrix PCG method (90).The vector PCG approach tends to become quicker than the vector approach as N x and N y increase.5.In case (a) on the smallest domain only few holes arise in the pattern.In case (b), for ρ = 20000 on the larger domain the mesh is too coarse and a phantom pattern arises.In case (c), the intrinsic Turing pattern is well-resolved for N x = 600 and N y = 200.MO-PCG and vector solutions are very similar, but the MO approach converges in significant less time (see Table 5).

Conclusions
In this work we have provided a matrix-oriented formulation for Lagrangian finite elements of arbitrarily high order k ∈ N on x-normal domains.The proposed approach applies to both elliptic and parabolic PDE problems.The discrete problems take the form of a matrix equation (or a sequence of matrix Sylvester equations in the time-dependent case) of much smaller dimension that is mathematically equivalent to the much larger standard linear systems in Kronecker form.The proposed approach adopts a curvilinear structured mesh that eliminates geometric boundary error.Moreover, through a coordinate transformation, our approach applies also to special surface domains, namely cylinders with arbitrary curved boundaries.On square domains, the discrete problems take the form of a generalised two-term Sylvester equation that we solve efficiently through a spectral approach for all k.In this sense, our work extends the findings in [9], based on classical finite differences, to the case of high order FEM in space.On general x-normal domains, the discrete problem takes the form of (a sequence of) multiterm Sylvester equations which we solve through a matrix-oriented PCG method with matrix-oriented preconditioner that is quick to evaluate thanks to is single-term form.On one hand, such solver is always quicker than the classical PCG in vector form.On the other hand, we show by several numerical tests that it is quicker than the optimised direct solver mldivide of MATLAB in the case of (i) time-dependent PDEs on general x-normal domains and (ii) elliptic PDEs on square domains.In terms of memory occupation, the matrix-oriented PCG method always improves on any direct or iterative solver that relies on the full storage of the Kronecker matrix, and the gap increases both with the number of gridpoints N and the polynomial order k of the method.The only case in which we could not find any speedup in the MO-PCG approach is that of elliptic problems on non-square domains with P k elements, k = 1.This opens the quest for efficient preconditioners, which will be addressed in future studies.Special consideration deserves the application to reaction-diffusion systems, where the simulation of fine-grained patterns requires high spatial resolution, which translates into computationally intense simulations both in time and memory.Our experiments for the approximation of Turing patterns arising in batteries as solution of the DIB morphochemical model provide encouraging results in this direction and justify the matrix approach in terms of execution times and storage.The best performance gains were observed with P 4 and lumped P 1 elements.Also in this case, we believe that further performance gains can be found through the development of more efficient preconditioners and more efficient solvers for multiterm Sylvester equations, such as a truncated PCG [30].These aspects will be addressed in future studies.

Figure 1 :
Figure 1: Pictorial illustration of the classes of spatial domains considered in this work, together with their curvilinear meshes.
(a) Symmetric x-normal domain Ω S of the form(20) with S(y) = 1 + 1 2 sin 2πy with its mesh Ω S h .(b)Curvilinear cylinder Γ defined in(22) with its mesh Γ h corresponding to the 2D x-normal domain in (a).

Figure 2 :
Figure 2: An x-normal domain Ω S and its mesh Ω S h are transformed into the curvilinear cylinder Γ and its mesh Γ h , respectively.The transformation σ in (21) joins the top-and bottom edges of Ω S , highlighted in red.

Figure 4 :
Figure 4: Poisson equation (66) on the square Ω = [0, 1] 2 .FEM of order k = 1, 2, 3, 4: comparison between the "vector" formulation (51) (dashed lines) solved by the direct method and the MO-FEM solved by the MO-PCG (continuous lines).Left plot: computational times.Right plot: Convergence behaviour of the two approaches for all k.Dotted lines indicate slopes 2,4 and 5.For k = 1, 3, 4, the convergence is optimal of order k + 1, for k = 2, we observe superconvergence of order 4. PCG is cheaper in execution time and for all k and for all N stops after two iterations, see also Table1.

Figure 5 :
Figure 5: Poisson equation (83) on the cap-shaped domain Ω S defined in (82).Continuous lines indicate MO-PCG, while dashed lines indicate the vector (Kronecker) formulations solved by the direct method.In terms of computational time, MO-PCG is competitive only for k = 1, without lumping (upper left panel).Both approaches exhibit optimal convergence for all FEM order k, with the case k = 2 being superconvergent, round-off error becomes dominant on the finest mesh for k = 4 (upper-right and lower-left panels).Number of iterations required by the MO-PCG increases with k and N , lower-right panel.

with d u > 0 ,
endowed with either homogeneous Dirichlet boundary conditions u |∂Ω S = 0 or homogeneous Neumann boundary conditions (∇u • n) |∂Ω S = 0.The treatment of RDSs of the form (

Figure 6 :
Figure 6: DIB model (98)-(100) on the cap-shaped domain, lumped P 1 solutions.The solutions are plotted on the rescaled domain Ω ρ .Each row corresponds to the (A, ρ, N x , N y ) combination in Table 4.For ρ = 400 (a) only few structures arise in the pattern.For ρ = 20000: in (b) N x = 200, N y = 100 are not sufficient to resolve the pattern structure, that instead is well-resolved for N x = 400 and N y = 200 in (c).Smaller values of N x and N y do not provide sufficient spatial approximation on the larger domain and the pattern appear grainy (b).The patterns obtained by the MO-PCG method and the vector (direct) method are very similar, they are stationary solutions obtained at T ρ = 300, are shown by the increment dynamics (left subplots).To compare the execution times by the lumped P 1 approximation see Table4.

Figure 7 :
Figure 7: DIB model (98)-(100) on the jar-shaped domain, P 1 solutions.Values of (ρ, N x , N y ) are given in Table5.In case (a) on the smallest domain only few holes arise in the pattern.In case (b), for ρ = 20000 on the larger domain the mesh is too coarse and a phantom pattern arises.In case (c), the intrinsic Turing pattern is well-resolved for N x = 600 and N y = 200.MO-PCG and vector solutions are very similar, but the MO approach converges in significant less time (see Table5).

Figure 8 :
Figure 8: Numerical solution of Fig. 7 (a) (ρ = 20000, N x = 600, N y = 200) mapped onto the cylinder Γ in Fig. 2b.It can be interpreted as the solution of the DIB surface reaction-diffusion model on Γ.