Computational Algorithms for Solving Spectral / hp Stabilized Incompressible Flow Problems

In this paper we implement the element-by-element preconditioner and inexact Newton-Krylov methods (developed in the past) for solving stabilized computational fluid dynamics (CFD) problems with spectral methods. Two different approaches are implemented for speeding up the process of solving both steady and unsteady incompressible Navier-Stokes equations. The first approach concerns the application of a scalable preconditioner namely the element by element LU preconditioner, while the second concerns the application of Newton-Krylov (NK) methods for solving non-linear problems. We obtain good agreement with benchmark results on standard CFD problems for various Reynolds numbers. We solve the Kovasznay flow and flow past a cylinder at Re-100 with this approach. We also utilize the Newton-Krylov algorithm to solve (in parallel) important model problems such as flow past a circular obstacle in a Newtonian flow field, three dimensional driven cavity, flow past a three dimensional cylinder with different immersion lengths. We explore the scalability and robustness of the formulations for both approaches and obtain very good speedup. Effective implementations of these procedures demonstrate for relatively coarse macro-meshes the power of higher order methods in obtaining highly accurate results in CFD. While the procedures adopted in the paper have been explored in the past the novelty lies with applications with higher order methods which have been known to be computationally intensive.


Introduction
In both computational research and practical applications there is continuing interest in solving incompressible fluid flow problems efficiently.Different approaches have been used for solving the Navier-Stokes equations.The finite element community has been beset with various difficulties such as satisfaction of the Ladyshenskaya Babuska Brezzi (LBB) condition between the velocity and pressure spaces (Reddy, 2010), presence of the zero pressure block in the weak form Galerkin formulation, spurious pressure and velocity fluctuations, along with non-convergence for moderate to high Reynolds number flows.Large scale applications with usage of LBB compatible finite element pairs however can be found in Elman (2005) and Benzi (2005) among others who also introduce techniques to solve the saddle point problem.
To alleviate some of the above mentioned difficulties, various approaches have been proposed.One formulation is the penalty finite element method, which treats the continuity equation as a constraint and adds this restriction on to the momentum equations.Penalty finite element methods provide acceptable results on the velocity fields.The pressures are post computed from the velocities via the incompressibility constraint.Using such techniques however to solve complex problems are beset with difficulties primarily resulting from the high condition numbers associated with the discrete form of the penalty terms which can engender convergence issues with high values of the penalty parameter.More specifically, for large scale problems, a high value of the Penalty parameter is required to satisfy the incompressibility condition and this leads to non-convergence for complex problems.Attractive alternatives that exist for solving incompressible flow utilize splitting techniques of Chorin type (Chorin, 1968) which split the equations pre-discretization.Post discretization splitting techniques requiring the usage of Yosida method (Dembo, 1982) have also been found to perform very well for large scale problems.
A separate formulation namely the Least squares finite element procedures (LSFEM) has recently been proposed as an alternative formulation for solving Navier-Stokes equations.LSFEM formulation provides a variational setting for the Navier-Stokes equations, and a symmetric positive definite matrix, which can be solved with conjugate gradient solvers combining Jacobi or multi-grid preconditioning (Quarteroni, 1999).Another advantage of the LSFEM is that it provides a parameter free formulation for solving incompressible flow.One of the main drawbacks of LSFEM is it often results in extensive loss of mass.LSFEM procedures also fail for problems in contraction regions resulting in spurious results.An attractive alternative to the techniques mentioned above (other than those highlighted), is a stabilized formulation in the context of streamline upwind Petrov Galerkin (SUPG), and pressure stabilized Petrov Galerkin formulations (PSPG) for incompressible flow (often mentioned as SUPS formulation in literature).The problems associated with the illconditioning of the discrete form of the Galerkin finite element formulation are alleviated with the addition of consistent terms into the variational formulation for solving the Navier-Stokes equations (Brooks, 1982;Brooks, 1980;Hughes, 1979;Hughes, 1982).In general the SUPS formulation for solving incompressible flow can be interpreted as a special case of the generic residual based variational multiscale (RBVMS) formulation of Hughes (1995) with the omission of two terms.Since the SUPS formulation has been studied extensively and provides excellent results for solutions of CFD problems we adhere to this specialized formulation in this article.We utilize the p-version FEM or the spectral/hp method for solving CFD problems.Stabilized finite element formulations have been used for solving incompressible flow with higher order spectral approximations in (Gervasio, 1998;Ranjan, 2016).However effective preconditioning techniques for these methods and efficient methods to solve large scale problems were not discussed.
Lower order finite element formulations have been used extensively to solve incompressible flow problems.There has been an interest lately to usage of higher order spectral approximations to solve CFD problems since they provide spectral (exponential) accuracy to known analytical solutions available.This allows the usage of coarse macro meshes for solving problems rather than extensively refined low order finite element meshes to capture evolving fields of interest.Due to the increase in the number of degrees of freedom of spectral element dicretizations, higher order spectral methods take a long time to converge and consequently have high computation times even on multiprocessors.On the other hand lower order finite element implementations converge faster, and it is known p version of the finite element method is slow to converge.Further higher order spectral methods have problems in maintaining monotonicity in the presence of singularities.
For two dimensional problems spectral implementations the quadrature step requires an O(p 6 ) operation where as for complicated three dimensional solutions, there is an increase in the quadrature requirements (due to increase in the degrees of freedom per element).To control the relatively high computation times for solving complicated problems, one has to resort to non-linear solvers and effective preconditioning techniques.In this context using effective preconditioners and Newton-Krylov algorithms are of interest for solving large linear systems generated from spectral discretizations of the Navier-Stokes equations.

Literature Review
A review of sparse preconditioners has been provided in Meister et al. (2001) for hyperbolic conservation laws.Details on the construction of the incomplete block LU preconditioner are provided by Sturler et al. (2005).Issues of the singularity of approximate inverse preconditioners have been outlined in Wang et al. (2005).Performances of the incomplete LU and Cholesky preconditioned iterative methods on GPU's have been explored by Naumov et al. (2011).Parallel implementations with Domain decomposition algorithms are presented by Barth et al. (1998).Theory and procedural development of non-linear solution techniques with the Newtons method is presented in Dembo et al. (1982) and Eisenstat et al. (1996).Applications of the Inexact Newton-Krylov methods for solving large computational fluid dynamics problems can be found in Shadid et al. (1997) where they solved the thermal convection in a square cavity, lid driven cavity flow at Reynolds number (Re) 10, 000 and backward facing step problems at Re-800 with stabilized methods.Elias et al. (2006) solved the two dimensional problems of flow inside a borewall well and flow through an abrupt contraction region with Inexact Newton-Krylov methods.Further developments in the area include solutions of the lid driven cavity at a range of Re-100 through Re-1000, backward facing step for Reynolds numbers of Re-100 and Re-500.Elias et al. (2006) also obtained the solutions of viscoplastic flows with the inexact Newton-Krylov methods for three-dimensional flows through a channel with a sudden expansion and the three-dimensional driven cavity problem (2006).Edge based data structures were found to perform better in solving complex CFD problems with stabilized methods.However scalability of edge based data structures on large processor counts was found to deteriorate and for the case of large processor counts an improvement in performance as compared to an element by element procedure was found to be marginal at best.Parallel implementations for solving Inexact Newton-Krylov procedures have been demonstrated in Hwang et al. (2005).They implemented the backtracking procedure for solving the driven cavity problem at Re-10, 000 and the backward facing step problem at Re-100, Re-500, Re-700 and Re-800.The approaches described above have been mostly confined to either a finite difference or a lower order finite element method implementation for solving Navier-Stokes equations other than multigrid preconditioning that was implemented for solving LSFEM incompressible flow by Ranjan (2012).In this paper we apply established algorithms for solving high order spectral discretized equations which are computationally intense problems otherwise.While the procedures that are employed are standard the effort is directed towards the development of efficient computational frameworks for obtaining accurate results for Navier-Stokes equations on relatively coarse levels of discretization with higher order methods which have been known to be computationally intense problems.

Incompressible Flow Equations
Let us consider the flow of a Newtonian fluid with density ρ, and viscosity µ.Let Ω ⊆ R N and t ∈ [0, T ] be the spatial and temporal domains respectively, where N is the number of space dimensions.Let Γ denote the boundary of Ω.The equations of fluid motion that describe the unsteady Navier-Stokes governing incompressible flows are provided by where, ρ and u, f, and σ are the density, velocity, body force, and stress tensor respectively.The stress tensor is written as a sum of the pressure and viscous components as Here, I is the identity tensor, µ = ρν, p is the pressure and u is the fluid velocity.The part of the boundary at which the velocity is assumed to be specified is denoted by Γ g : The above set of equations require a divergence free initial velocity field to complete the problem specification.

Finite Element Formulation
Consider a spectral/hp element discretization of Ω into subdomains, Ω e , k = 1, 2, ...., n el where n el is the number of spectral/hp elements into which the domain is divided.Based on this discretization, for velocity and pressure, we define the trail discrete function spaces ℑ hp u and ℑ hp p , and the weighting function spaces; W hp u and P hp p defined above.These spaces are selected, by taking the Dirichlet boundary conditions into account, as subspaces of ] n el and where ] is the finite-dimensional function space over Ω.We write the stabilized Galerkin formulation of eq 1-2 as follows, find u hp ∈ ℑ hp u and p hp ∈ ℑ hp p such that ∀ w hp ∈ W hp u , q hp ∈ P hp p ; ] As can be seen from Eqn. 6, three stabilizing terms are added to the standard Galerkin formulation.In Eqn. 6 the first three terms and the right hand side constitute the classical Galerkin formulation of the problem.The surface integrals on the right are obtained from the weak form development as the natural boundary conditions.The first series of elementlevel integrals comprise the SUPS stabilization terms, which are added to the variational formulation (Mittal, 1997;Sampath, 2001).The second series of integrals comprise the least squares terms on the incompressibility.The stabilization parameter τ is defined as Here, h denotes the element size, which is computed based on the equivalent diameter of the element.We have tested using the stabilization parameters for lower order finite element implementations for the solutions of incompressible Navier-Stokes equations in the interest of lower computation times.Stabilization parameters typically reduce in magnitude for higher order discretizations and increase the time the linear solver takes to converge.The second stabilization parameter δ on the continuity equation was set to zero.The spatial discretization of Eq. 6 leads to the following set of non-linear algebraic equations, Here, v is the vector of unknown nodal values of v hp , and p is the vector of nodal values of p hp , and the mass matrices have been specified with derivatives of the velocity components.The matrices N(v), K, and G are derived, respectively, from the advective, viscous, and pressure terms.The vectors F and E are due to boundary conditions.The subscripts κ 1 and κ 2 identify the SUPG and PSPG contributions, respectively.The various matrices forming the discrete finite element equations are provided.The definition of the terms in the stiffness matrices are outlined below; And the forcing functions are defined as This stabilization of the Galerkin formulation presented in this paper is a generalization of the Galerkin Least squares (GLS) formulation, and the SUPS procedure employed for incompressible flows.With such stabilization procedures described above, it is possible to use elements that have equal order interpolation functions for the velocity and pressure, and are otherwise unstable (Brooks, 1982).There are various ways to linearise the non-linear terms in stabilized finite element methodology.We follow the approach adopted in (Tezduyar, 1992) where the non-linear terms are allowed to lag behind one non-linear iteration.This is mentioned as the iteration lagged stabilization terms update.

Preconditioning
In this section, we present the use of preconditioners in solving a large scale element structured linear system: using Krylov subspace methods.In the above equation {x} denotes the solution of the linear system.We say that the element A admits an element structure ε if Here ε is a set of elements, A e is a dense square coefficient matrix (element matrix), P e is a the boolean connectivity matrix that maps the coordinates of the element stiffness matrix to the global matrix A. Note that A(x) is a non-linear operator in general which has been linearized prior to the construction of the linear system.It has been mentioned in Benzi et al. (1999) that effective preconditioners are required to solve sparse systems efficiently using Krylov-subspace methods.Some preconditioners exploit knowledge of the element structure.These so-called element by element (EBE) preconditioners operate exclusively on the element matrices (Gustafsson, 1986;Nour-Omid, 1985).The global matrix A is never assembled (computed).Typically, one computes an approximation to A −1 by computing the LU-factorizations of agglomerates of elements, and then by combining the results.However, these pre-conditioners are block element by element because they entail the factorizations of blocks of elements.They can be applied to a vector using high-throughput dense vector multiplications.Unfortunately, their quality is unsatisfactory to tackle more difficult linear systems.
The total time spent on preconditioning, excluding the factorization, is the product of the number of iterations and the execution time per iteration.That implies that it is equally important to reduce the number of iterations as increasing the computational performance.A compromise should be reached between reducing the iteration count, by forming a more accurate factorization for Incomplete LU (ILU) based factorizations and increasing the computational performance, to allow for dense operations.Current research appears to be focused on reducing the iteration count (Vannieuwenhoven, 2010).In this work, we seek to develop a true element-by-element preconditioner which completes the LU factorization of the element stiffness matrix in its dense form and applies it to the huge linear system obtained.Another problem with the development of preconditioners is the scalability of the factorizations while computing the agglomerates of elements.
Let us denote the non-linear problem comprised of the element stiffness matrix contributions by equation 20.Then the preconditioning operation is defined as Here, M is the preconditioning matrix.Let us denote the global stiffness matrix with The preconditioning operation is obtained as a series of backsolves after the factorizations of the matrix has been achieved.It is to be noted that the factorizations have been achieved for the non-linear operator that has been linearized prior to the factorization step.
Let us denote by the subscript pe the element stiffness matrix for a preconditioning element matrix in the domain to generate the non-linear system as where {x pe } is the restriction of the non-linear unknown vector, {x} on the element nodes and {b pe } is the restriction of the residual on the element.In the construction of the non-linear terms for the element stiffness matrix, we utilize a damping of the non-linear system.Thus, the elements of the stiffness matrices are obtained as In the above equation α denotes a constant which is used to damp the non-linear solution vector.We varied the damping coefficient α from 0.40-0.50.We consider a spectral element with a relatively high p level as a domain for which the preconditioning matrix was constructed.The storage of the matrix elements in sparse matrix format allows for affordable memory allocations, as well as a reduction in the operation count in construction of the LU factors that can be expensive for high p levels (p 6 ) operation in 2D or 3D).Linearization of the linear system of equations was achieved with the help of the method of successive substitution (or Picard method).Picard method has been known to be a robust technique which has a large radius of convergence and is based on the fixed point (contraction mapping) theorem.The specification of Dirichlet boundary conditions on the interface nodes between different spectral elements allowed for a communication free implementation with the preconditioned value contributions at the interface replaced by the corresponding residuals.

Kovasznay Flow
We proposed that the two constants in the general solutions, λ 1 and λ 2 , tend to 2π when the Reynolds number is very small.We utilize the solution provided with the above assumptions as the analytical solutions for the Navier-Stokes equations.
The analytical solutions are provided by where λ = Re/2 − (Re 2 /4 + 4π 2 ) 1 2 and p 0 is the reference pressure (an arbitrary constant).We assigned the reference pressure to zero for the purpose of this simulation.The SUPS spectral element formulation was used to formulate the problem and the discretization was performed with the spectral/hp element method.Nodal expansions (spectral elements) were used to obtain the discrete model.The exact solution was prescribed on the boundaries for the u and v components of the velocities.Pressure was assigned based on the reference pressure, set on the whole left face.Nonlinear convergence was declared when the nonlinear residual was reduced to a value of 10 −4 for both the velocities and the pressure.An under-relaxation factor of 0.50 was found to be necessary to affect convergence.For this problem the resulting system is ill-conditioned.The number of non-linear iterations for convergence was typically 14.The mesh used for the analysis is presented in Figure 1(a).A uniform polynomial expansion of order 5 was used in each element.The discrete mesh resulted in 25, 921 nodes, with three degrees of freedom per node for the finest spectral/hp mesh.Figure 1(a) and 1(b) present the spectral/hp mesh, u velocity contour plot over the domain of interest, and 1(c) and 1(d) present the v-velocity and pressure contour plots over the domain for a Reynolds number of 40.Spectral or exponential convergence of the velocity components (u & v) and pressure (p) are presented in Figure 2. Higher order convergence is evident for the three variables.For faster convergence and in the interest of low solve times we adhere to the stabilization parameter presented in eq. 7.
The iterations required to convergence for the un-preconditioned element-by-element solution (EBE) to the problem vs. the LU preconditioned (EBE-LU) operator is provided in Table 1.Notations for Tables 1-3 are as follows: Index denotes the non-linear iteration, Re the Reynolds number, N p for the number of processors, EBE the number of iterations required by the EBE procedure, EBE-LU for number of iterations required by EBE-LU preconditioned problem, and Time for the execution time (in secs).Ndof denotes the number of degrees of freedom.'-' is used to denote cases where one algorithm converged to the non-linear tolerance specified earlier than the other and thus those non-linear iterations are redundant for the first algorithm (or in cases where there is no comparison possible).The degrees of freedom for the problem were set to 77763.The EBE-LU algorithm was found to be 2× faster (in execution time) than the EBE algorithm on 16 processors.In terms of iterations required for convergence the EBE required 1.953 more iterations to converge than the EBE-LU algorithm.The metrics presented also provide a measure of the compute times for the back-solve operations in the EBE-LU algorithm.The EBE-LU back solve operations were found to contribute a negligible amount of time to the total execution time to reach converged results.
To check the speedup in parallel for the EBE-LU preconditioned system we increased the number of processors from 16 through 120.Table 2 provides the parallel performance measurements for the Kovasznay flow problem.Here p level denotes the order of the spectral/hp polynomial used for analysis, and N do f denotes the number of degrees of freedom in the problem and speedup is where, t N p1 denotes the execution time on N p1 processors, and t N pi is the execution time on N pi processors, where N p1 ≤ N pi (e.g.N p1 =16 and N pi =60, 120).
Increasing the number of processors, there is a slight increase in the number of iterations to obtain converged results.This is because of the slight differences in numerics while performing back solve and factorization processes per block in each processor.Thus we measured the execution time (in seconds) for one iteration per processor for the problem.On 16 processors this value was found to be 0.0619.On higher processor counts of 60 and 120 the values ranged from 0.002768 through 0.00136.Varying the number of processors from 16 to 60 provided superlinear increase in performance due to effective cache memory utilization.However varying the number of processors from 60 through 120 gave a speedup of 2.08 which is slightly higher than linear speedup.Verification of the algorithm was achieved using the L ∞ norm of the difference of the numerical results from the analytical solutions for the problem.The L ∞ norm was found to be 0.000407 and 0.000291 for the u, and v components of the velocity and 0.0320 for the pressure, respectively for a polynomial order of 5 for the problem.
6. Inexact Newton-Krylov Methods Upon implementaton of the EBE-LU preconditioner for solving the incompressible Navier-Stokes equations, we further explore using non-linear solvers for both two-and three-dimensional incompressible flows.We consider both steady state and transient analysis.Let us consider the discrete non-linear problem denoted by: where F : R n → R n is a nonlinear mapping with the following properties: (1) There exists an x * ∈ R n with F(x * ) = 0.
(2) F is continuously differentiable in a neighborhood of x * .
In the particular case of solution of computational fluid dynamics problems, x = {u, p} denotes the vector of nodal variables for the velocity field and pressures over the domain of interest.For moderate to high Reynolds numbers the nonlinear convective term is dominant and the solution procedures have to be able to accommodate the strong non-linearities.
We assume F is continuously differentiable in ℜ n sd where n sd is the number of spatial dimensions.Let us denote the Jacobian matrix by J ∈ ℜ n sd .Newton's method for solving the non-linear problem is a classical algorithm and can be formulated as follows (Elias, 2004;Elias, 2006) Newton Algorithm for k=0 step 1 until convergence solve Newton's method was designed for solving problems where the initial guess is close to the non-linear solution.Among the disadvantages of Newton's method are the requirements that the linear system be solved exactly, to machine accuracy which can be computationally expensive.Computing an exact solution using a direct method can also be prohibitively expensive if the number of unknowns is large and may not be justified when x k is far from the solution.In spectral computations where there are high computational costs involved with quadratures, it is imperative to resort to efficient procedures for solving this discrete system of equations.In this case Inexact Newton methods are used to compute some approximate solution leading to the following algorithm for k=0 step 1 until convergence Find some η k ∈ [0, 1] and s k that satisfy or a damped version of the final update to the new solution vector as: set x k+1 = x k + αs k where α is the damping parameter.For some η k where ∥ • ∥ is the norm of choice.This new formulation allows the use of an iterative linear algebra method, one first chooses η k and then applies the iterative solver to the algorithm until an s k is determined so the residual norm satisfies the convergence criterion.In this work η k is often called the forcing term, since its role is to force the residual of the above equation to be sufficiently small.In general, the non-linear forcing sequence needs to be specified to drive the solution toward the solution of the non-linear problem as In our implementation, we use the element-by-element (EBE) BiCGSTAB method to compute the s k .There exists several choices for the evaluation of the forcing function ∥η k ∥ as mentioned in Eisenstat et al. (1996).The forcing term is chosen to be a constant η k = 10 −04 .Other choices require an additional evaluation of the Jacobian, which is expensive in practical computations and in implementation procedures.
The derivations of the convergence behaviour for a generic non-linear problem have been outlined in Dembo et al. (1982).
The basic requirement for the convergence of the iterative method is that F is continuously differentiable in the neighbourhood of x * ∈ R n for which F(x * ) is non singular and that F ′ is Lipschitz continuous at x * with constant λ, i.e.
The construction of the tangent stiffness matrix required for the algorithm follows standard procedures in finite element analysis.We follow through some of the developments for obtaining the tangent stiffness matrices for inexact Newton-Krylov methods.The non-linear tangent stiffness matrix terms are evaluated from the residual equations based on Taylor series approximations.Let us denote the residual of the linearised non-linear system by the following: We expand the residual about the (known) solution at the r th iteration.
Ordering the terms of second order and higher we obtain since we seek to find the solution for which {R{x}} = 0, we obtain the following equation where the tangent stiffness matrix is defined as In the following sections we present the results obtained with the Inexact Newton-Krylov methods with the procedures described earlier for solving benchmark incompressible flow problems.
Flow Past a 2-D Cylinder at Different Reynolds Numbers After demonstration of the scalability and effectiveness of spectral/hp based implementations for solving incompressible flow problems, we extended the computational framework for the solutions of larger problems.We consider (as a second example) both steady and unsteady flow past a circular obstacle in a Newtonian flow field at a Reynolds number of 40 and 100 respectively.We consider a large computational domain of dimensions [46 × 41] for the steady state example.
A circular cylinder with unit diameter obstructs the flowfield with the center located at [15.5, 20.5].The presence of the cylinder was modeled as a porous domain with a low permeability.On account of the large computational domain the flow at the top, inlet and bottom are unperturbed from the presence of the cylinder and are prescribed to have a velocity of u-1.0 and v-0.A datum pressure of 0.0 was prescribed at the inlet face to anchor the pressure.The domain was discretized into [38 × 32] elements, with a p level of six in each element which resulted in a discrete system with a total of 132, 591 degrees of freedom.The length of the wake and drag obtained from the present results were in excellent agreement with benchmark results.The number of iterations to convergence for the flow past a cylinder problem has been presented in Table 3 with columns following the column Re-40.The number of processors was set to 32.The 'forcing term' η k for the inaccurate solve step was taken as a uniform value of 10 −04 for the problem.The problem took 12 non-linear iterations to converge.However these iterations were considerably cheaper (3.4 times) compared to the exact Newton solution of the problem.
Next we consider the flow past a cylinder for unsteady state analysis at Re = 100.The computational domain was taken of dimensions [28,16].The center of the cylinder of diameter 1.0 was located at [8,8].Reynolds number was based on the diameter of the cylinder, viscosity of the fluid, and the inflow free stream velocity at the inlet.The contour plot of the u-component of the velocity field for unsteady analysis has been presented in Figure 3. Also the v-component of the velocity in the near vicinity of the cylinder has been presented in Figure 4.

Three Dimensional Lid-driven Cavity Flow
We further test the NK algorithm for solving a benchmark three-dimensional problem with spectral/hp element method.
As the problem of interest, we choose the three-dimensional leaky lid-driven cavity flow problem.The domain comprises of a cavity of unit dimensions that is uniformly discretized into 14 × 14 × 14 spectral elements with p level of three in each element.The side walls of the cavity are specified to have no-slip and no-penetration boundary conditions.The top wall is set to move in a distinct orthogonal direction (either X, Y or Z) with a velocity of unity.The Reynolds number is based on the velocity of the moving lid, the viscosity of the fluid, and the length of an edge of the cavity.We consider the problem for two different Reynolds numbers of Re-100 and 400 respectively.The problem has been solved by Jiang et al. (1998) in three-dimensions and benchmark results have been provided.We provide the agreement between the present results and benchmark results of Jiang et al. (1998) in Figure 5.There is excellent agreement between the present results and reference results in literature for this problem for both Reynolds numbers.
Figure 6 presents the convergence in the L 2 norm of the residuals of the linearized problem for three-dimensional driven cavity flow for Reynolds number 400.The convergence history for two different linear steps (1 and 3) are provided.The forcing term for the linear convergence was taken as a value of 0.01 for Re-100 and 10 −4 for Re-400 respectively.The linearized system took from six through 96 iterations to converge for the Re-100 problem with a total of 318, 028 degrees of freedom.For the Reynolds number of 400 it took 22 iterations to converge to the non-linear tolerance specified.Nonlinear convergence was declared when the L 2 norm of the velocity reduced by a factor of 10 −04 when compared to the L 2 norm of the initial residual.In comparison, the LSFEM technique utilized by Jiang et al. (1998) utilized a 50 × 50 × 50 mesh with 5 degrees of freedom per element.This resulted in a total number of 625, 000 degrees of freedom.The efficacy of the spectral discretization is clear from the dof count which was found to be half of that found with lower order methods.
Fully Immersed Cylinder in 3-D We consider the flow past a cylinder for three-dimensional analysis.Earlier example considered the flow past a two dimensional cylinder in a large computational domain.Such flows provide faithful representations of the flow metrics for the actual three-dimensional problem since the domain is large in size and the flow can be assumed to be symmetric in the third dimension.Further for large computational domains the end effects are negligible in flow computations.In other situations where the cylinder lies in a relatively confined domain there are considerable interactions of the perturbed flow past the cylinder with the boundary layers and ignoring the third dimension will provide inaccurate results.
To consider such three dimensional effects we examine flow past a three dimensional domain of dimensions 14 × 12 × 2 at a Reynolds number of Re-200.The axial length of the cylinder is considered as 2 units.Reynolds number was based on the diameter of the cylinder taken as unity and density and viscosity of the fluid.We placed the center of the cylinder at the location of [5, 6, 0] with a domain height of 2 units.This places the cylinder symmetrically from the walls of the computational domain.A free stream velocity of u = 1 was specified on the inlet, front, and back face boundaries.
No-penetration boundary condition was specified at the bottom, top, front, back, and left faces.The exit was assumed to be traction free t y = 0.No velocity component in the third dimension was specified at all faces other than the exit face which was assumed to be traction free with t z = 0.The exit face was specified to be traction free in all three components of t x = t y = t z = 0. Pressure was specified as the datum pressure at the inlet to the computational domain of p = 0 at the left face.
The problem was discretized into 28 × 28 × 4 mesh with a p level = 4 in each element.This resulted in a discrete linear system with 868292 degrees of freedom.A space-time decoupled finite element formulation was utilized for solving the non-linear system in time.The inexact-newton krylov method was used to accelerate convergence of the problem to a relative linear tolerance of 10 −05 .The problem was simulated on 158 processors till an end time of 82.The time increment for the simulation was set at δ t = 0.1.This required solving the above linear system over 820 time steps for generating the simulation results.Figure 7 presents the iso-contour plots of the pressure at development of periodic steady state.Let us consider the flow past a partially immersed cylinder in 3D with Inexact Newton-Krylov methods.We consider a computational domain of dimensions [14 × 12 × 4].The height of the cylinder was considered to be h=2 units from the bottom face of the domain.Reynolds number considered for analysis was taken as Re-200.Reynolds number was based on the diameter of the cylinder and the velocity of free stream of u = 1.Inlet to the computational domain was considered on the left face of the domain.Similar to the fully immersed cylinder analysis we placed the center of the cylinder at the location of [5, 6, 0] with a domain height of 2 units.This places the cylinder symmetrically at the center from the walls of the computational domain.We define the immersion length of the cylinder in the domain as follows Here, h denotes the height of the cylinder, and H denotes the total height of the computational domain.On the boundaries of the immersed cylinder in the free stream Γ h we prescribe values of the velocity components (u.v.w) = {0, 0, 0}.A free stream velocity of u = 1 was specified on the inlet, front, and back face boundaries.No-penetration boundary condition was specified at the bottom, top, front, back, and left faces.The exit was assumed to be traction free t y = 0.No velocity component in the third dimension was specified at all faces other than the exit face which was assumed to be traction free with t z = 0.The exit face was specified to be traction free in all three components of t x = t y = t z = 0. Pressure was specified as the datum pressure at the inlet to the computational domain of p = 0 at the left face.The immersion length thus considered for analysis was determined as 0.5.We subject the three-dimensional problem for analysis.Fluctuations on the lift coefficient and the z-component of the force on the cylinder were metrics of interest.
We determine the coefficients with the help of integration of the forces on the sides of the cylinder.Let us denote the direction cosines of the inward normal with (n y , −n x , n z ).Let S denote the surface of the cylinder.The drag, lift and tangential forces were calculated with the above formulae The stress tensor is evaluated for Newtonian fluids as The force coefficients were evaluated with the help of the above formulae The diameter of the cylinder D was taken as the characteristic dimension for determination of the forces, the velocity of the free stream was considered as U. Since the cylinder was considered partially immersed in the domain considerable (non-trivial) forces were found for each of the variables.The mean drag coefficient for the partially immersed cylinder was found to be 1.20.Fluctuations on the mean c z were found to be negligible.The mean z component of the force coefficient was found to be 0.27.An over-lay plot of the u-contours have been presented in Figure 10.From the figure an envelope over the cylinder domain was clearly identified.A similar plot for the v-contours of the fluid flow have been presented in Figure 11.A trace of the temporal evolution of the mean lift coefficient with time has been presented in Figure 12.It is evident from the figure that the coefficient of lift with time reaches a periodic steady state by the time instant t=50.Periodic shedding from the bottom of the cylinder were found be developed and shedding was found to be periodic from the lift.

Conclusion
We provide two distinct procedures for the solutions of incompressible Navier-Stokes equations with stabilized formulations.The first method describes the development of an LU based preconditioner for solving spectral discretizations of Navier-Stokes equations.The second method involved development of an Inexact Newton-Krylov methods for solving incompressible flow problems.Spectral accuracy to known analytical solutions were verified.While inexact newton-krylov methods are expected to perform better with low order implementations of the problems solved, the element by element preconditioner developed is better suited for spectral implementations where higher order implementations provide the most suitable framework.We utilize the algorithms developed to solve an array of problems in CFD.Performance metrics for both algorithms were demonstrated with a scalability analysis performed on a range of processor counts varying from 12 through 160 processors.We provide a computational framework for solving problems which involve higher order discretizations in CFD computations.
consider a two-dimensional, steady flow in Ω = [−0.5,1.5] × [−0.5, 1.5].We use the Kovasznay et al. ?solution to the problem.Kovasznay provided the analytical solution to the Navier-Stokes equations in two dimensions.The equations were linearised by introducing the stream function.Following the introduction of the stream function and its exponential functional representation, the quadratic terms in the Navier-Stokes equations vanish.Further in the derivation, it was

Figure 8
presents the iso-contour plots of the w velocity component at the same instant of time.The periodic fluctuating component of the lift for the three dimensional computations have been presented in Figure9.From the fluctuating lift coefficient one derives the Strouhal number of S t = 0.169.A two dimensional component of the simulation was obtained after prescribing the top and bottom walls of the domain to have free stream velocity of u = 1, and a no penetration velocity component of v = 0.The 2-D domain was considered of size[14 × 12]  with the center of the cylinder located at[5, 6].The value of the Strouhal number obtained from the two-dimensional simulation was found to be S t = 0.166.The two dimensional simulation validated the development of the full three-dimensional flow field.Partially Immersed Cylinder in 3-D

Table 1 .
Kovaszany Flow Iterations to Convergence

Table 3 .
Newton Krylov Performance Metrics