Projection-based reduced order modeling and data-driven artificial viscosity closures for incompressible fluid flows

Projection-based reduced order models rely on offline-online model decomposition, where the data-based energetic spatial basis is used in the expensive offline stage to obtain equations of reduced states that evolve in time during the inexpensive online stage. The online stage requires a solution method for the dynamic evolution of the coupled system of pressure and velocity states for incompressible fluid flows. The first contribution of this article is to demonstrate the applicability of the incremental pressure correction scheme for the dynamic evolution of pressure and velocity states. The evolution of a large number of these reduced states in the online stage can be expensive. In contrast, the accuracy significantly decreases if only a few reduced states are considered while not accounting for the interactions between unresolved and resolved states. The second contribution of this article is to compare three closure model forms based on global, modal and tensor artificial viscosity approximation to account for these interactions. The unknown model parameters are determined using two calibration techniques: least squares minimization of error in energy approximation and closure term approximation. This article demonstrates that an appropriate selection of solution methods and data-driven artificial viscosity closure models is essential for consistently accurate dynamics forecasting of incompressible fluid flows.


Introduction
Over the past decade, improvements in computational hardware and advancements in physical simulations have pushed the envelope of complexity of scientific applications that can be modeled with adequate accuracy.High-fidelity simulations of many challenging applications demand extensive computational resources and long computational time.The expensive nature of these simulations prohibits their use in the context of multi-query applications, real-time simulation and dynamics forecasting.In such situations, reduced order models (ROMs) are an attractive alternative as they can potentially simulate engineering systems at a lower computational overhead without a significant loss in accuracy.ROMs often rely on offline-online task decomposition, where the computationally expensive component is ideally performed in the offline stage and typically carried out only once.In contrast, the inexpensive online stage is performed when the model is deployed in the actual application.
Projection-based ROMs [1,2] are the most popular among model reduction approaches.These models rely on high-fidelity data to obtain spatial basis, using modal decomposition methods such as proper orthogonal decomposition (POD) [3,4] or reduced basis [5], and optimally representing the data in terms of reduced states.Using this basis representation in the relevant physical equation and projecting the solution to a lower-dimensional subspace, a set of equations for the evolution of reduced states is obtained.The reduced states are classified as resolved states, which are the most energetic states whose dynamics are predicted, and unresolved states, which are less energetic states whose dynamics are neglected.These equations can be solved in the online stage of ROMs to get the dynamic evolution of the solution.The cost of the online stage can scale highly with the number of resolved states; for example, in the case of Navier-stokes equations, the cost can scale as O(r 3 ) where r is the number of resolved states.A lower number of resolved states must be considered for such applications.This selection comes at the cost of more unresolved states, which negatively influences the dynamics of the resolved states.Several approaches have been considered in ROM literature to account for the interactions between resolved and unresolved states either directly [2,6,7] or indirectly [8,9].Closure models are direct approaches that add additional terms to the dynamic evolution of the resolved states to account for their interaction with unresolved states.A comprehensive summary of several approaches under this category is given in [10].Common indirect approaches include stabilization methods [8,9,[11][12][13]] that typically use Petrov-Galerkin projection, which implicitly adds terms to the dynamic evolution of reduced states to account for the interaction of unresolved states.Some of these Petrov-Galerkin projection methods can also be considered a variational multiscale method [8,14] and categorized as closure models.These Petrov-Galerkin projection ROMs often represent the dynamics of resolved states based on the optimality conditions [11,12].
In this article, we focus on the closure models to account for the influence of unresolved states on the resolved states.These models can be broadly classified into physics-driven and data-driven models.Physics-driven models [2,15,16] typically use physical insight to add terms to equations of the resolved states to account for their interaction with unresolved states.For example, artificial/eddy viscosity model forms [2,7,16,17] are based on the assumption that unresolved states are primarily responsible for dissipating energy from the system.Augmented versions of these models based on modal artificial viscosity enable energy dissipation characteristics to vary for individual resolved states [6,15,16] and provide more accurate dynamics of resolved states [18].Alternate models based on the theory developed in the context of large eddy simulation (LES) [19], variational multiscale method [14,20] and regularization [21] have also been developed.The model parameters in these models are typically obtained using several calibration techniques like manual tuning, statistical averages [16,18], data-assimilation [22][23][24] and least squares minimization [16].With the growing interest in modern machine learning algorithms, data-driven models for closure modeling in ROMs [10,25] and LES [26][27][28][29] are becoming popular.These data-driven models rely on raw data to determine the approximate operators or correction to the operators in the resulting dynamic system.Often, these models assume a model form, such as models with a linear term [30], models with both linear and quadratic terms [25,31] and neural network model forms [32].As the development and deployment of ROMs are typically in a data-rich environment, data-driven closure models can directly leverage the data used for constructing the energetic spatial basis.These models are generally ill-posed and require regularization to work well [25,33].The selection of an appropriate regularization scheme and parameters depends on the application scenario and model selection, thereby adding an expensive model-tuning process to obtain efficient ROMs.
In addition to the closure problem, ROMs of incompressible fluid flows require unique solution methods for coupled continuity and momentum equations in the incompressible Navier-Stokes equations.Most ROMs for incompressible flows neglect the role of pressure gradient in the momentum equations by assuming the discretely divergence free nature of the velocity data.However, in most situations, this assumption is not satisfied.For example, data from commonly used numerical methods, like finite element methods with Lagrange elements, is also not discretely divergence free.Furthermore, obtaining discretely divergence free velocities may not be possible with common experimental measurement techniques.ROMs of shear layers are sensitive to the approximation of pressure gradient; thereby, neglecting this term can reduce ROM accuracy [34].Accurate dynamics prediction of pressure states is also essential for several applications such as fluid-structure interaction [35].Therefore, ROMs that avoid such assumptions and accurately solve coupled equations of velocity and pressure reduced states are essential.In these coupled equations, the role of the pressure states is to enforce the incompressibility condition.Improper solution methods can result in spurious pressure modes or pressure instabilities and several studies have focused on mitigating them [35][36][37][38].
Based on the discussion above, an ideal ROM for incompressible flows needs to have two key features: 1) an accurate and stable solution method for the dynamic evolution of velocity and pressure states and 2) an adequate closure model to account for interactions between unresolved and resolved states while reducing the costly human-in-loop involvement of parameter tuning, such as parameter tuning while adding regularization or manually tuning parameters for physics-driven models.Inspired by these challenges, the main contributions of this article are two-fold: • The incremental pressure correction scheme to solve coupled equations of velocity and pressure states while providing a second-order accurate temporal evolution.
• New artificial viscosity closure models with unknown parameters determined using two data-driven calibration techniques: least squares minimization of error in energy approximation and error in closure term approximation.
The first contribution is to demonstrate the applicability of incremental pressure correction schemes through numerical experiments.This solution method has been commonly used in FOM [39]; however, to the extent of the authors knowledge, this solution method has not been used previously in the context of ROMs.This solution method gives us velocity and pressure states without requiring the pressure and velocity states to satisfy inf-sup condition, thereby avoiding the need for a more expensive supremizer enrichment method [36].The second contribution is to demonstrate the applicability of several artificial viscosity closure models for resolving the closure problem in ROMs.In addition to the two physics-driven model forms discussed above, which are global [2] and modal [15,16] artificial viscosity model forms, we also propose a tensor artificial viscosity model form.This model form is inspired by the concept of tensor eddy viscosity subgrid models for LES [40,41].The unknown parameters for these three model forms are learned using the least squares minimization technique, typically used for data-driven closure models.In this article, we compare six closure models obtained by combining each of these three physics-driven model forms with two data-driven calibration techniques for determining unknown model parameters.These closure models lie between physics-driven and data-driven models and we refer to them as data-driven artificial viscosity closure models.Unlike the data-driven closure models that approximate the quadratic operator [25,31], these closure models did not require additional regularization.Together with the proposed solution method for incompressible flows, we validate and compare the proposed six closure models for forecasting the flow dynamics over a 2-D cylinder at three Reynolds numbers.The outline of this article is given below.Section 2 discusses the incompressible Navier-Stokes equations.Section 3 discusses POD for dimensionality reduction.Section 4 mentions the construction of projection-based ROMs with a POD basis while highlighting the closure problem and several models proposed in this work to overcome it.This section also discusses the incremental pressure correction scheme for solving coupled equations for resolved velocity and pressure states.Section 5 validates the solution method for ROMs and several closure models for an incompressible fluid flow case: flow over a 2-D cylinder.Section 6 concludes this article by summarizing its main contributions and highlighting possible directions for future research.

Incompressible Navier-Stokes Equations
Incompressible Navier-Stokes equations are commonly solved to obtain discrete approximations, denoted by superscript h, of velocity, u h : [0, T ] × Ω → R and pressure, p h : [0, T ] × Ω → R, in an incompressible flow in a domain Ω ⊂ R d of dimensionality d at time T ∈ R + .The approximate velocity and pressure lie in respective function spaces, u h ∈ V h and p h ∈ Q h , which are appropriately chosen to represent the flow physics.These equations consist of the continuity equation which ensures the conservation of mass, and the momentum equation, which ensures the conservation of flow momentum.From here onwards, we will remove superscript h and use u and p to refer to discrete approximations of velocity and pressure, respectively.The solution to incompressible Navier-Stokes equations exhibits a saddle-point structure, which is accompanied by the role of the pressure to enforce the incompressibility constraint.For this scenario, the equation that governs the dynamic evolution of pressure is unavailable, unlike compressible Navier-Stokes equations, where such an equation can be obtained.The saddle point nature of these equations often restricts selecting appropriate function spaces for velocity and pressure.For such saddle point problems, the Ladyzhenskaya-Babuška-Brezzi (LBB) or inf-sup condition [42,43] is a sufficient condition to ensure the uniqueness of the solution.

Proper Orthogonal Decomposition
An effective ROM relies on constructing the basis and determining reduced states that can effectively represent the flow dynamics.POD is one of the most common methods for dimensionality reduction and obtaining an effective spatial basis for the dynamic system [3,4].Velocity and pressure fields are decomposed using POD as, where n is the number of modes/states, a i : [0, T ] → R and b i : [0, T ] → R are the temporally evolving reduced states of velocity and pressure respectively, ϕ i : Ω → R and ψ i : Ω → R are spatial modes of velocity and pressure respectively and u m and p m are the reference velocity and pressure respectively which are taken to be respective mean values in this article.When finite element methods are used to solve PDEs, the spatial modes are unknown and determined by assembling a large system of equations and solving them.Meanwhile, ROMs use high-fidelity simulation or experimental data as spatial modes to reduce the dimensionality of the problem and study the temporal evolution of these reduced states.POD basis optimally represents the energy of incompressible flows: (u, u)/2.Through numerical experiments, we observed that the coupled approach, where velocity and pressure POD basis are computed together [8], did not result in a monotonic decrease in energy error with increased modes.Therefore, this article follows a decoupled approach where velocity and pressure POD basis are computed separately [35].
For generating velocity and pressure POD basis, the velocity data, X u (x), and pressure data, X p (x), at each spatial coordinate location, x ∈ R d , are represented as, where The selection of the POD velocity modes is based on minimizing the approximation error, such that where dx is the norm introduced by the inner product (•, •) and δ ij is the Kronecker delta.Other weighed norms and inner products have been considered in the past [44,45] for better stability properties.However, these are not considered in this article.The velocity POD modes are obtained using the relation, The components of , are obtained by solving the eigenproblem where the ij th component of the velocity correlation matrix, R u , is given as Similarly, the selection of POD pressure modes is based on minimizing the approximation error, The pressure POD modes are obtained using the relation, where components of , are obtained by solving the eigenproblem, The ij th component of the pressure correlation matrix, R p , is determined using, The orthonormality of velocity and pressure modes ensures that the reduced states at time t j are obtained using, The temporal evolution of reduced states governs the dynamics of the solution.Depending on the availability of computational resources, it may be computationally unaffordable to determine the dynamics of all n reduced states.Therefore, these reduced states are further decomposed as resolved (denoted by subscript c) and unresolved states (denoted by subscript f ).Keeping this in mind, POD for velocity can be rewritten as, where The reduced states a i for i = 1, 2 where The reduced states b i for i = 1, 2, • • •, r are the resolved pressure states and b i for i = r + 1, r + 2, • • •, n are the unresolved pressure states.Even though different number of velocity and pressure states can be used, we use the same number of states in this article.

Reduced order modeling
As the introduction suggests, ROMs are critical for multi-query applications, such as optimization and uncertainty quantification, real-time control, and dynamics forecasting.ROMs have two main components: representation of the high-dimensional solution in terms of reduced states and dynamical evolution of these reduced states.In Section 3, we explored using POD to obtain the set of energy-dominant spatial bases and corresponding reduced states.This section focuses on the evolution of these reduced states for dynamics forecasting.Recently, there has been a growing trend towards fully data-based ROMs involving machine learning to represent the reduced states and evolve them in time [10].Unfortunately, these fully data-based ROMs often have limited applicability for dynamics forecasting problems, where the goal is to predict future states that are not part of the high-fidelity training data used to learn these models.This article restricts our discussion to traditional ROMs involving physical equations to evolve the reduced states.These latter ROMs are better suited for forecasting problems where the model is evaluated for future time instances that do not inform the generation of the energy-dominant basis.

Projection-based reduced order modeling and closure problem
Projection-based ROMs are a model reduction approach where high-fidelity data is used to generate the spatial modes and the temporal evolution of reduced states is governed by physical equations.For incompressible flows, the reduced states can be obtained by substituting POD of velocity (Eq.( 16)) and pressure (Eq.( 18)) in Eq. (1) and Eq. ( 2) to obtain and Using Galerkin projection [2], the continuity equation is projected onto a linear subspace spanned by The resulting continuity equation is where D c k (a) = n i=r+1 a i ψ k , ∇ • ϕ i is the closure term for continuity equation.On the other hand, the momentum equation is projected onto a linear subspace spanned by where and is the closure term for the momentum equation.This approach is often referred to as the variational multiscale (VMS) method for ROMs and D m k can be thought of as the closure term that accounts for coarse and fine-scale interactions defined in VMS methods [14,17].This article neglects the effect of unresolved pressure states in the closure term.With this approximation, the resulting closure modeling becomes The incremental pressure correction scheme can implicitly account for the effect of the unresolved pressure states.For practical near real-time prediction, it is desirable to have r << n, especially for non-linear equations where the cost of the non-linear terms can scale with a high exponent.For example, the cost of the online stage for the incompressible Navier-Stokes equations is O(r 3 ), which is very expensive for large values of r.Unfortunately, this also implies that a cost-effective approach involves a relatively higher number of unresolved and fewer resolved states.As the dynamic evolution of unresolved states is not computed, the influence of these unresolved states on the resolved states are neglected.Therefore, if r << n, the accuracy of the dynamics prediction is affected, especially for strongly advective flows where the decay of Kolmogorov n-width is slow [46].If the influence of unresolved states is ignored by setting k and D m k become zero and the equations, Eq. ( 22) and Eq. ( 23) reduce to We refer to this as the Galerkin projection-based ROMs without closure modeling.Suppose the influence of unresolved states on the dynamics of resolved states is significant.In that case, the accuracy and stability of the predicted resolved states can be severely affected in these ROMs.Therefore, accurate models that approximate D m k without incurring a significant evaluation cost are essential.For incompressible flows, D c k may not severely affect the dynamics of a i and b i and is set to zero in this article.

Closure models for ROMs
As discussed in the introduction, several closure models have been proposed over the years that account for the interaction of the unresolved states with the resolved states.These models can be classified as physics-driven or data-driven.Physics-driven models involve analysis of equations and physical insights to determine a model form and calibrate the model parameters [2,6,7,18].These model forms typically have fewer parameters to ease the process of calibrating them.Most common physics-driven models involve the assumption of a model form as a linear term in Eq. (23), that is where Lki is the unknown closure term.Three model forms can approximate Lki .The first model form, which we refer to as the global artificial viscosity, approximates Lki as where ν t is the unknown global artificial viscosity.The second model form, which we refer to as the modal artificial viscosity, approximates Lki as where ν t k is the unknown modal artificial viscosity.These two model forms are common in the literature [2,15,18].In addition to these model forms, we propose a third model form, which we refer to as the tensor artificial viscosity, that approximates Lki as where ν t ik is the ik th component of the unknown tensor artificial viscosity.Several strategies have been used to determine the unknown parameters in Eq. ( 33) and Eq. ( 34).The mean energy balance was used in [18] to obtain ν t .On the other hand, the mean of modal energy was used in [18,34] to obtain ν t k .Data-assimilation techniques were used in [22,23] to obtain the relevant unknown closure term parameters.Recently, data-driven models that use high-fidelity simulation data and machine learning techniques have been proposed.These models approximate D m k using linear [30] or quadratic operators [25,31] instead of assuming a physical model form like artificial viscosity-based closure models.As projection-based ROMs are data-intensive and rely on data to obtain the energy-dominant spatial modes, substantial data needed for determining unknown parameters in closure models is readily available.
Data-driven models [25,30,31] often used least squares minimization to obtain the unknown parameters in the model form.However, there is limited work on utilizing this calibration technique for determining the unknown parameters in physics-driven closure models [16].Even in [16], only the modal artificial viscosity was determined using least squares minimization, but not used as a closure model in ROMs.Therefore, one of the main contributions of this article is to provide a systematic feasibility study on using least squares minimization for learning unknown parameters in three artificial viscosity closure model forms.The least squares minimization for obtaining unknown parameters can be applied in two ways.The first and more common calibration technique is to approximate the closure term, Dm k in Eq. (29), as accurately as possible.We refer to this technique as the closure term approximation.This technique is typically followed in the data-driven closure model literature.[25,30,31].
The alternate calibration technique accurately captures the energetic interactions between resolved and unresolved states.We refer to this technique as the energy approximation.The influence of closure terms on the energy equation can be determined by examining the energy equation.This equation can be derived from the Navier-Stokes equations by taking the inner product of Eq. ( 2) with velocity to obtain Substituting POD for velocity and pressure in this equation and simplifying it, we get The energetic interactions of the unresolved states and their contribution towards the energy of all resolved states are accounted for in the last term: Tensor artificial viscosity: Eq. ( 35) of this term, a k D m k , accounts for energetic interactions of unresolved states to the energy of the resolved state a k .These interactions comprise dyadic energetic interactions due to linear terms and triadic energetic interactions due to non-linear terms.Similarly, the energy equation for the k th resolved state is The modal energy equation determines the temporal evolution of energy for each resolved state.Based on these two equations, the energy approximation calibration technique for determining unknown parameters can be further classified as global and modal energy approximation techniques.The global energy approximation aims to ensure that the global energy balance (in Eq. ( 37)) is accurate.Therefore, this technique estimates unknown model parameters such that r i=1 a k Dm k is accurately calculated.On the other hand, the modal energy approximation involves ensuring that the modal energy balance is accurate.Therefore, this technique estimates unknown model parameters such that a k Dm k is accurately calculated.Determining parameters using global energy approximation is appealing for a global artificial viscosity approximation.Meanwhile, determining parameters using modal energy approximation is a better-posed problem for modal and tensor artificial viscosity models.
The two alternate pathways for calibrating closure models, accurate closure term and accurate energy contribution, will yield different model parameters, and it is important to study which calibration technique is better suited.Combining three closure model forms, namely global, modal, and tensor eddy artificial viscosity model forms, with these two least squares minimization-based calibration techniques, namely minimization of error in energy approximation and closure term approximation, results in six possible combinations of datadriven artificial viscosity closure models as shown in Table 1.

GV-GE model
GV-GE model assumes the model form in Eq. (33), where the optimal value of the global artificial viscosity (ν t ) ensures that the contribution from the closure term on global energy, r k=1 a k Dm k (a), is accurately estimated.Mathematically, this model is posed as The same data used for obtaining POD modes is used to obtain projected reduced states, a i (t), at time t = t l where l = 1, 2, • • •, n.Using this data, this problem transforms to where the l th component of the vector z is and the l th component of the vector b is The solution of this regression problem yields the following expression for artificial viscosity, For this model, we obtain a single artificial velocity that best approximates the energy at all sampled timesteps.If the flow is not stationary, the optimal global viscosity may not be the best approximation of the energetic interactions at different time instances.Therefore, this closure model is more suited for stationary flows.Despite the model form appearing very similar to the model in [18], this model differs in the calibration technique to determine the value of ν t .In [18], the mean value of global energy is used to determine ν t , whereas raw data of global energy with least squares minimization is used in this article.

MV-ME model
The MV-ME model assumes the model form in Eq. (34), where the optimal value of the modal artificial viscosity (ν t k ) ensures that the contribution of closure term on modal energy, a k Dm k (a), is accurately estimated.Mathematically, this model is posed as Following the procedure for other models, we use the sampled data.This problem transforms to where the l th component of the vector z is and the l th component of the vector b is The solution of this regression problem yields the following expression for artificial viscosity, for the k th mode.This model can better approximate the dynamic interaction between unresolved and resolved states than the GV-GE model, as it is more expressive.Although the model form appears very similar to the model in [18], this model differs in the calibration technique used to determine the ν t k value.In [18], the mean value of modal energy is used to determine ν t k , whereas least squares minimization with raw data of modal energy is directly used in this article.This model is similar to the one mentioned in [16] where the least squares minimization is used to obtain ν t k .However, the applicability of this closure model for ROMs was not demonstrated in [16].

TV-ME model
The TV-ME model assumes the model form in Eq. ( 35), where the optimal value of the tensor artificial viscosity (ν t ki ) is obtained to ensure that the contribution of closure term on modal energy, a k Dm k (a), is accurately estimated.Mathematically, this model is posed as with ν t ki as the i th element of vector ν t k .Following the procedure for other models, we use the sampled data.This problem transforms to where the li th component of the matrix Z is and the l th component of the vector b is given in Eq. (47).The solution of this regression problem yields the following expression for vector-valued artificial viscosity, for the k th mode.This model can better approximate the dynamic interaction between unresolved and resolved states than GV-GE and MV-ME models as it is more expressive.

GV-C model
The GV-C model assumes the model form in Eq. (33), where the optimal value of the artificial viscosity (ν t ) ensures that the closure term, Dm k , is accurately estimated.Mathematically, this model is posed as where data from all r modes is used to learn the artificial viscosity.Following the procedure for other models, we use the sampled data.This problem transforms to Eq. ( 40), where the o th component of the vector z is and the o th component of the vector b is where o is a unique combination of k and l.The solution to this regression problem, shown in Eq. ( 43), provides the value of ν t .For this model, we identify a single artificial velocity that best approximates the closure term for all the modes at every time step.Therefore, if the flow is not stationary, the optimal global viscosity may not best approximate the closure term at different time instances.

MV-C model
The MV-C model assumes the model form in Eq. ( 34) where the optimal value of the modal artificial viscosity (ν t k ) ensures that the closure term, Dm k , is accurately estimated.Mathematically, this model is posed as Following the procedure for other models, we use the sampled data.This problem transforms to Eq. ( 45), where the l th component of the vector z is and the l th component of the vector b is The solution to this regression problem, shown in Eq. ( 48), provides the value of ν t k .This model better approximates the dynamic interaction between unresolved and resolved states than the GV-C model, as it is more expressive.This model is similar to the one mentioned in [16], although the applicability of the closure model for ROM was not demonstrated in that article.

TV-C model
The TV-C model assumes the model form in Eq. (35), where the optimal value of the artificial viscosity (ν t k ) ensures that the closure term, Dm k , is accurately estimated.Mathematically, this model is posed as with ν t ki as the i th element of vector ν t k .Following the procedure for other models, we use the sampled data.This problem transforms to Eq. ( 50), where the li th component of the matrix Z is and the l th component of the vector b is given in Eq. (58).The solution to this regression problem, shown in Eq. ( 52), provides the value of ν t k .This model better approximates the dynamic interaction between unresolved and resolved states than GV-C and MV-C models as it is more expressive.

Differences compared to other closure modeling techniques
As discussed earlier, several other closure models exist for ROMs.Physics-driven models are better posed as the model form is determined using the physical relationship between variables; for example, global and modal artificial viscosity models have been demonstrated to tackle the closure problem adequately [10,17,18].In contrast to the conventional physicsdriven models like Eq. ( 33) and Eq. ( 34), the tensor artificial viscosity model form in Eq. ( 35) is more expressive and can potentially give better accuracy.On the other hand, data-driven closure models are naturally suited for ROMs due to their data-intensive nature.However, common data-driven closure models that rely on least squares regression for closure modeling require regularization to remedy the ill-conditioning in the model learning problem [25].The selection of the ideal regularization strategies and constants varies with the problem definitions and introduces an additional expensive tuning process.For example, the models proposed in [25,31] have more unknown parameters; thereby, they are more expressive but also suffer from ill-conditioning and require regularization.The linear operator data-driven model in [30] has fewer parameters and does not require regularization for flows considered in this article.Numerical tests not shown in this article for brevity indicated that this linear operator model yielded similar results to the TV-C model and better than models with quadratic operator model form [25,31].The closure models discussed in this article consider several physics-driven model forms discussed above and determine unknown parameters using least squares regression, which is more commonly used for data-driven closure models.

Incremental pressure correction schemes for projection-based reduced order modeling
After including the closure model, the final ROM equations to be solved are, where Dm−model k is determined using one of the proposed closure models.The solution of these equations also exhibits a saddle-point form like those exhibited by incompressible Navier-Stokes equations [36].No equation dictates the evolution of b i , and hence, several alternate solution methods have been proposed to solve these problems [39,47] in the context of FOM.Most studies on ROMs for incompressible Navier-Stokes equations assume the discrete divergence-free nature of the velocity and thereby neglect the pressure gradient term in the momentum equation [35].Unfortunately, most of the practical data sources, both computational and experimental, do not satisfy the discrete divergence-free condition of velocity.Furthermore, the pressure gradient term in the momentum equation significantly influences flow behavior for several fluid flows, such as confined flows [34].In such scenarios, appropriate treatment of pressure-velocity coupling is essential.
Our solution method for these equations is inspired by the pressure-projection method used to solve incompressible Navier-Stokes equations.The most commonly used solution method is the classical pressure correction scheme in [48,49], which uncouples pressure and velocity to avoid the solution of the saddle point problem.However, this method is first order in time and our implementation required a small time step for convergence of results.Therefore, we use incremental pressure correction that is second order in time [50].For incompressible Navier-stokes equations, this method involves two sub-steps, and A similar solution method can be used for Galerkin projection-based ROMs by substituting Eq. (3) in Eq. ( 63), Eq. ( 64) and Eq. ( 65).The resulting momentum equation, Eq. ( 63), is projected onto the linear subspace spanned by the ϕ i for i = 1, 2, • • •, r.We first take the divergence of Eq. ( 64) and then project the resulting equation along with the continuity equation Eq. ( 65) onto the linear subspace spanned by the ψ i for i = 1, 2, • • •, r.The resulting equations are, and where As discussed in Section 2, inf-sup condition restricts the choice of function spaces for pressure and velocity.For example, mixed elements such as Taylor-hood elements [51] are often used to overcome this restriction.On the other hand, if equal order interpolation functions are chosen for velocity and pressure, additional stabilization techniques, such as the pressure-stabilizing Petrov-Galerkin (PSPG) method [47], may be needed.In the context of ROMs, the spatial basis for both pressure and velocity is determined from data, and there is limited literature on identifying adequate function spaces for velocity and pressure POD basis.The literature in this area highlights using stabilization techniques [8,35] and supremizers [36,52].In the context of FOMs, such as those using finite element methods, pressure-projection schemes have been shown to overcome the restriction posed by inf-sup condition on selecting velocity and pressure spaces [53].Therefore, these schemes are also expected to work well for ROMs and this article demonstrates this through numerical simulations.As pressure correction scheme [48,49] is shown to be of similar form to PSPG scheme in [47], pressure correction schemes could implicitly act as a model for unresolved pressure states [54] and thereby improve pressure stability.Detailed numerical analysis of the stability and convergence of this method in the context of ROMs will be considered a future study.
Despite good predictions of a i using the incremental pressure correction scheme for ROMs, the accurate temporal evolution of b i was not observed.This observation could be attributed to the role of pressure to enforce the incompressibility condition.Therefore, the evolution of b i is obtained in the post-processing stage by utilizing the a i and solving the pressure-Poisson equation as also done in [34,55].These equations are obtained by taking the pressure-Poisson equation, substituting Eq. ( 3) in it and then projecting it on a linear subspace formed by Note that the values of b i obtained through this post-processing step are solely used to compute the pressure distribution and not used in the evolution of velocity states.The use of pressure-Poisson for post-processing to obtain pressure is also common when a discretely divergence free condition is used to simulate incompressible flows, as pressure is not computed during the solution stage.On the other hand, other common solution methods for resolved pressure states [35] typically involve the solution of the pressure-Poisson equation during the solution step.

Numerical experiments
This section aims to validate the performance of the ROMs with proposed closure models and compare the results to ROMs without any closure model.First, we describe the details of the FOM simulation setup for the validation case: 2-D flow over a cylinder [56].This flow has been frequently considered for validating ROM solution methods and closure models [25,31].Second, we give details for the construction of the operators and time evolution of reduced states.Third, we compare the performance of proposed closure models at three Reynolds numbers: Re = 200, Re = 500 and Re = 1, 000.

Full order model details
FOM simulations use a finite element code developed using the DOLFINx library [57].The finite element code closely follows openly available tutorials in [58,59].The commonly used Taylor-Hood elements (P2-Q1) are chosen as the spatial discretization for the FOM simulation.The problem can be posed as: and the outlet boundary condition is The simulation setup is similar to the well-known benchmark [56].Unlike the benchmark, which uses a sinusoidally varying velocity, the flow considered in this article has a constant inflow velocity.These equations are solved using the Crank-Nicholson and semi-implicit Adams-Bashforth time integration scheme with the above boundary conditions.The mesh for Re = 200 was composed of quadrilateral elements formed using approximately 14, 000 nodes.Meanwhile, the Re = 500 and 1, 000 meshes comprised around 22, 000 nodes.This mesh resolution possibly underresolves the flow at the given Reynolds numbers.However, this mesh works well to demonstrate the applicability of the closure model and ROM solution method for incompressible fluid flows.The FOM simulation is started from a zero velocity and pressure initial condition.The simulation is integrated in time until T f = 10s with a timestep of 1/1, 600.For Re = 500 and 1, 000, a lower timestep was also tested to ensure that the lower Courant-Friedrichs-Lewy (CFL) number remains similar between simulations of different Reynolds numbers.However, the behavior of FOM simulations and ROMs remained insensitive to this selection.

Reduced order modeling details
The offline step of ROMs involves the computation of operators as these are expensive to compute, for example, the non-linear term of Navier-Stokes equations in our case.ROMs can be categorized as discrete ROMs or continuous ROMs following the approach taken to construct operators in Eq. ( 66), Eq. ( 67) and Eq. ( 68).The key differences between these ROMs are discussed in [13,60].In this article, we work with continuous ROMs as it is mathematically more consistent with FOM than discrete ROMs [13,61].Continuous ROMs leverage interpolation and numerical integration routines used in the corresponding FOM solver, favoring the consistency between ROMs and FOMs.The first step for assembling these ROMs involves interpolating the POD basis, ϕ and ψ, to a mesh with a suitable polynomial degree.The polynomial degree for the spatial discretization can be arbitrarily chosen but could introduce consistency issues if not matched with the spatial discretization in the FOM [61].In this article, a more consistent approach is undertaken by selecting function spaces for ϕ and ψ to be the same as those for u and p, that is, V h and Q h respectively.Additionally, we use the same quadrature rule as FOMs for operator integrals.
The online step of ROMs involves the time integration of the dynamical equations of reduced states.This article uses a second-order implicit time stepping scheme as discussed in Section 4.3.As this scheme is not the same as the time discretization used for the FOM, some inconsistency in time discretization may be introduced.This decision was still made to demonstrate the applicability of pressure correction schemes for the temporal evolution of reduced states in ROMs.The equations Eq. (66), Eq. ( 67) and Eq. ( 68) are solved using SNES non-linear solver from PETSc [62].The time step for ROMs was chosen to be the same as the one for FOMs after a systematic timestep convergence study, which is not included in this article for brevity.The initial transient of the FOM simulation, that is, t ∈ [0, 5]s, is ignored.The data generated by FOM is uniformly sampled in a time interval of T pred = t ∈ [5, 10]s and used for testing the ROMs.The POD basis, equation operators and closure models are obtained using uniformly sampled data in t ∈ [5,6]s.Data at every 20 th timestep is selected in this time interval, resulting in 80 data points.Other time intervals and data selection frequencies were also considered and yielded similar conclusions.The goal of these computational tests is to demonstrate the ability of ROM for forecasting problems.For all Reynolds numbers, we are interested in answering the question: Do ROMs obtained using data from t ∈ [5,6]s give good predictions at a future time: t ∈ [5, 10]s (renamed as t ∈ [0, 5]s)?

Cylinder flow at Re = 200
We compare the energy (E), drag coefficient (C D ), and lift coefficient (C L ) for different ROMs with corresponding values of FOM.These are defined as: and where ∂Ω S is the surface of the cylinder and u ts (= u • (n y , −n x )) is the tangential component of the velocity with n x and n y as the components of the normal to the cylinder surface.In particular, we compare error in energy (e E ), error in C D (e C L ) and error in C L (e C L ) which are defined as, Note that errors in C L and C D have a mean bias introduced due to the difference in pressure computation between ROMs and FOM simulations.ROM determines pressure by solving pressure-Poisson equation Eq. (70), whereas no such equations are solved in FOM simulations.
We first demonstrate the performance of the ROM without using a closure model.The evolution of energy, C L and C D for ROMs without a closure model for different numbers of modes is shown in Figure 2.For all selection of modes, we observe an increase in the error with an increase in time.Using eight modes for the ROM gives better energy prediction than using 16 modes.This result implies that increasing the number of modes for representing the solution does not improve the errors in energy approximation in time.On the other hand, results for C D and C L give a more consistent picture as the error for eight modes is much higher than the error for 16 modes.The error in these aerodynamic coefficients reduces significantly with an increase in the number of modes.These numerical results demonstrate the applicability of an incremental pressure correction scheme for treating pressure-velocity coupling in the context of incompressible Navier-Stokes equations.
These results indicate that ROMs would require many modes to achieve accurate predictions without using any closure model.As the online cost of ROMs scales as r 3 , where r is the number of modes, using many modes to achieve more accurate results becomes computationally expensive and is not ideal.These observations and drawbacks of Galerkin ROMs are well-known in the literature.This discussion motivates the need for closure models that ensure higher accuracy without increasing the online cost of ROMs.
The streamwise velocity component and pressure fields at t = 5s are shown in Figure 3 and Figure 4 respectively.The pressure distribution for several ROMs tested in this study is similar to those for FOM.These observations highlight the robustness of pressure when GV-GE and GV-C closure models are used despite the slightly lower error in energy approximation.These results indicate that ROM without a closure model is insufficient for dynamics prediction as the error rapidly increases outside the interval for which the data is used to obtain the energetic spatial basis.However, this drawback of Galerkin ROMs can be rectified by selecting an appropriate closure model.As the error does not grow significantly in time when MV-ME, MV-C, TV-ME and TV-C closure models are used, Galerkin ROM with these closure models appear to be well suited for time dynamics prediction for this Reynolds number.A more complete comparison of the model requires a comparison of these errors for other selections of the number of modes.Some efficient metrics for comparing the model performance of the different number of modes are the relative integrated error in energy (η E ), relative integrated error in C L (η C L ) and relative integrated error in C D (η C D ) which are defined as follows: and The variation of the integrated error in energy, C L and C D with the number of modes for several closure models is shown in Figure 8.A non-monotonic change in the integrated error is observed with increased modes for ROMs with and without closure models.The results indicate a higher error for ROMs with the GV-C closure model for several modes than when ROMs without a closure model are used, indicating that this closure model does not perform well.For a lower number of modes, ROMs with GV-GE, MV-ME and MV-C closure models

Cylinder flow at Re = 500
The temporal evolution of error in energy for four and six mode ROMs with different closure models is shown in Figure 9.As the error in energy for the four mode ROM without a closure model is surprisingly low, ROMs with GV-GE, MV-ME, GV-C and MV-C closure models give comparatively worse errors.This lower energy error for the four mode ROM without a closure model is an anomaly, as we will observe by assessing other mode ROMs and looking at different quantities of interest.For the four mode scenario, the ROM with the TV-ME closure model gives results similar to the ROM without a closure model.In contrast, the ROM with the TV-C closure model gives lower errors in energy than ROMs with other models and without a closure model.For the case of six mode ROMs, the ROM without a closure model exhibits a high error that rapidly rises outside the data generation window.ROMs with GV-GE and GV-C closure models display lower errors than those without a closure model.However, this error increases rapidly outside the data generation window.ROMs with MV-ME and MV-C closure models exhibit very low and similar errors.Lastly,   ROMs with TV-ME and TV-C closure models yield consistently low errors for all modes; therefore, these are the preferred closure models.

Discussion
The following conclusions can be drawn from the results obtained from the flow over a 2-D cylinder at three Reynolds numbers: • The incremental pressure correction scheme provides accurate estimates of the several quantities of interest for different mode selections and gives significantly lower errors when coupled with an appropriate closure model.
• A constant artificial viscosity model form may not be expressive enough to resolve the closure problem and provide significantly lower errors than Galerkin ROMs without a closure model.
• The parameters of modal and tensor artificial viscosity closure models obtained using the two calibration techniques, based on energy and closure term approximation, do not lead to an appreciable difference in the closure model performance.
• Both modal and tensor artificial viscosity model forms are sufficient for a higher number of modes to account for the interaction of unresolved states on resolved states.However, at a lower number of modes, ROMs with tensor artificial viscosity models consistently yield better results than ROMs with modal artificial viscosity closure models.
These observations inform decision-making for selecting the appropriate solution method and closure models for ROMs for incompressible flows.The inapplicability of global artificial viscosity closure models is demonstrated through results indicating that GV-GE and GV-C closure models do not consistently yield lower errors than ROMs without a closure model.This conclusion resonates with the observation in [18], where a modal artificial viscosity was used to overcome this drawback.A comparison of the two calibration techniques indicated that energy transfer between unresolved and resolved states is sufficient to account for the non-linear dynamics, even for the highest Reynolds numbers under consideration.This result echoes the observation in [16] where the mean modal artificial viscosities computed using two calibration techniques are similar.Lastly, the modal artificial viscosity closure form is insufficient to capture the dynamics between the more energetic resolved and unresolved states, which appear to be better handled by a tensor artificial viscosity closure form.These conclusions guide the selection of appropriate model forms for even higher Reynolds number flows where the closure problem becomes even more prominent and ROMs with more modes become less favorable due to higher evaluation costs during the online stage.

Conclusions
ROMs are essential for multi-query applications, real-time control and dynamics forecasting.These models must achieve high accuracy while exhibiting a lower online computation cost.The cost of the online stage scales as r 3 where r is the number of resolved states for incompressible fluid flows.Therefore, it is desirable to use a lower number of resolved states.However, with a decrease in the number of resolved states, the need for accounting for the interaction between unresolved and resolved states becomes more important.The first contribution of this article is to propose closure models for the interaction between the unresolved and resolved states.We consider six different closure models determined using three model forms, based on global, modal and tensor artificial viscosity, where the unknown coefficients are determined by two calibration techniques: least squares minimization of error in energy approximation and closure term approximation.These closure models can be considered hybrid data-physics models as physical arguments inspire the model form, but the model parameters are directly learned through raw data.The proposed models did not require regularization for the cases considered in this study, unlike other common data-driven closure models [25,31].The flow over a 2D cylinder is considered at three Reynolds numbers and errors in energy, C L and C D are compared to validate these closure models.The results indicate that modal and tensor artificial viscosity model forms yield more accurate ROMs than those without any closure model.ROMs with tensor artificial viscosity model form deliver even more consistent results as the results did not deteriorate even for four mode ROMs, indicating that this model form is the most consistent amongst those considered in this article.Lastly, the two calibration techniques used for determining the unknown parameters in the closure model forms, least squares minimization for energy error or closure term error, provide closure models with slight differences in results.This observation indicates that both calibration techniques are equally well suited for determining modal and tensor artificial viscosity closure model parameters.
ROMs for incompressible fluid flows exhibit coupling between the reduced pressure states and reduced velocity states, where the primary role of the reduced pressure states is to enforce the incompressibility constraint.The unique structure of these equations results in a saddlepoint problem that requires special solution techniques to obtain the dynamic evolution of velocity and pressure states.The second contribution of this article is to demonstrate the applicability of a solution technique: incremental pressure correction for projection-based ROMs.This method is adapted from the incremental pressure correction scheme readily used in full order modeling of incompressible fluid flows [50].The ROMs obtained through this solution method give low errors for incompressible flows under consideration, thereby demonstrating the applicability of this method for ROM of incompressible fluid flows.
As more complex and challenging fluid flow applications that require ROMs emerge, the solution method and the closure models proposed in this article will require further validation to ensure the accuracy remains.Furthermore, a detailed numerical analysis of the stability and convergence of the incremental pressure correction scheme in the context of ROMs will be performed.Alternative solution methods for saddle-point problems, such as velocity correction schemes, exist in the full order modeling literature.The applicability of schemes for ROMs must be validated.The closure models considered in this article are linear as they depend linearly on reduced states.These models could not adequately account for the interaction between resolved and unresolved states when less than four modes were used for ROMs.Non-linear model forms have the potential to be more accurate for such a lower number of modes and will be explored in future work.Despite the potential to be more accurate than linear models, these models would involve a higher online stage cost.Therefore, a comprehensive study is required to investigate the trade-off between cost and accuracy.As part of future work, we also plan to explore novel applications of ROMs in material transport regulation of neurite networks [63][64][65][66] and heat exchanger design for additive manufacturing

Figure 1 :
Figure 1: Domain for the flow over a 2-D cylinder.
) subject to boundary conditions u| ∂Ω D = g and p| ∂Ω N = h, (73) where ∂Ω D is the Dirichlet boundary and ∂Ω N is the Neumann boundary.The Dirichlet boundary for the 2-D flow over a cylinder that encompasses of wall boundary (Ω w ), inlet boundary (Ω in ), and outlet boundary (Ω ou ), that is ∂Ω D = ∂Ω w ∂Ω in ∂Ω ou ).The domain and locations of the boundaries are shown in Figure 1.The wall boundary condition is u| ∂Ωw = 0, (74) the inlet boundary condition is u| ∂Ω in = 4U 0 y(0.41 − y) 0.41 2 , 0 , where U 0 = 1.5 m/s,

Figure 2 :Figure 3 :Figure 4 :Figure 5 :Figure 6 :
Figure 2: Error in (a) energy, (b) C L and (c) C D for 2-D cylinder flow at Re = 200 for Galerkin ROM without closure model.The shaded area is the time interval from which data is extracted to determine the POD basis, equation operators and closure models.

Figure 7 :
Figure 7: Error in C D (a) four modes and (b) six modes for 2-D cylinder flow at Re = 200.The shaded area is the time interval from which data is extracted to determine the POD basis, equation operators and closure models.

Figure 8 :
Figure 8: Integrated error in (a) energy, (b) C L and (c) C D for 2-D cylinder flow at Re = 200.

Figure 10 :Figure 11 :
Figure 10: Error in C L for (a) four modes and (b) six modes for 2-D cylinder flow at Re = 500.The shaded area is the time interval from which data is extracted to determine the POD basis, equation operators and closure models.

Figure 12 :Figure 14 :Figure 15 :
Figure 12: Integrated error in (a) energy, (b) C L and (c) C D for 2-D cylinder flow at Re = 500.

Figure 16 :
Figure 16: Integrated error in (a) energy, (b) C L and (c) C D for 2-D cylinder flow at Re = 1, 000.
, • • •, r are the resolved velocity states and a i for i = c (x, t) + p f (x, t),

Table 1 :
List of the six closure models considered in this article.