AN EQUATION- FREE, REDUCED-ORDER MODELING APPROACH TO TROPICAL PACIFIC SIMULATION

The “equation-free” (EF) method is often used in complex, multi-scale problems. In such cases it is necessary to know the closed form of the required evolution equations about macroscopic variables within some applied fields. Conceptually such equations exist, however, they are not available in closed form. The EF method can bypass this difficulty. This method can obtain macroscopic information by implementing models at a microscopic level. Given an initial macroscopic variable, through lifting we can obtain the associated microscopic variable, which may be evolved using Direct Numerical Simulations (DNS) and by restriction, we can obtain the necessary macroscopic information and the projective integration to obtain the desired quantities. In this paper we apply the EF POD-assisted method to the reduced modeling of a large-scale upper ocean circulation in the tropical Pacific domain. Compared with the POD method, the method provided more accurate results and it did not require the availability of any explicit equations or the right-hand-side (RHS) of the evolution equation.


Introduction
The Proper orthogonal decomposition (POD) is a methodology that constitutes a very efficient way to perform reduced modeling by identifying the few most energetic modes in a sequence of snapshots from a time-dependent system, and subsequently providing a means of obtaining a low-dimensional description of the system's dynamics [1].The method of snapshots which is first proposed in [2] for flow system is a very effective and easy to carry out approach for obtaining POD basis sets.The proper orthogonal decomposition (POD) was originally introduced by Karhunen in 1946 (see [3]) and Loève in 1945 (see [4]), and the method has been extensively used in research in recent years and successfully applied to a variety of fields, such as in conjunction with experimental e.g., [5,6,7,8]) and with numerical studies(e.g., [9,10,11,12]) in thermal convection, shear layers, cavity flows and external flows, to mention just a few.In recent years, our understanding of the tropical ocean has increased.There is a vast and growing literature on the design of the ocean model based on partial differential equations (PDE) for physical systems.Such modes are often hard to solve because of the high order system that describe the state.Another obvious application of POD in weather forecasting and operational oceanography is the four-dimensional variational (4DVAR) data assimilation problem.However, a major difficulty in use of 4D-Var for realistic general circulation models is the dimension of the control space, generally equal to the size of the model state variable and typically of order .Current ways to obtain feasible implementations of 4D-VAR consist mainly of the incremental method (see [13]), check-pointing (see [14]) and parallelization.However, each of these three methods has their obvious shortcomings.The dimension of the control space remains very large in realistic applications for the incremental method (see [15,16]).The size of assimilation studies is imposed a limitation for memory storage requirements, even on the largest computers.Check-pointing strategies (see [17]) have been developed to address the explosive growth in both on-line computer memory and remote storage requirements of computing the gradient by the forward/adjoint technique, which characterizes large-scale assimilation studies.POD provides a potential technique that can dramatically reduce computation and memory burdens of 4D-VAR.Cao et al (see [18]) made an initial effort to explore the feasibility of application of POD to 4D-VAR.If someone was to apply the POD method to an ocean model, the feasibility and efficiency of the POD technique in ocean calculations are already demonstrated (e.g., [19]).In the aforementioned works, we can obtain low-dimensional system dynamics systems directly from the Galerkin projection of the governing equations on the POD modes.However, it is well known that the reduced systems resulting from truncated Galerkin projections may result in spurious asymptotic states (e.g., [20]), and it is difficult to obtain the explicit form of the right-hand-side (RHS) which consists of POD coefficients of the evolution equation.Thus, the Equation-free POD-assisted method is first used to resolve the question above in incompressible flows (see [21]).It is well know that the tropic Pacific Ocean model is more complicated than the molecular dynamics model, and it is very difficult to obtain accurate information due to the lack of direct measurements and insufficient knowledge of air-sea exchange processes.Since Ocean forecasting is very important to human activity, the topic we are investigating is more complicated and significant than the two-dimensional flow past a circular cylinder.Similarly, the Equation-free 7 10 10 method also can be used to resolve problems related to weather and land numerical simulation.
In this paper, we apply the equation-free POD-assisted method to the simulation of the upper tropical Pacific Ocean model based on POD model.In this context, the role of the "detailed, microscopic" model is played by the numerical simulation (DNS) simulator, and the "unavailable coarse-grained" equations are the evolution equations about the coefficients of the projection of the full DNS solution on master POD modes.We often obtain these equations by performing a Galerkin projection of the original equations on the POD basis; to truncate this Galerkin projection would give an approximation of these equations.The equation-free method tries to resolve this model without deriving it in closed form.We use full DNS simulator to estimate the right-hand-side of this Galerkin model, and the equation-free acceleration technique we will illustrate is projective integration.In this paper, we make use of an Equation-free POD-assisted method, which use "equation-free" [22,23,24] projective integration [25,26,27].The equation-free procedure is devised for the efficient computational study of complex, multi-scale problems.The basic process is as follows (two level) [21] (see Figure 1): (a) to devise and implement short-time numerical experiments with "the best available " microscopic model, and subsequently (b) to estimate quantities (derivative) required in numerical computation of the (unavailable) macroscopic equations for the coarse-grained system behavior [28,29] by using the numerical results of such microscopic computations.Thus, we can estimate the closures that are required to obtain explicit macroscopic equations on demand; we can perform numerical analysis tasks by running the microscopic simulation directly.This framework has been applied to many types of problems, such as bifurcation analysis of complex systems and homogenization of random media [22, 23, 25, 28, 30, and 31].The paper is organized as follows: In section 2 the upper tropical Pacific Ocean model is described.
The POD technique and its mathematical properties and equation-free POD model are presented in section 3.In section 4, the numerical calculations using equation-free POD in the context of simulating the upper layer thickness and currents in this ocean model, and a comparison with DNS and POD are discussed.In section 5 we discuss the convergence and error of the method.We summarize the results in section 6.The model is a reduced-gravity, linear transport model, consisting of two layers above the thermocline with the same constant density (Figure 2).It is assumed that below the thermocline, the ocean is of a higher density, which is sufficiently deep so that its velocity vanishes and there is not density difference across the base of the surface layer, that is, we regard the surface layer as part of the upper layer.The equations for the depth-averaged currents are  (Stricherz et al.1992).By a linear interpolation, the data are projected onto each time step and into each grid point.In Table 1, the values of the numerical parameters used in the model integration are listed.It takes about 20 years for the model to reach a periodic constant seasonal cycle; at that time, it has successfully captured the main seasonal variability of dynamical fields.The currents and the upper layer thickness of the 21-st year are saved for the process.The model is discretized on the Arakawa C-grid, and all the model boundaries are closed.At these solid boundaries, we apply the no-normal flow and no-slip conditions.The time integration uses a leapfrog scheme, with a forward scheme every 10th time step to eliminate the computational mode.Every integration day a mass-compensation is carried out.

A Simple Introduction to Proper Orthogonal Decomposition
In order to illuminate the idea of the proper orthogonal decomposition method, we will introduce the POD method in continuous case.Because the main idea for both continuous and discrete cases is same, we carry out the numerical experiments in the discrete case.

Let (
) denote the set of observations (also called snapshots) of some physical process taken at position . The average of the ensemble of snapshots is given by ) ( 1 We form new ensemble by focusing on deviations from the mean as follows: We wish to find an optimal compressed description of the sequence of data (3.2).One description of the process is a series expansion in terms of a set of basis functions.Intuitively, the basis functions should in some sense be representative of the members of the ensemble.Such a coordinate system, which possesses several optimality properties (to be discussed in the sequel), is provided by the Karhunen-Loève expansion (see [35]), where the basis functions are, in fact, admixtures of the snapshots and are given by ) are to be determined such that where the coefficients given by (3.3) will most resemble the ensemble { . More specifically, POD seeks a function

L
It follows that (see, [36]) the basis functions are the eigenfunctions of the integral equation where the kernel is given by (3.7) Substituting (3.3) into (3.6)yields the following eigenvalue problem: ) , ( 1 where , is a symmetric and nonnegative matrix.Thus, we see that our problem amounts to solving for the eigenvectors of a is the size of the ensemble of snapshots.A straightforward calculation (see also [36]) shows that the cost functional . By denoting a(t)={ ( (see [21]), we define a restriction operator and a lifting operator such that Where the explicit form of right-hand-side (RHS) terms are unknown.In the normal truncated POD-Galerkin procedure, the RHS terms are derived form the equation (2.1a)-(2.1c)by a Galerkin approach, resulting in a (possibly large) set of coupled ODEs.In this section, in order to perform the numerical experiments, we consider the discrete Karhunen-Loève expansion to find an optimal representation of the ensemble of snapshots (e.g., [1], [10], [19], and [37]).

The Equation-free POD Model
Equation-free method is usually used to resolve different scale problems between the macroscopic level and microscopic level.The whole method is often made of two levels, that is, "inner" simulator and "outer" simulator.The "inner" simulator is micro direct numerical simulation, and the "outer" simulator consists of many types of continuous mathematic methods at macro level, such as finite difference, finite element, finite volume element, optimization.How to link between macro scale and micro scale is the key problem.Here, we often have some steps as described below.
Full DNS is used to estimate the right hand side (RHS) of these Galerkin ODEs on demand in our approach, and we can accelerate their numerical integration, without approximating them in closed form.In this paper, the equation-free implementation of a numerical task is illustrated: numerical integration, which is called "projective integration" in an equation-free context.We will provide a detailed description for every step of the method.Equation-free single step projective integration consists of the following main components, from 1 The large (coarse) step will be usually chosen to be where .
The global time step is .Figure 2 presents a graphical illustration of the notation.
In our case, the "inner" simulator is the fully resolved DNS discretization for tropic Pacific model.The "outer" (coarse) model is the unavailable in closed form Galerkin sets of ODEs that are made up of coefficients of the evolution equations based on the first few low-POD modes.Specially, the process of POD-assisted projective integration is made up of the following steps (see Figure 2): Given a a( ), Micro scale computation: Resolve the equations (2.1a)-(2.1c)by DNS for a short period of time, for .We can compute this by appropriate difference scheme with a small time step .

a(
).The time step here is , where .
5. Return to step 1 until the final integration time is reached.

Remark:
As discussed above, we can obtained the right-hand-side Y( t ;a( t )) of equations (3.12) from the equations (2.1a)-(2.1c)by a Galerkin projection.This procedure may result in rather intricate forms and often suffers long-term dynamics, (see [20,32]).This "equation-free" approach can avoid these difficulties by using "just enough" full DNS simulation due to its not requiring the explicit form of the RHS of (3.12).After having already performed a short DNS run, the procedure is as follows: ), = .In our case, Once the RHS of (3.12) is estimated numerically, one can integrate it through Euler difference scheme.
1, , , a a a ( ) Therefore, after the steps above, we obtain what we want, that is,

Numerical Results
In this section, we present numerical results of the equation-free POD model for the upper tropical Pacific Ocean model.Here we select the three sets of different snapshots for discussing the effect of the number of snapshots and basis functions, that is n=20, n=36, and n=60.For all of them, the first seven POD modes can capture about 99% energy.It can also be clearly seen, that for the upper layer thickness h , the same modes capture the most energy, followed next by and last by .Thus, different POD modes may be used to reconstruct respectively.To quantify the performance of the reduced basis method, we use two metrics, namely the root mean square error (RMSE) and the correlation of the difference between the full order and the reduced order simulation.We can obtain it by first taking twelve-month's full order results and the corresponding twelve-month's reduced order results.We made a comparison between POD, equation-free POD and full DNS simulation by RMSE.The formulation of computing the error is as follows: where M is the number of nodal points, the index denotes the month, is the full order approximation and is the reduced order approximation.The average RMS error is defined as: And the correlation is defined as: where and is the average of full order approximation and reduced order approximation respectively.The RMS error and the correlation for the other model variables and u are computed similarly.Table 2 presents the average RMSE in reduced order approximations using different modes namely n = 20, n = 36, and n = 60 snapshots.Note that from these simulations, with the span of the reduced basis space increasing, the RMSE decreases as long as the same number snapshots are used.Compared with POD, equation-free POD results in a bit of improvement, especially for v by capturing 60% energy.We also find that the RMSE continues to decreases with more energy being captured in equation-free POD than in POD.The correlation for twelve months is displayed in Table 3.Clearly, when increasing the POD mode, the correlation also increases for the same number of snapshots.We find that the results from equation-free POD are almost the same as those of POD.(Figure 3)-(Figure 5) present some comparisons made (among DNS, POD and equation-free POD ) under the different scenarios.

Convergence and Error
In the problem we investigate, we also need to pay attention to the issues of convergence and error.Before we consider the convergence, we first prove the consistency of the EF method.Let equations (2.1a)-(2.1c)be rewritten in vector form.The equation is as follows: ( , ) ( ; ( , )) where is a vector, which consists of ( ), and is also a vector function.Let , denote the large-scale time step and short time step respectively, ρ is the maximum mesh interval, and we apply a numerical scheme that is consistent and stable.Let ( , ) H t x be the numerical solution of , is the numerical solution of the EF method.A selected scheme is as follows, ( , ) Since the scheme is consistent and stable, we have following requirement at a given moment ) In the case of the limit of 0, , and . By virtue of property ( ) Having finished illustrating the consistency from a theoretic angle, the convergence of the EF method may be obtained at a given moment

N T T =
, through (5.3),(5.5) Similarly, = , using ( Where and is the true solution of original problem.The convergence of the method is thus proved.

( , ) t p x
The relative error is used to highlight the convergence and error in numerical experiments.

H H H − , H H
The relative error can be expressed by , where are the EF method solution and full DNS solution, respectively, and ||•|| denotes the Euclidian-norm.The results of analysis of the numerical experiments are following.
We discuss the factors that affect the results of convergence and error, including time step and the number of iterations.In our case, due to the different magnitude, we choose the relative error, and we think that it can reflect the convergence truly.We select the case of n=36, and we first discuss the difference between large-scale time step and short time step results.One factor is number of iterations during the course of the DNS running in a large-scale time step, and we also select several sets of experiments.We give the error fold line figure when we select the number of iterations to be 2, 10, 20, 40, 100, respectively, see figure 6.We find that the results are much improved when we increase the number of iterations, and the error is much smaller.Similarly, we decided to choose the number of iterations 20 for the sake of saving computational time.The other factor is the time step, which in the case of short time step is t Δ =100 seconds, We select the large-scale time step size as =6 hours, 1 day, 5 days and 10 days respectively.We can illustrate that the method is convergent by providing a numerical example, see figure 7.In addition to the RMS error, we compute the relative error between reduced solution and full solution for , and we attained a relative error of below T Δ h 3 10 − of the magnitude of the average height , that is about 150 meters.The results indicate that the smaller the large time step is, the smaller the error for , with a reverse effect for u .Generally speaking we conclude that the result is better when a large-scale time step of =1 day is used, in particular when we consider computation time savings.The relative error is presented to show the convergence of the method, see figure 8. Here, we also consider another issue related to scheme of time derivative estimation.In our case, we use the Forward Euler Projection to discretize the time derivative.The polynomial extrapolation approximation is another way to estimate the time derivative.In our case, we find that the Forward Euler Projection yields the same result as obtained with polynomial extrapolation when we use 1-th order polynomial.However, the result is unbelievably improved when we h , h v T Δ choose the 2-th order polynomial, and this result may be related to our specific problem.

Summary
We applied the equation-free POD method to reduced modeling of a large-scale upper ocean circulation in the tropic Pacific domain.Three sets of snapshots are chosen to analyze the difference between POD and equation-free POD, and we found that some results of the equationfree POD were the same as those of POD, while other were slightly better than those of POD.By numerical experimentation, we analyzed the factors that affected the results, and some improvements were obtained in resolving the POD related problems.Generally speaking, the number of snapshots was found to have little influence on results beyond a certain threshold number.Certainly, increasing the number of snapshots is good for us.Here, we often choose the number of basis functions that capture 99% energy.We found that it was not always the case that the larger the number of basis functions we choose, the better the results obtained.For a given number of snapshots and basis functions, we also discussed the effect induced by a large-scale time step and by the number of iterations.The results indicated that the ratio between large time step and short time step should be confined to a certain range, here, the result is acceptable when the large time is equal to 1 day.We also think it appropriate to take the number of iterations to be equal to 20 for saving computation time, though a larger number of iterations is beneficial for our problem.The convergence and error between approximate solution and full solution were also discussed, and we provided an illustration of convergence and error by calculating relative error and RMS error.
From the results, we can find that the magnitude of the error is below 3 10 − .However the most obvious advantage of using the equation-fee POD is that we can solve problems for which the corresponding equations are unavailable or without closed form.Although the results of using equation-free POD are not obviously better (they are the same as POD in cases) than those of POD, we are still satisfied with them for our applying the new method to the tropic Pacific Ocean simulation due to its enormous decrease in computation work compared with DNS.For instance, the computational cost of the equation-free POD-assisted method is less than 10% that required by DNS.In our case, we think it important that the lifting step-given the accurate initial values of macro variables (here, low-POD coefficients) be implemented.Projective integration is one of the traditional continuum numerical procedures implemented in an equation-free frame.Certainly, we must further investigate the issues that emerge in the process of using equation-free POD method, for instance, the computational cost is not less than that of POD.The same as figure 3, but for the case of 36 snapshots.
The same as figure 3, but for the case of 60snapshots.
a) steps of micro level (short time step) computation (at time step where and ).

( b )
Restriction to macro variables and estimation of the time-derivatives of their evolution; One step of macro level projective integration with step size (d) Lifting from the projected values of the macro variables to consistent micro level initial conditions.

3 . 4 .
Restriction: compute the POD coefficients a(Projective integration: Integrate equation (3.12) to by standard ODE techniques to obtain

Figure 2 Figure 3
Figure 2Sketch of POD-assisted projective integration

Figure 6 Figure 7 Figure 8
Figure 6 Root mean square error of the equation-free POD as a function of the number of iterations.

Table 1
The values of the model parameters used in the model

Table 2
RMSE of , and a comparison between POD and equation-free POD for different t percentages of captured energy, here, snapshots is n=20, 36, 60

Table 3
Correlation as to 20 snapshots, 36 snapshots, and 60 snapshots for upper layer thickness (unit: m), zonal current velocity v (unit: m/s) a comparison between POD and equation-free POD in energy percentage of 99%