Implicit solution of the material transport in Stokes flow simulation: Toward thermal convection simulation surrounded by free surface

We present implicit time integration schemes suitable for modeling free surface Stokes flow dynamics with marker in cell (MIC) based spatial discretization. Our target is for example thermal convection surrounded by deformable surface boundaries to simulate the long term planetary formation process. The numerical system becomes stiff when the dynamical balancing time scale for the increasing/decreasing load by surface deformation is very short compared with the time scale associated with thermal convection. Any explicit time integration scheme will require very small time steps; otherwise, serious numerical oscillation (spurious solutions) will occur. The implicit time integration scheme possesses a wider stability region than the explicit method; therefore, it is suitable for stiff problems. To investigate an efficient solution method for the stiff Stokes flow system, we apply first (backward Euler (BE)) and second order (trapezoidal method (TR) and trapezoidal rule—backward difference formula (TR-BDF2)) accurate implicit methods for the MIC solution scheme. The introduction of implicit time integration schemes results in nonlinear systems of equations. We utilize a Jacobian free Newton Krylov (JFNK) based Newton framework to solve the resulting nonlinear equations. In this work we also investigate two efficient implicit solution strategies to reduce the computational cost when solving stiff nonlinear systems. The two methods differ in how the advective term in the material transport evolution equation is treated. We refer to the method that employs Lagrangian update as ‘‘fully implicit’’ (Imp), whilst the method that employs Eulerian update is referred to as ‘‘semi-implicit’’ (SImp). Using a finite difference (FD)method,we have performed a series of numerical experimentswhich clarify the accuracy of solutions and trade-off between the computational cost associated with the nonlinear solver and time step size. In comparison with the general explicit Euler method, the second order accurate Imp methods reduce total computational cost successfully through the utilization of a large time step without sacrificing accuracy and stability. Moreover, the proposed SImp method is effective in reducing the computational cost associated with evaluating the nonlinear residual while obtaining a solution similar to the Imp method. © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Developing numerical schemes to model the dynamics of the systems described by Stokes flow coupled with material transport and a free surface boundary condition is a significant challenge in computational geodynamics (i.e. [1][2][3][4][5]). Simulating such systems over million-year time scales enables numerical investigation of the interaction between the surface geometry (topography) and interior dynamics of planets, which is of fundamental importance * Corresponding author. for understanding processes such as mountain building, subduction initiation and planetary core formation [2,6,7].
Although numerous free surface implementations have been proposed previously [4,[8][9][10][11], almost all employ explicit time integration schemes. Such schemes are conditionally stable, and thus an inappropriate choice of the computation time step will result in spurious behavior, which typically manifests itself as an oscillation at the free surface interface [3,12,13]. The state of pressure within the Earth is dominated by a large background hydrostatic pressure. Assuming an average rock density to be 3300 kg/m 3  the dynamic pressure associated with the presence of a negatively buoyant mantle plume will be insignificant compared to the hydrostatic pressure. However, as the plume rises, the dynamic pressure associated with the upward traveling plume will become comparable in magnitude to the hydrostatic pressure. Near the surface of the Earth, the negatively buoyant material will generate significant dynamic pressure and result in uplift of the crust. Physically, pressure changes and isostatic relaxation of the surface recover this ''out of balanced'' situation and return the pressure field to a predominately hydrostatic state. However the time scale required to resolve these relaxation processes is much smaller than the time scale associated with the physical processes of interest (e.g. thermal convection). By utilizing an explicit time integrator, one requires a small time step to capture the relaxation of the surface. As a result, the computational cost for stable time integration when free surface dynamics are incorporated is far higher than that when a free non-deformable surface is adopted.
Several approaches for stabilization, or implicit treatment, have been recently proposed to avoid free surface oscillations in the context of finite element and finite difference methods [3,12,13]. In this work, we propose to treat advection as a coordinate nonlinearity coupled to the momentum equation, thereby defining a fully implicit time integration scheme. In the geodynamics community, nonlinear solvers have been applied to processes involving material non linearities (i.e. power law creep or plastic rheologies). However, the application of a nonlinear solver for implicit material advection to define implicit time integrators for stable free surface evolution has not been examined.
In this paper, we apply several implicit time integration schemes based on the ordinary differential equation (ODE) stability theory [14] for solving the sticky-air free surface problem of Stokes flow [2,15]. These implicit methods are implemented within the FD framework combined with the MIC method with nonlinear residuals derived from the formulation of the material transport. Resulting nonlinear equations are solved iteratively by the Newton-based nonlinear solver. In addition, we propose two types of the advection of the material properties for the nonlinear solver. One is a full implicit method that uses the MIC throughout the solution process. The second is a semi-implicit method that uses an Eulerian advection scheme for the nonlinear residual evaluation. We examine the solution quality and efficiency of these methods by performing numerical experiments that simulate (i) viscous relaxation of a sinusoidal topographic high and (ii) thermal evolution in a radial gravity field coupled with a free surface. Based on our numerical experiments, we discuss the different stability characteristics and accuracies and analyze the trade-offs between time step size and computational efficiency in an attempt to determine optimal strategies to the solution of time-dependent viscous flow calculations.

Basic Stokes flow system
We begin with a purely mechanical problem of Stokes equations to explain our solution procedures. The equations for an incompressible variable viscosity Stokes fluid in a domain Ω are given as where ρ and η denote the density and viscosity as function of the material composition φ. The pressure, velocity vector and gravity vector are denoted via p, u and g respectively. Along the boundary of Ω, denoted via ∂Ω, we impose the ''free-slip'' boundary condition given by where n, t denote the outward pointing unit normal vector and unit tangent vector to ∂Ω. The evolution of the material composition φ, expressed in Lagrangian frame of reference is given by Eqs. (4) and (5) represent the fluid element defined at the position x advect phase φ (i.e. Dφ Dt = 0 in the Eulerian frame).

Time integration
In this work, we used several explicit and implicit time integration schemes for an efficient solution of the numerically stiff system following the general ODE theory [14]. Assuming that the velocity u is uniquely obtained from the φ and x, we can regard Eqs. (4) and (5) as a system of ODEs. The complete set of ODEs given by Eqs. (4) and (5) represents a stiff system with free surface deformation. The applied time integration method to improve the stability and accuracy for such problems are described below.

Explicit method (Exp)
It is common practice within geodynamics simulations of the Stokes flow, to perform a splitting between the equations describing the motion of the fluid (Eqs. (1), (2)), and those defining the transport of material phase (Eqs. (4), (5)) Such a splitting gives rise to the explicit Euler time-stepping method (Exp). The procedure to update the position x is given by where (·) n is the value at nth time step. The Exp method is the most standard method used to solve the Stokes flow system because it is easy to implement and computationally inexpensive. However the Euler method is first order accurate in time and the explicit time step may require small t to avoid the spurious oscillations. We note that for the system of equations we consider in this work, there is no formal stability criterion defining a t which will yield a stable time-integration scheme.

One-step implicit methods (BE, TR)
For a stable time step integration that permits a large time step size to be used, an implicit time integration scheme is required [16]. We first introduce one-step methods, by which x n+1 is obtained directory by x n without any intermediate steps. One of the classical and useful methods is the backward Euler method (BE), which can be expressed by This method is first order accurate and L-stable, which is unconditionally stable and suppresses unphysical numerical oscillation against the large time step integration. Another frequently used one-step implicit method is the trapezoidal method (TR) [13,16], which can be described by This averaging scheme is second order accurate; however it is not L-stable. Thus, the stability region without significant spurious oscillation is limited although better than the Exp method.

Two-step implicit method (TR-BDF2)
As a high order accurate and L-stable method, we implement a two-step method based on the trapezoidal and backward differentiation formula method (TR-BDF2) [14] with the two-step procedures outlined below: 2nd step: x n+1 The first stage is the normal TR method applied over t/2 for obtaining the intermediate state at n + 1 2 . Then the backward differentiation method is applied at the second stage of Eq. (10) for the solution at n+1. The TR-BDF2 method is second order accurate and L-stable; however, double calculation cost against the one-step methods is required due to the two-step procedures.

Spatial discretization
Eqs. (1) and (2) are discretized in space using a standard staggered grid finite difference (FD) scheme defined on a rectangular mesh consisting of N × M control volumes. A linear Stokes flow problem with density and viscosity variations discretized by the FD method (e.g. [15,17,18]) can be obtained in matrix form as where X = (u g , p c ) T is the discrete velocity and pressure vector. The notation (·) g and (·) c denote vector components on the faces of the cell, and scalars on the cell center, respectively. K and G are the discretized matrix form for the gradient of deviatoric stress and gradient of pressure.
Following [15,19], we discretize the composition field φ using Lagrangian markers, thereby allowing us to track the evolution of material properties (e.g. η and ρ). We will denote a value defined on the marker via (·) m . Accordingly, Eqs. (4) and (5) become where u m is the velocity at position x m . The velocity u m is obtained by linearly interpolating values u g from the FD grid. This interpolation from the FD grid to the marker positions is denoted via Material properties defined on the markers (e.g. η m , ρ m ) are projected onto the centroid of each control volume using the following weighted-averaging scheme where the notations (·) k c and (·) p m are for the kth cell and pth marker from the complete discrete set (·) c and (·) m , respectively, Ω p is the quadrilateral (2D) or hexahedral (3D) region with size x centered at marker position x m p , Ω k is the kth cell region of the staggered grid and w k p (x m p ) is the fractional area (or volume) of Ω p ∩ Ω k .
The variables associated with the remapping procedure are shown in Fig. 1. Note that ρ g in Eq. (11) is obtained from ρ c by linear interpolation. This type of MIC scheme is commonly used in computational geodynamics (e.g. [15,19,20]), because it allows numerically non diffusive advection without numerical oscillation arising in general high order Eulerian advection scheme. In addition, with this type of MIC scheme, it is easy to implement a physical evolution model coupled to the Stokes flow dynamics. However, the computational cost is expensive compared with alternative full Eulerian methods.
In this work, we are interested in processes which involve a free surface boundary condition, e.g.
where Γ fs (u, t) denotes the location of the time-dependent free surface of the fluid domain (Ω f (u, t)). For a free-surface geometry which is not orthogonal to the coordinate system, the standard staggered grid discretization cannot be used to discrete the fluid domain Ω f (u, t). This short coming is alleviated via approximating the free-surface boundary condition using an approach routinely used in the geodynamics community known as the ''sticky-air'' technique [2]. The sticky-air approach is defined in the following way: 1. Given a fluid domain Ω f (u, t), we define free-surface boundaries Γ fs (u, t) and rest of the boundaries as Γ v (u, t) = ∂Ω f (u, t) \ Γ fs (u, t) (see Fig. 2). 2. Define a computational domain Ω with faces orthogonal to the 2D or 3D coordinate system such that Ω f (u, t) ⊂ Ω, and which possesses a boundary defined by ∂Ω = Γ v (u, t) ∪ Γ w (u, t). 3. The ''sticky-air'' region (Ω sa (u, t)) is defined as the subdomain enclosed by Γ w (u, t) ∪ Γ fs (u, t). The material inside Ω sa (u, t) defines a new composition which is represented using the phase φ of Lagrangian markers. The sticky-air material is defined as a fluid with low viscosity and zero density which is evolved according to the governing equations given by Eqs. (1), (2), (4), (5).
Provided sticky-air viscosity is small relative to the viscosity inside the fluid domain, the viscous shear stress across the free-surface interface is small and the boundary condition in Eq. (16) is closely approximated. The validity and accuracy of this approach has been carefully examined in [2].

Implicit solution for nonlinear problem
An implicit time integration method to solve the Stokes flow system coupled with the material transport (Eqs. (1), (2), (4), (5)) requires the solution of a nonlinear problem.

Definition of the nonlinear residual
Here, we define our nonlinear residuals for the implicit time integration schemes. To evaluate the nonlinear residual, we first update the material properties η c and ρ c for given trial velocity and . The nonlinear Stokes residual for a converged solution with MIC method is then evaluated for each time integration schemes.

Updating material property (Imp and SImp)
The nonlinearity for the implicit material transport comes from the updating material properties with X ′ . We have examined two types of calculation method for η c and ρ c . The difference between them in the residual evaluation gives fully implicit (Imp) and semiimplicit (SImp) method.
The Imp method evaluates the nonlinearity of η m and ρ m directly from the marker coordinates. Therefore, it is a fully general implicit method of the MIC scheme. The trial marker position x ′ m for a given velocity u ′ g is calculated by the each time integration schemes of Section 3 via Eq. (14). Then, we evaluate the material properties η c and ρ c from the markers at x ′ m via Eq. (15). The advection and remapping operations of markers should be performed every time nonlinearity is updated. Note that these operations are computationally expensive. Thus, the costs to evaluate the nonlinear residual with the MIC scheme is likely to dominate the total cost of the nonlinear solver when using the Imp method.
In the SImp method, the marker positions do not change when evaluating the nonlinear residual. Instead, an advection scheme on the Eulerian mesh is used to transport the material implicitly and thus update η c and ρ c . The position of each marker is only updated after obtaining the converged solution of the nonlinear problem. With the Eulerian advection method, we examined classical first and fifth order upwind FD methods (Appendix A). In this work, to avoid the time-step size limits by Courant-Friedrichs-Lewy (CFL) condition, the FD advection scheme is applied with the sub time steps, by which quantity is iteratively advected with the substep size t s to reach the t = N s t s . The number of substeps N s is chosen for t s so as to not exceed 0.8 x/|u max |.
The accuracy and stability obtained by this method can differ from the Imp method, because the profiles transported by the gridbased method are inconsistent with those defined by the markers. Therefore we consider this method as a semi-implicit time integration method. When using the Simp, we can expect reduced computational cost compared with Imp because the Eulerian advection does not require the expensive MIC remapping procedure at each nonlinear residual evaluation.

Stokes residual for implicit time-stepping method
The updated η c and ρ c using X ′ generate the nonlinear gridbased Stokes residuals, which are different for the implicit time integration schemes. For the BE method of Eq. (7), each marker is updated with Thus, the discrete nonlinear residual can be expressed as follows: In the same manner, the TR method of Eq. (8) updates marker position using x ′ The nonlinear residual for the TR method is given by For the two-step TR-BDF2 method of Eqs. (9) and (10) . At the second step, since the markers are updated using The algorithm for the nonlinear residual of each time stepping scheme (i.e. BE, TR and TR-BDF2) with updating the material property by Imp or SImp method is shown in Algorithm 1.
If Imp is used then: Elseif SImp is used then: Line 5.
Update η c and ρ c by upwind Eulerian advection schemes with u ′ g Line 6.
End if:

Nonlinear solver
The nonlinear Stokes problem are solved by the Newton-based solver in a globalized JFNK framework [21]. To define the nonlinear solver, for clarity we will refer to the nonlinear residual as F. Noting that depending on the choice of time integration scheme used, F will be defined via Eq. (17) The Newton correction without forming the Jacobian can be expressed aŝ Eq. (20) for the Newton direction δX ′ to update X ′ . The simple line search approach [22] is applied for the globalization to calculate the step size s which is used in The Newton iteration is deemed to be converged when the nonlinear residual satisfies the following: where ∥F 0 ∥ 2 is the 2-norm of the initial residual (i.e. the problem dependent). The algorithm for the globalized Newton-based scheme of MIC method is shown in Algorithm 2.
While not converged: Line 4.
Compute step size s Line 6. Update: In the JFNK approach, the matrix free product of operatorĴ is given bŷ where the parameter ϵ = √ [23]. The linear solve of Eq. (20) and Line 4 of the Algorithm 2 is performed using FGMRES [24], which is restarted every five iterations. Krylov iterations are terminated when the 2-norm of the initial residual is reduced by a factor of 10 2 . Although the choices of parameters in the matrix free product and Krylov subspace method affect the performance of our nonlinear solver, such side effects are limited and do not change our performance analysis drastically. We use the operator A[η c ] defined in Eq. (11) as the preconditioner for the true Jacobian. Note that by replacing the operatorĴ defined by the JFNK framework with A[η c ], we can convert the Newton algorithm into a Picard nonlinear solver, written in defect correction form [21]. The convergence of a globalized Newton-based scheme is highly dependent on the starting vector used. From numerical experimentation, we found that the robustness of time-dependent simulations can be improved by switching betweenĴ and A[η c ] alternately during the nonlinear iterations. Generally, efficient convergence is observed when Picard/JFNK is applied to nonlinear systems with initially large/small residuals, respectively. However the best choice of switching point is highly problem dependent.
Thus we have switched betweenĴ and A[η c ] every five nonlinear iterations. CPU time was used as heuristic to determine the order to be applied between Newton-Picard-Newton · · · (Newton first) and Picard-Newton-Picard · · · (Picard first). Every 20 time steps, we changed the order to the alternate order, i.e. Newton first or Picard first, and compared CPU time required with that required by the previous time step. Here, we assume that the problems of neighboring time steps have similar numerical difficulty. Then, the better ordered method relative to CPU time is utilized for the next 20 time steps.
The linear problem defined in Eq. (11) (used by Exp) and application of the preconditioner used with our JFNK framework are both solved via the sparse direct solver PARDISO from the Intel MKL library [25].

Coupling with energy equation
As an extension to thermal convection of Boussinesq fluids operating in an infinite Prandtl number regime, we consider the coupling of the following simple energy equation in Lagrangian frame to the mechanical system discussed in Section 2: where, for simplicity, κ is the thermal diffusivity taken as a constant for the entire domain. Variations in the density associated with temperature perturbations (Boussinesq approximation) are introduced via The change in the temperature field of Eq. (24) denoted T c is evaluated on the grid cell with T c given by where L is the discretized matrix of the Laplacian by the FD method. To update the temperature on the markers, we calculate where the linear interpolation is used for T m = Π[ T c ], for the one-step methods discussed in Section 3.2. Although the subgrid diffusion procedure on markers for T m is proposed by [15] in computational geodynamics field, we do not apply that for sim-plicity of error analysis in this work. For the two-step methods of Section 3.3, we calculate These are explicit Runge-Kutta methods which are first and second order accurate for Eq. (27) and Eqs. (28), (29), respectively. Different from the solution of the mechanical equations of Eqs. (1), (2), (4), (5), we can assume that this explicit time-stepping scheme will not cause spurious solutions because the diffusion time scale of our target problem is much larger than the dynamical time scale of the Stokes flow.

Numerical experiments
We conducted numerical experiments with two different buoyancy-driven processes to assess the performance of the different time integration techniques introduced. The first experiment focused on the relaxation of large wavelength surface deformation under gravity (''bump relaxation''). This test was used to evaluate the performance of the proposed techniques for regional scale simulations of subduction and mountain building. The second example we studied considers thermal evolution under the influence of a radial gravity field. Here the fluid domain is time-dependent as we assume a free surface boundary condition on the perimeter of the fluid domain. These experiments investigated the coupling between various wavelength topographic anomalies coupled with an energy equation. This problem scenario is of relevance for planetary core formation (e.g. [7,8,26]).
In both experiments, the domain is two dimensional and defined by Ω = 1 × 1. The FD mesh employed to solve the Stokes problem utilized N × N control volumes in the x 1 and x 2 directions respectively. On the boundary of the FD mesh, we applied free slip boundary conditions. The free surface boundary condition approximated by surrounding low viscosity fluid with zero mass density, so-called ''sticky-air'' technique (see Section 4). Each control volume in the FD mesh was populated with markers. Markers were regularly spaced within each control volume with a displacement x/6, where x = 1/N is the size of each control volume. The time step size t is taken as constant.
The presented quantities are reported in their non-dimensional form. We employed the following scales Computational experiments were performed using 16 cores of a Xeon(R) E5-2650 v2 CPU. Fig. 3(a) shows the initial setting of the bump relaxation test consisting of two layers labeled φ = 1 and φ = 2. In this test, we assumed that α = 0 for each material (Eq. (25)), thereby decoupling the energy transport from the momentum equations. Markers located below the cosine curve given by

Bump relaxation test
where h 0 = 0.5, δh = 0.025 and k 0 = 2π are the initial thickness, perturbed amplitude and wavenumber, represent the bottom dense material (φ = 2). Otherwise markers are assumed to represent the sticky-air layer (φ = 1). In the simulation, we observed gradual relaxation of the bump to a flat shape under the gravity (0, −g). We monitored the marker position (0.5, h p (t)) at   the top of the bump (denoted by the star in Fig. 3(a)) to evaluate the topographic change.
The solution of this bump relaxation problem with small stickyair viscosity approaches the exponential decay of topographic anomaly [13], using τ = With this decay time, the analytic top topography is given by To check the convergence property, we first compared the initial velocity at the top of the bumpḣ p (t = 0) for various grid size N and sticky-air viscosity η 1 . Fig. 4 shows the deviations of the numerical solution from the analytical velocityḣ analy (t = 0) with ρ 1 = 0, ρ 2 = 1 and η 2 = 1. From the result, we used the model with η 1 = 10 −3 and N = 128 to analyze each time integration method with numerically converged solution to the analytic decay in the bump test. The material parameters used for each phase are also provided in Table 1. To validate the accuracy of the bump test, we introduced the error integrated in time defined by where t n = n t and n max is the number of time steps required to reach the target time with t end = 50τ (i.e. t n max −1 < t end ≤ t n max ).
We evaluated the error not only for our 2D calculation result but also for the solution of the 1D initial value problem (IVP) of the exponential decay (Appendix B) to observe the consistency with ODE theory.
We denote the solution schemes using the Imp, SImp and 1D IVP, as Imp_tstep, SImp_tstep and 1D_tstep, respectively, for each time-stepping method tstep = BE, TR or TR-BDF2. The explicit method for 1D IVP is also referred to as the 1D_Exp.
The stopping condition for the nonlinear solver of Eq. (22) is given by

Explicit method (Exp): bump test
The errors by the Exp method for several different t are plotted in Fig. 5. Note that our solution agrees with that of 1D_Exp. Following ODE theory, the solution obtained by the Exp demonstrates the first order accuracy for small step size t/τ < 1, deviates from the first order for 1 ≤ t/τ < 2, and diverges at t/τ ≥ 2. The solution by the Exp with small time step ( t/τ = 0.1) shows that the peak topographic height decays via viscous relaxation and then maintains a constant value for the remainder of the experiment. With t/τ = 2, the peak topography initially collapses far faster than the analytic solution. It then oscillates strongly to achieve balance but ultimately fails to reach the steady state.

Implicit method (Imp): bump test
Next, we examined the solution using the Imp methods. Our implicit solvers based on a nonlinear iterative method have two control parameters, i.e. nonlinear relative residual tolerance (denoted eitr in Eq. (22)) and time step t. The smaller eitr or t are the higher the accuracy of the solution; however, the calculation is more expensive. We first examined the quality of solutions as a function of eitr and different t values. Fig. 5 shows the error using the Imp_BE with different eiter values. For all examined t, eiter = 10 −3 was sufficiently small to  agree with the 1D_BE solution and converged to the solution with eiter = 10 −4 . Therefore, we use eiter = 10 −3 as the reference tolerance of the nonlinear solver for the implicit methods. Compared with the Exp method with t/τ > 1, the BE formulation allows the use of larger time steps without compromising accuracy due to the oscillatory behavior. On the other hand, the accuracy of the Imp_BE is the first order as same as the Exp method. For a higher order convergence method, we examined second order TR and TR-BDF2 schemes. The errors obtained by the second order Imp methods are plotted in Fig. 7. We can confirm numerically that the error curves for both integration methods are second order accurate and fitted to the 1D IVP solutions. The second order methods have a clear advantage in that they can achieve a certain solution accuracy at a larger time step size than that required by the first order Exp or BE methods.
The Imp_TR-BDF2 shows better convergence than the Imp_TR at the same step size. However note that the two step Imp_TR-BDF2 requires almost twice the calculation process than the Imp_TR. Thus, the advantage of using Imp_TR-BDF2 instead of Imp_TR is the oscillatory behavior of the solution at a large time step rather than the better accuracy for t/τ < 2. With large t, the solution obtained using the TR scheme deviates significantly from the analytic solution with numerical oscillation. On the other hand, the TR-BDF2 is L-stable, same as BE, thus such deviation is saturated for larger time step size over t/τ ≥ 4 (note that the BE works much better at these large time step). The difference between Imp_TR and Imp_TR-BDF2 is also evident in the marker position shown in Fig. 6. The solution obtained using Imp_TR-BDF2 shows that the oscillation equilibrates rapidly to steady state.
From these observations, we selected the Imp_TR-BDF2 as the standard implicit time integration scheme to obtain efficient second order convergence and avoid drastic degradation of solution accuracy at a large time step. In the reminder of this paper, we focus on the performance of the implicit solution methods with the TR-BDF2 time-stepping scheme.

Semi-implicit method (SImp): bump test
We examine the SImp method solved by the TR-BDF2 scheme (SImp_TR-BDF2) with nonlinear tolerance eiter = 10 −3 . In the bump test, we show the result for the first order upwind FD method because we could not find the significant difference between first and fifth order FD methods. The observed errors using SImp_TR-BDF2 are shown in Fig. 7 (error using SImp_BE and SImp_TR are also plotted for reference). The results obtained using SImp_TR-BDF2 agree well with the Imp_TR-BDF2 solutions, indicating that FD advection captures the nonlinearity by the marker update successfully.

Computational performance comparison: bump test
Here, we compare the computational efficiency of the Exp and implicit TR-BDF2 methods. The computational cost is a function of the number of time step, and the CPU time required to perform each time step. CPU time per time step can be further decomposed as follows: (number of iterations of nonlinear solver) × (cost per nonlinear iteration).
In Fig. 8, we compare the CPU time required by different solution methods to reach t end = 50τ . Using time step t/τ = 0.1, the ranking of CPU time is as follows: time(Exp) < time(SImp_TR-BDF2) < time(Imp_TR-BDF2). The high computational cost for the nonlinear residual evaluation of the Imp method results in the greatest CPU time. The SImp_TR-BDF2 is the second fastest owing to a less expensive nonlinear residual evaluation. Per time step, the Exp method is the least expensive but use of a large time step with the Exp method is not possible due to the bounded stability region. On the other hand, the implicit solver is more expensive than the explicit method per time step. However, this additional cost comes with an advantage, i.e. the implicit method's time integrator can restrict the selection of the time step only on the basis of an acceptable temporal error. As for the error, the second order accuracy of the TR-BDF2 method achieves the same solution quality with larger time step size than that using the first order Exp method. In the bump test, the solutions at t/τ = 0.1 by the Exp method and that at t/τ = 1 by the implicit TR-BDF2 methods are similar in terms of δh p ∼ 10 −4 . Although total elapsed CPU time depends on the cost of the nonlinear iteration, the use of Imp_TR-BDF2 with t/τ = 1 results in a reduction of approximately 70% CPU time compared with the Exp method at t/τ = 0.1, as shown in Fig. 8. Further reduction of the cost for the implicit method is obtained by the SImp_TR-BDF2 using a cheaper nonlinear solver than that associated with the Imp, as the nonlinear residual is evaluated using an Eulerian FD method, as opposed to the more computationally expensive marker-mesh remapping procedure by Eq. (15).
For t/τ > 2, the Imp_TR-BDF2 fails to reduce CPU time with increasing step size and the difference between the cost of Imp_TR-BDF2 and SImp_TR-BDF2 increases. This observation is derived from the fact that the convergence of the nonlinear problem becomes increasingly more difficult as t increases; therefore, the number of nonlinear iterations which indicates the cost difference for Imp_TR-BDF2 and SImp_TR-BDF2, increases. This additional trade-off also indicates that the use of a large t does not always guarantee reduction in CPU time, even though time integration is stable for large t.

Free surface thermal evolution test
Here we discuss the numerical experiments of a thermally driven convecting fluid, surrounded by a low viscosity fluid. The initial material geometry with three layers is illustrated in Fig. 3(b).
We assume that a radial gravity acceleration is given by g i = −gr cen wherer cen is the unit vector of r cen = (x 1 −0.5, x 2 −0.5). Although a self-gravitating force is commonly employed for g i in the planetary problems [7,8,26], we simply use constant acceleration to focus on solution quality and solver performance.
The inner most layer (φ = 3) consists of a high density and low viscosity fluid. The temperature of φ = 3 is fixed at T = 1. This region mimics the planet's core, which allows us to approximate a stress free and hot thermal boundary condition of the middle layer.
The middle layer (φ = 2) represents a simplified mantle layer. We are primarily interested in studying the dynamics of the middle layer. The initial temperature of the middle layer is given by where the second term on the right hand side is the initial perturbation with cos(θ ) = x/r cen and wavenumber k 0 = 6π [28]. The outer most layer (φ = 1) is the sticky-air layer, which has zero density, zero temperature T = 0 and a very low viscosity in order to numerically approximate the free surface condition of the middle layer.
To evaluate the deviation from the hydrostatic balance, the stopping condition for the nonlinear solver of Eq. (22) at the nth time step is defined with Once we began the simulation, upwelling plumes rose from the inner layer to the outer layer as shown in Fig. 9(a). At the early stage of calculation, the growth of the initial perturbation dominated the upwelling plumes, but various wavelength instabilities evolved to create a complicated convection mode under the employed Re with the current of times. Thus the symmetry of the temperature profile was found to be slightly broken at t = 4 × 10 5 as shown in Fig. 9(b-d).
To analyze each time-stepping method, we evaluated the error in the following form: Here we use τ discussed in Section 7.1 as the characteristic decay time of surface deformation to represent the time step size t. In the thermal test, we employed the TR-BDF2 method as the implicit time integration scheme. Accordingly, we denote Imp_TR-BDF2 and SImp_TR-BDF2 as Imp and SImp, respectively, in the following subsections.

Explicit method (Exp): thermal test
First, we solve the problem using the explicit method. Fig. 10 shows behavior is smaller than that for the bump test t/τ ≤ 1.0 because this problem deals with shorter wavelength than the bump test.

Implicit method (Imp): thermal test
First we determined an appropriate implicit solution parameter from δu ave at t = 2.0 × 10 4 calculated by the Imp for several different values of eitr and t as shown in Fig. 10.
The deviation of the solutions converges with small eiter. The  implicit solution with smaller t is found to require smaller tolerance for convergence. This means that the suitable tolerance eitr for efficient computational performance should be distinct for different t. The observed relationship suggests that to avoid the use of eitr that is too small at large t, the proper solution parameters in this test are (eitr = 10 −2 for t/τ = 8), (eitr = 10 −3 for t/τ = 4), (eitr = 10 −3 for t/τ = 2), (eitr = 10 −4 for t/τ = 1) and (eitr = 10 −5 for t/τ = 0.4 and 0.2). In the following thermal tests, these tolerances were used for the nonlinear solver of the implicit solution methods. The converged solutions by the Imp with these eiter values are replotted in Fig. 11, which shows close to the second order accuracy. At a similar level of accuracy, the Imp results in a larger time step than the Exp method. In addition, unlike the Exp method, the Imp scheme does not largely degrade the quality of solution for To see the impact of the error contribution of δu ave in global scale phenomena, the temperature field obtained using the Imp with t/τ = 4 at t = 4.0 × 10 5 is illustrated in Fig. 9(c). In Fig. 11, the δu ave obtained by the Imp with t/τ = 4 is greater than that obtained by Exp with t/τ = 0.2 at t = 2.0 × 10 4 .
However, their thermal fields are visually consistent (see Fig. 9(b) for reference) because the transient surface relaxation process of small scale bumps does not play an important role for global scale phenomena. This suggests that the acceptable temporal error to capture thermal evolution surrounded by a free surface can be greater than the upper limit handled by the Exp method with t/τ = 0.2.

Semi-implicit method (SImp): thermal test
In this test, two types of Euler advection methods were examined to evaluate the nonlinear residual (see Section 5.1.1 and Appendix A). Here, we denote the SImp with the first and fifth order upwind methods as SImp_1st and SImp_5th, respectively. Fig. 11 shows the numerical convergence errors at t = 2.0 × 10 4 obtained by the Simp methods. The SImp_1st solution is found to deviate from that obtained by the Imp because thermal instabilities growing at the high wavenumber range are difficult to resolve using the first order FD advection scheme. The FD advection schemes inherently show dispersion and dissipation errors to transport Fourier components at a high wavenumber range (e.g. [29,30]), although, essentially such errors do not appear with the marker advection of the Imp. The deviation between the Imp and SImp_1st is not found in the bump test because the bump test only exhibits the growth of initial low wavelength mode. Since higher order FD methods are less diffusive and thus can capture a wider wavenumber range, the fifth order upwind method is expected to reduce such dispersion errors. In Fig. 11, the solution of the SImp_5th agrees well with that of the Imp at t = 2.0 × 10 4 . The improvement achieved using the fifth order FD method, compared with the first order method, is clearly evident. From these results, resolving higher wave modes by a lower diffusive advection scheme is found to improve the Simp solution quality. However, when time is increased to t = 5.0×10 4 , the SImp_5th solution begins to deviate from the Imp solution as shown in Fig. 12, although SImp_5th shows better agreement with the Imp than SImp_1st. This is due to the growth of instability at much higher wavenumbers, which cannot well-resolved by the fifth order method. We can expect further improvement by using a more sophisticated lower diffusive advection method (e.g. [30][31][32]). Fig. 9(d) shows the temperature profile at t = 4 × 10 5 for the SImp_5th with t/τ = 4. This structure and the structure using the Imp as shown in Fig. 9(c), are approximately the same. The fine scale structures differ slightly due to the numerical diffusivity in the temporal Eulerian advection of the SImp_5th. However the difference for the whole convective structures is negligible, and the numerical convergence against t and the higher order advection scheme are observed in the numerical error analysis. These results justify the use of the SImp_5th to obtain the similar solution qualities of the Imp.

Computational performance comparison: thermal test
We compared the CPU times required to reach time t = 4 ×10 5 for different time integration schemes, i.e. Exp, Imp and SImp_5th in Fig. 13. The Exp shows the best cost performance without numerical solution oscillations at time step t/τ = 0.2. With this time step size, the Exp is the fastest among the examined time integration methods.
The quality of solutions for the implicit methods with t/τ = 1 were similar to that using the Exp with t/τ = 0.1 in Fig. 12. When we compare the CPU time at t/τ = 1, the SImp_5th shows the best performance due to reduction in CPU time associated with the calculation cost of the nonlinear solver. The SImp_5th spends approximately half of the CPU time required for the Exp at t/τ = 0.1, although the Imp requires the approximately the same CPU time due to the cost increase incurred by the nonlinear solver.
With these solution parameters, the SImp_5th is found to reduce approximately 65% CPU time compared with the Exp. On the other hand, it is still difficult for the Imp to outperform the Exp method. Fig. 14 shows the detailed cost balance of each simulation for the Exp method with t/τ = 0.2, the Imp with t/τ = 4 and the SImp with t/τ = 4. Since the Exp method requires many more time steps, overall most CPU time is spent in the linear solver. In contrast, the Imp shows smaller cost for the linear solver; however, the nonlinear residual evaluation employing the expensive remapping procedure dominates the execution time.
We argue that the trade-off between implicit and explicit solver will become more apparent as the mesh spacing decreases. The SImp shows the reduced cost not only from using large t, but also from the inexpensive nonlinear residual evaluation.

Conclusions
We have examined several types of implicit schemes of the MIC method, which were applied to Stokes flow problems employing time-dependent material properties and an approximate free surface. From numerical experiments, we observed the following.
1. It is difficult to solve stiff problems with the sticky-air free surface deformation using the explicit time integration scheme because the available time step is bound by the decay time of the transient free surface relaxation. Our implicit material transport techniques with wide stability region can handle large t over such short decay time without numerical oscillation of the solution. 2. In the numerical experiments, the observed accuracy and stability of the implicit time integration methods are consistent with those obtained by ODE theory (i.e. first or second order, with or without L-stability). The TR-BDF2 method is attractive in terms of second order accuracy and is L-stable for solving the stiff free surface problems. For a target solution accuracy, the second order implicit methods can use larger t than the first order explicit method. 3. There is a complicated, problem dependent trade-off between the increasing cost of the nonlinear solver and the reduced number of time steps required by using large t values. In both applications examined, the implicit time integration succeeds in reducing total calculation cost using large t to obtain similar solutions of fully explicit methods with small t. 4. The Imp method which uses markers throughout the nonlinear solver gives stable and oscillation-less behavior for large t using L-stable time integration scheme. The cost of evaluating the nonlinear residual dominates total CPU time. 5. The SImp method advects quantities in the Eulerian frame to evaluate the nonlinearity associated with material transport. The SImp method produced solutions similar to those obtained by the Imp method in our experiments. Although the Euler advection method may generate a numerically diffusive error at a high wavenumber mode, the use of a higher order FD scheme could improve such degradation of solution accuracy. Compared with the Imp, the SImp can reduce the cost of the nonlinear solver significantly because the residual evaluation is less expensive. In summary, the implicit material transport implemented by the nonlinear iterative solver can reduce the CPU time required to solve buoyancy driven processes that possess a free surface. An enlarged stability domain, second-order accuracy, and the resulting larger time steps permitted were found to be beneficial, even though the implicit algorithms require the solution of a nonlinear problem. The proposed semi-implicit strategy reduced computational cost of the fully implicit method significantly.
In the future, we will apply the developed implicit advection strategy to three-dimensional planetary scale simulations. We are especially interested in simulating convective flow patterns to study the thermal anomaly associated with sinking iron diapirs, which is a process that plays a critical role during planetary core formation [7].
The advection equation of quantity q in the Eulerian frame is given as follows: In the treatment of SImp, we numerically solve Eq. (A.1) using the Euler method. In this work, we examined first and fifth order upwind FD discretization schemes for the spatial derivatives of q, which are given as follows: where I is the grid point in the ith direction.

Appendix B. One dimensional ODE solutions
To validate our Stokes flow solutions for different time-stepping methods in the bump test (Section 7.1), we calculate the numerical solution of the ODE of the viscous bump relaxation given by where ξ = t/τ . The initial value is h p (0) = h 0 + δh.