Evaluation of Implicit‐Explicit Additive Runge‐Kutta Integrators for the HOMME‐NH Dynamical Core

The nonhydrostatic High‐Order Method Modeling Environment (HOMME‐NH) atmospheric dynamical core supports acoustic waves that propagate significantly faster than the advective wind speed, thus greatly limiting the time step size that can be used with standard explicit time integration methods. Resolving acoustic waves is unnecessary for accurate climate and weather prediction. This numerical stiffness is addressed herein by considering implicit‐explicit additive Runge‐Kutta (ARK IMEX) methods that can treat the acoustic waves in a stable manner without requiring implicit treatment of nonstiff modes. Various ARK IMEX methods are evaluated for their efficiency in producing accurate solutions, ability to take large time step sizes, and sensitivity to grid cell length ratio. Both the gravity wave test and baroclinic instability test from the 2012 Dynamical Core Model Intercomparison Project are used to recommend 5 of the 27 ARK IMEX methods tested for use in HOMME‐NH.


Introduction
In recent years, the global atmospheric modeling community has directed substantial effort toward the development of nonhydrostatic global atmospheric modeling systems (Ullrich et al., 2017). Although global weather prediction systems, such as the Integrated Forecast System (Wedi et al., 2015), have been run at nonhydrostatic resolutions for years, modern supercomputers are reaching the point where globally uniform nonhydrostatic simulations at sub-10-km horizontal grid resolution are now possible on longer timescales for seasonal to decadal prediction (Reale et al., 2017). Even on climatological time scales, investments in the development of regionally refined models, also known as variable-resolution models, have permitted limited areas of the domain to be simulated with horizontal resolution that is more typical of regional climate models (Harris et al., 2016;Huang et al., 2016;Rauscher & Ringler, 2014;Rhoades et al., 2018;Zarzycki et al., 2014).
A major hurdle in transitioning from hydrostatic to nonhydrostatic models is the numerical stiffness associated with vertical propagation of sound waves. Such waves are largely irrelevant to weather or climate modeling, but the fine vertical grid spacing (on the order of tens of meters) necessitated by the highly stratified nature of the atmosphere relative to the horizontal grid spacing (on the order of 1 km or more) provides a major source of numerical stiffness. The standard strategy of using fully implicit methods for integrating numerically stiff equations can be a computational burden in operational models (Evans et al., 2017;Lott et al., 2015), and consequently, several alternatives have been developed. One alternative to using fully implicit methods is to use modified equation sets that do not support vertically propagating waves (Arakawa & Konor, 2009;Durran, 1989;Ogura & Phillips, 1962). However, these equations either cannot be employed on all scales or require global communication at each time step (i.e., Davies et al., 2003;Klein et al., 2010).
waves. The ARK IMEX methods treat the "fast" terms implicitly while retaining an explicit treatment of the "slow" terms, with coefficients and ordering devised in such a manner as to ensure stability and accuracy.
A vast library of ARK IMEX methods are now available throughout the published literature, but only a few studies have assessed these methods in the context of nonhydrostatic global atmospheric models Giraldo et al., 2013). It is thus the goal of this paper to outline a number of metrics that can be used in performing this assessment and apply these metrics in the context of a new nonhydrostatic dynamical core.
The dynamical core being assessed in this work is the spectral element nonhydrostatic High-Order Method Modeling Environment: HOMME-NH . The hydrostatic version of HOMME-NH is presently used by the Energy Exascale Earth System Model (E3SM) atmospheric component model (E3SM Project, 2018) and the Community Atmosphere Model Spectral Element dynamical core (Dennis et al., 2012;Rasch et al., 2018). The spectral element method used in HOMME-NH has many desirable properties for atmospheric modeling including parallel scalability, flexibility, and accuracy (Marras et al., 2016).
Of particular importance for this study, the eigenvalues of the spatial SE operator are purely imaginary when explicit diffusion is disabled . This characteristic suggests that a desirable property of ARK IMEX methods is that the stability region of the explicit method encompasses as large a region of the imaginary axis as possible, that is, Kinnmark and Gray (1984). Although the recommendations for ARK IMEX methods reached at the end of this paper is based on only a single model, the authors believe that these conclusions are likely applicable to any problem where the spatial operators for the explicitly treated problem components have purely imaginary eigenvalues.
The dynamical core studied in Gardner et al. (2018) is similar to HOMME-NH in that both use the spectral element method and, therefore, have a spatial operator with purely imaginary eigenvalues. Furthermore, both dynamical cores partition the computational grid such that every vertical column is placed on only one computational node. This allows for the implicit treatment of vertical terms without internode communication. The two dynamical cores do, however, use different model formulations for the atmosphere. Unlike the formulation studied in Gardner et al. (2018), there are only two forcing terms responsible for vertical acoustic wave propagation in the HOMME-NH formulation. As such, this work considers a single splitting that treats only those two terms implicitly. No additional investigation into treating horizontal terms implicitly is conducted because Gardner et al. (2018) found such splitting incurs too much communication cost to be beneficial. Instead, the focus herein is on investigating 18 new ARK IMEX methods specifically developed for systems with purely imaginary eigenvalues, in addition to 9 methods from Gardner et al. (2018) that performed well on the formulation considered there. This work also extends the acceptable solution accuracy criterion in Gardner et al. (2018) to a criterion based on surface pressure.
The remainder of this paper begins with a brief review of the nonhydrostatic model used by HOMME-NH in section 2. Section 3 then discusses the spatial discretization and hyperviscosity treatment, which has implications on time integrator performance, as well as the 27 time integration schemes that will be evaluated. Section 4 presents the various evaluation criteria and results including verification, energy conservation, and time-to-solution tests. Recommendations on time integration schemes for both HOMME-NH and similar conservative systems, with purely imaginary spatial eigenvalues, are provided in section 5. That section also recommends some improvements outside of the time integration scheme, which will likely affect time integration performance, before the paper concludes in section 6.

Journal of Advances in Modeling Earth Systems
10.1029/2019MS001700

The HOMME Nonhydrostatic Atmosphere Model
The formulation of the nonhydrostatic primitive equations used in HOMME-NH is a modified version of that developed by Laprise (1992): where g is the acceleration due to gravity, the density, p the pressure, w the velocity component in the direction of the spherical radius (vertical velocity), u the remaining velocity components (horizontal velocity), the geopotential, the potential virtual temperature, and the hydrostatic pressure gradient. After noting that the hydrostatic pressure is the result of vertically integrating g , the hybrid mass-based vertical coordinate is defined implicitly: = A( )p 0 + B( )p s , where p 0 is the top of atmosphere pressure, p s is the surface pressure, and (A, B) are linear functions of such that (A, B) = (1, 0) when represents the top of the atmosphere and (A, B) = (0, 1) when represents the surface. The horizontal gradient, divergence, and curl operators in this pressure-based coordinate system are denoted ∇ , ∇ ·, and ∇ ×, respectively. Further details regarding the definition, derivation, and boundary conditions of the HOMME-NH primitive equations can be found in Taylor et al. (2019).

Numerical Methods
To solve (1), the unknown fields u, w, , Θ, are discretely represented on a cubed-sphere computational grid. In the horizontal direction, the unknown fields and associated horizontal derivative operators (i.e., ∇ ·, ∇ , and ∇ ×) in (1) are represented using a fourth-order, mimetic, spectral element method (Taylor & Fournier, 2010). These discretized operators are mimetic in that ∇ · and ∇ are adjoints in a discrete sense. In the vertical direction, the unknown fields are staggered with some quantities expressed at grid cell midpoints and others at grid cell interfaces. The midpoint quantities are u, , and . The interface quantities are w and . For vertical-derivative operators (i.e., ∕ ), the second-order accurate SB81 (Simmons & Burridge, 1981) approach is used. The SB81 discretization is also mimetic and supports a discrete product rule for the discrete spatial derivative operator ∕ . Note that these mimetic properties result in a discrete system that both conserves energy and, when linearized about a steady-state solution, has eigenvalues along the imaginary axis. It should also be noted that the computational grid can optionally be remapped in the vertical direction to improve numerical stability; however, the current use of the parabolic spline method (Zerroukat et al., 2006) does not conserve energy in a discrete sense. Additional spatial discretization details can be found in Taylor et al. (2019).

Integration in Time
After the discretization of the spatial-differential operators, (1) is now a system of ordinary differential equations: dq∕dt = f(q), where q(t) is the vector of all unknown grid quantities. To address the numerical stiffness caused by the presence of acoustic waves, a typical approach is to separate out the components of f(q) responsible for vertical acoustic wave propagation. For some nonhydrostatic models, this is not a trivial task . For (1), however, the terms to be separated are those that couple the vertical velocity w, the geopotential , and the nondimensional quantity . Note that if the system is hydrostatic, with no vertically propagating acoustic waves, then will be identically 1 because p = . This observation

10.1029/2019MS001700
motivates an additive splitting, f(q) = f E (q) + f I (q), given by where i enumerates values at the center of the grid cell, vertically, and j enumerates values at the vertical edges.
ARK IMEX methods are designed for additively split systems such as (2), where an explicit RK method is used on f E while a diagonally implicit RK method is used on f I . With q n denoting the approximate solution of q(t) at time t n , these ARK IMEX methods approximate q(t n + Δt), denoted q n+1 , via , where Various conditions on a E i, , a I i, , b E i , and b I i exists to ensure a certain order of accuracy (Araujo et al., 1997). The remaining degrees of freedom can used to enforce stability properties and improve other aspects of the methods, leading to an abundance of potential IMEX RK methods in the literature. Considering the purely imaginary eigenvalues associated with the linearization of (1) and the results of Gardner et al. (2018), 26 existing ARK IMEX methods and 1 explicit RK method are selected for evaluation. In addition, a method developed by one of the authors specifically for nonhydrostatic atmosphere models is also evaluated. All 28 methods are listed in Table 1, along with their theoretical order of accuracy, number of stages requiring a nonlinear solve (f I ), number of stages requiring an evaluation of f E (z ) (f E ), and references. Table B1 contains a more exhaustive list of properties for each ARK IMEX method and plots of the stability regions for each method are available in a Zenodo archive (Vogl et al., 2019a).

ARK IMEX Implementation Using ARKode
Each stage z i of an ARK IMEX method requires solving a nonlinear system F(z i ) = 0. The formation and solving of this system is done using the ARKode package of SUNDIALS (Hindmarsh et al., 2005;Woodward et al., 2018). While ARKode can adaptively adjust time step sizes by default, this feature is disabled so the time step size is fixed. The nonlinear system is solved using a Newton approach, where approximations z (m) i to stages z i for forming q n+1 are iterated from an initial guess z (0) i , chosen to be q n : Because f I only involves vertical derivatives in (2), the linear system for (m+1) i is block diagonal, with one block per vertical column of the computational grid. The resulting linear solve requires no parallel communication because each block is local to single processor. Furthermore, each block is tridiagonal , involving only j and w j , and is thus solved using LAPACK's tridiagonal solver routines DGTTRF and DGTTRS. The stopping criterion for ARKode is that where R 0 i = 1 and = 10 −1 . For a solution vector q with N components, the weighted root mean squared norm || · || WRMS is defined as where r = 10 −6 , a,k = 10 r when q k is either u k or w k , a,k = 10 5 r when q k is k , a,k = 10 6 r when q k is k , and a,k = r when q k is k . Here, v k corresponds to the kth component of the previous time step solution. SUNDIALS 3.1.2 was the current version available when this work began and, as such, is the version used herein. The ARKode framework in SUNDIALS 3.1.2 assumes a global nonlinear system and applies the aforementioned Newton method and stopping criterion to all grid cells together. While this approach does not take full advantage of the column structure of the HOMME-NH nonlinear system, the global Newton approach used herein typically required two to three iterations to reach the stopping criterion using the values of r and a,k mentioned above. The authors note that more recent versions of SUNDIALS allow for a user-defined nonlinear solver object, where the Newton method and stopping criterion could be applied to each grid column individually. Given the low numbers of iteratinos needed, we expect the nonlinear solver approach used here is sufficiently efficient to allow for accurate integrator assessments.

Hyperviscosity
Although the spectral element method used in HOMME-NH conserves energy in a discrete sense, it also produces persistent nonphysical waves . To address this in HOMME-NH, hyperviscosity is applied by adding fourth-order, horizontal derivatives to the right-hand side of the ODE system: where is the hyperviscosity magnitude. This approach stabilizes the nonphysical waves at the cost of adding a tunable amount of energy dissipation to the system. The hyperviscosity operator is applied, in a substepped fashion if necessary, after (2) is advanced in time. More specifically, the ARK IMEX update is performed first and followed by the hyperviscosity update: This approach does limit formal convergence to first order in time both because of the first-order Euler time integration and the first-order operator splitting; however, this limitation is usually only seen at time step sizes significantly below operational Δt values. Furthermore, this approach only requires K computations of Δ 2 u versus computing Δ 2 z u at each stage of the ARK IMEX method.

Evaluation of ARK Methods
The ARK IMEX methods described in section 3 are evaluated on test problems defined at the 2012 Dynamical Core Model Intercomparison Project, summarized by . The tests are all conducted on the Quartz machine at Lawrence Livermore National Laboratory. Each Quartz node has a 2.1-GHz Intel Xeon E5-2695 v4 CPU with 36 cores and 128 GB of memory connected to an Omni-Path interconnect. To build the HOMME standalone code, the Intel 16.0.3 compiler is used with MPICH 3.2, NetCDF-C 4.5.0, NetCDF-Fortran 4.4.4, HDF5 1.10.3, and SUNDIALS 3.1.2. A test problem simulating nonorographic gravity waves on a small planet is first used to verify the implementation of the methods. A second test problem simulating a baroclinic instability on a full planet is used to assess energy conservation of the methods and then to evaluate the methods on accurate solution efficiency and the ability to take large time step sizes. Both test problems are run using 216 MPI ranks across six computational nodes. The HOMME-NH code and namelist files are available in a Zenodo archive (Vogl et al., 2019b).

Nonorographic Gravity Wave Test
For the nonorographic gravity wave test problem, a planet of radius 1∕125 that of Earth with no rotation or surface topography is used. With a prescribed zonal wind speed, pressure, and temperature fields can be chosen so that the system initially is in a steady state. The initial condition for this test uses these fields with a perturbation added to the potential temperature to generate a gravity wave. Full details of the setup are available from .
A convergence study is performed to verify the implementation of the ARK IMEX methods, where observed convergence rates of the temperature field are compared to their theoretical predictions. The spatial resolution is held fixed at 4,374 fourth-order horizontal elements (≈1-km spacing) with 20 vertical levels of uniform height for each element. For this test, no vertical remap or hyperviscosity substepping is used (K = 0). The time step size (Δt) is varied for each of the ARK IMEX methods. In lieu of an analytic solution, a reference solution from applying the explicit KGU35 method (Guerra & Ullrich, 2016) with Δt = 3.90625E−4 is used. The maximum relative temperature error across all grid cells at t = 5 hr is shown for each Δt in Figure 1.
Note that in the absence of hyperviscosity, the errors of the various ARK IMEX methods decrease at a certain order as Δt is decreased until they reach around 7 * 10 −11 . The magnitude of this floor coincides with accumulated numerical round-off error. Using the Intel epsilon function, the accumulated round-off error is approximated by multiplying the obtained machine epsilon value (2.220446049250313 * 10 −16 ) by the number of time steps taken in the reference solution (768,000). The resulting value is shown in Figure 1 as a horizontal dashed line. A curve of the form Δt is fit through the two lowest error values that reside above the dashed line for each method. Table 2 shows good agreement between the measured order and the order predicted by numerical analysis theory. Similar convergence results for the IMKG methods can be found in Steyer et al. (2019).  Table 2 Results with hyperviscosity applied are also shown in Figure 1. The particular value for the hyperviscosity coefficient, = 5 * 10 8 m 4 /s, is chosen by tuning the value suggested by Ullrich et al. (2018) to the gravity wave test. As before, the errors of all the methods decrease as Δt is decreased; however, they are now all limited to first order. This is due to the time-split application of hyperviscosity, as discussed in section 3. Thus, the benefit of higher-order ARK IMEX methods at small Δt is more-or-less lost, although higher-order methods do show an advantage at larger Δt. The values of Δt where higher-order ARK IMEX methods are  advantageous will likely be problem dependent; therefore, a test problem that incorporates more features of a full Earth system run is considered next.

Baroclinic Instability Test
Like the nonorographic gravity wave test, the baroclinic instability test prescribes a zonal wind field that is balanced by pressure and temperature fields. A perturbation to the initial conditions is also added, only now it is to the zonal wind near the surface. Unlike the linear evolution of the gravity wave test, the velocity field in this test becomes unsteady, resulting in a nonlinear evolution. Full details of the setup are available from . The spatial resolution used for all baroclinic instability tests is 5,400 fourth-order horizontal elements (≈110-km spacing) with 30 vertical levels of varying heights chosen for more resolution near the surface. For this test, hyperviscosity substepping is used with K = 3.

Global Energy Conservation
One metric for determining the ideal ARK IMEX method is how well the time integration scheme conserves global energy. Recall from section 3 that while the mimetic discrete spatial operators conserve energy, the addition of hyperviscosity and vertical remapping procedure are not conservative and introduce energy conservation error. Figure 1 shows that the hyperviscosity error can dominate the overall error, making the time integration energy conservation error difficult to measure. Furthermore, simulation of the baroclinic instability test for more than a few hours requires vertical remapping of the Lagrangian vertical coordinate. Thus, short simulations are first conducted to compare the energy conservation error from each ARK IMEX method to the errors from the hyperviscosity and vertical remap approaches. While one might use the initial condition of the baroclinic instability test itself for this, the evolution of those conditions is fairly linear for small time and not indicative of the nonlinear environment expected in HOMME-NH. Thus, the result of simulating the baroclinic instability test for 7.5 days with ARS232 at a time step of 10 s, with hyperviscosity ( = 10 15 m 4 /s) and vertical remap applied at each time step, is used as an initial condition.
Results are obtained for the second-order ARS232 and third-order ARS343 methods at time steps of 10 s and 20 s. These methods are run with four different setups: neither hyperviscosity nor vertical remap, with hyperviscosity ( = 10 15 m 4 /s) but without vertical remap, without hyperviscosity but with vertical remap at each time step, and with both hyperviscosity ( = 10 15 m 4 /s) and vertical remap at each time step. Each setup is run for 24 hr or until the simulation goes unstable, usually indicated by a negative diagnosed density value. Figure 2 shows the magnitude of relative error in the global energy E(t) over time: [E(t) − E(0)]∕E(0). Without hyperviscosity or vertical remap, the simulation goes unstable for both ARS232 and ARS343 in less than 14 hr. The global energy conservation error for both methods at that point is at least an order of magnitude less than when hyperviscosity only is added or when vertical remap only is added. It is worth noting that if either hyperviscosity or vertical remap are present, then neither using a higher-order method (ARS343 vs. ARS232) or a smaller time step (10 s vs. 20 s) results in an improvement in energy conservation error. This lack of improvement indicates that the energy conservation errors from those sources is more dependent on the spatial resolution than on time step size. Another observation is that while it is expected that global energy is decreased by the addition of hyperviscosity, the addition of vertical remap without hyperviscosity increases global energy. The implications of this behavior are discussed in section 5.

Largest Accurate Time Step Size
Another metric for determining an ideal ARK IMEX method is how fast an accurate solution can be obtained. This is, of course, dependent on the computational hardware, code compilation, definition of accuracy, and criterion for acceptability. In addition to wall-clock run time, the largest accurate time step size is analyzed in various ways to address the variability from computational hardware and code compilation. The first measure is the largest accurate time step size itself, which is independent of CPU clock speed or code compilation. Another measure is the largest accurate time step size normalized by the number of stages requiring a nonlinear solve, which addresses performance when solving the nonlinear system dominates the computational expense. The final measure obtained is the largest accurate time step normalized by the number of stages requiring an evaluation of f E (z ), which addresses performance when internode communication dominates the computational expense. Both hyperviscosity, with the parameter from section 3.3 set to 10 15 m 4 /s, and vertical remap are necessary to obtain a 15-day solution and are applied at every time step.
As in Jablonowski and Williamson (2006) and Taylor et al. (2007), the surface pressure field p s (s, t) is used to determine whether a solution is sufficiently accurate. A reference surface pressure field is obtained at 24-hr intervals for 15 days by solving the hydrostatic version of (1), where is fixed to 1, for the baroclinic instability test using KGU35 with a time step of 10 s. A second hydrostatic solution is obtained using KGU35 with the production time step of 300 s. The difference between the reference surface pressure field after 15 days and the initial condition used is shown in Figure 3. Also shown is the difference between the reference and Δt = 300 s solutions after 15 days and the root-mean-square (RMS) difference in surface pressure. This RMS difference, denoted (t), is used as a tolerance for the accuracy of nonhydrostatic model (1) solutions, which is valid because the baroclinic instability test is well within the hydrostatic regime when 5,400 horizontal elements are used with 30 levels. Thus, the largest accurate time step for each ARK IMEX method can now be defined as the maximum time step of {10, 20, 50, 100, 120, 135, 150, 160, 180, 192, 200, 216, 240, 270, 300, 320} that results in a solution where the RMS difference of the surface pressure field from reference, denoted (t), is less than the tolerance (t) depicted in Figure 3 for all 15 days.
The largest accurate time step size and corresponding wall-clock run times are shown in Table 3. For the largest accurate time step overall, the ARK548 method produces an accurate solution with the hydrostatic time step Δt = 300 s. For the largest accurate time step normalized by number of stages requiring a nonlinear solve (Δt∕f I ), IMKG242b (two implicit stages) out performs the ARK548 method and produces an accurate solution in the least amount of time overall. This is consistent with the environment in which these

Ability to Use Hydrostatic Time Step Size
A final metric for determining an ideal ARK IMEX method is whether the current hydrostatic dynamical core time step of Δt = 300 s can be used. Certain components of an Earth system model might require HOMME-NH to advance the solution by the hydrostatic time step before the solution is coupled back to those components. As such, the ability to take the hydrostatic time step is assessed using the setup of the previous section but without an accuracy criterion. Here, "exceedance" is defined as how much the RMS deviation of surface pressure from the reference solution ( (t)) exceeds the RMS tolerance ( (t)). Table 4 shows the largest time step sizes that yield a 15-day solution, corresponding wall-clock run times, and relative maximum exceedance values, the latter of which is defined as ) . The six ARK IMEX methods that can produce a solution using Δt = 300 s are ARK548, DBM453, IMKG252b, IMKG253b, IMKG254a, and IMKG254b methods. Of those six, IMKG252b has the largest time step normalized by number of stages requiring a nonlinear solve and obtains the solution in the least amount of time. As before, the method with the largest time step normalized in this fashion is expected to excel in the environment where the nonlinear solution dominates computation cost. Looking at the largest time step normalized by number of stages requiring an evaluation of f E (z ), one might expect equal performance across the DBM453, IMKG252b, IMKG253b, IMKG254a, and IMKG254b methods. DBM453 shows significantly higher accuracy than the IMKG methods using Δt = 300 s while requiring similar run time as IMKG254a or IMKG254b and about 25% more run time than IMKG252b.
Recall from section 3 that terms related to vertically propagating acoustic waves are treated with the implicit Butcher Tableau, while the remaining terms, including those related to horizontally propagating acoustic waves, are treated with the explicit Butcher Tableau. If the horizontal grid resolution continues to improve while the vertical grid resolution remains constant, the stability properties of the explicit Butcher Tableau will become more important. Thus, each ARK IMEX method is now used to obtain solutions on planets of decreasing radii to investigate the effect of shrinking horizontal grid cell lengths. Two planets are considered: one with a radius 1∕10 that of Earth and a rotation rate 10 times that of Earth, where the hydrostatic time step size is 30 s, and one with a radius 1∕100 that of Earth and a rotation rate 100 times that of Earth, where the hydrostatic time step size is 3 s. Following , these two planets are denoted "X10" and "X100," respectively. With the number of grid cells held constant, these setups result in smaller relative horizontal grid cell lengths. Note that the hyperviscosity is adjusted to reflect the different grid cell size ( X10 = 10 12 m 4 /s and X100 = 10 9 m 4 /s). If the stability of the ARK IMEX method is not sensitive to this change, the largest time step size should scale with the radius. The results in Table 5 show varying levels of stability sensitivity. Of the methods that can produce a 15-day solution with the hydrostatic time step (Δt = 300 s), ARK548 shows strong sensitivity, requiring a time step size of 19.2 s on the X10 planet and of 1.6 s on the X100 planet. DBM453 shows more muted sensitivity, requiring a time step size of 27 s on the X10 planet and a time step size of 2.16 on the X100 planet. That said, the DBM453 time step sizes are within 10% of the 30-s hydrostatic time step size that IMKG254a, IMKG254b, and IMKG252b can take on the X10 planet, within 2% of the 2.2-s time step size that IMKG252b and IMKG254a can take on the X100 planet, and within 10% of the 2.4-s time step size that IMKG254b can take on the X100 planet.

Recommendations
In section 4, the 27 ARK IMEX methods listed in section 3 were evaluated on the metrics of largest time step size that yields an acceptable accurate solution, largest time step size that yields a 15-day solution Note. Largest accurate time step size (accuracy), efficiency in nonlinear solution cost dominated environments (solver cost), efficiency in communication cost dominated environments (communication cost), ability to take hydrostatic time step (stability), and ability to take hydrostatic time step on X10 planet (X10 stability) (✓: method outperformed most other methods, ✓*: method performed best of all methods).
of any accuracy, and sensitivity to horizontal grid resolution. The error contributions of the hyperviscosity and vertical remap approaches of HOMME-NH were also investigated. Noting that the HOMME-NH dynamical core will be solving fully nonlinear, stiff problems from Earth system models, the baroclinic instability test from 2012 is primarily chosen to evaluate the ARK IMEX methods, as it is both stiff and exhibits nonlinear phenomena. While the performance of any given ARK IMEX method will depend on the particular problem being solved, as well as the computational hardware, the authors are confident in making general recommendations for future development of both HOMME-NH and other dynamical cores.

ARK IMEX Methods for HOMME-NH
The performance of the various ARK IMEX methods on the baroclinic instability test for each of the metrics is summarized below by metric, with brackets grouping ARK IMEX methods of equivalent performance within each metric: While no one ARK IMEX method outperforms the rest in all metrics, there are a handful of methods that consistently perform better than the rest. Thus, the authors recommend that the methods in Table 6, listed in alphabetical order, be implemented and considered for operation in HOMME-NH, followed by accuracy assessments through more in-depth code verification tests. ARK548 does not appear in Table 6 because while the method was able to produce an accurate solution at the hydrostatic time step, the large number of both explicit and implicit stages make the cost of each time step significantly higher than that of the methods in Table 6.

Hyperviscosity and Vertical Remap
It is a major finding of this paper that the benefits of these time integration methods, namely, high order-of-accuracy and energy conservation, are essentially negated at very small time step sizes. While this finding likely does not effect results using current production time step sizes, it needs to be considered as spatial resolution is increased (effectively reducing production time step size). Furthermore, the energy conservation error from the vertical remap approach is positive, indicating that the remap process adds energy to the system. The authors therefore recommend investigating possible improvements to these approaches in HOMME-NH. One way to address the hyperviscosity error is to include the application of hyperviscocity in the explicit portion of the ARK IMEX scheme instead of applying it in a split fashion. Meanwhile, the vertical remap approach might be swapped out for one that discretely conserves energy or, if that is not feasible, for one that dissipates a small amount of energy. This might allow for a smaller hyperviscosity coefficient , because the results in Figure 2 suggest the coefficient is currently tuned to counteract the energy added by the vertical remap.

Conclusions
The nonhydrostatic atmosphere model, presented in section 2, provides the interesting challenge of handling acoustic wave propagation. This work seeks to find the time integration scheme to be implemented in the HOMME-NH dynamical core that most efficiently addresses this challenge. A class of time integration schemes known as ARK IMEX methods are selected for evaluation due to their ability to treat troublesome model terms implicitly in time while treating others in an explicit manner, all without losing access to higher-order accuracy. Twenty-seven different ARK IMEX methods, listed in section 3, are evaluated on efficiency in producing an accurate solution, ability to take the corresponding hydrostatic time step, and sensitivity to horizontal grid resolution. Instead of a single ideal method, the authors recommend five ARK IMEX methods be implemented in HOMME-NH and suggest that improvements to the hyperviscosity and vertical remap approaches will significantly improve time integration performance.
In section 4, the Dynamical Core Model Intercomparison Project 2012 nonorographic gravity wave and baroclinic instability tests are used to evaluate the ARK IMEX methods. A convergence test first verifies that the ARK IMEX methods, implemented with ARKode, attain the expected convergence orders in the absence of hyperviscosity. When hyperviscosity is applied in split fashion, however, the test reveals that the error from the hyperviscosity approach in HOMME-NH can dominate the overall error for small enough time step sizes. This finding was verified with global energy conservation in the baroclinic instability test, where the errors from the hyperviscosity and vertical remap approaches were compared to the error from the ARK IMEX method itself. It was found that for small time step sizes, both the hyperviscosity and vertical remap errors are orders of magnitude larger than the ARK IMEX method, with the vertical remap additionally causing an increase in the global energy of the system. These small time step sizes are currently below the production time step size, but increases in spatial resolution will continue to decrease that production time step size. For this reason, the authors recommend possibly including the hyperviscosity application in the explicit portion of the ARK IMEX method, to reduce the hyperviscosity error and modifying the vertical remap approach to either discretely conserve energy or be slightly dissipative.
For performance at the current production time step size, the 27 ARK IMEX methods are evaluated on how efficiently an accurate 15-day solution can be produced in the baroclinic instability test, which requires both hyperviscosity and vertical remap. An accuracy criterion is defined using the surface pressure, where the error tolerance is defined as the difference between the solution using a production time step size and a reference solution using a time step size that is 30 times smaller. In addition to producing a solution of acceptable accuracy in an efficient manner, the ARK IMEX methods are evaluated on the ability to produce a solution, regardless of accuracy, using the hydrostatic time step of Δt = 300 s. Finally, the sensitivity of the methods to shrinking horizontal grid cell lengths was investigated using planets of reduced radii. The methods that excelled in each metric are summarized in section 5, with the authors recommending that the ARK324, DBM453, IMKG242b, IMKG243a, and IMKG252b methods be implemented in HOMME-NH. The authors encourage future work using the metrics described herein to test newly developed integration techniques, ARK IMEX or otherwise, on more complex test problems (full planet nonhydrostatic dynamics, orography, etc.) using both HOMME-NH and other dynamical cores.

Appendix A: Custom IMEX ARK Method
The a. five stages, with four implicit solves per step, with the same coefficient used in each implicit solve, b. third-order accuracy for both explicit and implicit methods, with third-order coupling, c. the explicit method has provably maximal stability along the imaginary axis for a five-stage method (see Kinnmark & Gray, 1984), and d. the implicit method is both A and L stable.
Of these, the authors felt that property (c) was of fundamental importance for this application, due to the fact that the eigenvalues of the explicit portion of the model, f E (q) lie on the imaginary axis, and frequently serve to limit the maximum stable step size for IMEX methods. Table B1 provides a variety of theoretical properties of each of the ARK methods used in this paper. The categories match those from Gardner et al. (2018 , Table A1):

Appendix B: IMEX ARK Method Properties
• number of implicit solves per step (f I column); • number of explicit stages per step (f E column);