Fornax: a Flexible Code for Multiphysics Astrophysical Simulations

This paper describes the design and implementation of our new multi-group, multi-dimensional radiation hydrodynamics (RHD) code Fornax and provides a suite of code tests to validate its application in a wide range of physical regimes. Instead of focusing exclusively on tests of neutrino radiation hydrodynamics relevant to the core-collapse supernova problem for which Fornax is primarily intended, we present here classical and rigorous demonstrations of code performance relevant to a broad range of multi-dimensional hydrodynamic and multi-group radiation hydrodynamic problems. Our code solves the comoving-frame radiation moment equations using the M1 closure, utilizes conservative high-order reconstruction, employs semi-explicit matter and radiation transport via a high-order time stepping scheme, and is suitable for application to a wide range of astrophysical problems. To this end, we first describe the philosophy, algorithms, and methodologies of Fornax and then perform numerous stringent code tests, that collectively and vigorously exercise the code, demonstrate the excellent numerical fidelity with which it captures the many physical effects of radiation hydrodynamics, and show excellent strong scaling well above 100k MPI tasks.

Most of Fornax is written in C, with only a few Fortran 95 routines for reading in microphysical data tables, and we use an MPI/OpenMP hybrid parallelism model now optimized for Cray and KNL architectures.Fornax employs general orthogonal coordinates in one, two, and three spatial dimensions, solves the comoving-frame, multi-group, two-moment, velocity-dependent transport equations to O(v/c), and uses the M 1 tensor closure for the second and third moments of the radiation fields (Dubroca & Feugeas 1999;Vaytet et al. 2011).Fornax can accommodate any orthogonal geometry, though for the core-collapse supernova (CCSN) problem we generally employ a spherical grid.In this paper, we describe in detail the computational philosophy and methods of Fornax ( §2), the formulation of the equations ( §3), the discretization approach ( §4), specifics concerning reconstruction, solvers, and coupling ( §5), the algorithmic steps ( §6), the implementation details of the dendritic grid in 3D spherical coordinates ( §7), the results of numerous standard hydrodynamic ( §8) and radiation ( §9) test problems, and conclude with some general observations in §10.

GENERAL DESCRIPTION OF FORNAX
The hydrodynamics in Fornax is based on a directionally-unsplit, Godunov-type finite-volume method.Fluxes at cell faces are computed with the fast and accurate HLLC approximate Riemann solver based on left and right states reconstructed from the underlying volume-averages.
The reconstruction is accomplished via a novel algorithm we developed specifically for Fornax that uses moments of the coordinates within each cell and the volume-averaged variables to reconstruct TVD-limited, high-order profiles.The profiles always respect the cells' volume averages and, in smooth parts of the solution away from extrema, yield third-order accurate states on the faces.
The code is written in a generalized covariant/coordinate-independent fashion, and so can employ any coordinate mapping (see Appendix §A).This allows the use of an arbitrary orthogonal coordinate system and facilitates the artful distribution of zones in a given geometry.To circumvent Courant limits due to converging angular zones when using spherical coordinates, the code can deresolve in both angles (θ and φ) independently near the origin and polar axes as needed, conserving hydrodynamic and radiative fluxes in a manner similar to the method employed in SMR (static-mesh-refinement) codes at refinement boundaries.The use of such a "dendritic grid" (discussed further in §7) allows us to avoid angular Courant limits near coordinate singularities, while maintaining accuracy and enabling one to employ the useful spherical coordinate system natural for the supernova problem.All components of the velocity vector are included in the hydrodynamics, enabling, e.g., the evolution of a rotating flow in 2D axisymmetry.
Importantly, the overheads for the calculations of various geometric quantities such as Christoffel symbols are minimal, since the code uses static mesh refinement, and hence the terms need to be calculated only once (in the beginning).Therefore, the overhead associated with the covariant formulation is almost nonexistent.In the context of a multi-species, multi-group, radiation hydrodynamics calculation, the additional memory overhead is small (note that the radiation typicially requires hundreds of variables to be stored per zone).The additional costs associated with occasionally transforming between contravariant and covariant quantities and in the evaluation of the geometric source terms are negligible, especially in the context of a radiation hydrodynamics calculation.
The various radiation species (photons or neutrinos) are followed using an explicit Godunov method applied to the radiation transport operators, but an implicit method is used for the radiation source terms.In this way, the radiative transport is handled locally, without the need for a global solution on the entire mesh.This is also the recent approach taken by Just, Obergaulinger, & Janka (2015), Roberts et al. (2016), andO'Connor &Couch (2018), though with some important differences.By addressing the transport operator with an explicit method, we significantly reduce the computational complexity and communication overhead of traditional multidimensional radiative transfer schemes by circumventing the need for global iterative solvers that have typically shown to be slow and/or problematic beyond ∼10,000 cores.We have demonstrated excellent strong scaling in three dimensions well beyond 100,000 MPI tasks using Fornax on KNL and Cray architectures (Appendix §E).
The light-crossing time of a zone generally sets the timestep, but since the speed of light and the speed of sound in the inner core are not far apart in the corecollapse problem after bounce, this numerical stability constraint on the timestep is similar to the CFL constraint of the explicit hydrodynamics.For cases in which this near correspondence does not obtain, we have implemented a reduced-speed-of-light approximation.Radiation quantities are reconstructed with linear profiles and an HLLE-like solver is used to determine numerical fluxes at zone boundaries.These HLLE fluxes are corrected to obtain the correct asymptotic behavior in the diffusion limit (Berthon, Charrier, & Dubroca, 2007).The momentum and energy transfer between the radiation and the gas are operator-split and addressed implicitly.
When using spherical coordinates in 2D and 3D, gravity is handled with a multipole solver (Müller & Steinmetz 1995), where we generally set the maximum spher-ical harmonic order necessary equal to twelve.The monopole gravitational term is altered to approximate general-relativistic (GR) gravity (Marek et al. 2006), and we employ the metric terms, g rr and g tt , derived from this potential in the neutrino transport equations to incorporate general relativistic redshift effects (in the manner of Rampp & Janka 2002; see also Skinner et al. 2016).For the CCSN problem, we have used for our recent 2D and 3D simulations the K = 220 MeV equation of state (EOS) of Lattimer & Swesty (1991), the SFHo EOS (Steiner et al. 2013), and the DD2 EOS (Typel et al. 2010;Hempel & Schaffner-Bielich 2010).
Without gravity, the coupled set of radiation/hydrodynamic equations conserves energy and momentum to within the tolerance of the implicit solver.With gravity, energy conservation is excellent before and after core bounce ( §8.9).However, as with all other supernova codes, at bounce the total energy as defined in integral form experiences a glitch by ≥ 10 49 ergs,5 due to the fact that the gravitational terms are traditionally handled in the momentum and energy equations as source terms, hence the equations are not solved in conservation form.
Though constructed originally for the CCSN problem, Fornax is a flexible general radiation/hydrodynamics code that functions well in a variety of geometries and coordinate systems.As already stated, in this paper we focus on this generic character as we demonstrate its strengths and capabilities and do not here emphasize its implementation in its original CCSN context.This we leave to a future paper geared specifically to the use of Fornax in studying CCSNe.Again, the main advantages of the Fornax code are its efficiency due to its use of explicit transport, its excellent strong scaling to hundreds of thousands of cores (Appendix §E), its multidimensional transport capabilities, and its interior static mesh derefinement near spherical coordinate singularities ( §7).

FORMULATION OF THE EQUATIONS
In Fornax, we use a general covariant formalism based on an arbitrary orthogonal metric to write our equations in a way that makes no reference to any particular geometry or set of coordinates.Perhaps the biggest advantage of this approach is flexibility; as we discuss below, switching geometries and coordinates in Fornax is straightforward.

Hydrodynamics
As is standard practice, we denote contravariant components of a vector with raised indices, covariant components with lowered indices, covariant differentiation with a semicolon, partial differentiation with a comma, and make use of Einstein notation for summation over repeated indices.Here and throughout this work, we adopt a coordinate basis.In this notation, the equations of Newtonian hydrodynamics can be written (ρe) ,t + ρv i e + P ρ where e is the specific total energy of the gas, X is an arbitrary scalar quantity that may represent, e.g., composition, and S j , S E , and S X are source terms that account for additional physics.The contravariant components of the velocity are v i = dx i /dt, i.e., they are coordinate velocities.In a coordinate basis, the covariant derivatives can be expanded to yield where g is the determinant of the metric, Γ l jk are the Christoffel symbols, defined in terms of derivatives of the metric, and T k l ≡ ρv i v j + P δ i j is the fluid stress tensor.Note that we have chosen to express the momentum equation as a conservation law for the covariant compenents of the momentum.There is good reason to do so.Written this way, the geometric source terms, Γ l jk T k l , vanish identically for components associated with ignorable coordinates in the metric.A good example is in spherical (r, θ, φ) coordinates where, since φ does not explicitly enter into the metric, the geometric source terms vanish for the ρv φ equation.Physically, ρv φ is the angular momentum, so we are left with an explicit expression of angular momentum conservation that the numerics will satisfy to machine precision, rather than to the level of truncation error.In general, the covariant expression of the momentum equation respects the geometry of the problem without special consideration or coordinate-specific modifications of the code.

Radiation
Fornax evolves the zeroth and first moments of the frequency-dependent comoving-frame radiation transport equation.Keeping all terms to O(v/c) and dropping terms proportional to the fluid acceleration, the monochromatic radiation moment equations can be written In equations ( 3), E ν and F νj denote the monochromatic energy density and flux of the radiation field at frequency ν in the comoving frame, P j νi is the radiation pressure tensor (2 nd moment), Q k νji is the heat-flux tensor (3 rd moment), and R νE and R νj are source terms that account for interactions between the radiation and matter.These interaction terms are written where j ν is the emissivity, κ ν is the absorption coefficient, and σ ν is the scattering coefficient.Correspondingly, there are energy and momentum source terms in the fluid equations: As in the fluid sector, we rewrite these covariant derivatives as partial derivatives, introducing geometric source terms in the radiation momentum equation.Fornax can treat either photon or neutrino radiation fields.For neutrinos, Equations (3) are solved separately for each species, and Equations ( 5) are summed over species.Additionlly, the electron fraction is evolved according to Equation 2d, with X = Y e and where s refers to the neutrino species and where N A is Avogadro's number.

NUMERICAL DISCRETIZATION
To maintain conservation while admitting discontinuous shock solutions, we adopt a finite-volume discretization of the equations presented in Section 3. Independent of geometry or coordinates (x), the volume element can always be expressed dV = √ g d 3 x.Similarly, for arbitrary orthogonal coordinates the area element is dA = √ g d 2 x.Averaging the equations over control volumes (cells) leads to a set of exact equations describing the evolution of cell volume-averaged quantities.
Applying the divergence theorem, we express the divergence terms in each equation as a net flux through cell faces.For example, the volume-averaged density in cell (i, j, k) evolves according to Two things are of note.First, by integrating over cell volumes, we have transformed our set of partial differential equations into a large set of coupled ordinary differential equations.Second, Equation ( 8) is exact, provided the fluxes on each face are taken to represent cell face area-averaged quantities.
Source terms appear in nearly all the evolution equations and, for consistency, must also be volume-averaged.If these source terms were linear in the conserved variables, this volume averaging would be trivial.Unfortunately, all the source terms are nonlinear, which requires that we make choices for how the volume averaging is carried out.For example, the geometric source terms in Equation 2b are typically treated as where angle brackets indicate a volume average over a given cell and T k l (U) is stress tensor computed from the vector of volume-averaged conserved variables.It is straightforward to show that the error in this approximation is O(∆x 2 ), but this is not a unique second-order accurate expression.Adopting alternative expressions for some geometric source terms can have desirable properties, such as exact conservation of angular momentum (see Appendix §A).
With space fully discretized, we now turn to the issue of temporal evolution.Each of our equations can be written in the form where Q is some volume-averaged quantity evolved on the mesh, F i Q is the area-averaged flux of that quantity, and the volume-averaged source terms have been grouped together according to whether or not they are stiff.The only terms currently treated by Fornax that are stiff are the interaction terms that couple radiation and matter and the terms that handle reactions (nuclear or chemical).These stiff terms require an implicit treatment for numerical stability since the characteristic time scales for these interactions may be extremely short compared to the explicit Courant time step for the matter or radiation.All other terms are treated explicitly and are evolved together without operator splitting.The explicit time integration is currently carried out using Shu & Osher's (1988) optimal second-order Runge-Kutta scheme.
The frequency dependence of the radiation moments is represented in discrete groups linearly or logarithmically spaced between a user-specified minimum and maximum frequency. 6The number of groups per species N sg is also set by the user and can be different for each species.Unless otherwise indicated, the radiation moments and interaction coefficients are integrated over each frequency group, appearing with a subscript g.For example, is the energy density of species s in group g which has units of energy density.In integrating over frequency groups, the monochromatic radiation moment Equations (3) become the group-integrated moment equations, given for each group g (and for each species s) by 5. RECONSTRUCTION, SOLVERS, AND INTERACTION TERMS 5.1.Reconstruction The hydrodynamics in Fornax is evolved using a directionally-unsplit, high-order Godunov-type scheme.Having already described the underlying spatial discretization of the variables as well as the time-stepping scheme, the essential remaining element is a method for computing fluxes on cell faces.Since Fornax employs Runge-Kutta time stepping, there is no need for the characteristic tracing step or transverse flux gradient corrections required in single-step unsplit schemes.This approach has the great advantage of relative simplicity, especially in the multi-physics context.Fornax follows the standard approach used by similar codes, first reconstructing the state profile at cell faces and then solving the resulting Riemann problem to obtain fluxes.
There are many potential approaches to reconstructing face data based on cell values.One particularly popular approach is to reconstruct the primitive variables (ρ, P , and v i ) using the method described by Woodward & Colella (1984), the so-called piecewise-parabolic method (PPM).This method is based on the idea of reconstructing profiles of the volume-averaged data, with curvilinear coordinates incorporated by reconstructing the profiles in the volume coordinate rather than in the original coordinates themselves.This approach can be problematic in the vicinity of coordinate singularities.Blondin & Lufkin (1993) suggest a generalized approach to PPM in cylindrical and spherical coordinates.Unfortunately, it can be shown that their approach cannot be used generically and, in fact, fails even for standard spherical coordinates.Here, we present an alternative approach that works in arbitrary geometries and coordinates, which contains PPM as a special case.
We begin by writing down an expression for a general n th -order polynomial: where c j is the coefficient of the j th -order monomial.As in PPM, we wish to construct a polynomial p(x) consistent with p i , the volume average of quantity p in cell i.
In our formulation, the volume average of p(x) must then satisfy a set of constraint equations of the form where x i±1/2 ≡ x i ±∆x/2 and the quantities x j i can be precomputed for each cell either directly or via Romberg integration then stored in one-dimensional arrays.For the case n = 0, i.e., for piecewise-constant reconstruction, this yields the trivial result that p(x) = p i .For n > 0, we must further constrain the reconstruction.To proceed, we first distinguish between cases where n is even and n is odd.For odd n, we form a symmetric stencil around each edge containing data in n + 1 cells and require that p i = p i in each cell i.This necessarily yields continuous left and right states at cell interfaces, but that continuity may be broken by the subsequent application of a slope limiter.For even n, we similarly constrain the polynomial within a symmetric stencil of n + 1 cells, this time centered on a given cell, yielding potentially discontinuous data at cell faces even before the application of a slope limiter.
In Fornax, since we most often employ third-order piecewise-parabolic reconstruction (n = 2), we now describe it in more detail.The reconstruction of p(x) in cell i depends on the data values p i−1 , p i , and p i+1 .Note that this requires the same number of ghost cells as linear reconstruction, is of similar cost, and being of higher order, produces superior results for many problems, as we show below and in §8.
To begin, if p i is a local extremum of the data, then the reconstruction in cell i is taken to be piecewise-constant.Otherwise, three constraints in the form of Equation ( 14) are used to form the linear system   which is then solved for the unique interpolation coefficients c 0 , c 1 , and c 2 .The resulting polynomial is then used to determine p R,i−1/2 ≡ p(x i−1/2 ), the right state at the lower face of cell i, and p L,i+1/2 ≡ p(x i+1/2 ), the left state at the upper face of cell i.If p R,i−1/2 is sufficiently close to either p i−1 or p i such that p(x) has an extremum for x ∈ (x i−1 , x i ), then p R,i−1/2 is reset to ensure that p(x) is monotone on this interval and has zero slope at x i−1 .The value of p L,i+1/2 is reset in an analogous manner as needed.Next, p(x) is reconstructed once more according to the solution of the linear system   If neither state has been reset by prior monotonicity constraints, then the solutions of Equations ( 15) and ( 16) are identical.Finally, if p(x) has an extremum in cell i, then either p R,i−1/2 or p L,i+1/2 -whichever state is nearest to the extremum-is reset to ensure that p(x) is monotone in cell i with zero slope at that face.
For smooth data away from extrema, this process yields a third-order-accurate piecewise-parabolic reconstruction at the lower and upper faces of each cell.Once the reconstruction step is completed in all cells and for all variables, the resulting left and right states at each face, which are derived from distinct interpolation parabolae, are then passed to a Riemann solver to obtain the numerical flux at that face as described in the next section.

Riemann Solvers
Fornax currently implements three choices for computing fluxes as a function of the reconstructed left and right states: the local Lax-Friedrichs, HLLE, and HLLC solvers.The HLLC solver, incorporating the most complete information on the modal structure of the equations, is the least diffusive option and is the default choice.Unfortunately, so-called three-wave solvers are susceptible to the carbuncle or odd-even instability.Many schemes have been proposed to inhibit the development of this numerical instability.In Fornax, we tag interfaces determined to be within shocks (by the pressure jump across the interface), and switch to the HLLE solver in orthogonal directions for cells adjacent to the tagged interface.In all our tests, this very simple approach has been successful at preventing the instability, incurs essentially no cost, and is typically used so sparingly that the additional diffusion introduced into the solution is likely negligible.
To O(∆x 2 ), we compute face-averaged fluxes between cells using approximate Riemann solvers applied to reconstructed states at the area-centroid of each face.For the hydrodynamic fluxes, we use the HLLC solver of Batten et al. (1997) and Toro, Spruce, & Speares (1994).For the fluxes of the radiation moments (specifically the F i and c 2 P i j terms), we use the HLL solver as in Harten, Lax, & van Leer (1983).As has been pointed out by multiple authors, the HLL solver applied to our radiation subsystem yields fluxes that fail to preserve the asymptotic diffusion limit.To recover this limit, we have found it sufficient to introduce a correction to the energy fluxes of the form In Equation ( 18), τ cell is an approximation to the optical depth of a cell in the direction normal to the face, computed using a simple arithmetic average of the total opacity (absorption plus scattering) of the cells on either side of the face.In Equation ( 17), S L and S R are wavespeed estimates for the fastest left-and rightgoing waves, F L and F R are the normal components of the reconstructed radiation fluxes on the left and right side of the face, and E L and E R are the reconstructed radiation energy densities on the left and right side of the face.Note that when τ cell ≤ 1, the corrected numerical fluxes are identical to the standard HLL fluxes.
Note further that from here on, we drop indices referring to radiation species (photon or neutrino) and frequency group for clarity, though each of these radiation terms are computed per species and per group.
Since we evolve the comoving-frame radiation moments, the intercell fluxes include advective terms that depend on the fluid velocity component normal to the faces.For consistency and accuracy, we make use of the contact (i.e., particle) velocity (v * ) computed during the construction of the hydrodynamic fluxes in the HLLC Riemann solver.At a given cell face, these advective fluxes are then set to where the subscripts L and R indicate quantities reconstructed to the left or right side of the cell face, respectively.In other words, we use the reconstructed quantities in the upwind direction, defined with respect to v * , the particle velocity at a given cell face computed by the Riemann solver for the hydrodynamics.
Several terms in the radiation moment equations depend upon the gradient of the velocity field.Ultimately, volume averages of each of these velocity-gradient dependent terms are needed.These are approximated, as in the calculation of the geometric source terms, by combining separately volume-averaged components as Like before with the advective fluxes, we again make use of the particle velocities computed during the Riemann solve for the hydrodynamic fluxes.In addition to the normal components, we also need the transverse components; these are available in the construction of the hydrodynamic fluxes as well.Thus, given the velocity components on every face of a cell, we assume a linear profile (in our generic coordinates x i ) for the intracell velocities and compute the volume-averaged partial derivatives directly via finite differences across the cell.Finally, to construct the volume-averaged velocities v k , we compute the volume-average of the linear profile in each direction k.
For the terms involving frequency-space derivatives, we use an approach similar to Vaytet et al. (2011), considering our multi-group treatment as a generalized finitevolume discretization of frequency space.The evolution equations for these terms can be written as Vaytet et al. (2011), we adopt a simple upwind formulation of these terms, defining the intergroup flux at a given group boundary ν g−1/2 as where for a given frequency group g, P j νi g = P j gi /∆ν g is the group-averaged spectral density, ∆ν g is the group width, and the subscripts L and R indicate the reconstruction of the spectral densities to the left or right side of the group boundary, respectively.

Interaction Terms
The interaction terms that couple the radiation and matter can introduce significant stiffness into the system, requiring an implicit treatment to provide stability to their numerical evolution on hydrodynamic time scales.In Fornax, we use operator splitting to advance the system after the explicit transport update described above is applied.It is essential to treat these terms after the explicit update to capture stiff equilibria properly, as obtain in optically-thick environments.Equations ( 23b), (23c), and (23d) treat the terms describing the emission and absorption of radiation and their coupling to the material energy and composition.The essential task is to solve the system where u denotes the material internal energy density and the − superscript denotes the state just after the explicit update described above.Although Equations ( 24) are spatially decoupled, i.e., their solution depends entirely on local data, if solved directly, they would represent a large system of stiff nonlinear equations, especially in the multi-species, multi-group context.Fortunately, the solution of this system can be made simpler by separating the updates into "inner" and "outer" parts.In the inner update, the frequency groups are decoupled from one another and can be updated using a direct implicit scheme.Then, in the outer update, all that remains is to find the roots of just a few equations, independent of the number of groups or species.This can be accomplished using an iterative scheme.In the photon radiation context, there is just a single equation, and in the core-collapse supernova context, there are two.Without loss of generality, we continue our description for the core-collapse case, where X represents the electron fraction, Y e , and the opacities and emissivities depend on both T and Y e .
In the inner update, each individual frequency group is updated using a fully-implicit Backward-Euler scheme of the form where the opacities and emissivities are computed using the values of T k and Y k e at outer iteration k.Equation ( 25) has the important property that for c ∆t κ k sg 1, i.e., for large optical depths, the updated energy, E k sg , approaches the thermal equilibrium solution, j k sg /(c κ k sg ).In the outer update, the hydrodynamic quantities u and ρY e are updated implicitly according to the sum over groups (and of species) of the emission and absorption terms.The integrated changes in energy and composition are given by Finally, we compute the residuals where, like the opacities and emissivities, the internal energy density u k depends on both T k and Y k e .The solution to our implicit system is given by the values of T and Y e for which r k E = r k X = 0. 7 To find these roots, we use a Newton-Raphson iteration, using finite differences to form the Jacobian and employing a line search algorithm to improve robustness.At each iteration, a linear system is solved for δT k+1 ≡ T k+1 − T k and δY e ≡ Y k+1 e − Y k e , the changes in temperature and electron fraction, respectively, between iterations k and k + 1.It is in solving this linear system that our reduction to two equations has obvious virtues.Convergence is checked by requiring that the relative change of the temperature and composition between iterations is below a user-specfied tolerance, typically 10 −6 .In practice, this procedure yields converged solutions within a few iterations under a wide range of conditions.
Finally, Equations ( 23a), (23b), and (23e) treat the momentum coupling represented by the absorption and scattering of radiative flux.For each frequency group, we first update the jth component of the flux via the implicit Backward Euler scheme which is crucial for maintaining the correct asymptotic 7 More accurately, the numerical solution to our implicit system of two equations consists of the vector [T, Ye] T that minimizes the norm of the relative residual vector behavior in the diffusion regime.To see this, consider the relevant case of a near-equilibrium, static atmosphere with large cell optical depths, as is approximately the case inside the proto-neutron star in the core-collapse problem.Under these conditions, F /(cE) 1, so that the pressure tensor under the M 1 closure approaches the Eddington closure, P i j = (E/3)δ i j .The flux after the explicit transport update described above is then given by where the last approximate equality holds in a nearequilibrium, optically-thick atmosphere in which F ∼ cE/τ c 2 E ,i .In this case, our implicit update yields the updated flux which is a temporally first-order accurate expression of Fick's law of diffusion.In other words, our operator-split implicit update of the flux reduces to Fick's law in the asymptotic limit of high optical depth.The corresponding explicit material momentum and energy updates then follow from conservation, namely This completes the description of our treatment of the interaction terms.

ALGORITHMIC STEPS
The following describes an Euler push sequence (i.e., first-order in time) for the multi-species equations in the core-collapse context.The generalization to a secondorder Runge-Kutta scheme is straightforward.

6.1.
Step 1 Advance the explicit transport subsystem (including advective flux terms and geometrical and gravitational source terms) by ∆t: (ρe) ,t + ρv i e + P ρ (ρY e ) ,t + (ρY e v i ) ;i = 0 , (33d) Step 2 Advance the frequency advection subsystem by ∆t: Step 3 Advance the radiation-matter energy interaction subsystem by ∆t: Step 4 Advance the radiation-matter momentum interaction subsystem by ∆t:

IMPLEMENTATION OF THE DENDRITIC GRID IN 3D
7.1.Motivation The purpose of the spherical dendritic grid is to avoid the narrowing of zones in polar and azimuthal angular extent approaching the origin and the narrowing of zones in the azimuthal extent approaching the polar axes that is characteristic of the spherical polar coordinate mapping of a logically-Cartesian grid.For example, a unigrid in ordinary spherical polar coordinates would have zones of size ∆r × r ∆θ × r sin θ ∆φ.Even at moderate angular resolution, this narrowing of zones can place a severe constraint on the CFL time step.One approach to circumvent this constraint is to coarsen the angular resolution as needed in order to maintain a roughly constant zone aspect ratio.Thus, we coarsen the polar angle in order to keep ∆θ ∼ ∆r/r as r → 0, and similarly, we coarsen the azimuthal angle in order to keep ∆φ ∼ ∆r/r sin θ as θ → 0 or θ → π.As a result, all zones on the grid have roughly equal aspect ratios, and near the origin where ∆r is approximately constant, all zones have roughly the same size.This implies that the CFL time step is essentially constrained only by the minimum radial resolution, ∆r min , i.e., the timestep is not constrained by the resolution in either angular direction.A sample arrangement of the dendritic grid in 3D is given in Figure 1.

Static Coarsening
In Fornax, angular resolution is always coarsened by a factor of 2. Thus, wherever the zone aspect ratio would otherwise deviate too far from 1, either ∆θ or ∆φ is Figure 1.An example of the sort of dendritic grid one might employ for 3D simulations.Note that as both the interior and the vertical axis are approached, the angular sizes increase independently.This allows us to maintain respectable resolution, while not being burdened by a severe Courant limit.coarsened as needed,8 and the aspect ratio of the neighboring zone changes discontinuously by a factor of ∼2.It is optimal to coarsen in angle whenever a zone's aspect ratio would fall outside the range [1/ √ 2, √ 2], ensuring that any zone is at most ∼41% longer in any direction than in the others.It follows that the CFL timestep constraint may be given by ∆t ≤ where the max is taken over all zones i.Note that the notion of coarsening does not here imply that the overall zone size is increasing.On the contrary, in the portion of the grid where ∆r is constant, the zones are all roughly of equal size.Therein, the resolution can be controlled simply by the value of ∆r min .
To continue, let θ and φ denote the refinement levels in the θ-and φ-directions, respectively, where level d = 0 denotes the coarsest-resolution level in the d-direction.We require a minimum of 2 zones in the θ-direction at the origin and a minimum of 4 zones in the φ-direction at the polar axes, hence, the finest levels in each of these directions are given by θ = log 2 N θ − 1 and φ = log 2 N φ − 2, where N θ and N φ are the number of zones on the finest level in the θ-and φ-directions, respectively.For example, with a resolution of N θ × N φ = 256 × 512 zones, we would have maximum levels θ = φ = 7.
The indexing scheme can be rather subtle.A given zone with global coordinates (i, j, k) at levels θ (i) and φ (i, j) has a neighbor (or neighbors) in the r-direction at coordinates (i , j , k ) at levels θ (i ) and φ (i , j ).The global coordinates are related by j = n θ j and k = n φ k , where n θ (i, i ) ≡ 2 θ (i )− θ (i) and n φ (i, i , j, j ) ≡ 2 φ (i ,j )− φ (i,j) , and • denotes the integer floor operation.

Three-Dimensional Reconstruction
The general approach of Fornax towards reconstruction is described in §5.However, with our dendritic grid there are numerous special considerations.When performing reconstruction on the spherical dendritic grid in the r-direction in 2D − and in both the r-and θdirections in 3D − we must take special care at refinement boundaries to use a pencil of zones of constant coordinate volume.For a given zone at refinement level , if its neighbor is at refinement level d > d in a given angular direction d, then we perform a fine-to-coarse restriction of the neighbor data to level d using an appropriate volume-weighted average.On the contrary, if its neighbor is at level d < d , then we perform a coarse-to-fine prolongation of the data to level d using an interpolation of the neighbor data based on MC-limited linear slopes in the transverse direction(s).
The restriction from level d down to level d in the r-direction is then given by The analogous prolongation from level d up to level d in the r-direction can be messy, but straightforward.Prolongation and restriction are done in order to obtain a pencil of constant coordinate volume, which is required to perform a volume-coordinate-based reconstruction.
7.4.Domain Decomposition Fornax is parallelized with non-blocking point-topoint communication and global all-to-all reduction using an implementation of the MPI-3.0standard.Nonblocking communication allows us to hide most − if not all − of the communication latency behind a significant amount of local computation.Due to the structure of the dendritic grid, domain decomposition becomes a nontrivial issue.Unlike the case for a unigrid or for patchbased AMR/SMR, there is no known optimal domain decomposition vis-à-vis load balancing.Therefore, we rely on a type of greedy algorithm to map the grid onto processors.In our decomposition algorithm, we attempt to drive the maximum processor load as close to the average load as possible, within reason.At the same time, we attempt to minimize the total communication overhead required to exchange ghost zone data among neighboring grids.Since our decomposition algorithm scales as a power law in the number of processors, N proc , it can take minutes to hours to compute a single decomposition for a 3D grid with a given resolution and spatial extent.
However, we need only to compute this once; the resulting decomposition can be saved to a file and read in at startup.
We typically require the maximum processor load be no more than 50% larger than the average load.This is not possible for every conceivable N proc , but for a given target, there is typically some nearby N proc for which our algorithm yields sufficient load-balancing efficiency.We need only search within some range of N proc around the target − a process that can be performed in parallel − and select the value of N proc giving the most evenly balanced decomposition.Although our algorithm is likely suboptimal, we still achieve near-perfect strong scaling efficiency for N proc few × 10 5 (Appendix §E).This is due to the fact that the majority of zones belong to a single logically-Cartesian block that can be optimally decomposed in a trivial manner.All that remains is to decompose the coarsened dendritic portion of the grid in a way that optimizes the overall load and communication overhead.

Boundary Conditions
In spherical coordinates, certain unit vectors change sign discontinuously across coordinate singularities, i.e., the r and φ unit vectors change sign across the origin in the radial direction and the θ and φ unit vectors change sign across the axis in the polar direction.Following Baumgarte et al. (2013), we enforce parity conditions on the signs of the corresponding vector components in these ghost zones after copying the data from active zones.In the azimuthal direction, the grid at φ = 2π is mapped periodically to the grid at φ = 0, so that the data in active zone (i, j, N φ −1) are mapped to ghost zone (i, j, −1) and, similarly, the data in active zone (i, j, 0) are mapped to ghost zone (i, j, N φ ).In the polar direction at the axes, the data are periodically shifted in the azimuthal direction by N φ /2 zones as they are copied from the active zones into the ghost zones.Thus, the active zone (i, 0, k) is mapped to the ghost zone (i, −1, k ), where k = (k + N φ /2) mod N φ .Finally, in the radial direction at the origin, the active-zone data are reflected in the polar direction and then periodically shifted as they are copied into the ghost zones.Thus, the active zone (0, j, k) is mapped to the ghost zone (−1, j , k ), where j = N θ − 1 − j and k is as before.

Mignone Reconstruction Tests
We have taken special care in treating the reconstruction of quantities on the mesh and have formulated our reconstruction methods to be agnostic to choices of coordinates and mesh equidistance.As in Mignone (2014), we do this by forming our reconstruction expressions in terms of volume-averaged quantities and moments of the coordinates in each cell.The moments are calculated at start up for the given mesh and coordinates.In the radial direction, we have a uniform mesh in coordinate x 1 , where the normal spherical radius r = r t sinh(x 1 /r t ), with r t a constant parameter.We reconstruct quantities in x 1 , i.e., we locally compute f (x 1 ) to specify f on the faces of each cell.The mesh in x 1 is entirely uniform.We have no need of, and do not construct, f (r), where then the samples of f would be non-uniformly distributed in We plot the density, ρ, for the on-axis (left) and off-axis (right) blast waves along a slice at y = 0 using a linear gray color scale from ρ = 0 (white) to ρ = 3 (black).The analytic shock radius location, r sh , is overplotted (red dashed line).Small-scale axis artifacts are apparent in the off-axis version, but the solution remains fairly spherical even though the blast is nowhere aligned to the grid.
r.The same is true in the angular direction.Thus, we have no "non-equidistant" reconstruction and, therefore, our reconstructions yield the same results as Mignone (2014).For any choice of coordinates in any geometry, the method works the same and forms a consistent parabolic reconstruction from volume-averaged inputs.
To demonstrate the scaling of Fornax under this reconstruction regime, we have performed at various resolutions the L 1 error radial/spherical wind test discussed in Mignone (2014).This hydrodynamic problem has an analytic solution (Mignone, his equation 89) and we have conducted the test for the Gaussian density profile provided in his eq.( 73), with his parameters (a = 10; b = 0) and (a = 16; b = 0.5). Figure 2 depicts the associated convergence scaling for the two parameter sets, compared with quadratic ( 1 N 2 ) behavior, and should be compared with Mignone's Figure 10.Despite the fact this study was performed for a non-uniform radial grid, the expected scaling with resolution still emerges.

Centered and Off-Centered Sedov Blast Wave
A standard test is the classical self-similar blast wave of Sedov (1959) describing, e.g., the initial energy- . Same as Figure 3 but for the internal energy density, u, which is plotted using a linear gray color scale from u = 0 (white) to u = 0.25 (black).Small-scale axis artifacts are less apparent, since pressure gradients tend to get smoothed out behind the shock front.
conserving phase of a supernova remnant.The problem consists of a stationary background medium with uniform density ρ 1 and near-zero pressure P 1 into which a prescribed blast energy E 1 is deposited.Since P 1 ≈ 0, the solution depends on only the two parameters ρ 1 and E 1 , it follows that the radius, r, and the time since explosion, t, are related by a dimensionless similarity variable, ξ.Thus, it can be shown that the position of the shock radius is given by where ξ 0 ≈ 1.033 for an adiabatic equation of state with γ = 1.4.Following Kamm & Timmes (2007), we set the dimensionless inputs to γ = 1.4,ρ 1 = 1, P 1 = 4 × 10 −13 , and E 1 = 0.851072 such that the outer blast radius is at r sh = 1 at time t = 1.We use a three-dimensional spherical dendritic grid with radial extent out to r max = 1.2 and a resolution of 64 × 64 × 128 zones.The radial grid spacing is constant in the interior with ∆r ≈ 0.01 out to r ≈ 0.3, then smoothly transitions to logarithmic spacing out to r max .For our first test, we deposit all of E 1 in the zones surrounding the origin.As expected, the subsequent blast is aligned with the grid and maintains perfect spherical symmetry.Next, we deposit the same energy in the zones closest to the x-axis at r = 0.12 so that the blast is offaxis.This time the blast is not aligned with the grid and must cross the polar axis as it expands.Figures 3 and 4 compare the density and internal energy of the on-and off-axis blast waves, respectively.There are small-scale axis artifacts in the density in the off-axis version of the blast wave, but the blast wave remains fairly spherical despite being nowhere aligned with the spherical grid.
The scale of the artifacts is smaller in the internal energy, since pressure variations tend to be smoothed out behind the shock where the pressure is large.
Overall, this test demonstrates that the internal geometric boundary conditions at the origin and polar axes are correctly applied.The artifacts at the axes are likely due to small errors in reconstruction in the azimuthal direction with our dendritic grid.The direction of the polar and azimuthal coordinate vectors is divergent at the polar axis, hence angular variations in the data there can only be resolved by a correspondingly convergent grid.In the limit of a locally "planar" flow across the poles, the rapid azimuthal variation comes entirely from the coordinates themselves, not from the data, although these variations must average to zero.Our dendritic grid deresolves this rapid angular variation, introducing firstorder errors due to monotonicity contraints.However, these artifacts are confined to a narrow polar axis region and do not adversely contaminate the solution elsewhere.

Sod Test
To demonstrate the shock-capturing capabilities of Fornax, we consider the classical shock-tube test of Sod (1978).The initial data consists of two states: The initial velocity is set to zero.For this test we adopt an adiabatic equation of state with index γ = 1.4.This test is not particularly challenging for modern HRSC codes, but it is nevertheless interesting because it exhibits all fundamental hydrodynamical waves.The exact solution consists of a left propagating rarefaction wave, a right propagating contact discontinuity, and a right propagating shock.
Fig. 5 shows the analytic solution and the results from Fornax using a coarse mesh of 64 points.The code correctly captured all of the waves.The shock wave is sharply captured within ∼2 grid points, while the contact wave is smeared over ∼4 grid points.

Double Mach Reflection
To test the sensitivity of the Fornax code to ithe numerical diffusion of contact waves, we perform the classic double Mach reflection test of Woodward & Collela (1984).This test consists of a Mach-10 oblique shock through air (γ = 1.4), inclined with respect to a reflecting boundary in which the incident and reflected shocks then interact to produce a triple point.In the space between the reflected shock and the contact discontinuity, an upward-directed jet should form along the slip surface, but if the numerical dissipation of the contact wave is too high, the formation of this jet will be suppressed.We use a two-dimensional grid of 520×160 zones over the domain (x, y) ∈ [0, 3.25] × [0, 1] and position an oblique shock at x 0 = 1/6 inclined at an angle α = π/3 with respect to the x-axis.The pre-shock state is given by (ρ, P, v x , v y ) R = (1.4, 1, 0, 0) and the post-shock state by (ρ, P, v x , v y ) L = (8, 116.5, 8.25 sin α, −8.25 cos α).The left x-boundary is held fixed at the post-shock state and outflow conditions are imposed at the right x-boundary.At the lower yboundary, the post-shock state is fixed for x ≤ x 0 and reflecting boundary conditions are imposed for x > x 0 .Meanwhile, at the upper y-boundary, the time-dependent .Amplitude of the interface perturbation for the single mode 2D Rayleigh-Taylor test.We show results from different resolutions and the expected growth rate from analytic theory.We find perfect agreement with linear theory up to t 1.75, when secondary instabilities start to appear in the flow and the dynamics becomes fully nonlinear.
shock position, x s = x 0 + y/ tan α + 10t/ sin α, is used to set either the post-shock state (x ≤ x s ) or the pre-shock state (x > x s ).
Figure 6 shows the evolved system at t final = 0.2 using 30 linearly-spaced contours of the density.The features at the various shock fronts remain sharp, and in the region below the triple point a small upward-directed jet is indeed produced along the slip surface.This indicates that numerical dissipation of the contact wave is wellcontrolled in Fornax.
The gravitational acceleration g = 1/2 is parallel to the z axis, and ρ h = 2, ρ l = 1, so that the Atwood number A = (ρ h − ρ l )/(ρ h + ρ l ) is 1/3.The boundary conditions are reflective for z = ±1 and periodic for x = ±1/2.The initial pressure is chosen to ensure hydrostatic equilibrium: with p(−1) = 10/7 + 1/4.For this test we adopt an adiabatic equation of state with index γ = 1.4.At time t = 0 the interface between the two fluid components is perturbed to be with h 0 = 0.01 and κ = 4π.The sharp interface is then smoothed into an hyperbolic tangent profile with characteristic length 0.005.
According to analytic theory h should evolve according to the following equation: We perform simulations with resolutions ranging from 64 × 128 to 1024 × 2048.We track the mixing of the two fluid components by means of a passive scalar tracer.This is initialized to be ρ − ρ l , so that the value 1 corresponds to the heavy fluid, while the value 0 corresponds to the light fluid.We track the growth of the initial perturbation by locating the 0.99 isocontour of the tracer.
Figure 7 shows the perturbation amplitude, normalized by its initial value, and the prediction from analytic theory (Eq.44).At low resolution, the perturbations grow exponentially from t = 0, but the growth rate is somewhat smaller than that predicted by analytic theory.The slower growth rate at low resolution is not unexpected and is due to the numerical viscosity, which modifies the Rayleigh-Taylor dispersion relation (Dimonte et al. 2004;Murphy & Burrows 2008a).We speculate that the reason why the initial perturbations grow exponentially at low resolution is that they are not well resolved.At higher resolution the correct cosh time dependence is recovered and the agreement with linear theory is excellent up to time t 1.75, when the Rayleigh-Taylor plumes start to break and rolls develop.
During the non-linear phase of the evolution, secondary Rayleigh-Taylor and Kelvin-Helmholtz instabilities appear (Fig. 8).These are seeded by the numerical noise and their detailed morphology, in the absence of explicit dissipation, is known to be dependent on the details of the numerical scheme (Liska & Wendroff 2003a).Despite the presence of these features, Fornax is able to preserve the sharp discontinuity in the fluid tracer.Artificial mixing between the two fluid components is only present in the Rayleigh-Taylor rolls, where the flow develops features on scales comparable to that of the grid.

Three-Dimensional Rayleigh-Taylor Test
Neutrino-driven convection plays a central role in the explosion mechanism of CCSNe (Burrows 2013;Radice et al. 2017;Burrows et al. 2018;Vartanyan et al. 2018).In this section, we benchmark Fornax for the modeling of convective flows by studying the nonlinear development of the Rayleigh-Taylor instability in 3D.We consider the setup introduced by Dimonte et al. (2004), for which the dynamics is dominated by mode couplings.
The computational domain is a box with −L/2 < x, y < L/2, −L < z < L, where L = 10 in arbitrary units.We consider a constant vertical gravitational acceleration g := −g z = 2, and we prepare initial conditions with density stratification where γ = 5/3, and p 0 = 2π(ρ h + ρ l )gL.We set ρ h = 3 and ρ l = 1, so that the Atwood number A = (ρ h − ρ l )/(ρ h + ρ l ) is 1/2.The initial pressure is set to which ensures that the initial conditions are in hydrostatic equilibrium.We track the concentration of the "heavy" fluid by evolving a passive scalar field f h initialized to be one for positive z and zero otherwise.We assume periodicity in the x and y direction and hydrostatic boundary conditions at z = ±L.We use uniform grids and label our simulations by the number of grid points in the x direction so that, e.g., the N128 run has a resolution of 128 × 128 × 256 points.
At time t = 0 the interface between the two fluid components is perturbed to be where X = 2πx L and Y = 2πy L .The coefficients a k , b k , c k , and d k are sampled from a uniform distribution taking values between −1 and 1, boundary excluded.H is a normalization coefficient that we adjust after having sampled a k , b k , c k , and d k so that h rms = 3 × 10 −4 L. We remark that we use the same initial conditions for all of our simulations, while Dimonte et al. (2004) used a different number of modes in their initial perturbation depending on the resolution.We also use the same realization of the coefficients in Equation ( 48) for all the simulations presented in this section.In this way, the convergence of our numerical results can be better assessed.
When constructing the perturbed initial data, we compute the cell-averaged density, internal energy, and heavy-fluid concentration, i.e., the quantities evolved by the finite-volume scheme in Fornax, by numerically integrating the profiles discussed above on an auxiliary mesh with 50 times higher resolution than the base grid in each direction.
The initial perturbations generate small buoyant bubbles that interact and grow into larger bubbles as the system evolves, as shown in Fig. 9.We find that there is significant mixing between the two fluids.This is evidenced by the intermediate values taken by f h (see Eq. 49 below), especially in the buoyant bubbles which entrain a significant amount of the heavy fluid.At late times, the topology of the interface between the two fluids becomes non-trivial and we observe the detachment of some of the bubbles.
The entrainment of the heavy fluid in the buoyant bubbles and of the light fluid in the sinking plumes is not strongly dependent on the resolution, but they appear to grow slightly with resolution, as shown in Fig. 10.This is presumably due to the higher effective Reynolds numbers achieved in the best resolved simulations.Note that we do not include an explicit viscosity in these simulations, consequently the dissipation scale is a multiple of the grid scale.At high resolutions secondary instabilities with smaller wave number and faster growth rates are allowed to develop.Their presence might explain the increase in the entrainment with resolution.Despite these quantitative differences we find a remarkable qualitative agreement between the different resolutions with the gross flow feature captured even at the lowest resolution of 64 × 64 × 128.
Following Dimonte et al. (2004) we compute the horizontally-averaged concentration of the heavy fluid and we define as the bubble penetration depth h b the z value at which f h reaches 99%.The profile of f h is observed to be self-similar during the non-linear development of the Rayleigh-Taylor instability.We show the profile of f h obtained with Fornax at two representative times in Fig. 11.Also shown are the experimental results obtained by Dimonte et al. (2004), as reported in that paper.This figure should be contrasted with Fig. 7 of Dimonte et al. (2004).We find good agreement between the Fornax results and the experimental data.
The agreement improves as the resolution increases, with the exception of a spike appearing at late times in the N512 data around z/h b −1.This is caused by the interaction of some of the down-falling plumes with the boundary at the bottom of the simulation box.
Following Andrews & Spalding (1990) and Dimonte et al. (2004), we quantify the mixing degree between the heavy and the light fluid with the metric Our results are shown in Fig. 12.The growth rate of the instability in the nonlinear phase is proportional to Agt 2 /L, as predicted by theoretical models and confirmed by experimental results (Dimonte et al. 2004).
The progression of the mixing between the two fluids as computed by Fornax agrees well with the experimental results reported in Dimonte et al. (2004).We find that Fornax compares favorably to the other Eulerian codes presented in Dimonte et al. (2004), such as FLASH.
Fornax yields results of comparable in quality to those of Arbitrary Lagrangian-Eulerian (ALE) codes with in-terface reconstruction methods (cf.Fig. 8 of Dimonte et al. 2004).That said we want to remark that, as documented in Dimonte et al. ( 2004) and as we have confirmed in preliminary calculations, the exact rate of mixing between the two fluids depends to some extent on the spectrum and the character of the initial perturbations.
We have not fine-tuned the initial conditions of our simulations to match the experimental data.9Nevertheless, we cannot exclude the possibility that the better agreement with the experimental data of Fornax compared to the results from the other Eulerian codes presented in Dimonte et al. (2004) 2004), the solid lines are Fornax calculations at different resolutions.We find good agreement between the numerical and the experimental data.

choice of initial conditions.
We analyze the character of the fluctuations in f h on the original interface at z = 0 at an advanced time Agt 2 /L 14 (Fig. 13).We find good qualitative agreement between the two highest resolution simulations.However, there are quantitative differences.Particularly noticeable is the appearance of smaller scale fluid features as the resolution is increased.This is expected since we do not include an explicit viscosity in our simulations.Instead, the dissipation scale is related to the grid resolution, so that higher resolution simulations have a larger effective Reynolds number.
We quantify this observation by computing the power spectrum of f h on the original interface at z = 0 at Agt 2 /L 14.We define the 2D spectrum of f h as and the 1D spectrum as Because of the periodic boundary conditions, the 2D spectrum is nontrivial only for k = (k x , k y ) with k x , k y integers.When computing the 1D spectrum, we convert the integral into a weighted summation following Eswaran & Pope (1988), i.e., we set where N k is the number of modes with κ − 1/2 < k < κ + 1/2.Our results are shown in Fig. 14.We find that the spectrum appears converged for k ∼ 2, as could have been anticipated by looking at the dipo-  2004).The other lines are results from Fornax calculations.We find that Fornax agrees well with the experimental data predicting the right growth rate of the instability in the non-linear regime.
lar component of the oscillations in Fig. 13.However, we find that fluctuations with scales k ∼ 10 are progressively suppressed as the resolution increases.The lack of convergence at these intermediate scales can be attributed to nonlinear mode coupling becoming stronger as the effective Reynolds number of the simulations increases.At smaller scales, the flow becomes self-similar and develops an inertial range.We find that the power spectrum follows the scaling expected from Kolmogorov's theory of turbulence for a passively advected scalar field, i.e., | fh | ∝ k −5/3 (Pope 2000).We also find that, as the resolution increases, a progressively larger part of the inertial range is uncovered.At scales of less than ∼5 grid points the power spectrum drops due to the direct effect of numerical viscosity.

Kelvin-Helmholtz Instability
The Kelvin-Helmholtz (KH) instability is a twodimensional instability arising from a velocity shear within a fluid.The instability results in vorticity, which can cascade into turbulence.As such, it is of particular interest in the study of core-collapse supernova, where cascading turbulence can contribute to shock revival.
We consider a fluid over the rectangular domain within x∈ [0,4] and z∈ [0,2], with a setup similar to that found in Lecoanet et al. (2016) and with a density jump characterized by the initial conditions below.We conduct tests at low, medium, high resolutions, defined as 512×256, 1024×512, and 2048×1024, respectively.
with a = 0.05, σ = 0.2, z 1 = 0.5, and z 2 = 1.5.We define v as the velocity in the x-direction and w as the velocity in the z-direction.The initial amplitude, A, of the perturbation in the  In the context of the Kelvin-Helmholtz study, the growth rate of the maximum of w, the velocity in the z-direction as a function of time for three different resolutions: low (blue), medium (red), and high (green).In thick dashed black we plot the best fitting exponential to the growth rate in the linear regime, which breaks down at t ∼ 2.6.vertical (z-direction) velocity (w) is set to 0.01, and the initial density perturbation, δρ/ρ 0 , is set to 1.0.The initial pressure P 0 is 10, the initial lateral velocity constant, u 0 , is set to 1.0, and the initial density ρ is set to 1.0.We assume an adiabatic gas with adiabatic index γ = 5/3.The resulting Mach number (M ) is ∼0.25, consistent with quasi-incompressible flow.As is often done in KH instability tests, we ignore gravity.
In Fig. 15, we plot as a function of time the growth rate of the maximum of w, the velocity in the z-direction.The linear growth phase of the Kelvin-Helmholtz instability lasts until t∼2.6 (in our dimensionless units) for all models, which evolve virtually identically, independent of resolution, until the nonlinear regime.
In Fig. 16, we plot the 2D density evolution at three times (t=0, 3, 6 in dimensionless units) over our grid.Initial density perturbations evolve to form characteristic vortices at later times.

Liska-Wendroff Implosion Test
The 2-D implosion test of Liska & Wendroff (2003b) consists of Sod-like initial conditions, but rotated by 45 • in a 2-D reflecting box.Specifically, the domain extends from 0 to 0.3 along both the x-and y-directions, with (ρ, P ) set to (1, 1) for x + y > 0.15 and (0.125, 0.14) otherwise.The gas is intially at rest and obeys an ideal gas equation of state with γ = 1.4.The solution is evolved until t = 2.5 on a uniform 400 × 400 mesh.This setup results in a shock, contact, and rarefaction moving from the initial discontinuity that reflect off the boundaries and interact.As highlighted by Stone et al. (2008), the correct solution to this problem includes a low-density jet that travels along the symmetry axis y = x.This problem is exquisitely sensitive to the preservation of this reflection symmetry, with the jet failing to form or wandering off y = x if symmetry is violated even at the level of round off in the discretized update.The directionally-unsplit update in Fornax is able to maintain this symmetry, as shown in Fig. 17.Additionally, the distance traveled by the jet can be a useful measure of the numerical diffusion of contacts.This is illustrated in Fig. 17 in the comparison between the results using linear and parabolic reconstructions.In fact, at much higher resolutions, the jet propagates into and interacts with the top, right corner of the domain, as shown in Fig. 18.Our results are comparable, though perhaps characteristic of slightly higher numerical diffusion, to the results presented in Stone et al. (2008).

Pressureless Dust Collapse
Here we investigate the nearly pressureless and homologous collapse of a uniformly dense cloud of dust under its own self-gravity.This problem has a solution described in Colgate & White (1966) and was used by Mönchmeyer & Müller (1989) to argue for the reconstruction of volume-averaged grid variables over the corresponding volume centroids, rather than the zonecentered coordinates.At a time t, the outer radius of the cloud, r cl (t), is given by the solution to the equation 8πG 3 where ρ 0 and r 0 are the cloud's initial density and radius, respectively.Once r cl is obtained, the cloud density ρ cl (t) is given by and the radial velocity profile inside the cloud, v cl (r, t), is given by We use ρ 0 = 10 9 g cm −3 , r 0 = 6500 km, an adiabatic equation of state with γ = 5/3, and evolve to t final = 0.065 s.The pressure is set to a negligibly small, constant value, P 0 , such that P 0 (4πG/γ)ρ 2 0 r 2 0 , hence at t final , the outer edge of the cloud is collapsing at a Mach number of ∼ 80. Finally, we use a threedimensional spherical dendritic grid with radial extent out to r max = 7000 km and a resolution of 200 × 64 × 128 zones.The radial grid spacing is constant in the interior with ∆r ≈ 0.5 km out to approximately 15 km, then smoothly transitions to logarithmic spacing out to r max .
Figure 19 shows the density, ρ, and radial velocity, v, at time t final .We scale these by the cloud density and radius, ρ cl and v cl , respectively, as obtained from equations ( 56) and ( 57), which in turn depend on the numerical solution for r cl obtained from equation (55) at time t final .As evident from the figure, the cloud density remains constant throughout the cloud, and velocity remains linear.There is some smearing at the outer edge of the cloud due to finite pressure gradients, but otherwise, the results are comparable to those obtained by other authors (Monchmeyer & Müller 1989;Mignone 2014).This test demonstrates the code's ability to obtain accurate fluxes from reconstruction in the volume coordinate.
8.9.One-dimensional Core-Collapse Mechanical Energy Conservation Test We perform several spherical 1D core-collapse hydrodynamical simulations without transport or generalrelativistic corrections and with Newtonian gravity to study energy conservation in Fornax.For this test, we use the SFHo equation of state (Steiner et al. 2013).The total energy is defined as the sum of the kinetic, internal, and gravitational potential energies on the grid, extending out to 20,000 km.The latter is calculated as: where Φ is the gravitational potential.We take the jump in total energy around bounce as an apt measure, since gravitation is not implemented in automatically conservative form (the other terms are) and bounce is the most problematic phase of core-collapse simulations vis à vis total energy conservation.Moreover, we continue the calculations to ∼100 milliseconds after bounce for three resolutions: Lo (304 radial cells, green), Default (608 radial cells, blue), and Hi (1216 radial cells, red).We find that energy is conserved from just before bounce to just after bounce to approximately 10 50 ergs, 2×10 49 ergs, and ∼3 × 10 48 ergs, respectively, for the three resolu-   The change in total energy (in 10 49 ergs), defined as the sum of gravitational potential, internal, and kinetic energies, as a function of time after bounce (in seconds) for three differentresolutions: Lo (green, 304 radial cells), Default (blue, 608 radial cells), and Hi (red, 1216 radial cells).Energy is conserved to better than 2 × 10 49 ergs at bounce for the default resolution, and ∼3 × 10 48 ergs at the higher resolution.See text for a discussion.

tions.
Earlier generations of codes, including AGILE-BOLTZTRAN (Liebendörfer et al. 2004) and BETHEhydro (Murphy & Burrows 2008b), saw total energy shifts from just before bounce to over 100 milliseconds after bounce of on the order of 10 51 ergs.For comparison, CHIMERA (Bruenn et al. 2010;Bruenn et al. 2016) conserves energy to 0.5 Bethe (1B = 10 51 ergs) over on the order of one second post-bounce.Our default resolution from before bounce to ∼1 second after bounce conserves total energy to 0.05 B, an order of magnitude better.Importantly, we find that the total energy is conserved to ∼10 47 ergs from 10 ms post-bounce to 100 ms post-bounce.By comparison, using CoCoNUT for a Newtonian core-collapse simulation, and subtracting out neutrino losses, Müller et al. (2010) report energy conservation of ∼2×10 49 ergs from just before bounce to just after bounce, and subsequent non-conservation for the following tens of milliseconds at the ∼×10 48 level.

Doppler Shift Test
To test the advection of radiation energy in frequency space caused by strong spatial variations in the gas velocity, we perform the Doppler shift test of Vaytet et al. (2011).We use a one-dimensional domain with 50 equally-spaced zones between x = 0 cm and x = 10 cm with constant background density of ρ = 1 g cm −3 and zero opacity.The velocity varies smoothly between v = 0 cm s −1 at the left edge of the domain and v = A in the center, where A = 5 × 10 7 cm s −1 .At the left boundary, we introduce a radiation field having a Planck spectrum at a temperature T = 1000 K, which we resolve using 20 equally-spaced frequency groups from ν = 0 Hz to ν = 2×10 14 Hz, with two additional groups to account for radiation in the frequency range 2 × 10 14 Hz → ∞.We evolve the system for 10 light-crossing times in order to reach a steady state.0.0 0.5 1.0 1.5 2.0

Computed Analytic
Figure 21.Difference between the shifted and unshifted energy spectra for the Doppler shift test.The energy difference in the computed solution (blue exes) and the analytic solution (black dashed line) are plotted.The relative error in the solution is less than 6.5% except near the spectral peak at ∼ 5 × 10 13 Hz where it jumps up to 23% due to slope limiting of the frequency-advected spectrum, which results in a local first-order error.
In Figure 21, we plot the difference between the Doppler-shifted spectral energy distribution at the center of the domain, E ν (v = A), and the unshifted distribution at the left boundary, E ν (v = 0).We also plot the analytic solution, obtained from the difference between the relativistic Doppler shift of the incident spectrum into the frame comoving at velocity v = A and the unshifted spectrum.To do this, we first multiply the lab-frame frequencies by the relativistic Doppler factor, where β = A/c, to obtain the corresponding comovingframe frequencies.Then, we set E ν = (4π/c)B ν , with frequencies evaluated in the comoving frame, and finally divide by the relativistic Doppler factor to transform the resulting spectrum back into the laboratory frame.
The computed solution agrees well with the analytic solution at almost all frequencies.The relative error is slightly larger near ν ∼ 5 × 10 13 Hz, the peak of the Planck spectrum at T = 1000 K, where TVD slope limiting in our frequency advection step results in a local first-order error.With 20 groups, the relative error is less than ∼ 6.5% away from the spectral peak, where it jumps up to ∼ 23%.Running the same test with 40 groups reduces the maximum relative error to 0.85% away from the spectral peak, where it jumps to 5.6%.

Continuity of Lab-Frame Flux at Transparent Shocks
To further demonstrate that Fornax accounts properly for Doppler shifts and radiation advection, we show a comparison of the radial profile of the total radiation energy density and flux calculated in the steady state for a constant point luminosity source at the spherical grid center.A non-trivial radial velocity field is imposed that includes a strong standing shock wave and the opacity of matter is set to zero.We calculate the radiation quantities in the comoving frame and it is natural to ask whether the comoving-frame radiation field translates smoothly into the simple 1 r 2 field it should.Figure 22 depicts two panels of the radial profiles, the left side in the comoving frame and the right side in the laboratory frame.Fornax calculates in the comoving frame and, hence, the corresponding energy density and flux profiles have discontinuities at the shock.However, since the radiation can't in fact see the matter, performing a simple Lorentz transformation to the lab frame should eliminate the shock in the plot.The right-hand-side panel shows that it has.This simple test is one way of demonstrating that Fornax is properly handling the velocity terms in radiation equations (3a) and (3b).

Frequency-Dependent Opacity Test
To test the effect of a strong velocity gradient on a frequency-dependent opacity function, we perform a modified version of a test by Vaytet et al. (2011).We use a one-dimensional domain with x ∈ [0, 1] cm resolved over 100 zones and set a fixed velocity profile v = Dx, where D is either 0 s −1 or 10 7 s −1 .We use a density profile of ρ = 1/(Cx) with C = 0.2 cm 2 g −1 and set the gas temperature to T 0 = 3 K everywhere.We use a frequency-dependent opacity with κ ν = 100 cm 2 g −1 for ν < 2 × 10 13 Hz, transitioning smoothly to κ ν = 1 cm 2 g −1 over a region of width ∆ν = 4.5 × 10 9 Hz as shown in Figure 23.At the left boundary, we inject a Gaussian radiation spectrum that is normalized to have total energy a R T 4 1 with T 1 = 1000 K, is peaked at ν = 2 × 10 13 Hz, has FWHM equal to two-thirds the width of the transition region, and is resolved over 20 equally-spaced frequency groups (also shown in Figure 23).For each case, we evolve only the radiation system for one light crossing time, keeping the hydrodynamics frozen.
Figure 24 shows selected group temperatures, T g ≡ (E g /a R ) 1/4 , for the cases with D = 0 s −1 (top) and D = 10 7 s −1 (bottom).In the zero-velocity case, the group energies are unshifted, and only those groups with low enough optical depth are able to stream freely across the domain.By contrast, in the case with a strong velocity gradient, the comoving-frame frequencies are Doppler-shifted to the right where the opacities are lower.Thus, the energy in groups 9-12 in the transition region of the opacity function are able to stream more freely once optically-thin conditions are reached.

Multi-Group Radiation Pulse Advection
To study the behavior of Fornax in the strong diffusion regime, where radiation is transported both by diffusion through the gas and advection along with it, we perform a variation of the pulse advection test of Krunholz et al. (2007) (see also Zhang et al. 2013), modified for our explicit transport solver.We initialize the gas temperature and density profiles as where T 0 and ρ 0 are the background temperature and density, respectively, T 1 = 2T 0 is the peak of a Gaussian pulse of thermal energy of width w, and µ = (m e + m p )/2 is the mean particle mass for ionized hydrogen.
Following Zhang et al. (2013), we use a temperature- The opacity changes from κν = 100 cm 2 g −1 for ν < 2 × 10 13 Hz to κν = 1 cm 2 g −1 , transitioning smoothly over a frequency range of width ∆ν = 4.5 × 10 9 Hz.Overplotted is a scaled representation of the incident energy spectrum (dashed orange line), which peaks at ν = 2 × 10 13 Hz and has FWHM equal to 2/3∆ν.The numbered frequency groups are indicated by vertical dotted lines.Groups 8-13 comprise the opacity transition region.and frequency-dependent absorption opacity of the form where κ 0 is some normalization constant and we take ν 1 = 2.821k B T 1 /h to approximate the spectral peak of a Planck distribution at temperature T 1 .The problem dimensions are set by the choice of an arbitrary length scale, w, and by the values of the non-dimensional parameters β ≡ v/c, M ≡ v/a 0 , P ≡ a R T 4 0 /(ρ 0 a 2 0 ), and τ ≡ κ P (T 1 )w, where κ P = 3.457 κ 0 (T /T 1 ) −3.5 is the Planck-mean opacity such that τ represents the optical depth of the pulse.For our version of this problem, we choose w = 1 cm, β = 0.01, M = 0.1, P = 0.1, and κ 0 = 0.2892 τ cm −1 , where τ is chosen to control which diffusion regime characterizes the flow.
The product βτ is equal to the ratio of the radiation 10 0 10 1 Figure 24.Selected group temperatures, Tg ≡ (Eg/a R ) 1/4 for the frequency-dependent opacity test.The velocity profile is given by v = Dx, where either D = s −1 for the zero-gradient case (top) or D = 10 7 s −1 for the strong velocity gradient case (bottom).
For the zero-velocity case, only radiation in groups with very low opacities can stream freely across the domain, but for the case with a strong velocity gradient, energy in groups 9-12 inside the opacity transition region are shifted to higher frequencies where the opacities are smaller.This allows these groups to stream more freely once optically thin conditions are reached.(bottom) for the multigroup pulse advection test with βτ = 1 in the static diffusion regime.Profiles at the initial time (dotted blue line) and at time t = 2w/βc for the unadvected (solid orange line) and advected (dashed green line) runs are shown.In both runs, the radiation pulse diffuses through the gas, and as pressure support is lost, the density in the pulse increases.Although the radiative work and advection terms are very different between the unadvected and advected runs, the relative error between them is so small that they are visually indistinguishable (see Figure 26 for a plot of the errors).
diffusion and advection time scales.For βτ 1, radiation diffusion dominates and the system is in the so-called "static diffusion" regime, but for βτ 1, advection dominates and the system is in the "dynamic diffusion" regime.In either case, although the velocity-dependent radiation work and advection terms will be very different depending on the frame in which the equations are solved, the results should be frame-independent.Therefore, for both the static and dynamic diffusion regimes, we compare runs in which the radiation pulse is initially at rest to runs in which the radiation pulse is advected over twice its initial width and shifted back to lie on top of the unadvected results.In either case, with τ 1, the radiation is in thermal equilibrium with the gas, and the total pressure is initially constant.
We use a computational grid of length L = 20w resolved over 512 zones, with 8 radiation groups logarithmically-spaced over the frequency range [4 × 10 18 , 4 × 10 22 ] Hz.In Figure 25, we show the resulting temperature, velocity, and density profiles for both the advected and unadvected runs with τ = 100 such that βτ = 1, putting the system is in the static diffusion regime.Over the course of the run, the radiation has diffused through the gas, decreasing the temperature and thermal pressure support in the pulse, hence the density increases correspondingly.Since the advected and unadvected runs are visually indistinguishable, we plot the relative errors of the density and temperature in Figure 26, which are bounded by 2.3 × 10 −5 and 1.4 × 10 −5 , respectively, in absolute value.Similarly, in Figure 27, we plot the results for the case with τ = 10 4 such that βτ = 100, which puts the system in the dynamic diffusion regime.This time, there is hardly any diffusion as the pulse is advected, since the radiation is effectively trapped within the gas.In Figure 28, we plot the relative errors of the density and temperature, which are bounded by 1.4 × 10 −3 and 1.5 × 10 −4 , respectively, in absolute value.
Importantly, these results demonstrate the ability of Fornax to reproduce the correct behavior to a high degree of accuracy in both the static and dynamic diffusion regimes and for both advected and unadvected flows.This represents a very sensitive test of the code's ability to model radiative trapping in the CCSN.In the core-collapse supernova context, trapping of ν e s by infalling matter and their subsequent compression produces a spectrum of degenerate ν e s whose energies can reach the ∼ 100−300 MeV range while preserving a relatively high lepton fraction, which is a critically important aspect that allows the core to reach nuclear densities at bounce.

Multi-group Gray-Opacity Non-Equilibrium
Radiation Shock To test the Fornax code's ability to couple the multigroup radiation subsystem to the hydrodynamics in a regime where the fluid and radiation are out of equilibrium, we present here the results of the classical gray Here, the radiation is trapped within the pulse and there is hardly any diffusion as it is advected across the grid.Again, the profiles for the unadvected (solid orange line) and advected (dashed green line) runs are visually indistinguishable (see Figure 28 for their relative errors).non-equilibrium radiation shock described by Zel 'Dovich & Raizer (1969).Following that work, we adopt the Eddington approximation and set the radiation pressure tensor to (P ) i j = (E /3)δ i j , where δ i j is the Kronecker symbol.The shock structure has a semi-analytic solution described by Lowrie & Edwards (2008).Using their parameters with an upstream Mach number of M = 3, we obtain a subcritical radiation shock whose temperature jumps discontinuously at the shock interface.We set the constant, gray absorption opacity to κ = 577 cm −1 , the mean particle mass to µ = m H , and use an adiabatic EOS with γ = 5/3.We use a one-dimensional Cartesian domain with x ∈ [−0.0132, 0.00255] resolved over N x = 512 zones.The problem is set in the rest frame of the shock, which we initialize at x = 0.In the upstream state (x < 0), we set the gas temperature T 0 = 2.18 × 10 6 K, density ρ 0 = 5.69 g cm −3 , and velocity v 0 = 5.19 × 10 7 cm s −1 , and using the Rankine-Hugoniot jump conditions to determine the downstream state (x < 0), we set T 1 = 7.98 × 10 6 K, density ρ 1 = 17.1 g cm −3 , and velocity v 1 = 1.73 × 10 7 cm s −1 .Similar to Vaytet et al. (2011), we use 8 radiation groups logarithmically spaced over the frequency range ν ∈ [10 15 Hz, 10 19 Hz], initialize the group radiation energy densities using a Planck spectrum at the local gas temperature with zero radiation flux, and evolve for 3 shock-crossing times to t final = 9.08 × 10 −10 s.Finally, since the structure of the steady-state shock solution is independent of the radiation propagation speed, ĉ, and since we must solve the semi-explicit radiation subsystem on a hydrodynamic time scale, we adopt a reduced speed of light ĉ = 10(v 0 + a 0 ), where a 0 = 1.73 × 10 7 cm s −1 is the upstream sound speed.
Figure 29 shows the gas temperature structure at time t final with the semi-analytic solution from Lowrie & Edwards (2008) overplotted and an inset showing the detail of the Zel'Dovich spike in the upstream temperature near the shock interface.There is very good agreement between the two solutions, including in the nonequilibrium spike region and in the radiatively-heated shock precursor in the downstream state.The rela-tive error between the solutions is bounded by 0.8% everywhere; the agreement in the remaining variables is similarly good.Since this test simultaneously exercises the multi-group, velocity-dependent, Doppler-shift, and matter-radiation coupling features of the Fornax code, it also simultaneously demonstrates the code's ability to compute the most problematic aspects of full radiationhydrodynamics accurately and consistently.

CONCLUSION
In this paper, we have described the methods and implementation of the multi-group, multi-dimensional, radiation/hydrodynamic code Fornax and numerically exercised it with a variety of standard and non-standard simulation tests.These included tests of on-and offcenter Sedov blast waves, the Sod shock tube, double Mach reflection, the 2D and 3D Rayleigh-Taylor and 2D Kelvin-Helmholtz instabilities, the Liska-Wendroff implosion, pressureless dust collapse, energy conservation with gravity and a complicated EOS, radiation advection and Doppler (v/c) shifts in the context of strong velocity gradients, the handling of velocity-and frequency-dependent opacity, and non-equilibirum radiative shocks with the Zel'Dovich spike.We demonstrated that Fornax performs well and accurately for all these tests, robustly handling non-trivial multi-dimensional hydrodynamic and radiation/hydrodynamic problems.To date, Fornax has been employed to study corecollapse supernovae (Wallace et al. 2016;Radice et al. 2017;Burrows et al. 2018;Vartanyan et al. 2018;Seadrow et al. 2018), for which extensions that include approximate general relativity and inelastic scattering processes are of relevance.The implementation of these effects is not address here, but can be found in Marek et al. (2009) (for the former) and Thompson, Burrows, & Pinto (2003) and Burrows & Thompson (2004) (for the latter).In constructing Fornax, we endeavored to incorporate best numerical and solution practices, with the result that the code is fast, scales well on most modern HPC platforms, and has useful geometrical and grid flexibilities.Given this, we anticipate its continued use to explore cutting-edge radiation/hydrodynamical challenges of astrophysical import and plan to extend its reach to include magnetic fields and neutrino oscillations in the years to come.To benchmark the parallel performance of Fornax under production-run conditions, we ran a full radiation/hydridynamic core-collapse 3D simulations with 20 energy groups, 12th-order multipole gravity with GR corrections to the monopole component, and a resolution of 608×256×512 (radial, poloidal, toroidal) for 10 cycles with an increasing number of pure MPI tasks out to 110k tasks.The results shown in Figure 30 demonstrate that the code runs extremely fast on NERSC/Cori II, with excellent strong scaling efficiency over 90%.In fact, because of the way we use a greedy algorithm to load-balance the decomposition of our dendritic mesh, the efficiency peaks at larger task count.
Figure 31 demonstrates strong scaling results for Fornax on the Cray/XE6 on Blue Waters for full-star radiation/hydro simulations using 20 energy groups per neutrino species and 1000×256×512 spatial gridding in spherical coordinates.We see excellent scaling results to ∼130,000 cores.The lowest efficiency achieved was for ∼130,000 cores and was ∼75%.A comparison was made to the corresponding results for our previous supernova code CASTRO, that requires an iterative transport solution and Krylov subspace methods.After ∼30,000 cores, Figure 31.Top: Wallclock time per timestep (in seconds) versus core count for 3D radiation/hydrodynamic runs.This is a strong scaling test on Blue Waters.Red-dashed is the ideal expectation, and blue is the realization using Fornax.Bottom: The corresponding parallel efficiency versus core count for Fornax (blue) and CASTRO (green).Note that beyond ∼10 4 cores, the efficiency of CASTRO plummets, while that of Fornax maintains high values.At ∼160,000 cores the efficiency of Fornax is still ∼75%.Our efficiencies have improved slightly since this test on Blue Waters was performed (see Figure 30).Perfect scaling would be a flat curve.
Fornax is five times more efficient than CASTRO, and after ∼100,000 cores Fornax is approximately ten times as efficient.Moreover, the wallclock and CPU-hour per timestep comparisons have revealed that Fornax is also ten times more favorable by these metrics than CAS-TRO.This comparision is not meant as a criticism of CASTRO, which has many state-of-the-art features and capabilities (Zhang et al. 2011(Zhang et al. ,2013)).It merely highlights the differences between an implicit and an explicit treatment of radiative transport in the context of current computer architectures and parallelism modalities.

Figure 2 .Figure 3 .
Figure 2. Plotted are the L 1 error measurements for the spherical wind problem found in Mignone (2014), as a function of number of zones (Nr).These resolution tests were performed with two sets of parameters for the initial Gaussian density profiles used in Mignone, with his parameter pairs (a = 10; b = 0) (left panel) and (a = 16; b = 0.5) (right panel) (see Mignone 2014, his eqs.73 and 89).The solutions to this hydrodynamic wind problem are analytic, facilitating the demonstration of the rate of convergence to the correct solution.See text in §8.1 for a discussion.

Figure 5 .Figure 6 .
Figure5.Density, pressure, and velocity (left, middle, and right panels respectively) for the Sod test.The solid line is the analytic solution, while the dots show the Fornax results at a resolution of 64 grid points.All of the key features of the solution are correctly captured, but the contact wave is smeared over ∼4 grid points.
Figure7.Amplitude of the interface perturbation for the single mode 2D Rayleigh-Taylor test.We show results from different resolutions and the expected growth rate from analytic theory.We find perfect agreement with linear theory up to t 1.75, when secondary instabilities start to appear in the flow and the dynamics becomes fully nonlinear.

8. 5 .
Rayleigh-Taylor Instability 8.5.1.Two-Dimensional Rayleigh-Taylor Test Here, we study the linear and non-linear development of the Rayleigh-Taylor instability in 2D.The initial data describes an unstably stratified fluid with

Figure 8 .
Figure 8. Passive tracer concentration for the single mode 2D Rayleigh-Taylor test at three representative times.The resolution is 1024 × 2048.During the linear phase of the instability, up to t 1.75, the contact discontinuity is well preserved and there is no spurious mixing of the two fluid phases.Secondary instabilities, seeded by the numerical noise, appear at later times, during the non-linear phase of the evolution.

Figure 9 .
Figure 9. Passive tracer concentration for the multi-mode 3D Rayleigh-Taylor test at two representative times.Red values indicate higher concentration for the "heavy" fluid.The red surface is the 99% isocontour of the concentration.The resolution is 512 × 512 × 1024.

Figure 10 .Figure 11 .
Figure 10.Passive tracer concentration in the xz-plane for the multi-mode 3D Rayleigh-Taylor test at time Agt 2 /L 14 for different resolutions.The dominant features of the flow are in qualitative agreement between the different resolutions, despite the highly non-linear nature of the flow at this time.

Figure 12 .
Figure 12.Mixing degree as a function of the dimensionless time Agt 2 /L.The blue points are the experimental data extracted from Dimonte et al. (2004).The other lines are results from Fornax calculations.We find that Fornax agrees well with the experimental data predicting the right growth rate of the instability in the non-linear regime.

Figure 13 .
Figure 13.Passive tracer concentration in the xy-plane for the multi-mode 3D Rayleigh-Taylor test at time Agt 2 /L 14 for different resolutions.Note the appearance of small-scale flow structures as the resolution is increased.

Figure 14 .
Figure14.Power-spectra of the heavy fluid concentration on the initial interface z = 0 at Agt 2 /L 14.The spectra are normalized to have unit norm and then compensated by k 5/3 so that regions with Kolmogorov scaling appear flat.As the resolution is increased, a progressively larger extent of the inertial range is recovered.

Figure 16 .
Figure 16.Two-dimensional density evolution of the Kelvin-Helmholtz instability over the grid at 3 different times: t=0, our initial condition (top), t=3 (middle), and t=6 (bottom).The vertical velocity perturbations translate to density perturbations, forming vortices at later times.

Figure 17 .
Figure 17.Results of the 2-D implosion problem of Liska & Wendroff (2003b) at t = 2.5 on a 400 × 400 mesh using linear (left) and parabolic (right) reconstruction methods.Contours are shown for ρ = [0.35,1.1] at 31 equidistant values and are exactly symmetric about y = x.The longer jet in the results at right reflect the lower numerical diffusion characteristic of parabolic relative to linear reconstruction methods.

Figure 18 .
Figure 18.Results of the 2-D implosion problem of Liska & Wendroff (2003) at t = 2.5 on a 1600 × 1600 mesh using parabolic reconstruction.At this very high resolution, the vortices at the tip of the jet have propagated into the upper right corner and reflected, running back along the top and right sides.The overall structure of the solution, however, remains largely unchanged in comparison to the lower resolution results shown in Fig.17.
Figure 18.Results of the 2-D implosion problem of Liska & Wendroff (2003) at t = 2.5 on a 1600 × 1600 mesh using parabolic reconstruction.At this very high resolution, the vortices at the tip of the jet have propagated into the upper right corner and reflected, running back along the top and right sides.The overall structure of the solution, however, remains largely unchanged in comparison to the lower resolution results shown in Fig.17.

Figure 19 .
Figure19.Results of the three-dimensional pressureless dust collapse test at time t final = 0.065 s.The computed density, ρ (blue exes), and velocity, v (red exes) are scaled by the cloud density and velocity, ρ cl and v cl , respectively, obtained from the semi-analytic solution (see text for details).The semi-analytic solutions for the radial profiles of density (dash-dotted) and velocity (dashed) are also shown.At a given radius, the data in all angular zones are identical, indicating perfect spherical symmetry is maintained on infall.Results from one-and two-dimensional tests are also identical.

Figure 22 .Figure 23 .
Figure22.Left: The profiles of the radiation energy density and scaled flux in the comoving frame versus radius for a steady-state constant point luminosity source.A non-trivial radial velocity field has been imposed, that includes a strong shock wave.Right: The corresponding radiation energy density and scaled flux profiles in the laboratory frame.We note that the matter is assumed to be transparent.We include on the plots a 1 r 2 line that captures the analytic answer in the laboratory frame.See text for a discussion.

Figure 25 .
Figure25.Temperature (top), velocity (middle), and density (bottom) for the multigroup pulse advection test with βτ = 1 in the static diffusion regime.Profiles at the initial time (dotted blue line) and at time t = 2w/βc for the unadvected (solid orange line) and advected (dashed green line) runs are shown.In both runs, the radiation pulse diffuses through the gas, and as pressure support is lost, the density in the pulse increases.Although the radiative work and advection terms are very different between the unadvected and advected runs, the relative error between them is so small that they are visually indistinguishable (see Figure26for a plot of the errors).

Figure 26 .
Figure26.Relative between the unadvected and advected runs for the density (solid blue line) and temperature (dashed orange line) in the multigroup pulse advection test in the static diffusion regime with βτ = 1 (see Figure25for their profiles).The relative errors are bounded in absolute value by 2.3 × 10 −5 and 1.4 × 10 −5 for the density and temperature, respectively.This indicates very good agreement between these runs, despite the fact that the radiative work and advection terms are very different from the vantage of the different frames.

Figure 27 .
Figure27.Same as Figure25but for βτ = 100 in the dynamic diffusion regime.Here, the radiation is trapped within the pulse and there is hardly any diffusion as it is advected across the grid.Again, the profiles for the unadvected (solid orange line) and advected (dashed green line) runs are visually indistinguishable (see Figure28for their relative errors).

Figure 28 .Figure 29 .
Figure28.Same as Figure26but in the dynamic diffusion regime with βτ = 100.The relative errors are bounded in absolute value by 1.4 × 10 −3 and 1.5 × 10 −4 for the density and temperature, respectively.Again, this demonstrates good agreement between the unadvected and advected runs, although the relative sizes of the terms is very different in each frame.
Figure 30.This figure demonstrates our code's raw per-cycle timing (top) and excellent strong scaling efficiency (bottom) out to 110k MPI tasks.Due to the greedy algorithm used to load-balance the decomposition of our dendritic mesh, efficiency can improve with increasing task count as demonstrated here.