General Relativistic Implicit Monte Carlo Radiation-Hydrodynamics

We report on a new capability added to our general relativistic radiation-magnetohydrodynamics code, Cosmos++: an implicit Monte Carlo (IMC) treatment for radiation transport. The method is based on a Fleck-type implicit discretization of the radiation-hydrodynamics equations, but generalized for both Newtonian and relativistic regimes. A multiple reference frame approach is used to geodesically transport photon packets (and solve the hydrodynamics equations) in the coordinate frame, while radiation-matter interactions are handled either in the fluid or electron frames then communicated via Lorentz boosts and orthonormal tetrad bases attached to the fluid. We describe a method for constructing estimators of radiation moments using path-weighting that generalizes to arbitrary coordinate systems in flat or curved spacetime. Absorption, emission, scattering, and relativistic Comptonization are among the matter interactions considered in this report. We discuss our formulations and numerical methods, and validate our models against a suite of radiation and coupled radiation-hydrodynamics test problems in both flat and curved spacetimes.


INTRODUCTION
Advances made in the modern era of multi-messenger astronomy, together with ever faster computing platforms, have inspired the development of increasingly sophisticated computational tools capable of modeling astrophysical systems as complex as kilonovae, tidal disruption events, gamma ray bursts, and accretion disks. These events are highly energetic interactions generally associated with compact objects (white dwarfs, neutron stars, black holes) where relativistic effects become important and radiative emissions are critical for interpreting observational data in the form of x-ray spectra and luminosity light curves.
Decades of progress has seen the field of computational physics advance from relatively simple N-body and hydrodynamics simulations to the point where relativistic magneto-hydrodynamics (MHD) together with radiation treatments beyond the diffusion approximation have become common place (a far from incomplete list includes Farris et al. 2008;Müller et al. 2010;Shibata et al. 2011;Zanotti et al. 2011;Sadowski et al. 2013;McKinney et al. 2014;Tominaga et al. 2015;Kuroda et al. 2016;Ryan & Dolence 2020). A historical perspective of this progress is exemplified by our own contributions with the Cosmos++ code, which grew in sophistication from a modest start in Newtonian hydrodynamics and flux-limited (grey) diffusion (Anninos et al. 2003), to eventually incorporate general relativistic magneto-hydrodynamics on unstructured, adaptively refined grids (Anninos et al. 2005). Many subsequent upgrades followed, including among others grey radiation transport with M1 closure (Fragile et al. 2014), adaptive high order discontinuous Galerkin finite element MHD (Anninos et al. 2017), covariant relativistic molecular viscosity (Fragile et al. 2018), nuclear alpha-chain reactive networks (Anninos et al. 2018), and most recently multi-group radiation transport (Anninos & Fragile 2020). This latest multi-group transport capability was developed using the two-moment (M1) approximation, so although it represents an important improvement over zero moment diffusion approximations, it does not compare in fidelity to an honest Boltzmann treatment, especially in the treatment of scattering (Comptonization) events.
In this paper we describe further developments of Cosmos++ to include an Implicit Monte Carlo (IMC) treatment of radiation fully coupled with general relativistic MHD and spacetime curvature. Unlike moment closure methodologies, Monte Carlo transport solves the full radiative transfer equation without artificial assumptions about closure relations for either the radiative intensity or flux. This enhanced accuracy comes at the cost of random noise in the solution. Although noise can be reduced by increasing the number of Monte Carlo packets being tracked, it can be exceedingly expensive in terms of computing resources. Nevertheless, Monte Carlo remains a promising avenue of exploration for solving astrophysical problems where radiation plays an important role, especially because it is also useful for generating synthetic spectra of radiation escaping from systems of interest which can be compared to astronomical observations, including those of the vicinity of black holes where spacetime curvature is important (Dolence et al. 2009;Schnittman & Krolik 2013;Zhang et al. 2019).
Previous couplings of Monte Carlo radiative (or neutrino) transfer to hydrodynamics include Nayakshin et al. (2009);Abdikamalov et al. (2012); Haworth & Harries (2012); Noebauer et al. (2012); Cleveland & Gentile (2015); Ryan et al. (2015); Roth & Kasen (2015); Tsang & Milosavljević (2015); Richers et al. (2017); Vandenbroucke & Wood (2018); Smith et al. (2020). Some of these are IMC methods, which modify the Monte Carlo radiative transfer coupling to hydrodynamics to allow for hydro time steps that are larger than the radiative cooling timescale of the fluid. The IMC methods adopted in this paper are closely related to those described in Roth & Kasen (2015). A key difference however is that here we consider the fully general relativistic problem, whereas Roth & Kasen (2015) only considered Minkowski spacetime and hydrodynamics equations that were only accurate to order (v/c) 2 . In the process, we describe a relativistic generalization of the so-called "Fleck factor" (in reference to the seminal paper on IMC, Fleck & Cummings 1971). Our goal of generalizing the Fleck factor was inspired by Gentile & Morel (2011), although our formula differs from the one described in those proceedings. We additionally point out that our methods differ from previous general relativistic MC rad-hydro treatments (e.g. Ryan et al. 2015) by the implicit discretization method developed here, the introduction of the stabilizing relativistic Fleck factor, and the step-length weighting procedure for Monte Carlo estimators. We also describe our treatment of Compton scattering, which remains valid in the high temperature regime where electrons can move relativistically in the comoving frame of the fluid. Another important aspect of this work is a discussion of the procedure for initializing Monte Carlo packets for a flow field traveling across an Eulerian grid so as to maintain zero flux in the comoving frame (section 3.5).
The remainder of this paper is organized as follows: In sections 2 and 3 we write out the equations and discuss our numerical methods, emphasizing aspects relating to the coupling of IMC to general relativistic hydrodynamics. In section 4 we present a series of tests validating our implementation of IMC and Compton scattering in Cosmos++. Section 5 compares compute efficiencies between IMC and multi-group M1. Finally, we conclude in section 6.
Most of the equations in this paper are written in units where G = c = 1, although in a few places we leave in factors of c for clarity. We adopt the usual convention whereby Greek (Latin) indices refer to spacetime (spatial) coordinates and impose a (−, +, +, +) metric signature.

BASIC EQUATIONS
We consider two formulations of the general relativistic radiation-hydrodynamics equations, both of which are supported by Cosmos++: One based on an internal energy (IE) approach used with nonconservative artificial viscosity methods for capturing shocks, and a second conservative total energy (TE) approach required for high resolution Godunov methods. For both formulations we write the (non-radiative) stress energy tensor in the following form: where magnetic fields and artificial plus molecular viscosity have been absorbed into a single symmetric tensor representation Q αβ = Q αβ S + Q B (g αβ + u α u β ) + b α b β . Here ρ is the fluid mass density, h = 1 + + P/(ρ) is the specific enthalpy, = e/ρ is the specific internal energy, P is the fluid pressure, u α is the contravariant fluid velocity, b α is the magnetic field, P b = g αβ b α b β /2 is the magnetic pressure, Q B is the bulk viscosity, Q αβ S is the symmetric shear tensor viscosity, and g αβ is the curvature metric. The fluid density, pressure, energy density, and specific energy are all defined in the comoving fluid frame.
Our implementation of radiation allows for both magnetic fields and molecular viscosity, however for the purposes of this paper we henceforth drop Q αβ from all subsequent discussions, but maintain occasional use of Q B representing artificial viscosity.

IE Hydrodynamics
In this formulation the mass, energy, and momentum equations are derived from ∇ µ (ρu µ ) = S D , u ν ∇ µ T µν = u ν G ν , and h αν ∇ µ T µν = h αν G ν , respectively, resulting in the following set of equations: where D = W ρ, E = W ρ , W = √ −gu 0 is the relativistic boost factor, v i = u i /u 0 is the fluid transport velocity, h αν = g αν + u α u ν is the projection operator, S D is an arbitrary source or sink for mass production that may come from chemical or nuclear activation, Γ µνj are the curvature Christoffel symbols, G ν are energy and momentum source terms from radiation coupling (or any other cooling function or acceleration force), and S j = √ −g(ρh − Q B )u 0 u j is the momentum which explicitly includes bulk (or artificial) viscosity in its definition, a choice we have made to simplify the evolution equations.
The radiation equations are derived from the 4-divergence of the radiation stress tensor, written simply as ∇ β R β α = −G α = − dνG α(ν) when integrated over frequency. In a coordinate frame the integrated 4-force with thermal emission and Compton scattering takes the form where a R is the radiation constant, T and T R are the gas and radiation temperatures, κ S is the scattering opacity, and κ F , κ A , and κ P are the flux, absorption, and Planck mean opacities, respectively. Local thermodynamic equilibrium (LTE) has been assumed to relate the fluid's radiation emissivity to its absorptive opacity to radiation.
More will be said about the transport equations and coupling expressions in a later section, but it is worth pointing out that the energy and momentum source terms in equations (3) and (4) reduce to the familiar Newtonian forms when associating R αβ u α u β with the radiation energy density, and R αβ u α h βj with the flux.

TE Hydrodynamics
The conservative scheme derives the hydrodynamics equations directly from the four-divergence of the mixed index stress tensor where as before Γ σ µν are the Christoffel symbols, and G ν is the radiation 4-force coupling. Defining total energy as E T = − √ −gT 0 0 and momentum S j = √ −gT 0 j , the corresponding conservation equations become Mass conservation (2) and the 4-force expression (5) complete this system of equations.

Primitive Inversion
In addition to the conserved (evolved) fields, one must also define the thermodynamic or equation of state variables (pressure, temperature, sound speed), and compute so-called primitive fields (mass density, internal energy, velocity) used in calculating the thermodynamic state, the radiation 4-force, and the stress tensor fueling curvature flow. Cosmos++ offers numerous options for equations of state ranging from simple ideal gas to complex (tabular) degenerate electron and nucleon models. For Newtonian systems the extraction of primitive fields from conserved quantities is straightforward enough, but the nonlinear coupling characteristic of relativistic hydrodynamics requires special iterative treatment. We have implemented several fully implicit Newton-Raphson schemes for solving the inversion problem, including reduced one, two, and five dimensional schemes for MHD, and a 5+4N dimensional method for radiation-MHD with N frequency groups. All of these options have been described in our earlier publications (Fragile et al. 2012(Fragile et al. , 2014Anninos et al. 2017;Anninos & Fragile 2020), so we do not go into detail here, other than to summarize the list of conserved fields (computed directly through the evolution equations), , and the primitive fields (extracted from evolved fields via Newton iteration) Here u k = u k − u 0 g 0k /g 00 is the velocity projected to normal observers, and R α , R 0 α , E R , u i R are stand-ins for the radiation fields required for our M1 treatment but are not necessary for our IMC implementation, an added benefit considering the computational cost this imposes on multi-group calculations. The cost trade-off between IMC and M1 is discussed further in section 5.

Radiation Stress Tensor
The most common approaches for treating relativistic transport have been based on one or two moment formalisms defined by a radiation stress tensor evaluated in inertial (11), comoving (12), or radiation (13) frames. Here n α = α −1 [1, − β i ] (n α = [−α, 0, 0, 0]) is the timelike vector orthogonal to the spacelike hypersurface, and β i and α = 1/ −g 00 are the shift vector and lapse function in the 3+1 decomposition of spacetime. The quantities E, E, and E R represent zero moment radiation energy densities in different frames, F α and F α are first moment momentum densities, and P αβ and P αβ are the anisotropic stress components generally determined by closure relations. Monte Carlo schemes instead evaluate the full radiation tensor (high order moments included) by directly sampling the invariant photon distribution function (Dolence et al. 2009;Ryan et al. 2015) such that for packets with 4-momentum k α and weights w k representing the number of photons in each packet. The form of the last expression in equation (15) restricts the summation be performed over a single grid cell with coordinate volume ∆ 3 x. In section 3.3, we will offer an improved estimator of the components of R αβ based on step-length weighting of packets.

Orthonormal Bases
Two frames of reference are utilized in solving the coupled radiation-hydrodynamics equations: the coordinate and comoving (fluid) frames, the latter defining a locally Minkowski metric η (α)(β) attached to the timeline tangent of the fluid 4-velocity. (We adopt the standard convention of using Greek letters to represent coordinate frames, and parenthetically enscripted Greek letters to represent tetrad bases where indices are raised and lowered with the Minkowski metric.) The hydrodynamics equations and photon trajectories are evolved in the coordinate frame. However, radiation-matter interactions are more easily evaluated in the comoving frame. Transformations between the two are performed by defining an orthonormal tetrad attached to the fluid and using a Gramm-Schmidt orthogonalization method to complete the tetrad. In particular we define the first 4-vector as e µ (0) = u µ . The remaining three vectors are selected based on either coordinate or physical (e.g, curvature, magnetic or flow field) alignments. We point out that aligning the first tetrad element to the fluid velocity is the equivalent, up to a spatial rotation, of a Lorentz boost in special relativity. Transformations from coordinate to tetrad and tetrad to coordinate bases are carried out by matrix products k (β) = e (β) α k α and the inverse k α = e α (β) k (β) . Higher rank tensors are similarly transformed g µν = e µ (α) e ν (β) η (α)(β) , using the metric tensor as an example. In the tetrad basis, photon 4-momenta are specified by the frequency and spatial direction, which in spherical coordinates takes the form where h is Planck's constant, and ν is the frequency in the comoving fluid frame, hν = −k α u α . When transformed to the coordinate frame (k α = e α (β) k (β) ) it takes on a more generic form k α = hν[u α + α ] whose components satisfy the orthonormality condition u α α = 0. The convenience of this dual frame approach is thus evident in the specification of photon 4-momenta when packets are created, which is generally performed in the fluid frame where distribution samplings are well-defined and optimally computed. But perhaps the advantage is best exemplified by considering the 4-force density derived directly from the radiation stress tensor where νχ ν , (η ν /ν 2 ), and I ν /ν 3 are the invariant extinction coefficient, emission coefficient, and specific intensity respectively. In the fluid frame this simplifies to a simple integral over frequency and solid angle where n (α) = (1, (i) ). These contributions to the comoving four-vector can then be transformed to the coordinate frame with G α = e α (β) G (β) . Yet another advantage to this fluid frame approach for constructing G α is that it allows a straightforward generalization of the semi-implicit treatment of radiation-fluid energy exchange via the Fleck factor (Fleck & Cummings 1971) to relativistic fluids (see section 3.4). We elaborate on the calculation of G (α) in sections 3.2, 3.3, and 3.6.

Geodesic Transport
Between interaction events photon packets are propagated along geodesic trajectories by solving the covariant transport equations in the following computationally convenient form for the spatial coordinates x i and momenta k α of each packet, together with the constraint k α k α = −m 2 c 4 (zero for null geodesics) from which k 0 is derived. These equations are generic and thus applicable to static or dynamical spacetimes, to Newtonian or general relativistic flows, and to flat space foliated with curvilinear coordinates. When the spacetime metric is known analytically the source terms are evaluated by direct calculation (we maintain analytic expressions for numerous metric forms and their derivatives). For dynamical spacetimes (e.g., when solving the Einstein equations or evolving pseudopotentials) the acceleration terms are projected from the mesh to each particle position with first or second order interpolation options.
Considering the significant cost of solving these equations numerically we have additionally incorporated a reduced option for problems where linearized gravity is a reasonable approximation. In this special case equations (19) reduce to a single source gradient where Φ 1 is the gravitational potential.
Some further optimization could be done even for problems involving strong gravity . To this end we have implemented a number of different solver options including 1st, 2nd and 4th order Runge-Kutta methods, and a velocity Verlet algorithm which minimizes the number of metric gradient calculations. Also, it is reasonable to expect when intervals between interaction events (scattering, absorption, census) are much shorter than local curvature gradients that one can approximate photon trajectories by tangential paths calculated from the last curvature update. For example, a measure of distance to local curvature 'scatter', such as min ijk [g ij /|∂ k g ij |], might be used to trigger (or deactivate) a full geodesic solve when measured against other interaction scales.

NUMERICAL METHODS
The full set of radiation-MHD equations are solved by operator splitting source terms into spacetime advection, curvature flow (Eulerian treatment for the field equations, geodesic transport for photon packets), constrained transport (or staggered vector potential) with magnetic fields, radiation-matter coupling, and finally primitive field inversion. Conserved fields are updated in time by the (user-specified) order that physics packages are inserted into a driver program. Primitive fields, from which interaction terms are derived, are updated simultaneously at the end of each time cycle from the advanced conserved fields after all physics packages have been evolved. Our numerical methods for the non-radiation parts have been described in previous publications (Anninos et al. 2005;Fragile et al. 2014;Anninos et al. 2017;Anninos & Fragile 2020) so we do not go into great detail here, except to note that we use high resolution shock capturing techniques for the hyperbolic elements (including HLL and LF Riemann solvers, together with PPM flux reconstruction), and explicit evaluation of curvature source terms. All field equations are solved on unstructured (adaptively refined) meshes using second order finite volume methods and up to fifth order time discretization. Primitive inversion is solved with a fully implicit Newton-Raphson procedure after physics updates, including radiation.

Propagating photon packets
Photon packets are created in the fluid frame with source-specific distribution sampling in angle and frequency. They are transported in the coordinate frame by advancing according to the geodesic equations between scattering events. When scattering events are triggered, packets are transformed to either the fluid frame, or to electron rest frame via an additional Lorentz boost, to perform the scatter. They are then transformed back to the coordinate frame to continue along geodesic trajectories. This process repeats until the packets are destroyed by absorption or exit the grid.
We developed two strategies for updating the energy of packets due to absorption. The first decrements packet weights after each step interval in accordance with the rate of energy absorption experienced along that path. Following this strategy, packets continue to be transported until their weight falls below a designated threshold. The second strategy avoids changing packet weights during each step. Instead, according to this strategy, we randomly compute a probability for removing the packet entirely during each interaction in proportion to the ratio of absorption opacity to total opacity, which includes the scattering contribution. This second option is consistent with interpreting a packet as a 'super-photon', behaving as we would expect individual photons to behave, but transporting energy far greater than would be transported by individual photons. In addition to aiding intuition, this super-photon approach has the tendency to keep the energy of all active packets nearly equal in many problems. In contrast, the first approach (which involves decrementing packet weights) leads over time to a wide range of packet weights, as packets that were created earlier in the calculation will have their weights substantially reduced when compared to newly created packets. Generally, the efficiency of a Monte Carlo method improves as packet weights become increasingly similar (Kalos & Whitlock 2008). For these reasons, we use the super-photon approach for all results described in this paper, but maintain the first option for comparison purposes.
According to the super-photon strategy, each time a packet is moved, we sample an interaction distance from an exponential distribution that accounts for both absorption and scattering processes, with average lengths equal to the frequency-dependent mean free path associated with the combined comoving frame absorption and scattering processes. We then transform the resulting interaction distance to a coordinate-frame distance and compare the coordinate time required for a packet to traverse that distance to the coordinate time it would take for each photon to reach a zone boundary, and to the coordinate time remaining until the end of the time step. The shortest of these timescales is used to move the photon to its next location, repeating the process as necessary until the end of the cycle.
In order to simplify the notation in the following more formal discussions we forego strict adherence to some of the index conventions established in the earlier theoretical sections. In particular, because much of the following centers around the fluid frame, we drop the parenthetical labeling of tetrad bases and, except in a few places, use hatted variables (not indices) to reference the fluid frame.

Radiation-matter coupling
As described in section 2.5, the exchange of energy and momentum between radiation and matter is completely described by the four-vector G µ , whose components are most straightforwardly evaluated in the comoving frame, G µ . We will now describe how we can simplify the equations for G µ under a set of widely applicable physical assumptions.
First, we note that thermal emission processes typically have direction vectors that average to zero in the comoving frame, allowing these emission terms to be omitted from the radiative momentum source term if statistical conservation of momentum (rather than exact conservation) is acceptable. Second, under the assumption of local thermodynamic equilibrium, the fluid source can be represented by the Planck function B ν . Third, when scattering processes are coherent, they do not contribute to energy exchanges in the comoving frame. Finally, similar to thermal emission, coherent scattering processes typically have outgoing direction vectors that average to zero in the comoving frame. Given these simplifications, then returning to equation (18), we find In the above, we use χ a ν to refer to extinction from absorption, while the undecorated χ ν refers to a total extinction that includes both absorption and coherent scattering. The extinctions χ a E , χ a p , and χ i F are the energy, Planck, and flux mean absorption coefficients, respectively, defined as Compton scattering, which is noncoherent in the comoving frame, is handled separately, as we describe in section 3.6.

Comoving-frame moment estimators
We require a way to compute the terms that appear on the right-hand sides of equations (21) and (22) based on our Monte Carlo treatment of the radiation field. We begin by considering how to construct Monte Carlo estimators of moments of the radiation field in the coordinate frame, where the packets are transported. In this section, we restore factors of c which have been set to 1 by choice of units in other sections. One approach would be to use equation (15). For example, the radiation energy density E can be found by taking the 00 component, where an arrow indicates that the Monte Carlo estimator converges to the indicated radiation moment as the number of packets approaches infinity. However, we would like to more smoothly handle situations where packets may enter and leave zones over the course of a time step that lasts a coordinate time interval ∆t. Additionally, when packets change direction within a zone, we would like to incorporate this into our flux estimators in a way that uses all available packet propagation information, not just their final directions at the end of the time step. So, following Lucy (1999) and Roth & Kasen (2015), we sum over the steps that packets take in each zone, weighted by the coordinate time dt associated with each packet step, and at the end we divide by the total time interval ∆t: where ∆ 4 V = √ −g∆ 3 x(c ∆t). This approaches an invariant four-volume element with increasing space and time resolution, which lets us write ∆ 4 V = ∆ 3 x(c ∆ t), because − g = 1 in the orthonormal comoving frame. Now consider the estimator for E, the radiation energy density in the orthonormal comoving frame. This can be written in terms of step lengths = c d t: Next we need a way to express . In general, the quantity νχ ν is invariant. We also know that optical depth along a step, τ = χ ν , is invariant, since it is equal to the negative of the log of the fraction of photons that remain in the packet without interacting, and photon number is an invariant quantity. This allows us to write the following set of relations: which can be used to evaluate the estimator as a function of coordinate path length We can compare equation (27) with (30). The estimator for E is constructed almost exactly the same way as for E, except for two differences: two factors of k 0 n /k 0 n are included for each packet step contribution, and step coordinate time intervals are given by /c, where is still a coordinate-frame length. For the special case of transforming between inertial frames, the two factors of k 0 n /k 0 n have a simple physical interpretation: One factor accounts for the Doppler shift of the photon energy, while the other accounts for length contraction of the step interval.
The factor k 0 n /k 0 n can be computed once per zone per hydro time step, in advance of the photon propagation, based on the tetrad transformation. Although we have not done it here, this estimator could in principle be made accurate to higher order in spacetime resolution by updating the factor of k 0 n /k 0 n for each packet step as k 0 n is updated during geodesic propagation, and by interpolating the fluid four-velocity within the zone to redefine the tetrad transformation at each intermediate step location. Now we can also construct the estimator for χ E E to use in equation (21), Similarly, an estimator for F i can be constructed as where i is described in section 2.5. Note that the inclusion of i in equation (32) requires computing the comoving direction vector using the tetrad transformation each time the packet undergoes a scattering event. Finally, the estimator corresponding to equation (22) is

Relativistic Fleck factor
An important advantage to handling energy and momentum exchange between radiation and gas in the comoving frame of the fluid is that it allows straightforward generalization of the semi-implicit treatment of radiative energy exchange, adapting the Fleck factor (Fleck & Cummings 1971 to the case of relativistic fluid flow. Based on the principle of relativistic invariance of inertial frames, we can write equation (21) as where dτ is the interval of proper time as measured by the comoving fluid element. This equation reduces to the analogous equation for the material energy update in FC1971 when source terms aside from thermal radiation are ignored. By writing the material energy update due to radiation this way, we are applying a form of operator splitting by first computing the change in material energy solely due to absorption and emission of radiation, and then later adding this contribution as an energy source when Cosmos++ performs its flux-conservative update of the material energy. It is reasonable, therefore, to consider ρ as a constant in this material energy equation. Following FC1971, we introduce the quantities u r and β, where we have assumed the material is a perfect gas, such that = c V T for a specific heat c V that is independent of temperature. The quantity u r should be considered as a representation of the material temperature T , not equal to the radiation energy density unless the radiation and the material have reached thermal equilibrium. We then have We now wish to discretize equation (34) in time. Let u n r refer to the value of u r at the beginning of time step n. We define a time-centeredū r = αu n+1 r + (1 − α)u n r for a parameter α such that 0 < α < 1. The time-discretized version of equation (34) We can solve equation (39) for u n+1 r in terms of u n r , and plug this into the definition ofū r to obtain where the common multiplier is associated with the Fleck factor Notice that this formula for f IMC utilizes the proper (not coordinate) time interval ∆τ .
In terms of f IMC , the discretized radiative heating and cooling equation can be written as Simplifying, and referring to equation (38), Comparing to equation (34), we see that to implement our semi-implicit thermal balance, all we must do is replace the true absorption extinction χ a ν with an effectively reduced absorption extinction f IMC χ a ν . The Fleck factor is therefore included when we construct the absorption estimator given by equation (31), and it is also included when setting the energy of packets generated due to thermal emission. Moreover, the Fleck factor is included when determining the probability that a packet is absorbed during an interaction and therefore removed from the calculation. The interactions not included by the Fleck factor can be considered as a coherent effective scattering process, with extinction (1 − f IMC )χ a ν . Together the Fleck factor-weighted absorption extinction and the effective scattering extinctions sum to χ ν , and they both contribute to the radiation force estimator defined by equation (22).
All that remains is to use the relationship between proper time of the fluid element and coordinate time c dt = u 0 dτ , where u 0 is calculated from the normalization constraint u α u α = −c 2 . Note that for flat spacetime this simplifies to the familiar relation u 0 = cγ where γ = 1/ 1 − v i v i /c 2 is the Lorentz factor. The final form of the Fleck factor can be written We point out that this result differs from Gentile & Morel (2011), who used γ∆t rather than ∆t/γ in the denominator of the formula for f IMC in the case of flat spacetime. Both expressions lead to a stable energy update, although the Gentile & Morel (2011) form effectively increases the coefficient α compared to ours, leading to less rapid cooling than can be achieved stably with our version. Our formula also generalizes to curved spacetimes, considering u 0 encodes metric information.
In practice, to ensure stability for problems where a nontrivial Fleck factor is required, the parameter α is set to a value between 0.5 and 1.0. Higher values of α lead to smoother solutions and improved stability. However, higher values of α can also lead to artificially long cooling times for zones undergoing rapid cooling. This is evident in the radiative equilibrium test described in Noebauer et al. (2012) and Roth & Kasen (2015).

Initializing packets in relativistic flows
At times it is desirable to begin a radiative transfer calculation when there is already a non-negligible field present. To that end we work out an initialization procedure for radiation packets being transported along with a moving fluid across a Eulerian grid representing an inertial frame. For pedagogical purposes, we restrict our attention to onedimensional flows with constant velocity and uniform density and temperature in flat spacetime. Through appropriate coordinate transformations, the same procedure can be applied in more general settings.
To help validate properties of our implementation described later in this section, we refer to the following exact relations between inertial frame moments. These relations can be obtained from the Lorentz covariance of the radiation stress-energy tensor, and in 1D takes the form (equations 91.13-91.15 of Mihalas & Mihalas 1984): where we have simplified notation to drop vector and tensor indices. Here, v is the (potentially relativistic) velocity as measured in the inertial frame, and β and γ have their usual special relativistic meanings. For the special case of uniform flow, symmetry requires F = 0 and P = E/3. Substituting these conditions into equations (45)-(46) we obtain 3.5.1. Enforcing zero flux in the fluid frame The option to sample packet directions isotropically in the comoving frame is generally possible, even for relativistically moving fluids, so long as we choose packet energies appropriately to ensure F = 0. In this section, we will consider packet energies n = w n (c k 0 n ). One procedure we can follow to enforce this condition is to introduce a reference packet energy ref , whose value we will determine shortly, and set lab-frame packet energies as Here we have identified k 0 n /k 0 n with ( ν/ν) n , the ratio of frequencies measured in the comoving frame and the lab frame, respectively. Then, starting from equation (32), and taking the limit ∆t → 0 so that /(c∆t) → 1, we have where the last sum approaches zero because of our choice to initialize packet directions isotropically in the comoving frame. We next relate ref to the lab-frame moment E in a zone, To determine how this sum converges, we again exploit the fact that we have chosen the packet directions isotropically in the comoving frame. If we originally chose, say, the x-axis to be parallel to the fluid flow in the lab frame, then after transforming to the comoving frame we have a new axis x , along with the unchanged y and z axes. Let µ represent the cosine of the angle between the packet direction and the x axis. Then with the inverse Doppler relation the angle probability distribution dP/d µ = 1/2, and letting N denote the number of packets in the zone at the initialization time, we can write We recognize the final expression as part of equation (48), namely E/ E. So we now have a formula for ref in terms of the comoving radiation moment, which can be evaluated if E is viewed as part of the initial conditions of the problem (if for example radiation is thermalized, E = a r T 4 ). Referring to equation (51), we find We have arrived at what first glance appears to be a counter-intuitive result. Although we have given our packets an isotropic direction distribution in the comoving frame, these packets must be given unequal energies in order to ensure zero flux in the comoving frame. Referring to the Doppler relation in equation (54), we see that the energy is adjusted based on the packet direction with respect to the material motion. Or, viewed another way, we could force the comoving packets to have equal energy, but then we would have to sample their direction anisotropically.
The necessity of such a procedure was also identified in Ryan et al. (2015), through consideration of the invariant four-volume element.
We know from symmetry that there is no preferred direction in the comoving frame. In fact, we have relied on this to derive our results. So why are packets not treated symmetrically when we initialize them in the comoving frame?
The seeming contradiction is resolved when we remember that we are constructing an estimator for the comoving flux assigned to a zone whose boundaries are specified in the lab frame. In the comoving frame, these boundaries are moving at the speed of the fluid, thereby breaking the symmetry. The packets are indeed directed isotropically in the comoving frame, but the flux of packets through the zone boundaries over a proper time interval is biased in the direction anti-aligned with the apparent motion of the zone boundaries. Moreover, the initialization of packets relies on a notion of simultaneity according to the lab-frame clock. In the comoving frame, this simultaneity is broken: according to the comoving frame clock, the zone boundaries will not simultaneously reach their spacetime positions corresponding to the lab frame t = 0 positions. The above procedure accounts for these effects and leads to proper initialization to ensure zero comoving flux in the zone. We use this procedure in the radiating shock tube tests described in section 4.1, and we confirm that it eliminates spurious transient behavior at early times in the regions of uniform flow when compared to a procedure that initializes packets with equal energies as measured in the comoving frame and isotropically sampled directions in the comoving frame.

Relativistic Compton scattering
Our treatment derives from Canfield et al. (1987) and Dolence et al. (2009), both of which were substantially influenced by Pozdnyakov et al. (1983). We briefly summarize the method here.
For each scattering event, after having transformed into the fluid frame, we begin by sampling the speed v e of thermal electrons from the relativistic Maxwell-Juttner distribution, given the fluid temperature. Next we sample a direction for the electron in the form of a unit vector n i e . This direction is characterized by an angle θ between the fluid-frame photon propagation vector and the electron velocity vector, and a uniformly distributed azimuthal angle φ that is independent of θ. The distribution for θ is weighted such that electrons with momenta anti-aligned with the photon momentum are more common than those that are aligned, as described by the distribution (1 − β cos θ)/2, where β = |v e |/c. For discussions of this angular dependence of the cross section, see for example Gould (1971).
Once we have obtained an electron velocity vector following these steps, we perform a Lorentz boost into the electron rest frame, and there we compute the angle-integrated scattering cross section σ using the Klein-Nishina formula, which depends on the frequency of the photon as measured in the electron rest frame. We now apply a rejection sampling method by generating a random unit variate and comparing it to the ratio σ/σ T where σ T is the Thomson scattering cross-section σ T and which is the maximum possible value for σ. If the variate is less than this ratio we proceed, but if it is greater we reject the electron velocity vector and begin again by sampling a new v e and n i e . Once we accept an electron velocity, we sample the outgoing photon direction from the Klein-Nishina differential scattering cross-section, accounting for the change in photon energy due to electron recoil. We then inverse Lorentz transform the photon four-momentum back into the fluid frame, and perform one final transformation from the tetrad basis to the coordinate frame (or inverse Lorentz boost for Minkowski spacetimes).
During each scattering event, we record the difference of the incoming and outgoing photon energy and momentum components as measured in the comoving frame, and sum these contributions over all scattering events in each zone over the hydrodynamic time step. The resulting energy and momentum sums per cell-volume per time interval are added to our baseline estimators for G α described in sections 3.2 and 3.3 that accounted for thermal absorption and emission, and coherent scattering.
During the photon propagation step we must also determine the fluid-frame mean free path for Compton scattering off the thermal distribution of electrons, which will be a function of the local electron temperature T e and the fluidframe photon frequency ν. This requires taking the average cross-section as measured in the fluid frame that results from the above sampling procedure over the distribution of electron velocity vectors. The resulting average crosssection can be expressed as a double integral over all possible values for v e and θ. For computational expediency, we pre-compute these average cross-sections and store them into a two-dimensional look-up table binned logarithmically by electron temperature (ranging from 10 7 to 10 12 Kelvin) and photon energy (ranging from h ν = 1 eV to 10 8 eV).

Methods overview
Many steps go into a single cycle update, so we take the opportunity here to summarize a broad view of the algorithm. First we point out that (1) none of the new material discussed in this report affect the other physics packages, except for the total energy and momentum modified by the radiation 4-force, and (2) all of the source terms used in updating the radiation fields are derived explicitly from current state primitive fields (only at the very end of the cycle are primitive fields updated). Because primitive fields are not altered intermittently between physics solves, the radiation transport solver is not affected by and proceeds (within a single cycle) essentially independently of other packages, and vice versa.
Each cycle begins with the interaction-sampling propagation sequence described in section 3.1: Between events packets are transported by the geodesic equations; at event sites packet energies and velocities are updated in accordance with the associated event (e.g., absorption, physical scattering, Fleck scattering). An interpolated weighting procedure calculates comoving moment estimators (section 3.3) and accumulates updates to the radiation 4-force, equations (21) and (22), integrating over energies and solid angles, accounting for both absorption and scattering. At the end of each propagation cycle, when particles reach census, the integrated 4-force is used to compute the fluid internal energy semi-implicitly with the Fleck procedure described in section 3.4. For the total energy formulation (section 2.2), the internal energy computed at this stage does not replace the actual primitive field, but is instead used to update the total energy and momentum conserved fields. After all physics packages have been advanced, primitive fields are obtained from the conserved (evolved) fields by Newton iteration before exiting the cycle.

Radiating shocks
This series of radiation wave and shock tube tests was first performed by Farris et al. (2008) in the Eddington closure approximation, P ij = ( E/3) δ ij . Versions of these tests have since been repeated with several other codes and treatments of the radiation, including our own earlier work with single-group Eddington (Fragile et al. 2012) and multi-group M1 closure (Anninos & Fragile 2020). We perform these tests again with implicit Monte Carlo, comparing results to our M1 treatment. Similar tests were carried out by Ryan et al. (2015) in their explicit treatment of Monte Carlo transport, which requires smaller time steps to resolve the radiation cooling time scale and to maintain stability.
The boundary conditions for this test make use of the fact that far from the shock transition region, F i → 0. In the lab frame, however, absent any injection of packets from the boundaries, there is a non-zero flux through the boundaries due to the advection of radiation. In this situation, Gentile & Morel (2011) show that the proper boundary condition requires injecting packets from the boundary with the following energy flux measured in the coordinate frame: where and where n i is the normal vector to the boundary plane pointing into the computational domain. In this 1D problem, the boundary planes coincide with planes of constant x-coordinate, and we have assumed the material has no motion in the y or z directions, although Gentile & Morel (2011) also discuss how to accommodate such motion. By injecting packets at a rate consistent with equation (58), the net radiative flux near the boundary as measured in the comoving frame will tend toward zero. Moreover, we also require the distribution of direction cosines µ with respect to the boundary normal direction for those packets injected from the boundary, measured in the coordinate frame. This will allow the packet direction distribution near the boundary to match the distribution in the interior zones. Again quoting from Gentile & Morel (2011), and assuming no material motion in the y or z directions, this distribution is We consider five tests: a nonrelativistic strong shock (case 1), a relativistic strong shock (case 2), a relativistic wave (case 3), a radiation pressure dominated relativistic wave (case 4), and an optically thick variant of case 4. The parameters and initial states for all cases are summarized in Table 1. All calculations were performed with 128 zones covering the entire domain for both the IMC and M1 calculations, and run until transients exited the grid and the solution achieved steady-state. Figure 1 compares the M1 and IMC solutions for density (ρ), velocity (v x ), gas temperature, and radiation temperature as a function of position. We find the agreement between IMC and M1 solutions is excellent in cases 1, 2 and 5. Disagreements are most notable in cases 3 and 4, with case 4 being the most obvious. The extent of the disagreement has a strong dependence on the optical depth across the domain of the problem. For example, case 4 has the most disagreement and also the lowest optical depth (τ = 5.8 across the downstream half of the domain). By comparison, case 5, which is the same as case 4 except for a higher opacity (τ = 51 across the downstream half of the domain), shows much better agreement between the two methods. This is expected behavior, since by design the M1 closure will faithfully solve the radiative transfer equation in the optically thick limit, but makes assumptions which are only approximately correct in the marginally optically thick regime. Figure 2 shows how the Fleck factor f IMC varies across the domain once the steady-state solution has been reached in all five cases. The Fleck α parameter was set to 1.0 in all five cases. The lowest values of f IMC occur in case 5, where they become less than 0.02 on the downstream side of the shock, indicating that we are taking time steps at least 10 times longer than we would be able to if we did not use a Fleck factor. We have also experimented with changing the Fleck α parameter for case 5, using α = 0.5. Despite some early transitory differences at the shock position, the solutions very quickly converge within a few cycles, so the value of alpha has no noticeable impact on the steady-state values of the quantities plotted in figure 1.

Compton equilibration off relativistic electrons
In this next test we initialize photons at a single frequency ν 0 in a uniform region of gas with zero velocity in the lab frame. We consider gas that is fully ionized, with free electron temperature T e , electron number density n e , and free electrons make up a fraction f e of the total number density of particles, which is constant in our problem. The photons have number density n γ , and they Compton scatter off the free electrons. In the process, energy is exchanged between the gas and radiation as the volume is kept constant, and over time they equilibrate at a temperature T f . By construction, photons are neither created nor destroyed in this test, and their final distribution is described by a Wien distribution with temperature T f and chemical potential µ f . Whenever photon energies are specified in this section, they are measured in the comoving frame of the fluid, which is also the frame in which the electron temperature T e is defined.
This test is modeled closely on one performed in Ryan et al. (2015), and a related test in Roth & Kasen (2018). However, in this case we perform the test at higher T e , such that the relativistic effects mentioned in section 3.6 become relevant when sampling electron velocities and calculating mean-free-paths. We also provide an analytic formula for the equilibrium solution T f as follows.
The mean intensity for the equilibrium Wien distribution is given by Keeping in mind that the specific radiation energy density is E ν = 4πJ ν /c, the above distribution can be integrated over frequency to find the radiation energy density E f which corresponds to T f and µ f , yielding Fleck factors fIMC for all five radiating shock tests. For cases 3 through 5, fIMC is small enough that these problems must be solved implicitly (with α = 1) to maintain stability at hydrodynamic timescales.
To solve for the two unknowns T f and µ f , we consider two constraints: conservation of photon number and conservation of total energy in the combined gas-plus-radiation system. Since the volume is kept constant, the two conditions can be expressed as conservation of number density and energy density. We exploit the fact that the two systems equilibrate to T f and the photons equilibrate to the Wien distribution in the manner previously described. This means that conservation of photon number density is already expressed in equation (63). The total energy conservation constraint can be written as where we have assumed an ideal gas equation of state for the gas, with adiabatic index γ ad , and that electrons and ions have equilibrated at temperature T f . Solving the two equations we obtain Combining equations (62) and (66) we find that theradiation energy density in the equilibrium distribution is This implies that the mean photon energy in the equilibrium distribution is 3 kT f . Figure 3 shows results from this test with parameters T i = 10 9 Kelvin, n e = 2.5 × 10 17 cm −3 , n γ = 2.38 × 10 18 cm −3 , ν 0 = 6 × 10 17 Hz, f e = 0.5, and γ = 5/3. Equation (65) then gives T f = 1.04 × 10 8 Kelvin, and equation (66)   of time, both of which settle toward the predicted value of T f . The right panel shows how the spectral distribution E ν of the radiation evolves over time as the photons repeatedly Compton scatter off hot electrons. Eventually E ν converges to the Wien spectrum corresponding to T f and µ f given by equations (65) and (66). The spectra in panel (b) of Figure 3 used 2 × 10 5 packets. In Figure 4, we show how the L1 error norm for the steady-state IMC spectrum compares to the analytic Wien spectrum as a function packet number N . As expected we find the error scales as N −1/2 .

Compton scattering angle distribution test
For these tests, we again initialize photon packets at a single energy hν and allow them to Compton scatter off a population of thermal electrons, but this time we keep track of the distribution of photon scattering angles. We are careful to consider only those scattering events where the incident photon energy is at the value we initialize -that is, we are not describing here the effects of multiple scattering that were studied in section 4.2.
The first case we consider corresponds to the simultaneous limits hν/(m e c 2 ) → 0 and k B T e /(m e c 2 ) → 0, which is the special case of Thomson scattering, with differential scattering cross section given by where σ T is the Thomson total scattering cross section and θ s is the photon scattering angle in the rest frame of the target electron prior to scattering. We will work in terms of the probability distribution for the photon scattering angle θ s , dP/dΩ = (1/σ T )dσ/dΩ. Introducing µ s ≡ cos (θ s ), and performing the integral over the azimuthal angle, we can then write dP dµ s = 3 8 1 + µ 2 s .  Figure 5. Histograms of the cosine of the photon scattering angle from our Monte Carlo sampling procedure. The top and bottom rows correspond to relatively low and high temperatures, respectively, for the target population of electrons. The first and second columns correspond to relatively low and high incident photon energies, respectively. When available, we have plotted the analytic distributions from Sazonov & Sunyaev (2000) as red envelopes. In all cases where the approximate distribution is available, we find satisfactory agreement with our Monte Carlo results. The small discrepancy near µs = −1 in panel (b) is expected because the analytic distribution is only first order in hν/mec 2 .
In Figure 5 we show histograms with randomly generated µ s for four separate cases. Panel (a) of this figure corresponds to the Thomson scattering limit we have just described. Here we can compare our distribution of µ s to equation (69), and we find good agreement.
Panel (b) of Figure 5 shows the distribution of µ s for the case of higher-energy incident photons (hν = 0.1, m e c 2 ≈ 50 keV) scattering off electrons at the same temperature as before, T e = 10 4 Kelvin. These higher-energy photons have a higher probability of forward-scattering (µ s > 0) than was the case in the Thomson limit. Sazonov & Sunyaev (2000) provide an approximate analytic formula for the scattering distribution in this regime, which is plotted in red in the figure, although we have adjusted the normalization constant for the distribution to ensure the probability integrates to unity. The slight discrepancy between the analytic formula and the Monte Carlo results at µ s near -1 is expected (and also appears in figure 1 of Sazonov & Sunyaev 2000) because the formula is only first order accurate in hν/(m e c 2 ).
Panel (c) of Figure 5 shows the distribution of µ s for the case of low-energy photons (hν/k B = 3 × 10 4 K), but much hotter electrons (kT e = 5m e c 2 , or equivalently T e ≈ 3 × 10 10 K). This leads to a strong weighting toward back-scattering, with a distribution in agreement with the corresponding formula from Sazonov & Sunyaev (2000).
Finally, in panel (d) of Figure 5, we consider the higher energy photons, hν/(m e c 2 ) = 0.1, scattering off the hotter population of electrons, kT e /(m e c 2 ) = 5. We are not aware of an analytic approximation for the scattering angle distribution in this case. Even though these photons were high enough energy that they had a tendency to forwardscatter off the colder population of electrons, when the electrons are this hot, the distribution is still weighted toward back-scattering.

Gravitational and Doppler redshifts
The accreting and radiating sphere problem considered by Müller et al. (2010) (see also O'Connor (2015)) provides a good test of frequency redshifts experienced by photons traveling through gravitational fields and moving fluids. We adapt this test to the Cartesian Kerr-Schild spacetime metric, modeling the gravitational field of a single non-rotating 10 3 solar mass black hole on a N × 40 × 40 grid, where we vary the number of zones N along the x-axis. The sides of the grid have total lengths L x = 70 and L y = L z = L x /10 in mass units. We do not simulate the emission of thermal photons from a stellar surface as previous applications of this test (Anninos & Fragile 2020), but instead launch a steady stream of photon packets from x/R g = 2.4 (just outside the event horizon at R BH = 2R g ), directed along the x-axis, with identical 4-momenta. Absorption and scattering opacities are both set to zero in this problem, suppressing all radiation-matter interactions aside from the gravitational potential and Doppler interactions with the steady accretion flow introduced at a finite distance from the black hole in the following manner: Figure 6 plots photon energy measured in the comoving fluid frame, h ν = −k α u α , as a function of radius from the black hole. Results are shown at time t = 80 in units with M BH = G = c = 1. As expected, we observe photon energies to decrease with radius due to general relativistic time dilation effects, and then increase as they propagate through the accreting fluid due to relativistic Doppler effects.
To test the accuracy of our computation of gravitational redshift and relativistic Doppler effects, we derive an exact, analytic solution to use for comparison. In the limit that the angular momentum of a black hole approaches zero, we define a radial coordinate from the Cartesian Kerr-Schild coordinates, and for purely radial trajectories the line element can be written This implies the existence of a Killing vector field with components K 0 = 1 and K r = 0, and a conserved quantity which we denote as E ∞ obeying the relation Combining the above equation with the null condition for k α we find If we set the value of k 0 at our launch radius, then we can use equation (74) to calculate E ∞ , and then again to find k 0 as a function of radius. Finally, if u α is the fluid 4-velocity, we compute u α k α by using the metric corresponding to equation (71) to lower the index of k α . This analytic solution is also plotted in figure 6 as a dotted line. Even for the coarsest resolution used in that figure, the numerical solution remains within 1% of the analytic solution across the entire domain. In figure 7 we study the numerical convergence of the solution as a function of spatial resolution along the x-axis. In panel a we show how the L1 error norm decreases with increasing spatial resolution, focusing on the interval 2.4 < r/R g < 16. In this region, the fluid velocity is zero, so u α k α entirely reflects the gravitational redshift, which is obtained from the geodesic integration of the photon 4-momentum. The numerical solution converges at second order accuracy, as expected from the Verlet integrator used for the geodesics. In panel b, we plot the L1 error norm for the region 16 < r/R g < 32. Here the fluid velocity takes on non-zero values as given by equation (70). When selecting the fluid velocity components that go into u α k α , no interpolation of the fluid velocity is performed -the velocity value at the center of the zone is used. This causes the numerical solution to converge at first order accuracy with respect to spatial resolution. Of course, if we consider the full range of radii including both non-zero and zero fluid velocity, the first-order error dominates.

Geodesic light beams
For strongly gravitating systems (e.g., black holes), it is critically important that we solve the geodesic equations (19) accurately through multi-dimensional curved spacetimes in general curvilinear coordinates. To validate our geodesic solvers we consider a light beam test proposed by Sadowski et al. (2013) which launches packets at the photon orbit radius, r = 3 in distance units of GM/c 2 for a non-rotating 3 M Schwarzschild black hole, so that photons in the center of the beam should in principle be able to orbit the black hole indefinitely. As in the previous redshift tests, we neglect coupling between gas and radiation (by setting absorption and scattering opacities to zero), focusing on evaluating the accuracy of geodesic path integrals but keeping to the full IMC solver framework for computing the energy and flux. The calculations presented here are run on a two-dimensional grid (in r-φ coordinates), with 40 × 40 cells covering the domain 0 ≤ φ ≤ π/2 and r in ≤ r ≤ r out , where r in = 2.5 and r out = 3.5 are the inner and outer radial boundaries, respectively. The light beam is centered at r = 3 and given a width of r = 0.1.   3.4 × 10 −14 1.9 × 10 −12 Figure 8 plots the trajectory of the light beam at the final simulation time. As expected the center of the beam remains on the photon orbit radius, and widens slightly at the edges (falling towards the black hole along the inner edge, and pulling away at the outer).
We have also performed a number of single photon orbital trajectory tests through static and rotating black hole spacetimes, evaluating errors in energy, angular momentum, Carter, and null condition constants of motion. These tests were carried out on three-dimensional grids using the Boyer-Lindquist metric with a black hole mass of 10 3 M and spins up to a = 0.9M BH , considering both prograde and retrograde equatorial orbits, as well as polar trajectories (Teo 2003). We find, as expected, that the constants of motion are maintained to the convergence accuracy of the solver methods: first order convergence for first order Runge-Kutta (RK1), second order for second order Runge-Kutta and Verlet methods, and fourth order for fourth order Runge-Kutta (RK4). A characteristic sample of parameters and results are shown in Table 2, where we demonstrate convergence in the energy and Carter constants over several orbits between the RK1, Verlet, and RK4 methods. All calculations were run at a constant time step interval of about 0.01 in mass units (M = G = c = 1). We do not include the angular momentum or null (k α k α = 0) constants in the table since they evaluate to roughly machine precision (less then 10 −14 in most of the tests).
To measure the relative performance of the various geodesic solvers, we repeated the 2D light-beam test using RK1, RK4, and Verlet methods, in addition to the RK2 method that was used to generate figure 8. The timing information for the calculations is reported in table 3, normalized to the RK1 compute time.

Radiating Bondi accretion
The radiating Bondi accretion problem considered in our earlier work (Fragile et al. 2012(Fragile et al. , 2014 provides an excellent test of coupled radiation-matter interactions in a hot and strongly gravitating black hole environment. We consider a variant of that test here, comparing our current IMC results to our previous radiation treatment with M1 closure.
A key difference however is that here we work in three-dimensions on Cartesian grids using the Cartesian Kerr-Schild black hole solution for the spacetime metric, rather than its two-dimensional spherical equivalent. Other than this difference, the basic problem definition and parameter choices are the same, so we refer the reader to those earlier papers for details and merely report the salient features here.
We fix the mass of the black hole to 3M , the adiabatic gas index to Γ = 5/3, and the gas temperature, freefall velocity, and density to where the mass accretion rate is defined in terms of the Eddington rateṀ = x MṀEdd withṀ Edd = 4πGM σ T /(cm p ), and for this test we set x M = 0.1. The radiation is initialized with a temperature much smaller than the gas temperature, but eventually increases in time as accretion heats up and thermal emissions contribute to the luminosity.
Interactions between matter and radiation are modeled with Thomson scattering (κ s ) and thermal bremsstrahlung (κ a ) opacities assuming fully ionized hydrogen gas The luminosity, defined as where dA r is the surface area element normal to the radial direction, agrees very well between the two methods (IMC versus M1). Figure 9 shows a comparison of the luminosity escaping through a spherical surface of radius 2000 R g centered on the black hole. After a brief initial transient, between t = 400 and t = 1200, the two methods agree to within 10%. Between t = 1200 and t = 2200 there is larger disagreement as the radiation wave from accretion near the black hole reaches the flux measurement surface. This is then followed by a period of even better agreement: between t = 2200 and t = 5000, the root-mean-squared fractional difference between the luminosities is 3.9%.

PERFORMANCE
A subset of the code tests presented in this article, particularly those that were validated against equivalent M1 calculations, provide an opportunity to compare relative CPU resources required by the two different treatments within the same code framework. A few 1D radiating shock tube tests and the 3D radiating Bondi accretion problem are selected to showcase this comparison. We note our current implementation does not redistribute particles for load-balancing purposes. Domain exchanges are performed only when particles propagate into domain-shared cells while transiting to their neighboring partition. The following performance comparison therefore does not account for communication costs (or benefits) of either mesh re-partitioning or particle load balancing beyond field and particle cross-over exchanges. Table 4 summarizes our results across four tests: the first three are 1D radiation shock tube problems (cases 3, 4 and 5 as described in section 4.1), and the last is the 3D Bondi accretion test from section 4.6. The first column identifies specific tests, the second column indicates how many processors were used, the third column shows how long in program time units the calculations were run, the fourth column records the number of packets active across the grid (averaged over time), and the fifth column is the ratio of the wall clock time required to complete the calculation for IMC relative to M1. All of the M1 calculations for the 1D relativistic shock tests were performed with a single frequency group. For the 3D Bondi test, we included multiple timing comparisons performed with different numbers of M1 frequency groups. The opacity used in the Bondi test is grey (frequency-independent), so the integrated luminosity is independent of the number of bins. By varying the bin number, we are strictly performing a timing comparison when the multi-frequency capabilities of the code are exercised.
An important consideration for the IMC treatment is that the number of packets employed in any given problem and at any given time can be dialed up or down to achieve desired levels of noise. Additionally, the number of active packets changes over the course of the calculation. The average number of active packets that is reported in table 4 is calculated based on the number of active packets at the end of each hydro time step over the course of the calculation. This number generally depends on the problem, and on the time at which we terminate the calculation. For the 1D tests, which used 128 zones, the average number of packets per zone works out to about 41000, 130000, and 77000 for cases 3, 4, and 5, respectively, in direct proportion to the performance metric in table 4. For the 3D Bondi test, which used 72 3 zones, the average number of packets per zone is approximately 380.
We find IMC to be about 1000 times more expensive than the 1D single-group M1 tests, given our packet counts (which we note can be reduced to make IMC more competitive while still maintaining reasonable noise levels). In 3D, the relative performance of the IMC improves significantly due to its intrinsic 3D nature. Depending on the number of M1 frequency groups employed, IMC can actually out-perform M1 on this problem, while maintaining an average of approximately 380 packets per zone. A crucial point here is that, for a fixed level of frequency-integrated sampling noise, the expense of the IMC calculation is effectively insensitive to the number of frequency groups used. If, instead, one wishes to maintain a fixed level of sampling noise in each frequency bin, then the cost of IMC increases linearly with the number of frequency bins, absent further optimizations.
By contrast, the expense scaling of M1 with the number of frequency bins can be super-linear. Figure 10 shows how the time to complete the radiating Bondi calculation depends on the number of frequency groups. At low group-count the scaling is roughly linear. However, as the number of groups increases, the scaling becomes dominated by the expense of solving a nonlinear 5 + 4N dimensional matrix system (combined with the primitive inversion elements discussed in section 2.3, and in more detail in Anninos & Fragile (2020)), where N is the number of frequency groups and the factor of four accounts for energy and flux vector coupling. At 40 groups we observe N 2.5 scaling. With even more groups, we expect this scaling to approach N 3 , the theoretical operational limit of the underlying LU decomposition algorithm used to solve these systems of equations. Wall time versus number of M1 frequency groups for radiating Bondi problem Figure 10. The time required to complete the radiating Bondi problem with M1 transport as a function of frequency bins. All calculations used a spatial resolution of 72 3 zones, and were parallelized across 64 processors with a 4 × 4 × 4 domain decomposition pattern. The 1, 5, 10, and 20 group calculations were run to t = 5000 mass units. The 40 group calculation was only run to t = 1000, and the time to run to t = 5000 was estimated based on its average performance to that point. IMC radiating shock test performance versus optical depth Figure 11. The relative computational expense for the relativistic radiating shock tests as a function of optical depth. These runs are generalizations of cases 4 and 5 from section 4.1, where opacity is the only parameter that is changed between the runs. The optical depth reported on the horizontal axis corresponds to the downstream half of the domain. When the optical depth is high enough such that the average number of packet interactions per time step significantly exceeds unity, the expense scales linearly with optical depth, as indicated by the dashed line. Figure 11 shows how the computational expense of IMC scales with optical depth. The tests used to make this figure are generalizations of the radiation shock tube cases 4 and 5 as described in section 4.1, where only the opacity is changed between calculations. For all the data points shown, the timing measurement is performed once the system has reached steady-state, and with approximately the same number of active packets in all cases. At low enough optical depth, the cost is nearly independent of optical depth because the average number of packet-matter interactions per time step is less than unity, and the packet interactions are the most computationally expensive part of the algorithm. When the optical depth is increased to where packets interact many times per time step, the computational expense approaches a linear scaling as indicated by the dashed line. The fact that computational expense increases for problems with higher optical depths is a well-known feature of a straightforward IMC solution, although this effect may be mitigated with specialized techniques as we discuss in the conclusion.

CONCLUSIONS
We have expanded the modeling capabilities of our radiation-magnetohydrodynamics code Cosmos++ to include implicit Monte Carlo radiation transport that is valid to all orders of (v/c), and equally applicable to Newtonian, special relativistic, and general relativistic problems. This latest capability represents a major improvement in fidelity over our previous multi-group transport implementation based on an Eulerian 2-moment (M1) closure method. The two approaches (M1 versus IMC) offer uniquely distinct advantages and disadvantages, so having the ability to choose one over the other is an important feature of Cosmos++. Monte Carlo methods, for example, can be prohibitively expensive at low noise levels, else suffer the consequences of random noise fluctuations. But they generally treat radiation-matter interactions (e.g., Comptonization) with greater accuracy and spectral fidelity. Zero or first moment multi-group methods, on the other hand, can be significantly faster without the worry of noise fluctuations when a moderate number of frequency bins will do. But they suffer in the treatment and accuracy of radiation-matter coupling and anisotropic stress energy closures. For problems where both spectral fidelity and computational cost are valued, one can imagine performing a self-consistent RMHD calculation with M1 closure to get reasonable energy balances, then running the Monte Carlo as a post-processor to resolve the spectrum, an option made available in our implementation.
It is well-known that Monte Carlo techniques are not well-suited for problems in the strong diffusion regime, becoming inefficient when the algorithms struggle to resolve an increasing number of scattering events at very short spatiotemporal scales. For explosive systems, this can be dealt with by using the diffusion approximation until the system relaxes to reasonable optical depths. Supernova calculations, for example, activate transport a day or two following the explosion when the optical depth drops below ∼ 10 4 (Lucy 2005;Kasen et al. 2006). In other cases, such as black hole accretion disks, the optical depth is a fixture in the equilibrium structure of the disk, in which case the diffusion limit has to be dealt with in a more rigorous manner. We propose to do that in future work by incorporating random walk algorithms along the lines of (Fleck & Canfield 1984) and its relativistic extension (Richers et al. 2017), Implicit Monte Carlo Diffusion (Gentile 2001), or Discrete Diffusion Monte Carlo (Densmore et al. 2007).