Learning Closure Relations using Differentiable Programming: An Example in Radiation Transport

The continuous flow or ‘transport’ of a macroscopic system of particles is a high dimensional problem and therefore often solved using reduced order models. This necessarily introduces unknown closure relations into these models. In this work, we present a machine learning approach to finding accurate closure relations utilising differentiable programming. As a case study, we consider the transport of photons and use a literature radiation transport test problem as a training dataset. We present novel ML closures for a number of reduced order models which out-perform their literature counterparts in both trained and unseen problems.


Introduction
Reduced order models provide a method to model physical systems efficiently but approximately when more accurate models may prove prohibitively expensive.These models often have 'closure relations', where in order to close the system of equations, one needs to encode higher order behaviour in a function of the variables which are being solved for.The primary objective of this work is to utilise differentiable programming to provide a proof of principle machine learning (ML) method to the task of deriving equation closures.By using a physics-driven model enhanced with ML we can obtain better performance than both pure data-driven and traditional numerical physics techniques.This methodology has cross-disciplinary application as the closure problem is ubiquitous in the physical sciences.In our example, we will show that state-of-the-art reduced order models from the literature can be improved in accuracy by an order of magnitude using ML closures.
Machine learning is apt in the case of finding closures as these act as unknowns in otherwise physically motivated reduced models.We will investigate ML closures in reduced order solutions to a particular analytic result in radiation transport from Su and Olson [1].Radiation transport is an important phenomenon in many fields of physics including astrophysics, solar physics, plasma physics and nuclear fusion.The Su & Olson problem considers a homogeneous medium which isotropically absorbs and emits radiation.Embedded within this medium is a transient and localised radiation source, Q.The solution describes how the radiation, W , and material, V , energy densities evolve over space, x, and time, τ .Further details can be found in the appendix.The data set considered is small within a ML context and therefore we need strong inductive bias in the models we train.These constraints will come from radiation transport theory and will improve the ML closures' ability to model unseen problems.

Learning closure relations using differentiable programming
The problem of learning closure relations can be specified as an optimisation problem.We have data we wish to match e.g. an analytical solution or experimental data.We also have a model with a number of trainable parameters, i.e. our reduced order system of equations, and a parametrised closure relation.Finally, we must define a loss between the reduced order model and analytic solution.
With these three components, one can utilise differentiable programming to compute gradients of the 'loss' with respect to our closure parameters.This involves back-propagating through the discretised form of the PDE model solution [2], which amounts to solving the adjoint equation of the PDE.As a result, we obtain an end-to-end differentiable architecture that can be used to learn closure relations, thanks to automatic gradient calculations.In this work, we utilise the ML framework JAX [3] and libraries optax [4] and diffrax [2].These libraries perform the optimisation task with modern ML techniques, within a very small number of lines of user code.The main challenges lie in creating a numerically-stable spatially-discretised version of the PDEs and specifying an effective parametrisation of the closure which enforces the physical constraints.

Variable Eddington factor closure, p(f )
The first reduced order model we will consider is variable Eddington factor (VEF): These equations appear in the radiation transport literature with various proposed closure relations p(f ), e.g. the Kershaw [5] and P1 [6] closures.However, the literature closures cannot reproduce the Su Olson problem, as shown in Fig. 1 (a).We will use our ML approach to find an optimal closure and improve the VEF equations' prediction.
For optimisation purposes, we define the best solution as one which minimises the squared errors between the model solution and the Su Olson radiation energy density, W , giving the following loss function: We also restrict to the time points for which the external source, Q, is on, τ ≤ 10.For these times, transport behaviour beyond diffusion is most significant, growing to a peak around τ = 10.This also allows a smaller domain size and therefore fewer numerical time steps.The correct late time diffusion behaviour, F → −(1/3)(∂W/∂x), will be enforced by constraints on the closures.Shortening the training time further to exclude the τ = 10 data caused a precipitous drop in accuracy.Secondly, we must define a suitable parameterisation of the closure which respects the radiation transport limiting behaviour.This is done through a constrained Padé approximant: where the a n and b m are the trainable parameters we will optimise.The constrained Padé approximant form was chosen as it required a low number of trainable parameters, allowed easy implementation of physical constraints and is a more flexible function approximator than a truncated power series.
Finally, our end-to-end differentiable PDE model produces AD gradients which can be utilised by mainstream optimisation methods.The small amount of data also means that no batching is required, thus every space and time point is evaluated for each new updated parameter set.We solve the transport equations explicitly using literature, finite differencing methods; for details see the code in the supplementary material or code repository.The explicit time stepping means the speed of light restricts the numerical CFL time step.For a spatially resolution dx, N t = 10 Cdx time steps are needed to reach τ = 10, where C is the Courant number.For our training we chose C = 0.1 and dx = 0.015, thus N t ∼ 7000 time steps of our reduced models must be taken in order to evaluate the loss function once.This highlights the need for differentiable programming as having efficient access to accurate gradient information will greatly reduced the number of loss function evaluations needed.For optimisation, we use the Adam optimiser with a decaying, piecewise-constant learning rate.The results from the optimal closure found using this ML method described above are shown in Fig. 1 (a).
These results point to an issue with VEF methods in this case.Indeed, in the Su Olson problem it is known that p is less than 1/3 at the origin [6] (where f = 0 by symmetry).Improved agreement could be found by allowing p(f = 0) to deviate away from 1/3.This however removes the models ability to model the diffusive limit, occurring when f → 0. These deficiencies point to the fact we are exiting the models range of validity and the accuracy of the model is no longer limited by the closure.

Third angular moment closure, g(p)
The third angular moment equations extend the VEF equations with the following additional equation: The third angular moment equations are a much less common approach than the VEF equations and no mainstream literature closures exist.Therefore, the presented ML methodology is vital in deriving a viable third order moment model.From its definition, we can enforce the following restrictions on the closure, g(0) = 0, g(1) = 1 and dg/dp(p = 1) = 1, using a constrained Padé approximant.Then we use the same optimisation method as for the VEF results to find the optimal values of a n and b m for the novel g(p) closure to reproduce the Su Olson solution.The results are summarised in Fig. 1 (b).As shown, the agreement between the model and analytic solution is improved for this higher angular moment scheme and ML closure.

Consequences for flux limited diffusion
Given the results from the higher order schemes, it is obligatory to investigate the consequences for arguably the most popular closure model -Flux Limited Diffusion [6] (FLD).The FLD model resembles the VEF model (Eq.( 1)) but with the flux, F , approximated using the following equation: where λ is known as the flux limiter; when λ → 1/3 pure diffusion is obtained.Various literature closures for λ exist, such as Larsen [7], but as with the VEF equations no single FLD closure will be able to capture the required transport behaviour in the Su & Olson problem.Instead, we propose a novel higher order flux limited diffusion (HFLD) model using the third order moment equations, but with the radiative flux PDE replaced with Eq. ( 5) and use the following flux limiter: This HFLD model is enabled by the closed form of the third order moment model we were able to obtain using ML methods.We have also introduced a new flux limiter closure term, h(p).From the literature, Levermore [8] proposes h(p) = p but then uses other literature VEF closures to relate p to R rather than explicitly solving Eq. ( 4) for p.For our ML model, we again choose h(p) to have a constrained Padè form, with h(0) = 0, h(1) = 1 and h(1/3) = 1/3 to ensure the correct limiting behaviours.We also enforce h ′ (0) > 0 to (non-rigorously) prevent negative values which would allow complex λ values.Training of the HFLD h(p) closure using the Su Olson results proceeds in a similar fashion to both the VEF and third order moment models, albeit with a more restrictive CFL condition on the numerical time step.

Application of the ML-closures in a modified problem
As the ML-closures discussed above are specifically trained on the Su-Olson problem, one might question their generic applicability in other problems.Here, we consider a modification to the Su-Olson test problem in which there is an additional reflective symmetry about x = 2.This introduces another region of zero flux but from symmetric converging flow (p > 1/3) rather than symmetric diverging flow (p < 1/3) -this acts to test a closure's ability to capture anisotropy in two 'fluxless' configurations, one of which is absent from the training data set.We also numerically solved the full transport problem using discrete ordinates [6], S N , with ordinate order N = 32.We compare the reduced order models and S 32 results in Fig. 2 (a) & (b), and the reduced model losses in Fig. 2 (c).
We see that the g-ML model generalises well as the moment equations allow the radiation pressure to respond to flux gradients.The VEF-ML model struggles in both 'fluxless' (f = 0) regions as the requirement needed for capturing diffusive behaviour (p(f → 0) → 1/3) creates large inaccuracy, as discussed in Section 2.1.For HFLD, the novel ML h(p) closure captures the source regions correctly in both problems.It performs similarly to the HFLD Levermore closure (h(p) = p) in the converging flow region (about x = 2) of the modified problem.This is a strong indication that the ML closure has not been over-fit to the original Su Olson training data.It was found that the diffusive limit, h(1/3) = 1/3, constraint is particularly important in preventing over-fitting.

Discussion
In this work, we describe a methodology for using accurate data to learn closure relations in reduced order models using differentiable programming.We use the example of the Su Olson radiation transport problem [1] to find novel closures to the variable Eddington factor, third order angular moment and flux limited diffusion models.All machine learnt models exhibit a lower mean-squared distance to the Su Olson analytic results than their respective literature counterparts, see Fig. 2 (c).The performance of trained ML models did not suffer when a modification was made to the Su Olson problem -still outperforming literature counterparts on an unseen problem.The third order angular moment performed best, proving the most consistent across problem types.The higher order flux-limited diffusion models still proved an improvement.This shows that the ML methods presented can produce more accurate closures as part of reduced models which respect physical laws and generalise to unseen problems.In the case of VEF after the ML closure is trained, we find the accuracy is not limited by the closure but instead by the assumptions of the reduced model itself.
Closure problems are ubiquitous in the physical sciences and machine learning methods have been employed to address the closure problem in the literature.Trained neural networks have proven successful closure models, as long as physical constraints are addressed.Huang et al. directly trained the network to match the higher order solution's closure term [9,10] rather than training to match the dynamics at a reduced order.Han et al. [11] took a similar approach but found that an end-to-end learning approach was beneficial.In contrast, we learnt the closures in an end-to-end approach using only the dynamics of the lowest order terms in the loss, i.e. the energy density, W .We chose not to use neural networks to specify the closure for two reasons.Firstly, enforcing hard inductive biases on the closure is important for ensuring accurate physical behaviour.Including these biases in a loss term would not strictly enforce these.While neural network architecture choices could be used to include these biases, we opted for using a simpler parametrisation of the closures.Secondly, a low number of trainable parameters was preferred due to the sparsity of training data, therefore favouring our Padé approximant approach.With more training data, a neural network could be used within the differentiable programming framework described here.
In this work, we trained closures with PDE solutions using selected numerical methods.A different choice of methods might alter the numerical solutions obtained from the same underlying PDE, affecting the optimal parameter values.However, with sufficiently accurate methods the dynamics of the numerical solution are the same.Therefore, the optimal closure relation parameters found using the differentiable programming methodology are applicable if a different numerical method is utilised, up to fine-tuning.Finally, it should be noted that the closure relation problem is not unique to radiation transport and therefore a similar approach could be considered in other transport problems throughout the physical sciences.

Figure 1 :
Figure 1: A comparison of energy densities (W ) calculated with various reduced models and Su & Olson's analytic results.These are shown as function of position (x) at various times (τ = 0.1, 0.316, 1.0, 3.16, 10.0), note that W monotonically increases in time.

Figure 2 :
Figure 2: (a) & (b) The energy density in the modified Su Olson problem.The numerical results from the reduced order equations and discrete ordinates are shown at various times (τ = 1.5, 4.0, 10.0).(c) Plot of the loss normalised to the pure diffusion result for the various models considered in the original (SO) and modified (MSO) Su Olson problem.The loss for the modified Su Olson problem used the S 32 results as the ground truth values.The models introduced in this work have their labels highlighted in red.