TRINITY: a three-dimensional time-dependent radiative transfer code for in-vivo near-infrared imaging

We develop a new three-dimensional time-dependent radiative transfer code, TRINITY (Time-dependent Radiative transfer In Near-Infrared TomographY), for in-vivo diffuse optical tomography (DOT). The simulation code is based on the design of long radiation rays connecting boundaries of a computational domain, which allows us to calculate light propagation with little numerical diffusion. We parallelize the code with Message Passing Interface (MPI) using the domain decomposition technique and confirm the high parallelization efficiency, so that simulations with a spatial resolution of $\sim 1$ millimeter can be performed in practical time. As a first application, we study the light propagation for a pulse collimated within $\theta \sim 15^\circ$ in a phantom, which is a uniform medium made of polyurethane mimicking biological tissue. We show that the pulse spreads in all forward directions over $\sim$ a few millimeters due to the multiple scattering process of photons. Our simulations successfully reproduce the time-resolved signals measured with eight detectors for the phantom. We also introduce the effects of reflection and refraction at the boundary of medium with a different refractive index and demonstrate the faster propagation of photons in an air hole that is an analogue for the respiratory tract.

so that simulations with a spatial resolution of ∼ 1 millimeter can be performed in practical time. As a first application, we study the light propagation for a pulse collimated within θ ∼ 15 • in a phantom, which is a uniform medium made of polyurethane mimicking biological tissue. We show that the pulse spreads in all forward directions over ∼ a few millimeters due to the multiple scattering process of photons. Our simulations successfully reproduce the time-resolved signals measured with eight detectors for the phantom. We also introduce the effects of reflection and refraction at the boundary of medium with a different refractive index and demonstrate the faster propagation of photons in an air

Introduction
Diffuse Optical Tomography (DOT) has been proposed as a powerful tool to diagnose biological tissue, which utilizes near-infrared light at the wavelengths of ∼ 700−1000 nanometer (nm) that can permeate biological tissues with the depth of ∼ 3 − 4 centimeter (cm) [1,2,3]. In the DOT, the propagation of photons in biological tissues is regulated by hemoglobin (Hb), water, and mitochondria. At a given wavelength in the near-infrared range, the different tissues have different optical properties like absorption and scattering coefficients, refractive index and g-factor. Therefore, by measuring the photon signals escaped from surfaces, we can derive spatial information of absorption and scattering coefficients. The information allows us to construct the spatial map for Hb concentrations, tissue oxygen saturation (StO 2 ), and blood flow. The states of them provide the diagnosis of various diseases and health conditions. For example, higher Hb concentrations and lower StO 2 can be tracers of malignancy [4].
Besides, the recent development of the facility for the DOT has allowed us to detect the signal with the high time resolution, ∼ picoseconds [5]. Multiple scattering processes of photons lead to long travels in biological tissues.
Photons passing through deeper inside take longer traveling time. Therefore, the time-dependent radiation transfer calculations are required to reveal the three-dimensional state in biological tissues. Compared to other diagnostics methods, e.g., X-ray CT (Computed Tomography) and MRI (Magnetic Resonance Imaging), the DOT has various advantages: (1) no radiative exposure, (2) non-invasive, and (3) bed-side monitoring. However, the DOT has faced difficulties in the reconstruction of structure, because photons propagate with complex trajectories due to multiple scattering. To reconstruct the structure, numerous forward models should be provided by radiative transfer calculations [6].
However, the radiative transfer in three-dimensional space is a six-dimensional problem, and also the radiative transfer equation (RTE) with scattering is an integro-differential equation that requires iteration to converge the solutions. Therefore, numerical simulations take high computational costs. Hitherto, the effective methodology for time-dependent radiative transfer simulations has not been developed to circumvent high computational costs.
To solve RTE, we calculate intensity that is a distribution function in multidimensional phase space depending on place, direction, and time. Therefore, the accurate calculations of RTE in three-dimensional space are challenging even with current computational facilities. Monte Carlo technique has been introduced as a simple approach to evaluate the radiation field stochastically [7,8,9,10,11]. However, due to the random samplings of angle and propagation distance in this method, photon signals are contaminated with shot noises. This leads to inaccuracy in the reconstruction of biomedical states from the signals.
In order to reduce the calculation amount and memory space, the diffusion equation (DE) has been often solved instead of RTE [12]. The DE determines the propagation of radiation solely by the gradient of radiation energy density.
This approximation is valid only if the radiation fields are close to isotropic distributions due to numerous scatterings. However, calculations based on the DE cannot keep high accuracy near surfaces or in abnormal tissues, because the radiation fields are likely to be anisotropic there. Moreover, it is difficult for the DE to take into account the refraction and reflection at a boundary between media with different refractive indices. To consider the anisotropy of radiation, the approximations based on spherical harmonics with higher orders have been applied for tissue optics [13,14]. Klose et al. (2006) [15] demonstrated that a simple spherical harmonics method could follow a light propagation accurately with less expensive calculation. However, as the anisotropy of the radiation field becomes higher, these methods need higher orders of the spherical harmonics, resulting in expensive calculations. In addition, the hybrid method combining the DE and the RTE has been considered [16,17]. Tarvainen et al. (2005) [16] solved the RTE only for the domain where the DE is not valid and used the DE elsewhere. Using this method, they successfully reduced calculation amounts by a factor of ∼ 2 with keeping the accuracy. However, note that it is difficult to know the anisotropy of the radiation fields without solving the RTE in cases of highly inhomogeneous media. In such a case, the domain where the DE is invalid or a required order for the spherical harmonics is unclear. Therefore, to realize precise reconstruction in general cases, solving RTE is required [18].
Recently, Fujii et al. (2018) [19] performed three-dimensional radiative transfer simulations with open multi-processing (OpenMP) parallelization of a central processing unit (CPU). However, the spatial and the angular resolutions are limited due to the expensive calculations of RTE with a single CPU. To manage the larger computational amounts and memory space, utilizing multiple CPUs via Message Passing Interface (MPI) parallelization is a promising way. In astrophysics, three-dimensional radiative transfer simulations have been performed with multiple CPUs [20,21,22,23,24]. However, in most astrophysical simulations, RTE is solved without the time dependency, since the light crossing time is much shorter than the dynamical time of objects. Moreover, unlike astrophysics, we have to take into account the refraction, reflection, and highly anisotropic scattering in the DOT, which require high angular resolution. Therefore, performing time-dependent radiative transfer simulations with high resolutions for the DOT is a challenging issue. In this paper, we develop a novel radiative transfer code, TRINITY (Time-dependent Radiative transfer In Near-Infrared TomographY) that is accelerated by combining MPI and OpenMP and traces light propagation with little numerical diffusion.

Numerical procedure of solving RTE
We calculate specific intensity I(t, r, Ω) in time domain which is a distribution function in six-dimensional phase-space with time t, space r(x, y, z) and direction Ω(θ, φ). Along a light ray, the intensity is determined by the following RTE [25] 1 c(r) where c(r) is the light speed at position r, µ a (r) and µ s (r) are the absorption and scattering coefficients, p(r, Ω , Ω) is a phase function from incoming direction Ω to outward direction Ω in a scattering process and it is normalized to be unity to the angle integral. The light speed changes with the refractive index n as c = c 0 /n where c 0 is the light speed in vacuum. Most previous studies solving the RTE have discretized the above equation and followed the light propagation by integrating it [26].
In this study, we consider a different form of the RTE. First, we introduce source function defined by If the time derivative term is larger than the space derivative one in the left-hand side of Equation (1), the equation can be approximated to be dI(t, r, Ω) dτ t = −I(t, r, Ω) + S(t, r, Ω), where dτ t ≡ [µ a (r) + µ s (r)] c(r)dt. The formal solution of this equation is On the other hand, if the space derivative term dominates the time derivative one is, Equation (1) can be approximated to be where we introduce the optical depth (τ ) defined by dτ = [µ a (r) + µ s (r)] dr.

The formal solution of this equation is
To incorporate the effect of time derivative into this formal solution, we employ a retarded form as We solve this equation at each time step with TRINITY, which corresponds to the integration of Equation (1). We evaluate the source function in the integration by linear interpolation between two points as where S 0 = S(0, 0, Ω) and S 1 = S(t, τ, Ω). Then, we discretize equation (7) as where I 0 ≡ I(0, 0, Ω).
In TRINITY, we assign light rays independently of spatial grids as shown in Figure 1. This basic idea was proposed in our previous works to calculate radiation transfer in galactic or metagalactic space [21,22,23]. We evaluate I(t, r, Ω) at points where a ray crosses spatial grids by using the information of an up-wind grid as where ∆r and ∆τ are distance and optical depth between the two radiation grids. The absorption, scattering coefficients, and refractive index on a crossing point are interpolated from those on nearest spatial grids. In most previous works solving RTE, a light ray is set at a spatial grid and not connected with a ray of an upwind grid, which is called the short-characteristic method [27] or the finite-element method [25]. This method is subject to considerable numerical diffusion as a result of multiple interpolations as shown in Section 2.2. On the other hand, TRINITY is based on a long ray connecting boundaries of a calculation box. In this scheme, the intensity at a specific spatial grid is evaluated by only one interpolation from nearest rays. Thus, numerical diffusion is substantially reduced.
Short TRINITY Figure 1: Schematic view of TRINITY and short-characteristic method. Spatial grids distribute at the cross points of the lattice. Radiative rays are designed independently to the spatial grids in TRINITY. The intensity at a specific spatial grid is interpolated from a nearest radiative ray.
The phase function for the scattering processes in biological tissue is generally modeled by the Henyey-Greenstein function: where g is an anisotropy factor. For example, g = 0 represents isotropic scattering and g = 1 does complete forward scattering. In cases of biological tissues, g is usually larger than ∼ 0.8 [28,29]. Therefore, the photons propagate with highly forward scattering processes.
For the discretization of angle, we apply HEALPix [30] that was developed for analyzing the cosmic microwave background. HEALPix divides a spherical surface with square pieces of equal area and can change the resolution hierarchically. As fiducial simulations, we settle 3072 angle bins for radiative transfer (RT) simulations. To estimate the source function, we sum up intensities for the discretized angles as where N Ω is the number of angle bins. The calculation amount of evaluating S(t, r, Ω) to all directions increases in proportion to N 2 Ω . Thanks to HEALPix's algorithm, we simply change the level of the hierarchical structure. In this work, we set N Ω = 3072/16 = 192 by coarse-graining of 3072 finer angular levels. So, the mean of I(t, r, Ω) at 16 finer angles is used for the calculation of S.
The evaluation of S(t, r, Ω) requires the information of I(t, r, Ω) at the current time step and vise versa. To obtain both I(t, r, Ω) and S(t, r, Ω) consistently, we iterate RT simulations until both are converged at each time step. In the iterations, we first set the source function at a previous time step S(t − ∆t, r, Ω) as an initial dummy.
At the boundary of media with different refractive indices, we consider refraction and reflection. The refraction angle is estimated based on the Snell's law as where n 1 and n 2 are refractive indices. By using the refraction angle, the reflection rate is calculated by the Fresnel's law, where θ c is the critical angle.

Test calculations with TRINITY
As a first test, we calculate the propagation of a pulse in one-dimensional space without absorption and scattering, i.e., light propagation in a vacuum. Figure 2 shows the spatial distributions of photon energy densities. Here, we inject a pulse with duration of 1.3 × 10 −2 nanosecond (ns) from x = 0 cm.
The calculation box size is 4 cm and the number of spatial grids is set to be 100. The time resolution should be higher than the light crossing time over two grids. We choose the time resolution as 1 10 × the light-crossing time, i.e., ∆t = 1 10 × ∆x c = 1.3 × 10 −4 ns, where ∆x is the length between grids. This is corresponding to a courant number 0.1. In comparison, we also solve Equation (1) by discretizing with the upwind scheme (1st order) and Lax-Wendroff (2nd order ) schemes [31]. As shown in Figure 2, the upwind scheme results in artificial extension of energy distributions due to numerical diffusion according as the pulse propagates. The peak energy decreases as the propagation distance increases and becomes ∼ 50% at x ∼ 3.0 cm. In the case of the Lax-Wendroff, the numerical diffusion is suppressed, but the rectangular shape of the beam is distorted and also oscillations emerge. We find that TRINITY preserves the shape of the pulse. Both numerical diffusion and oscillations are significantly suppressed. Note that, in the case of TRINITY, the data sizes of I and S are larger than those in previous works because of retaining the information of 10 time bins.
Next, we compare TRINITY with the short-characteristic method in the case of propagation of collimated beam in two-dimensional space. Figure  In the case of the short-characteristic method, the beam spreads due to the repeated interpolations from up-wind grids according as it propagates . In contrast, TRINITY preserves the collimated shape. Right panels of Figure 3 represent the intensity distributions along x-axis at specific y positions. In the case of the short-characteristic method, the peak values decrease as the y position increases. At the opposite boundary, the peak value is reduced to 8.5 × 10 −2 from unity as the original value. On the other hand, TRINITY keeps the peak value of ∼ 1 even far from the injected position.
In addition, we demonstrate the effect of the numerical diffusion for a case with an absorbing area. In addition to the above situation, we place a grid with a high absorption coefficient on the way of the beam at x = 1.7 and y = 2.0 cm. Therefore, we suggest that the numerical diffusion can induce fake signal at a detector on surface if there is an abnormal area smaller than the width of the numerical diffusion.
In cases with scattering processes, the difference between the schemes is likely to be smaller. Here, we also investigate the beam propagation in a scattering medium. Figure 5 shows two-dimensional beam tests with the scattering. x = 2.7 cm. In the case of the short-characteristic method, the beam spreads out more, resulting in lower photon absorption. Also, the area behind the absorption is recovered more efficiently because of the numerical diffusion. Thus, the energy density at the shadow area in the short-characteristic method is higher than that in the case of TRINITY.

Parallelization
Three-dimensional numerical simulations of RTE require huge calculation amounts and memory. To overcome the difficulty, we combine MPI and openMP parallelization techniques. To implement MPI, we decompose the calculation domain as shown in Figure 6. Each CPU rank calculates RTE in its own domain.   cores of each CPU. Then, before proceeding to the next time step, each domain communicates with adjacent domains and gets intensities at boundary grids.
As a result of such hybrid parallelization, we can utilize more than 100 CPU cores simultaneously. This allows us to handle direct simulations of RTE in three-dimensional space with more than 100 grids on one side.
We confirm that calculations are accelerated efficiently by increasing the number of CPU core. However, for the cases with a small number of spatial grids (e.g., 40 3 ), the parallelization efficiency decreases even if a lot of CPUs are utilized. This is because the data communication of intensities at boundary grids between the domains becomes a bottleneck. In our simulations, we use 100 3 grids to achieve the resolution of 1 millimeter to parts of a human body with the size of 10 cm.

Calculation steps
We summarize the calculation steps in TRINITY.
Step 1: Assigning light rays connecting from a boundary to another side boundary. The total number of rays is ∼ N ang × N 2 grid , where N ang is the number of angle bins and N grid is the number of spatial grid on one side.
Step 2: Decomposing a calculation box.
Step 3: Solving RTE on each radiative ray.
Step 4: Data communication of intensities at boundaries between adjacent domains.
Step 5: Calculating the source function.
Step 6: Go to step 3) until the source function becomes consistent with that derived from the intensities at the current time step.
Step 7: Go to the next time step if the intensity and source function are consistent.

Application to a phantom
As applications of TRINITY, we model the light propagation in a phantom made of polyurethane which mimics biological tissues. The properties related to RT are summarized in Table 1. In the left panel of Figure 7, the schematic picture of the phantom is shown. The simulation box has the size of (4.0 cm angle of the detectors is also the same. The time resolution is set to be 1 5 × the light-crossing time between grids, i.e., ∆t = 1 5 × ∆x (c/n) = 4.0×10 −4 ns. We pursue the light propagation for 3.5 ns after the pulse is injected. The resultant total number of time steps is 8680. In this simulation, we assume that all photons are absorbed once they reach the boundary, i.e., there is no reflection. In practice, the phantom is covered by black carbon as the photo-absorption media in the experiments compared with our simulations.
The right panel of Figure 7 shows the snapshot of the three-dimensional energy distributions at t = 0.8 ns. Because of the multiple scattering processes, the radiation fields spread over the box and the energy tends to be distributed spherically. Figure 8

Phantom test with an absorber
If there are abnormal parts like cancer or brain ischemia for instance, the density of hemoglobin or the redox state of cytochrome c oxidase in mitochondria can become different from that in normal tissue. Therefore, the absorption coefficient changes locally, resulting in the different signals at the surface. Here, we test the cases with embedded abnormal media of cylinder shape over z-   Table 1: Properties of phantom made by polyurethane: µa and µs are absorption and scattering coefficients, g is an anisotropy factor, and n is the refractive index.  The time-domain signals at the detector D5 are presented in Figure 11. Compared to the signals in the case without an absorber, the signals are weakened as the absorption coefficient becomes larger. However, once the optical depth of the absorber exceeds unity significantly, the reduction rate of the signal becomes almost constant. The signals of the case (c) and (d) are lower than the case without the absorber by a factor of ∼ 2. Note that, the arrival times of the peak signals are the same (t ∼ 1.7 ns) irrespective of the absorption coefficients.
The signal could also change with the position of the absorber. Therefore, that allows us to investigate the properties of an absorber in biological tissue by confronting signals in simulations with those in experiments.
In the above test simulations, we have studied the light propagations in the cases with the cylinder shape absorbers. Here, we additionally investigate the

Phantom test for a medium with a different refractive index
The refractive index of biological tissues is higher than that of the vacuum or air by a factor ∼ 1.5. Therefore, if there is a part including air like a trachea, photons can be scattered at the boundary, based on Equation (14). In addition, the light speed changes depending on the local refractive index. Fujii et al.
(2017) [26] took into account these effects in two-dimensional simulations and showed that organs with different optical properties change the light signals significantly. To perform three-dimensional simulations including these effects, we perform simulations with TRINITY. First, we regard nearest grids to a region with different refractive index as the boundary grids. In calculations of S(t, r, Ω) at the boundary grids, we use p(r, Ω , Ω) taking Equation (13) and (14) into account.
Actually, we calculate the photon propagation in a phantom with a cylinder hole filled with air. The cylinder hole is located at x = 1.6 cm and y = 2.0 cm and the radius is 0.5 cm. Figure 13 presents the energy distributions in x − y plane with z = 2.0 cm at t = 0.5, 0.8, 1.4 and 2.2 ns. At t ∼ 1.4 ns, the wavefront of the energy peak arrives at the hole. Owing to the higher speed of light, the wavefront propagates faster in the hole than other places. After the pulse injection is turned off, the photons are trapped for a while due to the scattering. However, because of no scattering in the hole, the photon energy density decreases rapidly.
We present the time-domain signals at the detector D5 in Figure 14. The arrival time of the energy peak is shifted to ∼ 0.2 ns earlier because of the faster speed of light in the hole. In addition, the energy peak is higher than that without the hole by a factor of ∼ 3 because of no absorption there. Thus, the light signal significantly changes in the medium with the different refractive index. For example, if we consider a neck, the above processes should be taken into account for an accurate diagnosis. Note that, it is difficult to handle the refraction and reflection in DE simulations. Therefore, RTE simulations are required to properly model the light propagation in complicated parts.

Discussion & Summary
Diffuse optical tomography (DOT) is a potential tool to diagnose biological tissues and has several advantages as no exposure, no invasive, and no side effect for instance. In order to reconstruct images of biological tissues, theoretical modeling based on time-dependent RTE is demanded. However, the radiative transfer calculations in three-dimensional space is expensive even with the current computational facilities. Hence, the methodology based on RTE has not been established hitherto. In this study, to realize three-dimensional radiative transfer calculations with high-spatial resolution, we have developed a novel three-dimensional time-dependent radiative transfer code, TRINITY In this study, we have focused on the methodology of time-dependent radiation transfer code and the applications to idealized cases. By combining the radiative transfer simulations and the inverse problem analysis with the machine-learning [32], the structure related to the optical properties in a body can be evaluated. We will investigate the light propagation in a more complicated structure as a human brain or neck that include trachea, blood vessel, spine and spinal cord in future work.