Use of Bayesian Optimization to Understand the Structure of Nuclei

Monte Carlo simulations are widely used in nuclear physics to model experimental systems. In cases where there are significant unknown quantities, such as energies of states, an iterative process of simulating and fitting is often required to describe experimental data. We describe a Bayesian approach to fitting experimental data, designed for data from a $^{12}$Be(d,p) reaction measurement, using simulations made with GEANT4. Q-values from the $^{12}$C(d,p) reaction to well-known states in $^{13}$C are compared with simulations using BayesOpt. The energies of the states were not included in the simulation to reproduce the situation for $^{13}$Be where the states are poorly known. Both cases had low statistics and significant resolution broadening owing to large proton energy losses in the solid deuterium target. Excitation energies of the lowest three excited states in $^{13}$C were extracted to better than 90 keV, paving a way for extracting information on $^{13}$Be.


Introduction
One of the main goals in nuclear physics is to expand the limits of observation of nuclear structure through reactions involving exotic nuclei. Direct reactions, such as the (d,p) one-neutron transfer reaction, have been used extensively to study the structure of nuclei. With the limited beam time available at radioactive ion beam facilities, and the associated cost of running these experiments, it is incumbent on experimentalists to extract the maximum information from the data obtained. There is also a need to account for sources of background and assess uncertainties in experimental data. Simulations can be used to understand the experimental resolution, which is commonly a combination of interactions of the beam and final-state particles with both the target and other materials such as detectors. They can also account for the effects of incomplete acceptances, which can complicate data analyses.
The GEANT4 toolkit is commonly used to simulate the passage of particles through matter by taking into account both electromagnetic and nuclear processes using Monte Carlo (MC) methods [1]. GEANT4 pro-vides a general framework for MC simulations of particles with support for detector construction, particle transportation, source generation, and particle detection. These simulations allow the user to understand the detector response for their specific conditions and reactions for a given experiment. This paper reports on the use of GEANT4 in combination with Bayesian Optimization for understanding reaction data and thereby extracting nuclear structure information. A study using the 12 C(d,p) 13 C transfer reaction was performed immediately before the experiment to study the structure of the unbound nucleus 13 Be via the 12 Be(d, p) reaction and was used to check the detectors and provide calibration points. As 12 Be has a half life of 21 ms [2], the experiment necessarily was run in inverse kinematics, with a beam of 12 Be and a solid deuterium target, as discussed in detail below in Section 2. The 12 C(d,p) reaction was run in inverse kinematics to mimic the reaction with the radioactive ion beam, and both had much lower statistics than usually obtained with stable ion beams. The 12 C beam had to be reduced in intensity to avoid damaging the forward angle silicon detectors. The nucleus 13 C has a very well-known structure and provides an excellent benchmark for testing these methods. The first four states in 13 C have been observed several times through 12 C(d, p) 13 C transfer reactions with excitation energies of: 0, 3.089, 3.685, and 3.854 MeV [2]. Here we describe the benchmarking of a GEANT4 simulation of the experimental setup with Bayesian Optimization (discussed in Section 3) using experimental data to extract the underlying nuclear structure.

Experiment and Simulation Description
The 12 C(d, p) reaction was performed using the IRIS facility at the ISAC II experimental area of TRIUMF. IRIS was designed for the study of nucleon transfer reactions and inelastic scattering of exotic nuclei in inverse kinematics [3]. The central component of the IRIS facility is a thin, solid hydrogen, or deuterium, target, which avoids the significant amounts of carbon present in polyethylene foils (CH 2 ) n and (CD 2 ) n , which are commonly used in direct reactions measurements. The carbon in CD 2 targets can create fusion evaporation background and is a large source of energy loss and thereby uncertainty in energy. To create the solid deuterium target, a thin Ag foil (4.64 µm) is cooled via a helium compressor to a temperature around ∼4K. The target gas (hydrogen or deuterium) is then sprayed onto the foil where the gas solidifies. The thickness of the target is controlled by adjusting the amount of gas sprayed on the foil. The IRIS facility incorporates a suite of detectors upstream and downstream from the cryogenic target. The specific configuration used in this experiment included an ionization chamber (IC), upstream of the target, filled with 19.4 Torr of isobutane (C 4 H 10 ) to measure the energy losses of beam particles, providing beam particle identification. The entrance and exit windows of the IC were 30 µm and 50 µm thick silicon nitride (SiN 3 ), respectively. Annular silicon detectors (MICRON Semiconductor YY1 type) faced the target from the upstream and downstream sides and measured the angles and energies of light reaction particles. The YY1 detectors have 8 azimuthal detector sectors where each detector is segmented into 16 rings. The upstream YY1 detector, with a thickness of 500 µm, was placed 80.8 mm from the target, while the downstream YY1 detector with the 100 µm thickness was placed 86 mm from the target. The downstream YY1 detector was backed by a 12 mm-thick cesium-iodide (CsI) scintillator forming a telescope for reaction particles emitted in the forward direction. The telescope provides dE-E particle identification for particles that do not stop in the silicon detector. Located 600 mm and 690 mm downstream from the target were two smaller annular silicon detectors (MI-CRON S3 type) used for dE-E particle identification of the heavy recoil. The transmission S3 detector (dE) had a thickness of 61 µm, the S3 stopping (E) detector, further downstream, had a thickness of 500 µm. The IRIS facility is described in more detail in Ref [3].
A 111.4 ± 2.2 MeV beam of 12 C at a rate of 1.5 x 10 3 pps impinged on the solid deuterium target, which had an average thickness of 56 µm.
The experimental setup was reproduced in GEANT4 as shown in Figure 1. The beam energy reproduction was tested in the simulation by comparison with data without the solid deuterium target. A short run with the 12 C beam and no target was performed, where only the silver foil was in the target location. The total energy of the 12 C ions was measured in the S3 telescope. The comparison of the data from the S3 detectors and the simulation is shown in Figure 2. The simulation reproduces the peak energy seen in the S3 telescope. The low-energy tail below the 102-MeV peak was due to incomplete charge collection in the silicon detector as also observed in alpha source calibrations.
Only the upstream YY1 detector was considered for the measurement of the (d,p) reaction. As the light reaction products from (d,d) and (d,t) reactions in inverse kinematics are constrained to the forward hemisphere in the laboratory frame, these reactions cannot be measured in the upstream YY1 detector. The only direct reaction products that can be detected at backward angles in this experiment are protons from the (d,p) reaction. Based on alpha calibrations, these detectors had an intrinsic FWHM resolution of 35 keV, which was implemented in the GEANT4 simulation. The energy and angle of the proton, as measured in the YY1 detector, were used to reconstruct the Q value of the reaction. Internal GEANT4 energy loss tables for the proton and the 12 C beam in the solid deuterium target were extracted and used to calculate energy losses. The Q value was reconstructed from the detected energies, assuming the reaction occurred at the center of the target. For consistency in the comparison of the data and simulation, the analysis of the experimental data also used the internal GEANT4 energy-loss tables for reconstructing the reaction Q-value.
The experimental resolution, which is dominated by the intrinsic detector resolutions, the energy straggling of the beam, and the energy straggling of the emergent light ion, is sometimes assumed to be Gaussian, and is included in fits as such. This approach does not work in cases with low beam energy and energy-dependent resolutions, as demonstrated in the example discussed below and shown in Figure 3. The 12 C(d,p) reaction was simulated for a hypothetical state at an excitation energy of the Q-value spectrum from a similar simulation with a beam energy of 80 MeV, where the average proton energy of the most backward ring in the YY1 detector is 1.25 MeV. At lower proton energies, the resolution of the detected proton worsens owing to the larger energy loss, as shown in Figure 3 as the blue histogram is much wider than the red histogram. The second effect occurs as the total energy and proton energy decrease, the detector resolution when extracting the Q value changes from a Gaussian distribution to a non-Gaussian distribution. The Q value for these lower energies becomes skewed towards higher Q values, that is high proton energies at the center of the target. This effect comes from protons generated at the furthest downstream position in the target having such a low energy that they either do not make it out of the target, or do not have enough energy when they hit the detector to make it above the noise threshold. The proton energy at the center of the target is reconstructed by adding on the energy losses in the target and the dead layer of the detector. The Q value in the case of a (d,p) reaction is subsequently found using equation 1.
(1) where M b , M p , and M r are the masses of the beam, emergent proton, and heavy recoil respectively, E b is the beam energy at the center of the target, E p is the proton energy at the center of the target, and θ is the angle of the light recoil particle. The analysis of the simulated data is performed in the same way as for the experimental data to allow direct comparisons.

Bayesian Optimization
Bayesian Optimization (BayesOpt) is a sequential optimization strategy for finding the global maximum of black-box functions, known as objective functions [4,5,6]. A benefit of using BayesOpt is that it is not necessary to know the derivative of the function that is being maximized. BayesOpt adopts a sequential approach where all previous knowledge about the function f (x) is used for selecting data points creating a convergence towards the global maximum. First, a surrogate model of the function to be maximized is built, which is updated as new data points are evaluated, while also suggesting the next evaluation point. The most popular surrogate model used in BayesOpt is the Gaussian process (GP).
GP uses the multivariate Gaussian distribution over the previous evaluated data points, described by a mean function µ(x) = E[ f (x)] and a covariance function, or kernel, The kernel used in this study was the Matérn kernel, whose form and details can be found in Ref. [7]. The second main component of BayesOpt is the acquisition function. This is calculated using the GP and implements trade-offs between exploration of the parameter space and exploitation in BayesOpt. The acquisition function optimizes the search for the maximum, while exploring regions where the GP is more uncertain. Two com-mon traditional types of acquisition functions are expected improvement (EI) and upper confidence-bound (UCB). The EI acquisition function measures the expectation of improvement in the objective function based on the predicted distribution of the GP. When exploring the parameter in EI, points associated with high uncertainty are more likely to be chosen, while during exploitation, parameters with high values of the mean are selected. The next point to evaluate is chosen by x n+1 = max x∈X (EI(x)) using x + , the best parameter found so far. EI(x) is defined as: where f (x + ) has the highest value and is thereby the best observed value of the objective function, µ(x) and σ(x) are the mean and standard deviation of the GP and φ(Z) and Φ(Z) are the probability and cumulative normal distributions where The UCB acquisition function is defined as: where ξ ≥ 0 controls the balance between exploration of the parameter space (ξ ∼ 1) and exploitation (ξ ∼ 0). This acquisition function can be described as the maximum value across all solutions of the weighted sum of the mean of the GP and the standard deviation of the GP. The next point to evaluate is chosen by the maximum of the acquisition function, x n+1 = max x∈X (UCB(x)), in a similar fashion to the EI acquisition function. BayesOpt has the benefit of reducing the amount of parameter space searched to find the maximum compared to other methods, such as a basic grid search, or a random search. Since past evaluations are taken into account, BayesOpt tries to focus on the parameter space area where the maximum of the black-box function occurs while not wasting much time in those parameter space areas that are not solutions. This greatly reduces the time required to fit complex functions, such as the GEANT4 simulations in this study. An example that demonstrates this iterative process with BayesOpt for the function f (x) = (1/(x 2 + 1) + e −(x−4) 2 /2 ) sin(x) is shown in shaded regions. An acquisition function using the upper confidence bound is shown as the solid red curve while the next point selected for evaluation is displayed as a yellow star on the acquisition function curve. As the number of evaluations increases, the GP quickly improves its description of the function and hones in on the global maximum. A basic BayesOpt pseudo-code is found in Algorithm 1.
Bayesian Optimization has been used extensively in recent years for hyper-parameter tuning in machine learning models [6,8,9], and also in a wide range of fields for scientific studies such as choosing experimental techniques in drug discovery [10], improving quantum annealing [11] and tuning free-electron lasers [12]. An example from nuclear structure theory is given in a work by Ekström et al [13] where it has been used to constrain the coupling constants in chiral effective field theory descriptions of the strong interaction. In a recent work [14], a Bayesian Analysis was used to fit the angular distributions from a transfer reaction and extract spectroscopic factors.
In the current work, Bayesian methods incorporating GEANT4 simulations and experimental data are used to extract energies and intensities of states populated in a transfer reaction. For this work, we have chosen to use BayesOpt and the Python package BayesianOptimization for the BayesOpt algorithm [15].

Results
The 12 C(d,p) Q-value spectrum was produced both from the experimental data and in the GEANT4 simulation, assuming the ground state and three excited states were populated. The GEANT4 energy loss tables were used to calculate the energy of the proton and the 12 C at the center of the target for both the experimental and the simulated data. The experimental data were used in the Bayesian Optimization for comparison and to provide the best fit. A recent paper [14] shows how Bayesian optimization can be used with GEANT4 simulations to fit angular distributions from low-energy direct reactions. We performed BayesOpt for two different acquisition functions: EI and UCB, shown in Equations 2 and 4, respectively. The inverse of χ 2 between the experimental Q-value and that from the simulation was maximized in order to find the best (lowest) χ 2 fit to the data. The lowest value of χ 2 , χ 2 min , was stored and used for the extraction of error bars. For each state, the energy was changed by a step, dE, and a Bayesian optimization was performed allowing the other states to vary until a best fit was found. The dE that produced χ 2 = χ 2 min + 1 was used to define 1σ errors.
To smooth out fluctuations from the simulation, 200,000 events were simulated for each iteration step of the BayesOpt process. Since only the upstream YY1 detector was used to generate the Q-value spectrum, the simulation was set to only populate light ions uniformly in the laboratory angular range of 150 • -175 • . This allows for a large increase in efficiency of the simulation as ∼35% of the simulated events are measured in the YY1 detector. Before the χ 2 was calculated, the simulated data was scaled to the intensity in the experimental Q-value plot using a χ 2 fit. The results using the EI acquisition function are shown in Figure 5(a) and using the UCB acquisition function in Figure 5(b). For the BayesOpt process for both acquisition functions, the number of random iterations taken was set to 300. After these random iterations, the exploitation phase of BayesOpt was initiated for a total of 300 more iterations. The excitation energies of four states as a function of iteration number are plotted in Figure 5 along with the known values, plotted as dashed vertical lines. The value of χ 2 /N is denoted by color going from red (low χ 2 /N for a good fit) to blue (high χ 2 /N for a bad fit). During the exploration phase for both acquisition functions, the χ 2 /N remains mostly above 10. When the GP switches over to the exploitation function, the χ 2 /N remains below 15 for both acquisition functions. The line between the random exploration phase and the exploitation phase is clearly visible by an almost dis- crete change in color from blue to red. The UCB method was more successful at converging quickly than the EI method, as illustrated by the number of low (red) χ 2 /N points starting at 300 iterations.
The Q-value spectrum from the best fit is shown in Figure 6. The excitation energies from the Bayesian optimization are 3.126, 3.713, and 3.894 MeV. As the excited states of 13 C are well known, it is possible to make meaningful comparisons, as shown in Table 1. There is good agreement between the fitted and known energies within error bars that are all less than 90 keV. This best fit from BayesOpt has a χ 2 /N of 2.4.

Conclusions
The GEANT4 toolkit with Bayesian Optimization study has been completed for the test run of the 12 C(d, p) reaction. The results obtained from BayesOpt agree with the known states in 13 C, within error bars, and all within 90 keV. The structure of 13    region below 5 MeV in excitation, providing a suitable benchmark for the 12 Be(d,p) experiment that followed the run with 12 C and populated poorly-known states in 13 Be. Other techniques, such as Markov Chain Monte Carlo (MCMC), provide alternate Bayesian methods to analyze the spectrum using GEANT4. The MCMC method would require significant numbers of walkers and steps per walker to fit to the data, resulting in 1,000's to 10,000's of individual simulations. In this work only 600 simulations were required. BayesOpt has the benefit of reducing computation time extensively for these measurements when compared to other methods such as Markov Chain Monte Carlo and can be used to extract spectroscopic information from experimental data relevant to nuclear structure studies.