Finding signatures of the nuclear symmetry energy in heavy-ion collisions with deep learning

A deep convolutional neural network (CNN) is developed to study symmetry energy $E_{\rm sym}(\rho)$ effects by learning the mapping between the symmetry energy and the two-dimensional (transverse momentum and rapidity) distributions of protons and neutrons in heavy-ion collisions. Supervised training is performed with labelled data-set from the ultrarelativistic quantum molecular dynamics (UrQMD) model simulation. It is found that, by using proton spectra on event-by-event basis as input, the accuracy for classifying the soft and stiff $E_{\rm sym}(\rho)$ is about 60% due to large event-by-event fluctuations, while by setting event-summed proton spectra as input, the classification accuracy increases to 98%. The accuracy for 5-label (5 different $E_{\rm sym}(\rho)$) classification task are about 58% and 72% by using proton and neutron spectra, respectively. For the regression task, the mean absolute error (MAE) which measures the average magnitude of the absolute differences between the predicted and actual $L$ (the slope parameter of $E_{\rm sym}(\rho)$) are about 20.4 and 14.8 MeV by using proton and neutron spectra, respectively. Fingerprints of the density-dependent nuclear symmetry energy on the transverse momentum and rapidity distributions of protons and neutrons can be identified by convolutional neural network algorithm.


I. INTRODUCTION
The density-dependent nuclear symmetry energy E sym (ρ) has attracted considerable attention from both nuclear physics and nuclear astrophysics communities in the recent two decades, since its knowledge is crucial for our understanding of diverse phenomena observed in rare isotopes, nuclear reactions with exotic nuclei, as well as neutron star and its merger [1][2][3][4][5][6][7]. Constraints on E sym (ρ) with various observations from studies of nuclear structure, nuclear reaction, and neutron star properties have been studies. However, the whole picture of nuclear symmetry energy as a function of density is still indistinct, especially above the saturation density (ρ 0 ). Simulations with transport model in combination with observables in heavy-ion collisions is one of the important way to constrain the high density behavior of E sym (ρ). Several sensitive observables have been presented, but it is still difficult to get a tight and consistent constraint on E sym (ρ) at high densities [8][9][10][11][12][13][14].
Deep learning has been proved useful for analyzing pattern from complex data in many branches of science, such as in physics [15][16][17][18][19][20][21][22][23][24][25][26][27]. In heavy-ion physics, neural network has been used to determine the impact parameter in heavyion collisions since 1990s [28][29][30][31][32][33][34]. The convolutional neural network (CNN) which is successful in Computer Vision has been shown promising in studying Quantum chromodynamics (QCD) properties from heavy-ion observables [35][36][37][38][39]. The marriage of heavy-ion physics and deep learning brought new effective paradigm for studying various details of the underlying physics. In this work we attempt to find fingerprints of the density-dependent nuclear symmetry energy in heavyion collisions by using a deep learning algorithm. This is a challenge task because symmetry energy is a sub-leading ingredient of transport model for studying heavy-ion collisions at intermediate energies. In addition, symmetry energy effects may be further washed out by the stochastic nucleon-nucleon interactions. It is known that deep learning algorithm relies on data heavily for pattern recognition. For our purpose, the data can be either from experiment or theoretical simulations. As data from theoretical simulations can be well controlled to perform supervised learning, in this work we apply the ultrarelativistic quantum molecular dynamics (UrQMD) model to generate training data. As a many-body microscopic transport model, UrQMD has been widely employed for investigating HIC from the Fermi energy (40 MeV per nucleon) up to the CERN Large Hadron Collider energies (TeV). With further improvement on several ingredients in UrQMD simulation, such as, the nuclear mean-field potential, the collision term, and the cluster recognition term, many experimental data within a wide energy range can be reproduced [11,12,[40][41][42][43]. In presently used UrQMD model, the symmetry poten- tial is derived from the Skyrme potential energy density functional in the same manner as the improved quantum molecular dynamics (ImQMD) model, see e.g., Refs. [44,45]. While the isoscalar components of the mean field potential inherits the widely used soft and momentum dependent version of potential in QMD-like models [46,47]. As the highdensity behavior of nuclear symmetry energy is still not well constrained, five different Skyrme interactions which yield very different E sym (ρ) are considered in the present work, as plotted in Fig. 1 and listed in Tab. I. The slope of the density-dependent nuclear symmetry energy which is defined as L = 3ρ 0 ∂Esym(ρ) ∂ρ | ρ=ρ0 ) spans the range from 5 MeV up to 160 MeV, and covers a wide range of various constraints on L. As the overall contributions of the isovector part in HICs is relatively small compared to the isoscalar part of the nuclear potential, the subtle influence on various observables hard to be revealed. Usually, the difference or ratio of observables between isospin partners may provide some hints for the isovector part of the nuclear potential. For data generation, 800 000 Au+Au collision events with impact parameter b=5 fm and beam energy E lab =0.4 GeV/nucleon for each symmetry energy are simulated, with the transverse momentum p t and rapidity y 0 of protons and neutrons all recorded. Due to initial fluctuations and the random nucleon-nucleon collisions, fluctuations on the rapidity and transverse momentum distributions are very large, consequently, the effects of symmetry energy on the distributions are hidden. As shown in Fig.2 where the proton rapidity distribution in 0.10 ≤ p t ≤ 0.15 GeV/c and 0.20 ≤ p t ≤ 0.25 GeV/c obtained from calculations with very soft (SKz4) and stiff (SkI1) symmetry energies are compared. Fluctuation on the distributions can be reduced by combining results from different events, as displayed in the lower panel of Fig.2, nevertheless, the differences in the proton distributions obtained from these two symmetry energies are still too faint to be distinguished by conventional analysis. Accordingly, we prepare input sample by combining proton spectra from 100 UrQMD events. Therefore, we have 8 000 samples for each symmetry energy to perform supervised learning. We note here that, the In upper panels, results from five random events are displayed for each interactions. In lower panels, results from five random samples (which obtained by combining 100 collision events) for each interactions are displayed.

II. MODEL AND DATA DESCRIPTION
training accuracy will increase if we combine more events into an input sample because of the fluctuation reduction. However, given a certain number of events, combining more events to input will reduce the number of training samples, which may depress the performance of CNN training. Combining 100 events to a sample is a compromise between fluctuation and the number of training samples. The CNN architecture used in this work is inspired from successful applications in Refs [32,35,36], which has two convolution layers and one subsequent fully-connected layer. The input to the CNN is the two-dimensional (transverse momentum and rapidity) distributions of protons, which is a 20×40 matrix, as the transverse momentum p t spans from 0 to 1 GeV/c with 20 bins and the rapidity y 0 spans from -2 to 2 with 40 bins in between, respectively. A batch normalization (only after the first layer), LeakyReLU activation with a slope of 0.1, dropout with a rate of 0.2, as well as average pooling with a kernel size 2×2 and a stride of 2 pixels are added between the two layers. Each convolution layer consist of 128 filters of size 5×5. We have checked that, the accuracy is hardly influenced when varying the above mentioned parameters or adding more layers, these variations only affect the runtime and the stability of training process.

III. RESULTS AND DISCUSSION
A. Result of two-class classification task The simulated events for each symmetry energy are randomly divided into three parts: training, validation, and testing sets with the ratio of 60:15:25. Training set is used to adjust the parameters in CNN, validation set is used to monitor The accuracy of two-class classification task. The number in each cell denotes the accuracy for classifying the vertical and horizontal labelled symmetry energies. The statistical error of the accuracy was estimated to be smaller than 2% by comparing parallel testing data, being therefore negligible. and avoid overfitting during the training by ensuring that the performance over both the training and validation set should not deviate a lot, otherwise the CNN is overfitting and the training should stop. Testing set is used to evaluate the actual predictive power of CNN on unseen different events. As CNN has a deep structure and consists of many parameters, in principle the training accuracy can keep on increasing to 100%, while the validation and testing accuracy may not al-ways increase unless the CNN learns the underlying relevant rules. As displayed in the left panel of Fig.3, after 50 epochs the training accuracy still increase while the validation and test accuracy does not, indicating overfitting happening and accordingly the training should cease before epoch=50. The testing accuracy on distinguishing Skz4 and SkI1 is about 0.6 by using event-by-event proton spectra. For SV-sym34 and SkI2, the testing accuracy is about 0.5, meaning that the proton spectra obtained from these two symmetry energies are indistinguishable by the machine. The results by using event-summed samples (100 event-summed proton spectra) are displayed in that both testing and validation accuracy is enhanced. The accuracy is about 98% and 63% for classifying Skz4 vs SkI1, and SV-sym34 vs SkI2, respectively. As one expects, the former has larger accuracy because of the larger difference in E sym (ρ) as shown in Fig.I. By averaging over 100 events, the fluctuation is reduced largely and the tiny difference on samples obtained from different symmetry energy can be partly identified by the machine. In HICs at intermediate energies, there are basically two sources of fluctuations: the initial fluctuation and dynamical fluctuation (i.e., stochastic particle collision). We have tried to reduce the initial fluctuation by artificially using the same initial nuclei in every collision events. Consequently, the accuracy for classifying Skz4 and SkI1 on the event-by-event basis reaches the range 70%-90%, which depends strongly on the random number generator seed. Fig.4 displays the accuracy for classifying two different symmetry energies by using event-summed proton spectra. The larger the difference in E sym (ρ), the higher the accuracy for their classification. The accuracy for classifying SV-sym34 vs SkI2 is the lowest of all, since their difference in L is the smallest. Generally, the accuracy increases with the slope L difference between two symmetry energies increasing, showing that the CNN is capable of manifesting fingerprints of the density-dependent nuclear symmetry energy on proton spectra. The confusion matrix for five-class classification task. Number in each off-diagonal cell represents the probability that the object of the symmetry energy (vertical label) being misclassified as the horizontal labelled symmetry energy. The diagonal entries show the fraction of correctly classified testing data. Thus the sum of number in each row is unit. Upper and lower panels denote the results by using event-summed proton and neutron spectra, respectively.
The accuracy of the five-class classification task is plotted in Fig. 5, where the results by using event-summed either proton or neutron spectra are displayed. Using proton spectra, the accuracy is about 58% which is almost three times that of a random guessing. While the accuracy is increased to 72% if the neutron spectra is used. It is understandable that the accuracy with neutron spectra is higher because neutronrelated observables are usually more sensitive to E sym (ρ) than proton-related observables, see, e.g., Refs. [48,49].
The confusion matrix is a good way to display the performance of multi-class classification model in making predic-tion. Fig. 6 shows the confusion matrix for the five-class classification task. The diagonal numbers denote the probability that a horizontal labelled symmetry energy is correctly classified, as can be seen that they are the largest one in each row, although the probability is very high for some symmetry energies (e.g., SkI1 and Skz4) and very low for others (e.g., SV-sym34 and SLy230a). For example, by using event-summed proton spectra, 78% of SkI1 sample can be correctly identified, while 20% of them are misidentified as SkI2, and the remaining 2% are misidentified as SV-sym34. This result is reasonable as the symmetry energies obtained with SkI1 and SkI2 are close to each other, thus the probability of misrecognizing each other is high. Indeed, numbers around diagonal are larger than others, meaning that the predicted labels are close to the ground truth, although the classifier cannot always give the correct answer, indicating that the CNN can indeed capture symmetry energy signals in the spectra.  It is known that the slope L is one of the important quanti-ties that characterizes the behavior of density-dependent symmetry energy. The CNN architecture is also adapted to predict the slope L (regression task) instead of classifying symmetry energies. The distribution of the predicted slope parameter L with event-summed proton and neutron spectra are plotted in Fig.7. Mean absolute error (MAE), which is the absolute difference between the true and the predicted values, is about 20.4 and 14.8 MeV for using event-summed proton and neutron spectra, respectively. As can be seen, the predicted L distributions for Skz4 and SkI1 are well separated with each other, while the distributions for SV-sym34 and SkI2 are overlapping each other largely, which is due to the fact that L difference in the former case is about 153 MeV but it is only about 25 MeV in the later case. Using Gaussian fit, one can get the mean value of the predicted L and its standard deviation, which are listed in Tab. II. The mean values of predicted L are close to the true values used in corresponding events, indicating again the strong capability of CNN in revealing fingerprints of E sym (ρ) on the transverse momentum and rapidity distributions of protons and neutrons.

IV. SUMMARY
To summarize, we have presented the first attempt to find fingerprints of the nuclear symmetry energy in heavy-ion collision with deep leaning. The two-dimensional (transverse momentum and rapidity) distributions of protons and neutrons simulated with UrQMD model are fed into a CNN, and the output of CNN is either the label which denotes the stiffness of E sym (ρ) or the slope parameter L. It is found that, when using proton distributions on event-by-event basis, the accuracy for classifying the soft and stiff E sym (ρ) is about 60%, due to large event-by-event fluctuations, while by using event-summed proton spectra as the input sample the accuracy increases to 98%. For classifying five different E sym (ρ), the accuracy is about 58% and 72% when proton and neutron spectra is used, respectively. For the L regression, the mean absolute error between the CNN predicted and true L are about 20.4 and 14.8 MeV by using proton and neutron spectra, respectively. The present results suggest fingerprints of E sym (ρ) on the transverse momentum and rapidity distributions can be identified by deep learning algorithm.