Nuclear reaction network unveils novel reaction patterns based on stellar energies

Despite the advances in discovering new nuclei, modeling microscopic nuclear structure, nuclear reactors, and stellar nucleosynthesis, we still lack a systemic tool, such as a network approach, to understand the structure and dynamics of over 70 thousands reactions compiled in JINA REACLIB. To this end, we develop an analysis framework, under which it is simple to know which reactions generally are possible and which are not, by counting neutrons and protons incoming to and outgoing from any target nucleus. Specifically, we assemble here a nuclear reaction network in which a node represents a nuclide, and a link represents a direct reaction between nuclides. Interestingly, the degree distribution of nuclear network exhibits a bimodal distribution that significantly deviates from the common power-law distribution of scale-free networks and Poisson distribution of random networks. Based on the dynamics from the cross section parameterizations in REACLIB, we surprisingly find that the distribution is universal for reactions with a rate below the threshold, λ<e−Tγ , where T is the temperature and γ ≈ 1.05. Moreover, we discover three rules that govern the structure pattern of nuclear reaction network: (i) reaction-type is determined by linking choices, (ii) network distances between the reacting nuclides on 2D grid of Z vs N of nuclides are short, and (iii) each node in- and out-degrees are close to each other. By incorporating these three rules, our model universally unveils the underlying nuclear reaction patterns hidden in a large and dense nuclear reaction network regardless of nuclide chart expansions. It enables us to predict missing links that represent possible new nuclear reactions not yet discovered.


Introduction
Since the birth of the Universe in the Big Bang about 15 billion years ago, its evolution was accompanied and often driven by nuclear reactions. The birth, life and death of stars, Earth and their occupants, either animate or inanimate are all the products of these processes, evolved from the nucleosynthesis of the lightest elements (hydrogen, helium and lithium) [1,2]. Furthermore, the discoveries in nuclear science have also led to modern innovation and technology advances that have improved our lives by enabling, for example, the sustainable clean energy [3,4], the introduction and use of radioisotopes for medical diagnosis and therapeutic purposes [5][6][7][8], and advances in material science [9,10] by use of the nuclear reaction microanalysis.
Nuclear reaction network is important to describe the nuclear energy generation or the evolution of entire composition of nuclei in different astrophysical scenarios [11][12][13][14][15][16][17][18][19][20]. An important tool to understand the nucleosynthesis mechanism in various astrophysical environments, e.g. supernovae [13], core-collapse supernovae [16,17], novae [11,12,19], or x-ray bursts [14,15,18,20]. Despite some pioneering conceptual work in both nuclear physics [21][22][23][24][25][26][27] and network science [28], it is still unclear we still lack a systematic network framework for understanding the complex reaction patterns and nuclei stability. The theoretical gap is that there is no computational model for nuclear reaction networks, similar as Barábsi-Albert (BA) model with the preferential attachment for scale-free networks [28] applied in Internet, biology, and social science. Here, for the first time, we apply network tools into nuclear physics and develop a framework, which enables modeling the global structure of the nuclear reaction network without considering their dynamical equations. This approach opens novel ways to understand the structure of nuclear reaction network and nuclei stability.
We find some intriguing regularities, such as binomial degree distribution, the symmetry of in and out degrees, which are not yet explained by the current theory of nucleus dynamics and stability. We also show that our nuclei network framework guided by machine learning techniques and network science enable us to model the properties of isotopes as dependent on their positions in the network. Our main contribution is the discovery of hidden patterns in a large and dense nuclear reaction network by applying the network framework approach. After a close examination of these factors on their impacts on the network's topological structure, we confirmed that the nuclei exhibit communities across different scales and some physical implication related properties, prompting us to propose three fundamental rules to reconstruct the reaction network. The rules are proved to be capable of depicting the expansion of the nuclide chart accurately, enabling us to search for efficient and low-cost routes to yield rare isotopes. Our network-based framework allows us to discover some interesting patterns in nuclear reactions, the degree models on spatial information in embedding space, and the fundamental rules to reconstruct the nuclear reaction network. We hope, in future work, to address the open questions that arise from our current framework, like what physical properties of nuclei can cause the existence of the discovered patterns.

Spatially embedded nuclear reaction network
Many complex systems, such as transportation networks, power grids, internet, and brain networks, are organized as spatial networks with nodes and edges embedded in some kind of geometric space [29]. In these networks, there is a transport distance or wiring cost associated with the length of edges that, in turn, dramatically affects the topology of these networks [30,31]. Therefore, we embed the nuclear reaction network into a spatial space, and study how the spatial information impacts the topological structure of the reaction network. As shown in figure 1(b), the reaction network defines an affinity relation from one nuclide to another, such that a pair of nuclides are connected if there is a direct nuclear reaction (see figure 1(a)). The reaction network, in essence, is analogous to a spatially embedded grid network in a two-dimensional Euclidean space. As demonstrated in figures 1(c) and (d), each nuclide is characterized by its neutron and proton numbers, that can be represented as a lattice grid point in the nuclear chart, where the x coordinate value is defined by the number of neurons N, and the y coordinate value is defined by the number of protons Z. All isotopes of an element are placed on the same row in the lattice. The spatial network brings stability into our consideration, and provides another benefit to distinguish the nearest neighbors according to the statistics N and Z. Figure 1(a) illustrates a reaction chain from 63 29 Cu to 60 28 Ni. It shows two primary direct nuclear reactions: (i) n+ 63 29 Cu→ 4 2 He+ 60 27 Co, which is an α decay, and (ii) p+ 60 27 Co → n+ 60 28 Ni, which is a charge exchange reaction (p, n). These three nuclides ( 63 29 Cu, 60 27 Co and 60 28 Ni) and two reactions form a network shown in figure 1(d), which is part of a large network (figure 1(b)) with 8042 nuclei and 77 293 direct nuclear reactions from JINA REACLIB [32] (see section S1 (https://stacks.iop. org/NJP/23/083035/mmedia) B for detailed description). Figure 1(c) illustrates a sub-network of 235 92 U, showing all the possible direct reactions from and to 235 92 U. In network science, the number of inbound links to a node of interest is defined as the in-degree of the node, the number of outbound links from the node is defined as its out-degree. In the nuclear reaction network, the number of different types of nuclear reactions that produce a nuclide of interest is the nuclide's in-degree. As shown in figure 1(e), the in-degree of 235 92 U is 5, including two charge exchange reactions of the type (n, p) and (p, n), two photon-disintegration reactions of the type (γ, n) and (γ, α), and one neutron capture (n, γ) reaction. Also, the reaction network exhibits strong in-out degree correlations (figure 1(f)), which provides an effective measure of the network symmetry in studying the evolution and controllability of directed networks [33][34][35]. As expected, many nuclear reactions have a reverse reaction. The pattern underlines rule (ii) for reconstructing the nuclear reaction network (see nuclear network model).
The node degree distribution is an essential characteristics of theoretical model networks and real-world networks [36,37], determining their robustness [38], controllability [39] and resilience [25,40,41]. Many The full reaction network contains 8042 nodes and 77 293 directed edges. The nodes can be roughly separated into two categories, one ((d), edges in gray) has much more connections than another (c). (d) A mapping from a set of reactions shown in (a) to a tiny nuclear reaction network of size 3. Each node represents a nuclide, and each edge is a nuclear reaction associated with two nuclides (excluding the subatomic particles and γ rays). If the nuclear reaction between two nuclei is reversible, the corresponding edge becomes bidirectional. (e) Bimodal degree distribution of the reaction network with the largest degree being 16. (f) Positive correlation between the in-and out-degrees. For most nuclides the numbers of in-coming and out-going edges are close to each other. complex networks, including communication networks [42,43], transportation networks [44,45], internet [46], social networks [36] and biological networks [47][48][49] are characterized by a power-law degree distribution. They are scale-free [46,[50][51][52][53], greatly vary in size and structural complexity. Random networks, e.g. Erdös-Rényi (ER) networks [54,55], on the other hand, follow a bell-shaped Poisson degree distribution, in which most of the nodes have approximately similar number of links. In striking contrast to the common power-law or Poisson distribution, as demonstrated in figure 1(e), the nuclear reaction network exhibits an interesting bimodal degree distribution. Because of the nuclear force and other factors, there is an implicit limit on N and Z numbers that a nuclide can hold, forming the nuclear chart's boundaries, where the nuclei tend to have fewer direct nuclear reactions. The boundaries of the nuclide chart are also known as the drip lines [56]. Take the valley of stability as the reference, which consists of 253 stable nuclides [57], the nuclide chart is divided into the neutron-rich and the proton-rich regions. The neutron-rich region consists of all nuclides between the valley of stability and the neutron drip line, similarly for the proton-rich area. Such spatial distribution of nuclei strongly affect the degree distribution, deserve close attention. For the nuclides containing the same number of protons (i.e. associated to the same element) in one row of the nuclide chart, we notice that the degree generally reaches a peak between the stable nuclides and the ones on the boundaries (i.e. nuclides with minimum or maximum number of protons), as shown in figure S1(d). For different elements, the degree also presents a significant difference between the heavy elements with Z > 82 denoted as upper region and the light ones with Z 82 denoted as lower region. Despite the vast differences in degrees exhibited in different regions of the spatial space, we still observe a strong correlation between the in-and out-degree of the individual nuclides as shown in figure 1(f). For the nuclides with degrees between 7 and 10, such correlation is not as strong as for other nuclides, as indicated by the circle's size. The observed pattern emerges due to the physical properties of nuclides and nuclear reactions, and the technical challenges in measurements. Most of the nuclides in this region are short-lived, and therefore difficult to measure. The situation is even worse to measure nuclear reactions that have these reactants involved, which should be of sufficient quantities. Nuclear reactions often occur at high temperatures and densities, because a large amount of energy is required to overcome the Coulomb repulsion between positively charged nuclei [20]. But, the conditions are very difficult to achieve. To understand how these conditions affect the topological properties of the nuclear reaction network, it is necessary to study the impacts of temperatures and reaction rates.

Impacts of temperatures and reaction rates
The nuclear reaction network in astrophysics is a stiff system of ordinary differential equations [20] over a set of nuclides, where each variable represents the abundance of one nuclide species. To integrate the network, the rates of all reactions in the network, representing the interactions between nuclides are required [15]. We need to measure the reaction rates involving both stable and unstable nuclides across the nuclide chart. It is experimentally accessible for the stable nuclides but not for unstable nuclides. Therefore, a theoretical model is necessary to compute the required reaction rates. The Hauser-Feshbach model has been extensively used to compute reaction rates for a wide of nuclides and temperatures [20,32,58]. Based on the statistical model, JINA REACLIB stores reaction rates λs as a seven-parameter function of temperature T. Specifically, we have [32] where {a 0 , a 1 , . . . , a 6 } is the parameter set. Note that we correct the inverse reaction rates for the astrophysical context using partition functions (see section S6). Thus, equation (1) enables us to construct a weighted nuclear reaction network at a given temperature. As shown in figure S6, the reaction rate distributions vary with temperature, promoting us to explore the topology of the nuclear reaction network response to the temperature.
To investigate the nuclear reaction network's connectivity, we analyze the scaled-down networks by tuning temperatures and reaction rates. Adopt a similar approach used in reference [59], at a given temperature and a reaction rate threshold λ θ , we remove all edges whose reaction rates are below λ θ from the original reaction network. When λ θ = 0, we always observe an entire nuclear reaction network without a single reaction removed at any given temperature. However, as we increase the threshold, more and more links with low rates disappear from the system. Both the temperature and the rate threshold, λ θ , determine the network's connectivity, see figure 2(a). As the temperature increases from left to right horizontally, and λ θ stays the same, the nuclear reaction network increases its connectivity. Because, at higher T, the probability of penetrating the Coulomb barrier increases. In contrast, for a fixed temperature, as λ θ increases (from bottom to top vertically), reactions disappear, making the nuclear reaction network sparser.
In complex networks, a giant connected component (GCC) is a group of connected nodes that accounts for a significant portion of the entire network [60,61]. The GCC provides crucial information about the entire system, and its associated properties (e.g. the sizes or the degree distribution) which are relevant to the robustness of networks [62,63]. For the reaction network, we discover that the size of the GCC increases with temperature and decreases with λ θ . At high temperatures and low thresholds, the system develops the richest community structures, and we illustrate these communities with different colors both in the scaled-down reaction networks and the nuclide chart (see the inset in figures 2 and S7). From the right and bottom of figure 2(a), we note that two disjoint communities emerge, one is located on the neutron-rich side, another one is in the proton-rich region. The stable nuclides in the valley of stability (shown in black in the nuclide chart), thanks to their stability, are more abundant and play the role of stepping stones between groups of nuclei. They provide the only way to connect one community to another. The communities, once connected via a few edges (i.e. nuclear reactions), become disconnected when these edges are removed. In many complex networks, some edges may play a crucial role in preserving the connectivity of networks [64,65]. Similarly, some reactions in the nuclear reaction network also are necessary for the global connectivity, and may further affect the relative abundance of nuclear species as well, which in turn influences the reaction rates. The edge removal approach, i.e. the edge percolation [64] is often used to measure the importance of edges in the network. From this, we can observe how the structure of the network evolves. As shown in figure S10, many types reactions experiences a phase transition. Because of the almost uniform distribution of the reaction modes at different levels of λ θ , no single specific reaction mode abruptly fails from the connected network. For example, even after around 50% of the reactions are removed in an ascending order (i.e. from reactions with low reaction rates weights to those with high reaction rates), we can still observe all 15 reaction modes. To some extent, the reaction network is very robust in the diversity of reaction modes. Furthermore, we observe the modality of the degree distribution of the subnetworks (figures 2(b)-(d)) based on Gaussian kernel density estimation (KDE) and peak detection (see methods). The degree distribution exhibits a modality shift from a bimodal degree distribution (left subfigure in figure 2(b)) as observed in the full reaction network to a unimodal degree distribution as the reaction rate exceeds the critical point λ c (right two subfigures in figure 2(b)). In figures 2(c) and (d), we demonstrate that such modality shifts with the detected most frequent degrees. The red and blue bars in figure 2(b) divide the degrees into two distinct clusters, each of which has a characteristic degree. Finally, we find the critical reaction rate threshold λ c of the modality shift and it shows a universal law as a function of the temperature (see figure 2(e)) Empirically, we get γ ∼ = 1.05 for reaction networks at temperatures 0.1 T 1.5 (in Gigakelvins).

Nuclear network model
In spatial network models, such as random geometric graphs, two nodes are connected with high probability if their Euclidean distance is short, usually modeled as undirected networks with bidirectional links. However, the bidirectional topology does not hold (figure S1) in the nuclear network because not all reactions are reversible. Instead, it prompts us to consider a directed spatially embedded model [66]. On the two-dimensional spatial grid described in the section on spatially embedded nuclear reaction network, we find that the spatial distribution strongly associates with the nuclides' position on the grid and the reaction modes. The drip lines of the nuclide chart limit the possible nuclear reactions. The nuclides lying close to the drip lines are short-lived radioactive nuclides having small degrees in the reaction network. The radioactive nuclei tend to decay to stable nuclei, while the radioactivity of reaction products always decreases. However, fewer reactions occur on short-lived radioactive nuclei than the stable or long-lived ones, which involve an abundant number of nuclear reactions. There is a clear separation line between the high and low degrees nuclides for nuclides of different elements (i.e. containing other numbers of protons). It also delineates the limit of the valley of stability. Up to lead isotopes (Z = 82), each element has at least one stable isotope; no stable isotopes exist for the elements above the lead. These heavy elements, owning to strong nuclear force in the isotopes' nuclei, have few edges compared to elements with stable or long-lived radionuclides.
Motivated by the fact that the degree of a nuclide depends on its spatial location in the nuclide chart, we incorporate both the stability valley and the boundaries, that is the distance d s to the nearest stable nuclide and the distance d b to the nearest chart boundary, and propose a simple degree model K(d b , d s ; Θ), where Θ is a vector of parameters. The nuclide degree for the given element increases for both large d s and d b . Note that there are various forms of K(d b , d s ; Θ), and we consider two specific implementations in our analysis, see methods and section S2 for their explicit forms and the approach to learn the parameters.
Our model predicts the degree of each nuclide, i.e. the number of possible reaction modes, as shown in the supplementary movie. With these predictions, we reconstruct the nuclear reaction network. Instead of randomly connecting a pair of stubs as presented in the configuration network model [67][68][69][70], we reconstruct the network based on three rules: (i) reaction type is determined by linking choices (table S2), (ii) distances between the reacting nuclides are short and (iii) each node in-and out-degrees are close to each other. For details of these assumptions, and the detailed reconstruction procedure, see methods.
We compare the reconstructed networks with the real network from different perspectives, including the global degree distribution and the local degree differences between elements and different regions. This comparison demonstrates that our network model can capture the essential characteristics of nuclear reactions, especially the bimodal degree distribution. Moreover, it is theoretically proved that the bimodality can be captured with the proposed network model on an approximated triangular nuclear chart (see section S4 for detailed derivation). We choose a random network model as a baseline, which considers the constraint of reaction-type being determined by linking choices and each edge is assigned to one of the 15 reaction modes.
Let min and prod be two implementations of our proposed parametric degree model, rand be the random baseline model (see methods for detailed description). As shown in figure 3, both min and prod models reproduce the degree distribution of the real nuclear network accurately, much beyond preserving only the overall bimodality. Most importantly, the locally relative ordering of degrees is preserved. Also, the error bars in figure 3 indicate the robust performance of our model. Overall, the min implementation achieves better agreement with the real degree distribution. In marked contrast, the baseline random model produces a Poisson degree distribution, deviating significantly from the real bimodal degree distribution, and show high variance from the real network. These indicate the importance of restricting reacting nuclides distances and strong correlation of in-and out-degrees for generating the bimodal distribution characteristic of nuclear reaction network degree distribution. As shown in figures 1(e) and (f) with this distribution, only a tiny fraction of nuclides have degree k out = {0, 7, 8, 10, 11, 16}, and therefore the predicted degrees have large fluctuations due to limited data points. Meanwhile, a remarkably close agreement between the reconstructed degrees and the ground truth can be found for k out = 12, which contains nearly 25% of the nuclides, and for k out = 15 with 17% of the nuclides, further validating the outstanding performance of our model.
Next, we employ four metrics to evaluate the performance of our model-precision, recall, and Jaccard index for both in-degree and out-degree of each nuclide, and the clustering coefficient (see methods for specific definitions). As shown in figure 4, we consider the directions of edges and obtain the average performance on the element level (figures 4(a)-(g)) and network level ( figure 4(i)). For the degree distribution, one can see a valley between the two peaks, which is eroded by nuclides with 8 to 11 reactions. Their geographic positions relative to the boundaries and their number of protons are shown in figure  S1(b). As demonstrated in figure 4(h), the reconstructed models identify the patterns in d b with remarkable 80% similarity to the original network, but less than 60% similarity w.r.t. the distribution of the number of protons. Yet, the rand implementation is very far from the other two, except for recalls (see figures 4(b) and (e)) in the upper region Z > 82. The reason is that nuclides in this region have only 3 to 6 edges and the rand implementation gives predictions of at least 8, which increases the recall values. In summary, our network model, described by the three mechanisms, retains the essential real properties of the nuclear reactions, and captures the spatial attributes, offering a better and new understanding of the nuclear reaction network.

Nuclide chart expansions
Scientists still continue to discover new elements and nuclides [56], and thus the boundaries of the nuclide chart are expanding as well. To examine the predictive power of our model, we ask the fundamental question: can we predict new reactions across the entire network extracted from the JINA REACLIB database using the experimentally validated subnetwork? Until the end of 2017, scientists have discovered 3196 nuclides and 32 184 reactions in nature or in the laboratories [57,71], forming a validated subnetwork (the red region in figure 5(a)). Based on this subnetwork, we use our parametric degree model to predict the probability of occurrence of directed reactions between 4846 new nuclides. Our approach is significantly different from the current state-of-the-art models for nuclear reaction discovery. These models primarily probe the microscopic physical or chemical properties of isotopes, e.g. energetic possibilities [71]. There exists evidence [71][72][73] that the newly discovered nuclides are always close to the existing nuclides in the nuclide chart.
Our network approach can precisely predict novel reactions that have not yet been discovered in nature or in the laboratories as shown in figure 5. As the size of the nuclide chart increases, the overall quality of agreement with the ground truth (i.e. the real reaction network) does not fluctuate too much except for the last expansion, which shows a decline on each measure relative to other steps (figure 5). The nuclides discovered during previous expansions are densely connected, but the ones included in the fifth expansion, especially the heavy radioactive nuclides that lately join the network, have much fewer connections. Based on the 15 most common reaction modes, we build all possible missing links between pairs of nuclides. A missing directed link is feasible when it meets the criterion that the predicted in-and out-degrees of both involved nuclide nodes of the edge are greater than the corresponding existing ones. Once the criterion is violated, the link will be discarded. For example, considering the red node u as the reaction target, the red links are all discovered reactions associated with u. Node 1 has four existing links, and its predicted degree is four, so it violates the criterion. Therefore, u → 1 presented as a dashed line is discarded. All those feasible links in black are the candidates for new reactions in the matching procedure. (b) A realization of the predicted reactions with the nuclides in red as the reaction targets.
Interestingly, analogous heterogeneity in the links for different regions has been observed in city development [74]. As formulated in the core-periphery model [75] in the emergence of an urban system, the constructions of the core and the periphery regions are very unbalanced. In our case, the previously discovered nuclides are in this core, and those found later belong to the periphery region, and nuclides in the core region have more connections than those in the periphery region.

Reaction prediction
Our degree model predicts the number of potential reactions for each nuclide, and many nuclides may have less connections than that predicted. The difference implies missing nuclear reactions. To predict the missing nuclear reactions, we propose a simple matching algorithm. It is straightforward, and does not have any spatial favorite over the 15 different types of nuclear reactions. Therefore, we are able to study reactions beyond specific nuclides, reaction modes [76], or regions in the nuclide chart [77,78] and make new discoveries.
We search all eligible out-going and in-coming edges for each nuclide by excluding all existing nuclear reactions. An eligible reaction first should belong to one of the 15 most common modes of nuclear reactions, then the nuclide must have missing connections. Also, an eligible connection u → v must be eligible for both u and v. We iteratively match pairs of nodes with eligible edges, remove the illegible ones and drop nodes without missing connections based on the predictions, until no eligible connection can be found. As illustrated in figure 6, we predict all potential new reactions with nuclide u as the target. There are eight feasible reaction modes from u, and four of them (red edges) have been discovered. According to our degree model, nuclide u is predicted to have 6 connections, i.e. k u = 6. Therefore, it still has 2 missing links from the possible choices {u → 1, u → 2, u → 3, u → 4}. The prediction will be made based on the states of the possible production nuclides {1, 2, 3, 4}. According to the degree prediction, node 1 should have k 1 = 4 in-coming connections, excluding 4 already discovered reactions, there is no space for extra reactions. Therefore, the edges u → 1 can be discarded (dashed). Node 2 is checked similarly. It has k 2 = 6 stubs, but only five feasible reaction modes, and 4 of them have been discovered. Therefore, node 2 must choose u, and u → 2 must be kept. But it may be not the only choice for node u, which can have other choices. Node 3 has to choose from the remaining undetermined edges to build one or at most two connections. We examine node 4, which still misses two possible reaction modes. Suppose the degree model is accurate, and no nuclide has more connections than the predicted ones, the reaction prediction becomes the general b-matching problem [79].

Conclusions
Nuclear reactions are crucial in studying the nucleon-nucleon interaction and the formation of stellar structures, which can be modeled as a stiff system of coupled ordinary differential equations. Evolving the network will provide an abundance of information on these nuclei species in the network. Many studies concentrate on integrating the network via numerical approaches. Still, very few works are dedicated to making network analysis of reaction networks for an alternative perspective. Because of the interplay between the topological structure and the dynamics of nuclear reactions, the exhibited topology may, in turn, affect other properties of nuclear reactions. To better understand nuclear reaction networks' topological properties, we offer an alternative network-based framework for overall nuclear reactions collected in JINA REACLIB.
With a wealth of techniques established in network science, we analyze nuclear reaction networks and develop a predictive degree model K(d b , d s ; Θ) that captures the impact of spatial properties of nuclides on the number of the reactions in which a nuclide participates. Also, we develop a growth model to explore the evolution of nuclear reactions and understand how the characteristic bimodal degree distribution emerges.
All the reactions associated with one nuclide represent different reaction modes, identified with the spatial information (i.e. N and Z) encoded in all involved nuclide nodes of the direct nuclear reactions. The spatial information is implicitly encoded in the edges of the reaction network. From an unweighted reaction network, we discover the characteristic bimodal degree distribution and three recurring patterns. As we explicitly weigh the connections using the reaction rates, we make an additional contribution in studying the modality transition in the degree distribution of the weighted nuclear reaction network. We find that the network topology changes with the temperature of the reactions, and the bimodal distribution is universal for links with a rate λ below the threshold λ c = e −T γ , where T is the temperature in Gigakelvin and γ ≈ 1.05.

Limitations
In this work, we for the first time use network science tools to model and understand the nuclear reaction network. Our framework in this article has several limitations: • Dataset and nuclear reactions: the nuclear reactions used to create the nuclear reaction network are compiled by JINA REACLIB. Many weak reactions and reverse reactions are not available at all. Also, the spontaneous fission reactions are not included. Therefore, the reaction network used in our analysis is incomplete. But, it is not our intention to build a full network that exhaustively includes everything. Our interest is the most common 15 reaction modes, which are sufficiently large to study the nuclear reaction network's special topological property. • Degree model: the implemented degree models are built on the spatial information of nuclei. The degree of a nuclide of interest relies on its distance to the nuclide chart's boundaries and its distance to the nearest stable nuclide (if available) or a long-lived radionuclide. The distance is defined along the proton axis Z and measures the difference in the number of neutrons between a pair of nuclides of interest, describing isotopes of the same element. The model itself may have too many parameters, but the idea is generic. There still has room to simplify the model for fewer parameters.

Future work
This pioneering work may lead to many further research directions, including (i) considering more nuclear reactions beyond the ones covered by JINA REACLIB, for example, the complicated fission reactions and weak reactions, (ii) improving the overall reconstruction performance by incorporating the correction among different reaction modes (figure S2) to our current reconstruction procedure, (iii) developing a dynamical model over the proposed spatially embedded network to predict nuclide abundances, and (iv) identifying the most economical path to produce medically useful isotopes [80,81].

Model of nuclear reaction network
Degree model. To demonstrate the ability of our degree model to capture the spatial degree distribution, we introduce two specific implementations: where α, β, c 1 , c 2 ∈ R + , a, b ∈ R. Both have the exponent parameters α and β that represent the strength of d b and d s . The bias terms {a, b} and the coefficients {c, c 1 , c 2 } are included to improve the fit of the predicted degree with the degrees of known nuclides in the nuclide chart. These two implementations yield the accurate degree distribution, although they are structurally different. For example, K prob is a continues function of d b and d s , but K min is a discontinues function. For K min , even if two nodes are predicted with the same degree, the sources of this prediction may be different, some come from the impact of stable nodes, c 2 d β s + a, others from the chart boundary, c 1 d α b + b. Also, there exists obvious difference in the distribution of the degree between nodes at the two sides of the stable nodes. Hence, in order to reflect this difference, we introduce two different bias terms b 1 and b 2 to replace the term b for nodes located at the left and the right side of the stable nodes, respectively.
Both K min and K prod are parametric models, and each has several parameters to calibrate. To have a good prediction of the node degrees, and to fit the degree distribution of the entire network, we adopt a mixed optimization approach to search for optimal parameters. We apply the gradient-based optimization method BFGS [82] in the initial searching stage, then make further calibration with the derivative-free Bayesian optimization method [83], which is widely used in machine learning. The optimal parameters and associated prediction accuracy are summarized in table S3.
The degree implementations yield continuous predictions, however the true degrees are integers. To be realistic, in the estimation of the prediction accuracy, all predictions are rounded to their nearest integers in our analysis.
Network reconstruction model. In network science, the configuration model and the hidden parameter model [67][68][69][70] are two commonly used approaches to generate networks with arbitrary degree distribution, but neither is a good choice to reconstruct the nuclear reaction network. The configuration model assigns each node a predefined degree as stubs, then randomly chooses a pair of stubs in two nearby nuclides and connects them. But this procedure is likely to produce multi-links, which are not allowed in many real networks, including the nuclear reaction network. The hidden parameter model assigns a hidden parameter to each node, and randomly selects two nodes with probability proportional to their hidden parameters, respectively. Both nodes are discarded if they are already connected, otherwise a new link is created. This method can avoid parallel-links, but it still suffers the same problem as the configuration model, i.e. producing many long-distance links, which do not exist in reactions (see reaction modes reported in table S2).
To reproduce the nuclear reaction network, we need to resolve the aforementioned issues encountered by both network models. To this end, we put additional constraints on the network generation procedure: Most of the nodes have in-and out-degrees close to each other. Therefore, a simplification is made in our generation procedure, by assuming that the number of out-going edges from a node is the same as the number of edges in-coming to it.
Another important aspect is the maximum number of eligible edges, which is predicted with the degree model. Accompanied by the aforementioned three assumptions, we propose the following procedure to reconstruct the nuclear reaction network with m involved reactions from isolated nodes, expecting to achieve moderate accuracy.
Step 1. Search the nearest stable nuclides and the nuclide boundaries for each element, and set them as the references.
Step 2. Compute the distance d s to the nearest stable nodes, and the distance d b to the nearest boundary for each node.
Step 3. Predict the degree of each node with the degree model, and the predicted degree is the capacity of eligible edges that can be attached to the node.
Step 4. Select an eligible outbound edge according to the associated spatial priority, then update the vacancy of the related nodes with the predicted degree and the eligible edges.
Step 5. If a node has no vacancy to fill, it will be removed in the next round.
Step 6. The procedure is terminated once all nodes have their vacancies filled, and a new network is constructed.
It is possible that the procedure fails to generate a network with enough edges, hence a network with the same number of edges as the original nuclear reaction network is not guaranteed. To produce such a network, an additional aggregation step is carried out after the initial reconstruction. It works as follows.
We repeat the reconstruction for multiple times, say 100, and produce 100 link-deficient networks. Then, we count the times that an edge appears in these networks, and wire the nuclides with the m most popular edges for a link-sufficient network.
The quality of predictive ability of the degree model directly affects the reconstruction performance, since too high or too low degree predictions both cause the deviations from the ground truth degrees. To ensure a reconstruction model of high-quality, we must first design accurate degree models. Here, two variants of degree models with high prediction accuracy have been developed, and tested in our experiments. Actually, in future work we plan to try and design better degree models that assume other forms. To improve the reconstruction accuracy, the degree correlation can provide additional useful information, as shown in figure S2.

Nuclide chart expansion
The evolution of the nuclide chart from recently discovered nuclides is represented as the spatial expansion of the boundary of the nuclear chart.
Step 1. We start from the seed subnetwork that consists of 3196 nuclides and the incident 32 184 reactions (see figure 5(a)). Then, we expand the nuclide chart by adding extra 1000 nuclides in each of the subsequent steps. These additional nuclides come from the boundaries of the current nuclide chart. We train a degree model from the current subnetwork and recalculate the degree of all the nuclides in the newly formed nuclide chart. Based on the network model proposed in methods section, we reproduce the nuclear reaction network, and evaluate the performance of our network model in terms of several commonly used metrics (see section on reconstruction performance metrics for detailed definitions).
Step 2. Additional 1000 nuclides from the boundaries of the current nuclide chart are randomly chosen and included. Following the same procedure as in step 1, the pre-trained degree model is applied to calculate the degrees for new members.
Step 3. Repeat step 2 until all nuclides in the JINA REACLIB have been added. Our evolution process gives five expansions. During the first four expansions, 1000 nuclides have been added each time, and the fifth expansion includes other 846 new members and finally defines a chart of 8042 nuclides. Hence, we have five nuclear reaction networks to record the entire expansion history, as illustrated in figure 5.

Reconstruction performance metrics
Let G = (V, E) be the input nuclear reaction network, G = (V, E ) be the reconstructed network. Both are directed networks on the same set of nodes with the same number of edges, i.e. |E | = |E|. In our case, |V| = 8, 042 and |E| = 77, 293. To measure the reconstruction performance, several common metrics are used.
Precision and recall. Borrowing an idea from information retrieval [84], the reconstruction procedure is considered as the retrieval of the edges. Let E v,out be all out-going edges from node v ∈ V, and E v,out be the reconstructed edges outgoing from v. Precision on v is defined as the percentage of correct edges among the reconstructed ones, i.e. POS v = |E v,out ∩ E v,out |/|E v,out |. Recall on v is the percentage of correct ones in all real outgoing edges, i.e. ROS v = |E v,out ∩ E v,out |/|E v,out |. Similarly, we define precision and recall on the incoming edges as: Suppose node v has three incoming edges {e 1 , e 2 , e 3 }. If the reconstruction model gives two incoming edges {e 1 , e 4 } for v, then, PIS v = 1/2 and RIS v = 1/3. Jaccard index. Jaccard index is introduced to measure the similarity between two networks in terms of neighborhood: Clustering coefficient. Clustering coefficient [70,85] is a measure of the degree to which nodes in a graph tend to cluster together. Node v's clustering coefficient is defined as the fraction of its neighbors that are also connected to each other (triplets of nodes) where N v is the neighborhood of v, and k v = |N v | is the number of nearest neighbors, irrespective of the direction of the edges.
All these metrics are applied at node level. To estimate the overall reconstruction performance, we stack all nodes' performance scores w.r.t. a metric M into vectors x M and x M for G and G respectively, then compute the cosine similarity between x M and x M . The overall performance is positive and less than 1, and the higher is better.

KDE for peak detection
Degree distribution is a crucial characteristic of networks' topology. However, the common histogram plot, if not given sufficient data points, is often unable to reveal the number of peaks in modality analysis. To address this difficulty, we apply the KDE [86] to smooth the histograms to ease identification of peaks.
KDE is proposed to estimate the shape of some unknown probability density function f, from n independent and identically distributed samples {x 1 , x 2 , . . . , x n } that are supposed to be drawn from f. A kernel density estimatorf is used for this purpose. Here, K is the kernel function, and the bandwidth h > 0 is a smoothing parameter, controlling the tradeoff between bias and variance in the result. There are many different choices of K. The most common one is the Gaussian kernel function K(x) = 1 √ 2π exp(−x 2 /2), which is also used in our analysis. Bandwidth selection is crucial in KDE and greatly affects the result. It can be selected in many ways, either a 'rule of thumb', cross-validation, or 'plug-in methods'. We apply cross-validation to select the optimal bandwidth and use scikit-learn [87] to run KDE (see section S7 for the sensitivity analysis of bandwidth).
A peak or local maximum is defined as a point whose two direct neighbors have a smaller amplitude. A simple detection approach for all peaks (or local maxima) is to compare all neighboring density values (see figure S8). However, the procedure is very vulnerable to noise. Therefore, some sort of smoothing is necessary, and KDE can meet such need. In our analysis, the SciPy package (http://scipy.org/) is used to detect peaks.