Extracting Local Symmetry of Mono-Atomic Systems from Extended X-ray Absorption Fine Structure Using Deep Neural Networks

In recent years, neural networks have become a new method for the analysis of extended X-ray absorption fine structure data. Due to its sensitivity to local structure, X-ray absorption spectroscopy is often used to study disordered systems and one of its more interesting property is the sensitivity not only to pair distribution function, but also to three-body distribution, which contains information on the local symmetry. In this study, by considering the case of Ni, we show that by using neural networks, it is possible to obtain not only the radial distribution function, but also the bond angle distribution between the first nearest-neighbors. Additionally, by adding appropriate configurations in the dataset used for training, we show that the neural network is able to analyze also data from disordered phases (liquid and undercooled state), detecting small changes in the local ordering compatible with results obtained through other methods.


Introduction
Machine Learning (ML) methods are becoming widely used to tackle different scientific problems [1,2]. This popularity comes not only thanks to the advancement in the field and the success of the results, but also due to the availability of various open-source packages, which make them accessible even to non-expert users. In recent years, various studies have been conducted to the use of those methods connected to X-ray absorption spectroscopy (XAS) [3][4][5][6][7][8][9][10][11][12][13][14][15]. XAS is an experimental technique where, by measuring the absorbance of a sample as a function of energy around core-electrons edges of elements present in the sample, we can obtain information about the local structure [16,17]. The spectra is commonly divided into two regions, due to both experimental and theoretical reasons, depending on the energy range we consider: X-ray Absorption Near Edge Structure (XANES) for the near-edge region (up to about 50 eV after the edge) and Extended X-ray Absorption Fine Structure (EXAFS) for the region above 50 eV and beyond. One the main advantages of XAS is that it does not require sample crystallinity, meaning it can be used to study a vast range of materials, such as molecules [18][19][20], amorphous systems [21][22][23][24] and nano-materials [25][26][27][28].
In this paper, we focus our attention on EXAFS. The EXAFS signal can be described in terms of the n-body distribution functions g n [29,30]: dr 4πr 2 ρ g 2 (r) γ (2) (r, k) + dr 1 dr 2 dφ 8π 2 r 2 1 r 2 2 sin(φ)ρ 2 g 3 (r 1 , r 2 , φ)γ (3) (r 1 , r 2 , φ, k) + O(g 4 ), (1) where k is the photoelectron wave-vector k = 2m(E − E e )/h defined from the edge's energy and γ (n) are the irreducible n-body contributions to EXAFS, containing the MS terms associated with a given geometry. The n-body distributions are statistical quantities proportional to the probability of finding atoms in a given configuration and are used to describe the structure of the system [31]. Due to mean-free-path effects of the excited photoelectron, two-body and three-body terms are sufficient to describe experimental EXAFS signals and contributions from higher order distributions can be ignored. Direct extraction of the structural functions from Equation (1) is a known ill-posed problems and various strategies have been used over the years to solve this problem. The standard method of analysis, so called 'peak-fitting' approach, consists in describing the distribution functions as a sum of peaks and then refine the peak parameters by a least-square fitting with the experimental data. This method has the drawback of having to know a priori the peak distributions and additional constraints are required to treat disordered systems [32,33].
Other methods of analysis based on regularization techniques [34,35] have also been used, although limited to single scattering terms. Reverse Monte Carlo (RMC) [36] is a structural modeling method that can also be used to analyze EXAFS data [37][38][39], but being a simulation method, it may be costly in terms of time and also the density of the system required. On the other hand, it permits us to calculate, from the obtained coordinates, radial distributions functions, and bond-angle distributions and look for specific geometries [40,41]. As mentioned before, ML, more precisely Neural Networks (NN), has recently came out as another method to analyze EXAFS data [13,42]. A NN consists of many "neurons" stacked into layers, with the output of one layer acting as the input for the next. A neuron performs a linear transformation of the inputs with given weights and pass it to a stepwise activation function, which gives the output. The weights used are adjusted during the training of the NN, performed on a dataset for the problem that we want to study. For EXAFS, computer simulations, like Molecular Dynamics (MD) [43] are used to create the dataset, since it is not feasible using only experimental data. Previous works mainly focused on crystalline configurations, we want to investigate whether the same method could be used to also investigate disordered phases and whether we could obtain, other than the radial distribution function, also the bond-angle distributions, which contain information on the local geometries. In this work, we consider the case of metallic Ni, for which also the liquid phase has been previously measured [44].

MD Simulations
To create the dataset for the training of the NN, we run several MD simulations using the LAMMPS code [45,46]. Four different crystal configurations were taken into account: face-centered cubic (fcc), hexagonal close packed (hcp), body-centered cubic (bcc) and facecentered diamond-cubic (dia). Simulations were done in an NVT ensemble using different modified embedded-atom potentials ( [47] for fcc, hcp and bcc, [48] for dia) depending on the starting crystal configuration to avoid transition to a different structure. Since MD configurations are used only to establish the relation between the structure and the EXAFS signal, it is not necessary in this case that they reproduce real nickel configurations. For all structures, the lattice parameter was chosen to have approximately the first nearestneighbors at the distance of room temperature nickel ∼2.49 Å [49]. Simulations for each structure were run from 60 to 1500 K with a step ∆T = 20 K, for each temperature the simulation run for 50 ps to ensure equilibration. After that, the atomic configuration was saved every 1 ps for a total of 21 times. In order to save computational time, these configurations were rescaled to change the first nearest-neighbors distance with a step ∆r 0.015 Å from −10 to +10 times, effectively obtaining 21 different volume configurations for a given temperature.
In addition to crystal configurations, disordered liquid configurations were created. From an initial fcc configuration with a lattice parameter chosen to reproduce the liquid density [50], NVT simulations were performed at 2000 K for 500 ps, ensuring that the final configuration consisted in a melted system with no memory of the initial crystal structure. This final configuration was then used as a starting point for several simulation at different temperatures from 2000 to 1500 K with a step ∆T = 20 K that were run 100 ps to ensure equilibration. Similar to the crystal case, volume variations were done by rescaling different configurations for each temperature.
Number of atoms were chosen to have sufficient statistics, but limited to reduce computational time. In the crystal cases, we used 864 atoms for fcc and hcp (6 × 6 × 6 unit cells), 1024 atoms for bcc (8 × 8 × 8 unit cells) and 1000 atoms for dia (5 × 5 × 5 unit cells). For the liquid configurations, 4000 atoms were used instead: due to the disorder, more atoms were required to have sufficient statistics. In total, we obtained more than 6000 configurations, from which we calculated the radial distribution function (RDF) n(r) = 4πr 2 ρg(r) (cfr. Equtation (1)) using a binning of 0.05 Å , the bond angle distribution (BAD), using the minimum after the first peak as a cut-off (for bcc, the first two peaks were included, since they merge at high temperature), and the EXAFS signal.
While for crystals the nearest neighbor peak is clearly defined and so is the bond angle distribution, the definition for the liquid case is not unique [51]. The most common method is the use of a distance cut-off positioned around the first minimum, but small variations of this value can slightly change the distribution, due to more or less atoms being included in the calculations, which changes the height of the distribution. For comparison then, the normalized distribution (the integral is normalized to a constant) is used instead.

EXAFS Calculation
The χ(k) EXAFS signal for a configuration is calculated as the average of the signal coming from all atoms (ensemble average) using the suite of programs GNXAS [52,53]: where γ (2) and γ (3) are the irreducible 2-body and 3-body signals associated with the photoabsorbing atom 0 and its surrounding atoms up to a path cut-off. These signals are calculated by continued fraction expansion [54], containing all the significant multiple-scattering terms associated with that geometry. The use of a path cut-off is justified by the mean-free-path effects of the excited photo-electron, which makes the signal associated to distant coordination shells negligible. The cut-off used for crystal configuration was 8 Å and decreased to 6 Å for liquid configurations, due to the increased disorder. In both cases, it was confirmed that the use of a higher cut-off did not change the calculated EXAFS signal. The energy shift for the theoretical calculations was chosen randomly between −5 to +10 eV for each configuration. It has been shown [8] that using this strategy makes the NN trained on the dataset insensitive to this parameter.

NN Architecture and Training
For the NN, a short Python3 [55] program was written, making use of the Tensor-Flow [56] and Keras [57] packages. The architecture consisted, other than the input layer (k 2 -weighted EXAFS signal) and output layers (n(r) and BAD), in 4 hidden dense layers of 1000 neurons each with a rectified linear activation function (ReLU). Training was performed using the Adam optimizer [58] with a learning rate of 10 −4 and the metrics used were the mean squared errors for both quantities, scaled to have roughly the same weight. Early stopping with a patience of 50 steps was also used. All these parameters were obtained by grid search and through trial-and-error. For all optimization procedures, the dataset was divided in three chunks chosen randomly, following common practices of machine learning: a train set, a validation set and a test set, with a ratio of roughly 3:1:1, respectively. The train set was used to for the training of the weights in the NN, the validation set to check the training procedure and enforce the early stopping, and finally the test set to evaluate the performance of the NN. Parameters that obtained lowest loss on the test set were chosen. The use of Dropout [59] was also tested, but was found to not increase performance in our case.
For the training of the NNs used for data analysis, again 20% of the dataset was used for the test split, and the rest was divided into 5 equal parts. In a procedure similar to cross validation [60], one of the parts was used as validation set, while the others were used for training in rotation, for a total of 5 trained NNs. Results from each NN are then averaged and the standard deviation used to estimate errors.

Test Dataset
We first checked the performance on the test dataset, which has not been used for training. In Figure 1, we show in the top left the mean square error (MSE), ordered in descending order, for the worst 200 cases, and some selected configurations, one for each of the different type of structure we included in the dataset. For each we compare the RDF and BAD predicted by the NN from the calculated EXAFS signal with the ones from the model. Even in the cases where the MSE is high, like (a) and (b), the overall features of the distributions are correctly predicted and additional peaks, if present, have standard deviations that exceed their value. All the different type of structure are correctly identified.

Experimental Data
We then tested the NN on experimental data for Ni. XAFS spectra from a Ni thin-film was collected in transmission geometry at the beamline BL11 [61] of the Saga Light Source. The EXAFS signal (Figure 2) was extracted using the program jesf from the GNXAS package. With respect to the simulated data used for training, experimental data contain noise produced by the ionization chamber detectors, but we have found that the noise does not much affect the results. We tested this by manually increasing the noise present in the data: while the standard deviation increased as the noise increased, the results remained consistent until the signal-to-noise ratio approached 1.
Ni foil The data was fed to the NN, and the result obtained for the RDF is shown in Figure 3, compared with 4000 atoms NVT MD simulation at 300 K using the same potential used for constructing the dataset [47]. Peak position from X-ray diffraction (XRD) [49] and RDF obtained by RMC [42] are shown for comparison. We can see that the MD simulation in this case underestimates the thermal vibrations, resulting in slightly narrower and higher peaks with respect to the RDF obtained by NN, which instead agrees very well with previously published data [42]. Other than RDF, we obtained the BAD shown in Figure 4.

n(r) r [Å]
MD NN XRD Reference [49] RMC Reference [42]   . RDF for Ni at 300K obtained by MD simulation and by analysis of experimental data using NN. For comparison, peak position from X-ray diffraction [49] and RDF from RMC [42] are shown.
As with the RDF, also here we can see that the thermal disorder for the MD is lower respect to the prediction of the NN, but the angle's positions are correctly estimated. Doing the integration of each peak gives the number of atoms with that specific geometry: for the NN result, these values are shown in Figure 4 together with estimated error bars from the standard deviation. The correct coordinations for a fcc structure (shown also as gray bars) are obtained. We then used the NN to analyze data of undercooled and liquid Ni previously published [44]. In Figure 5, we report the obtained RDF obtained by NN, compared with NVT MD simulations at experimental density [50] and RMC analysis of the same data. Some differences are present: most noticeable the right side of the first peak is broader for RMC. Since it is a method based on random movements of the atoms, it is known to give more disordered configurations that could result, like in this case, to broader peaks. Secondly, the first peak from the NN is slightly shifted to lower distance in both cases, more visible in the liquid data, respect to MD and RMC results. This difference could be explained by the correlation between the average distance and the energy shift parameter ∆E. As explained in Section 2.2, this parameter was randomly shifted in the dataset, so that the NN became insensitive to it: while this strategy works well for the crystal phase thanks to various multiple-scattering contribution to the EXAFS signal, for the liquid phase, where most terms become zero, it may not be sufficient. The shift is, in any case, of the order of 0.02 Å. On the right side of Figure 5, the first peak obtained by the NN for the two cases are compared: clear changes are visible that predict the correct temperature behavior, even if the two experimental signals difference is quite small [44].
On Figure 6, we show the normalized BAD distribution from NN, again compared with MD and RMC. As anticipated in Section 2.1, in the liquid case, the nearest neighbors are not unequivocally defined and the use of a cut-off distance around the first minimum is usually used. Small variations of the cut-off, which increases/decreases atoms considered, result in magnitude variation of the BAD. Therefore, for comparing purposes, the use of the normalized distribution is better suited. We can see that there is a good agreement between NN and RMC results for the peak around 60 • , while some differences are present in the 90-120 • region. This is somewhat expected, since contribution to the EXAFS signal as a function of angle has minimum around this angle (see Appendix A, Figure A1) and so the prediction given by the NN is biased by the dataset used: we can see that in this region indeed the distributions obtained from the NN resemble the ones from MD simulations. Nevertheless, the agreement is overall good and the NN is able to predict the small differences between the two signals, which is an indication of changes in the local geometry of the two phases.

Discussion and Conclusions
We have shown that the NN method can be used to obtain not just radial distribution function, but also bond angle distribution, which gives information on the local symmetry of nearest neighboring atoms. Additionally, the method can be also used to study noncrystalline configurations, like liquids and amorphous materials. While this method is certainly powerful and permits quick and detailed analysis also by non-experts, results should be compared to additional methods or experimental data for validation and exclude possible biases from the dataset. To improve upon this point, datasets could be expanded, including additional types of structures, and shared, since their creation is the biggest bottleneck in terms of time. Extension to multi-component systems is technically possible, but further investigations in this direction are required for disordered systems.

Data Availability Statement:
The data presented in this study are available from the corresponding author upon reasonable request.

Acknowledgments:
We thank Andrea Di Cicco (University of Camerino) for permission to use data from Reference [44]. Part of the computation was carried out using the computer resources offered by the Research Institute for Information Technology, Kyushu University.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

ML
Machine Learning XAS X-ray absorption spectroscopy XANES X-ray absorption near edge structure EXAFS Extended x-ray absorption fine structure NN Neural Network MD Molecular Dynamics RDF Radial distribution function BAD Bond-angle distribution

Appendix A
Amplitude of the γ (3) signals as a function of the angle between two Ni atoms at distance 2.487 Å from the photo-absorbing atom. The increase at 180 • is known as the focusing effect given by the co-linear configuration [29,62].