Machine learning for neutron reflectometry data analysis of two-layer thin films

Neutron reflectometry (NR) is a powerful tool for probing thin films at length scales down to nanometers. We investigated the use of a neural network to predict a two-layer thin film structure to model a given measured reflectivity curve. Application of this neural network to predict a thin film structure revealed that it was accurate and could provide an excellent starting point for traditional fitting methods. Employing prediction-guided fitting has considerable potential for more rapidly producing a result compared to the labor-intensive but commonly-used approach of trial and error searches prior to refinement. A deeper look at the stability of the predictive power of the neural network against statistical fluctuations of measured reflectivity profiles showed that the predictions are stable. We conclude that the approach presented here can provide valuable assistance to users of NR and should be further extended for use in studies of more complex n-layer thin film systems. This result also opens up the possibility of developing adaptive measurement systems in the future.


Introduction
Neutron reflectometry (NR) is an ideal technique for studying thin film structures. The properties of neutrons, specifically their sensitivity to light elements and different isotopes, allow NR to provide complementary information to that obtained using x-ray reflectometry, another powerful tool for studying these structures. While the coherent scattering lengths of x-rays vary linearly as a function of the elemental atomic number, neutron scattering lengths are all of comparable magnitude but can be very different for isotopes of an element, as well as for neighboring elements of the periodic table [1]. This characteristic makes it possible for neutrons and x-rays to highlight different features within a given sample. The magnetic moment of the neutron and its relatively low absorption cross-section for many materials that strongly absorb x-rays also make the neutron a unique probe for the study of materials. Specular reflectivity, where the incident and reflected scattering angles are identical, is a measurement of the probability that an incoming beam will be reflected off a thin film surface at a given angle [2]. It is measured as a function of the wave vector transfer Q = 4πsin(θ)/λ, where θ is the angle of reflection and λ is the wavelength of the neutron. It can also be measured with other probes, including x-rays. Specular reflectivity is normally presented as the reflectivity profile R(Q), that can be written in the first Born approximation as a function of wave vector transfer Q [2]: where z is the depth coordinate perpendicular to the surface of the film and β(z) is the scattering length density (SLD) profile. The SLD of a given compound is related to the coherent scattering lengths of the atoms in the material: where ρ is the mass density of the compound, m is the molecular weight of the compound, N A is the Avogadro number, n i is the number of atoms of type i in the compound, and b i is the coherent scattering length of that atom. The sum is taken over all types of atoms in the compound. The structure of the materials in a thin film gives rise to the SLD profile β(z).
As can be seen in equation (1), R(Q) is given by the squared norm of a complex amplitude, which causes a loss of phase information. Consequently, there often are multiple structures, some of which not physically realistic, that can lead to very similar reflectivity profiles. As a result, data analysis is often difficult and labor-intensive. The key challenge is finding a good starting point, which is typically established by characterizing the thin film sample before making NR measurements. For example, one may use the complementary information from an x-ray reflectivity measurement of the sample and analyze both x-ray and neutron data together. Unfortunately, it is not always possible to measure the same sample in the same conditions using multiple techniques. Studies in which thin film samples are subjected to changes in operando make up a large fraction of NR experiments, and complementary structural information is generally not available while these experiments are taking place. In such cases, a starting model is chosen using a combination of the initial measurement of the as-prepared sample, the knowledge of the system under study, the expected behavior of the sample under sample environment changes during the measurement, and the expertise of the scattering scientist. The development of techniques to automate the process of finding a reasonable starting model is therefore important.
Machine learning offers new opportunities for accelerating the measurement and the analysis of scientific data. Ranging from developing models to assist the interpretation of data to the development of automated instruments that use machine learning predictions for autonomous control of experiments, this field is receiving growing attention from scientific user facilities [3]. In that respect, establishing the feasibility of using neural networks to extract structural information from reflectivity measurements of thin films would also have implications beyond helping reflectometry users fit their data.
There are several recent examples of using machine learning techniques to analyze x-ray and neutron scattering data [4][5][6][7][8][9]. Chang et al [4] and Archibald et al [7] applied machine learning methods to small-angle scattering data, while Samarakoon et al [6] used machine learning methods for neutron spectroscopy. The work of Greco et al [5] deals with reflectivity, specifically of x-rays. The authors developed an artificial neural network to determine x-ray reflectivity models from measured data. However, their neural network was only developed for data resulting from thin film comprised of a single layer, which is the simplest reflectivity data analysis problem. Recent work by Mironov et al [8] has focused on systems with up to three layers, and work by Carmona Loaiza et al [9] has focused on multi-layer systems.
In this article, we report on the application of machine learning for predicting an initial thin film structure from a measured reflectivity profile. The neural network developed is applicable to neutron reflectivity data from two-layer thin film systems. Although our neural network was trained for neutron reflectivity data, the results and conclusions obtained in this work are equally applicable to x-ray reflectometry by developing a suitable set of x-ray reflectivity profiles for training. The results presented here demonstrate that a neural network is suitable for fast analysis of reflectivity data. Such an approach could be used to obtain thin film structure predictions to either be used as a starting point for final model refinement, or for building an automated data analysis pipeline that supports autonomous measurements. Importantly, we demonstrate that the predictions are stable when presented with statistical fluctuations of the measurements, which is critical for the analysis of measured reflectivity data.
In section 2, we detail how the training set was obtained, how the neural network was designed and trained, and how the experimental measurements were made using the Magnetism Reflectometer (BL-4A) [10] at the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory (ORNL). In section 3, we describe the performance of the neural network, demonstrate the stability of the predictions under statistical fluctuations, and use the neural network to obtain thin film structure predictions for experimental measurements. We present our conclusions in section 4.

Thin film structure
The thin film structure used in this study is a two-layer system on a silicon substrate. The beam is incident onto the surface of the thin film layer furthest from the silicon substrate through air. Such a system can be modeled by defining an SLD profile parametrized by a stack of thin film layers, each with a thickness, an SLD, and a roughness parameter that describes the interface between two consecutive layers. Describing the SLD profile of a two-layer thin film therefore requires seven parameters (thickness, SLD, and roughness for each layer, and a substrate roughness).
Several reflectometry packages are available for creating a set of data for training a neural network [11][12][13][14]. Here, the reflectivity model calculations were done using the Refl1D package [11]. Reflectivity calculations with Refl1D use the thin film layer parameters outlined above to establish a multi-layer structure from which R(Q) is calculated. R(Q) is computed using the method of Abelès [15], where the roughness is accounted for using the approach of Névot and Croce [16]. A Q resolution of ∆Q/Q = 2.5% was used to generate the training data, a value that is consistent with the Magnetism Reflectometer [10].

Experimental data
The neural network developed was applied to data sets measured at the Magnetism Reflectometer [10]. The sample was a thin film of Haynes 230 nickel alloy deposited on a silicon substrate and annealed at 600 • C, which produced a well defined two-layer structure, with a thick oxide layer on top of a nickel rich alloy layer. Nickel is a ferromagnetic material that exhibits magnetic properties when present in a high enough density. Discussion of the properties of this material [17][18][19] is outside the scope of this paper. This data set was selected because it provides a demonstration of the utility of the neural network and because it was possible to model it with only two thin film layers using standard methods.
The sample was measured using polarized neutron reflectometry (PNR), performed in a 1 T magnetic field in the plane of the film. PNR follows the same formalism outlined above, but leverages the neutron's magnetic moment to extract information about the magnetic structure of a thin film [20]. In this case, the SLD can be expressed as the sum of a nuclear and a magnetic component β(z) = β N (z) ± β M (z), where β N (z) corresponds to the nuclear component, and β M (z) corresponds to the magnetic component. The magnetic contribution to the SLD depends on the spin state of the neutron, and the + and − signs indicate spin states parallel and anti-parallel to the in-plane magnetization, respectively. The reflectivity profile was obtained for each of the two spin states and our neural network was applied to each data set independently. Using a PNR measurement to test our approach provides a use case where two distinct data sets come from the same nuclear structure, so that we expect two closely related predictions.

Training set
A neural network was used to map a theoretical SLD profile to its corresponding reflectivity curve R(Q). The intention was to train a neural network to predict the seven thin film parameters corresponding to an input reflectivity curve. Given the complexity of the inverse problem of finding the underlying structure that best represents a measured scattering curve, establishing a relationship between a given theoretical R(Q) curve and physically plausible SLD profile is particularly valuable. Focusing on mapping the theoretical R(Q) to the underlying SLD profile also has the advantage of only necessitating simulated data. The approach allows for a diverse range of model parameters, and makes it possible to employ a larger training set than would be possible if real, measured data were to be used. This approach also intentionally separates the challenge of mapping a given theoretical reflectivity curve to its underlying SLD profile from understanding the impact of statistical fluctuations in measured data on the predictions.
For each thin film model, all thin film layer parameters were generated using a uniform random distribution from within a predefined range (see supplemental table S1 (available online at stacks.iop.org/ MLST/2/035001/mmedia)). To ensure good predictions, the choice of the range of each parameter should be selected to cover all measurable values. The range of each parameter in the model was selected to produce a reasonable distribution of model structures that could be studied with a neutron reflectometer instrument. The neural network, described below, was constructed such that each neuron of the input layer corresponded to a reflectivity value for a given point in Q. Therefore, a common array of Q values was used in evaluating R(Q), and the experimental data was measured at the same Q values to avoid potential errors due to interpolation. In the present study, the Q array had 92 points in the range 0.008 < Q < 0.088 Å −1 . The training set consisted of reflectivity curves calculated from one million randomly generated thin film structures, of which the last 10 000 were used as a validation set for the training.

Neural network model
A neural network with six hidden layers was used to model our reflectometry data using TensorFlow [21]. The neural network is shown schematically in figure 1. The input layer consisted of a number of neurons equal to the number of Q values at which the reflectivity profile was measured. The performance of neural networks improves when the input parameters vary within a limited range of values [22]. Reflectivity profiles, on the other hand, can span up to 6 or 7 orders of magnitude. For this reason, the input R(Q) profiles were preprocessed before being used as the input layer to our neural network. For a given R(Q), the following function was applied: where R ′ (Q) is passed to the input layer of the neural network and Q 0 is the value of the first Q value. The inclusion of Q 0 has no effect on the neural network and was only added for convenience to keep a reasonable scale. The same preprocessing was applied to real data when predicting thin film parameters with the neural network. It should be noted from equation (1) that choosing a multiplicative factor of Q 4 may further minimize the amplitude range of R(Q). On the other hand, the uncertainty on a real measurement tends to increase with Q as the counting statistics get poorer. A reasonable approximation of the dependence of the uncertainty on Q in a real reflectivity measurement is to assume simple counting statistics, which leads to Knowing that the ultimate goal is to test our neural network with real measurements, a good preprocessing scheme for R(Q) should serve both the purpose of enabling efficient training and stabilizing predictions. Our choice for the preprocessing of the R(Q) values passed to the input layer was also chosen with the cost function used for training in mind. The cost function used was the mean squared error (MSE), defined as where the sum runs over every Q point, R ′ i is the preprocessed reflectivity value at the ith point, and R ′pred i is the predicted value of R ′ (Q) at that point. Our choice of R ′ (Q) therefore uses a factor proportional to the uncertainty on each point to give more weight to points at lower Q where the relative uncertainty is smallest.
The number of hidden layers and the number of neurons per layer were chosen empirically. We started with a single hidden layer and incrementally added layers until we could obtain a qualitatively reasonable correlation between generated and predicted parameters. Since the quality of the predictions also depends on the size of the training set and the number of epochs chosen to train the neural network, these were selected using the same approach. After exploration, we determined that six hidden layers worked well for the problem. The number of neurons was 300 for the first hidden layer, 400 for the second, third and fourth hidden layers and 100 and 50 for the fifth and sixth hidden layers, respectively. The output layer of the neural network consisted of the seven structural parameters needed to describe the two-layer structure on silicon. The thin film structure parameters themselves were normalized to [−1, 1] within their respective value range. The predictions were then transformed to the original ranges that the neural network was trained with.
The Adam optimizer [23] was used to train the neural network, with a learning rate parameter of 5 × 10 −4 . The scaled exponential linear unit activation function was used [24] along with the LeCun kernel initializer [25]. Figure 2 shows the loss function value as a function of epoch, as well as the loss function value for the validation set. The neural network model was trained for 20 epochs. The number of epochs was chosen such that the loss function value (see equation (4)) of the validation set remained similar to the loss function value of the training set. Above 20 epochs, the loss for the training set was more likely to dip below the loss for the validation set, indicating the possibility of over-fitting. The code that was used to generate the training set and train the neural network is available on GitHub [26], along with the trained network.

Performance
To evaluate how well the trained neural network performed, we generated 100 000 new thin film models using the same procedure used to develop the training set. For each R(Q) in this set, the seven parameter predictions were generated by querying the neural network. Figure 3 shows how well the neural network predictions matched the generated parameters. Each sub-figure in figure 3 was obtained by histogramming the 100 000 pairs of predicted and generated parameters. Supplemental figure S1 shows the deviation of each predicted parameter, ∆P = P pred − P true , for the same data. All thin film model parameters are very well modeled. The predictions for the SLD of the bottom layer (SLD bot ) close the SLD of silicon (2.07 × 10 −6 Å −2 ) are not as well reproduced. This outcome is to be expected, since in this case the bottom layer blends in with the substrate. Very low roughness values are slightly less well reproduced. This is to be expected, as low roughness values will have a smaller effect on the reflectivity profile compared to larger roughness values. It should be pointed out that predictions can also be distorted near the edges of the chosen parameter ranges. This can be seen on several sub-figures of figure 3, but especially for the roughness parameters. Although inevitable in the case of a physical limit, this highlights the need to select a training set having parameter ranges with wide enough margins to minimize distortions when the neural network is used for a particular application. Supplemental figure S2 shows an example of how the choice of the range of σ bot affects the predictions.

Effect of statistical fluctuations
Our neural network was trained on reflectivity profiles without statistical variations on each R(Q) point, yet real experimental data always have them. The loss of phase information intrinsic to reflectometry measurements makes reflectivity modeling a challenging endeavor and is compounded by the statistical variations in measured data. Therefore, it is vital that a neural network should be able to make predictions that are stable with input data that possess statistical fluctuations that are representative of those encountered in real measurements. A usable neural network needs to map that solution space well enough to offer stable predictions. This requirement is particularly important because of the 'black box' aspect of neural networks that parametrize a solution space in a way that does not lend itself to a simple interpretation. In this section, we investigate the effect of counting statistics on the neural network predictions. To simulate measurement statistics, we generated 100 000 new reflectivity profiles using the approach described in section 2.3. For each profile, statistical fluctuations were added to each R(Q i ) point by adding a random value δR(Q i ) selected from a Gaussian distribution with a standard deviation σ R i . In order to simulate realistic fluctuations, the standard deviation for each point i was set to where (∆R i /R i ) meas corresponds to the relative uncertainty on the ith point in Q of one of the two measurements performed at the Magnetism Reflectometer and described in section 2.2 (for this purpose, both measurements have similar uncertainties and would be equally suitable. We chose the spin-down measurement). Following the generation of this new data set, our trained neural network was used to predict thin film structure parameters for each of the 100 000 simulated measurements. Figure 4 shows the relationship between the predicted and true values for each of the seven parameters. Supplemental figure S3 shows the distribution of the deviation ∆P = P pred − P true for each parameter. Similar to the case without statistical fluctuations, most parameters are well modeled by the neural network. The roughness parameters σ sub between the substrate and the bottom layer, and σ bot between the bottom and top layers are not as well modeled and show a tail toward smaller values. This result is not surprising. These two roughness parameters mainly affect the high-Q tail of the reflectivity curve, where the measurement uncertainties are larger. Although statistical fluctuations of the measurements will translate in a wider range of predictions for roughness parameters, they do not prevent the neural network from providing valuable structure predictions. The L bot values below 50 Å are not well reproduced. In particular, predictions for simulated data with true L bot values below 50 Å can be very large, as can be seen on figure 4. Investigating those simulations, we found that they mostly corresponded to thin film structures with large values of SLD top > 6 × 10 −6 Å −2 where the SLD profile of the film could be interpreted as a single layer. For these simulations, the neural network still interprets the data as a two-layer system, with a larger L bot than its true value.
A clear difference can be seen in the quality of the predictions for SLD bot values below the SLD of silicon compared to higher SLD bot values. Statistical fluctuations have a smaller effect for SLD bot < 2 × 10 −6 Å −2 where a critical edge is more likely to be visible. Our analysis demonstrates that a neural network can be used to map a reflectivity curve R(Q) to an estimate of its underlying SLD profile within reasonable uncertainty.

Applying the trained network to measured reflectivity
The trained neural network model was applied to PNR data collected from the Haynes 230 sample described in section 2.2, which was obtained with the Magnetism Reflectometer [10] at ORNL. The measured reflectivity R(Q) was binned to the Q binning chosen to train our neural network, and each measurement was preprocessed using equation (3) to obtain R ′ (Q).
The left panel of figures 5 and 6 show the measured reflectivity profiles and the neural network predictions for the two measured neutron spin states. The predictions are compared with fit results obtained using the Refl1D software using the DREAM algorithm [27] for the minimization process. Since the nuclear structure is the same for both measured spin states, these were fit simultaneously using the same thin film structure. All parameters were constrained to be the same for both data sets, except for the SLD of the magnetic bottom layer which was let to vary with each data set. Setting up the constrained fit was done using the Webi reflectometry application developed at ORNL [28]. The fit parameters are shown in table 1 and compared with the prediction for each spin state. The right panel of figures 5 and 6 show the comparison between the fitted model and the neural network predictions. Both the predicted SLD profiles and the corresponding reflectivity curves agree well with the fit results.
To expand our understanding of the stability of the neural network predictions to statistical fluctuations, the technique used in the previous section was applied to the measured data. Each point of the measured R(Q) data set was randomly varied according to a Gaussian distribution with a standard deviation equal to the measurement uncertainty of each point. This process was used to generate a new set of 20 000 simulated measurements and the neural network was then used to predict thin film parameters from these 20 000 simulated R(Q) profiles. The distribution of the predicted values for each parameter is shown in  supplemental figures S4 and S5. The standard deviation of the distribution of predictions for each parameter was used to estimate the stability of the prediction for that parameter, and are quoted as the uncertainty of the neural network predictions in table 1. The predicted parameters agree well with the fit results for both measured data sets, which demonstrates the feasibility of using a neural network to obtain a thin film structure prediction for a reflectivity measurement. Such predictions could readily be used in subsequent automated processes, or could be used as an initial model for traditional data fitting.
The standard deviation values quoted in table 1 with the network predictions have a different meaning than the parameter uncertainties produced by the DREAM algorithm [27] implemented in Refl1D [11]. The Refl1D package uses a Bayesian approach to determine the uncertainty on model parameters. The results are meant to be interpreted as a probability density function for each parameter given the measured data. In our case, once our neural network is trained, each input reflectivity curve has a single well-defined neural network output. For a particular underlying (true) thin film structure, the probability of observing a given reflectivity profile given statistical fluctuations is also the probability of obtaining the corresponding neural network prediction. The standard deviation on each parameter prediction in table 1 is an estimate of the expected variations in the prediction of each thin film parameter one would obtain if this particular measurement were repeated a larger number of times. The fact that the standard deviation of each parameter prediction is larger than the uncertainty of the corresponding fit parameter demonstrates the importance of follow-on refinement fitting for final data analysis. 13.6 ± 2.4 9.9 ± 0.5 13.3 ± 3.9 9.9 ± 0.5 χ 2 5.0 1.2 3.8 1.0

Conclusion
Establishing an initial thin film model for reflectometry fitting can be a time-consuming effort, especially for novice users of reflectometry. The problem is compounded at scientific user facilities, where a large number of novice users perform a large number of measurements in a short period of time. The most limited quantity at such facilities is the time that the instrument scientists have for assisting users with data analysis. We investigated the possibility of using a neural network to suggest an initial model structure of a two-layer thin film to use as a starting model for data analysis, with the long-term goal of eventually applying the approach to an n-layer thin film system. We were able to train a neural network to predict two-layer thin film model parameters with sufficiently good accuracy to be useful as a starting structure to use in traditional data fitting. Importantly, we found that the predictions obtained with our neural network were stable when statistical fluctuations of the measurement were considered. When applied to measurements performed at the Magnetism Reflectometer at SNS, the predicted parameters were close to their respective fitted values obtained through traditional data analysis. In most cases, the neural network predictions agreed with the data analysis results within the standard deviation of the prediction. These promising results are encouraging and clearly indicate the potential for using neural networks as an integral step in automated data analysis systems, either for automatically producing estimated structures for users to consider, or to inform future autonomous measurement procedures.
We conclude that the use of a neural network is a fast way to obtain a very good starting point for analyzing neutron reflectivity data. The present results show that the development of neural networks for multi-layer thin film model predictions has the potential to become a powerful tool for reflectivity data analysis that would significantly impact the operation of instruments at large-scale neutron and x-ray user facilities.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https:// doi.org/10.5281/zenodo.4318079.