Data analysis of neutron Compton scattering experiments using MANTID

The VESUVIO spectrometer at the ISIS pulsed neutron and muon source is an epithermal and thermal neutron station requiring a software suite able to tackle the analysis of experimental data from several neutron techniques. We present an update on the tools available within the MANTID environment for the treatment of neutron Compton scattering data, and we discuss new strategies for the analysis of neutron Compton profiles and momentum distributions that provide robustness to the scientific studies based on this technique.


Introduction
Neutron Compton Scattering [1,2], also referred to as deep inelastic neutron scattering, is a spectroscopic technique that measures Nuclear Quantum Effects (NQEs) in condensed-matter systems. The main observable available in a NCS experiment is the width of the Nuclear Momentum Distribution (NMD), σ M , of each element of mass M in the investigated sample. The analysis of the shape of NMDs yields direct information on the nuclear dynamics, underlying, for example, the degree of anharmonicity and anisotropy of the local potential, or the importance of NQEs in different phases of a condensed-matter sample, or at different temperatures. At present, the flagship instrument routinely performing NCS experiments is the VESUVIO spectrometer [3,4] at the ISIS pulsed neutron and muon source [5]. Several measurements can be performed during the same experiment, including NCS, neutron transmission, neutron diffraction, neutron resonance capture analysis, and prompt-gamma activation analysis [6] The analysis of NCS data has been recently included within the MANTID software [7,8], and first results have already been published [9,10], also in conjunction with the FABADA tool for data analysis and robust model selection, based on Bayesian inference [11]. A preliminary discussion on the implementation of NCS analysis in MANTID was presented in Ref. [12], yet in recent years a considerable effort has been directed towards the improvement of the functionalities and capabilities of the software, as well as the development of new algorithms [13][14][15].
We report here on the updated capabilities of the MANTID software in visualising, reducing, and analysing data collected on the VESUVIO spectrometer, with particular emphasis on the analysis of NCS spectra in the so-called West-scaling domain.

Materials and methods
NCS is based on the use of neutron energy transfers in excess of some eV, and momentum transfers in excess of tens ofÅ −1 , in a scattering regime referred to as the Impulse Approximation (IA) [16]. VESUVIO is an inverted-geometry spectrometer [4,17], where the energy of scattered neutrons is selected by cycling gold foils. Gold has a very intense absorption cross section at about E 1 = 4.9 eV related to a (n,γ) nuclear resonance. When a neutron within a narrow energy range around E 1 passes through the foil, there is a high probability for it to be absorbed and a cascade of prompt γ-rays is emitted. NCS spectra on VESUVIO are obtained by taking the difference of raw spectra obtained with different configurations of the gold foils. In particular, two configurations are available for the photon-sensitive front-scattering detectors [18,19], and three configurations for neutron-sensitive backscattering detectors [20,21]. Additional details on the modes of detection on VESUVIO can be found in Refs. [1,2,22]. As a consequence of such procedures, every data-acquisition run is composed of 6 periods corresponding to different positions of the energy-selecting gold foils. The differences of raw data from different periods allow the subtraction of the environmental contributions, with the exception of a residual sampledependent γ-Background (GB) due to the change in the position of the gold foils [23,24].
The VESUVIO experiments discussed in this paper were performed on a polyethylene foil with thickness 0.25 mm in the direction of the neutron beam, and cross-sectional area 8×8 cm 2 , larger than the beam cross section [4]. Figure 1 shows a schematic representation of the spectrometer, including the numbering of spectra from different detectors. The foil was attached to a square aluminium frame, not directly hit by the incident neutron beam to avoid unnecessary correction for sample-container backgrounds during the data analysis, and measured at room temperature. In what follows, we shall refer to two measurements, primarily characterised by the accumulated proton charge in the ISIS synchrotron: sample 1 was measured for a total of 6.0 mAh in the closed-cycle refrigerator (CCR); and sample 2 was measured for a total of 1.8 mAh in the vacuum tank, after removing the CCR. Sample 1 was measured in helium atmosphere at 60 mBar, and sample 2 in vacuum at 10 −6 mBar.
In its most general form, the NCP can be expressed as a series of Gauss-Hermite polynomials, in a series often referred to as a Gram-Charlier expansion, of the form: where n = 0, 1, 2, . . . , and the two conditions c 0 = 1 and c 2 = 0 are required so as to interpret the NCP as a probability function. Under these conditions, σ M corresponds to the standard deviation of the leading Gaussian term in Eq. 4. This equation provides the most used model to fit experimental NCPs in NCS experiments. The NCP is expected to be a simple Gaussian profile when the nucleus experiences a harmonic and isotropic potential. Terms related to n > 3 are seldom expected or required in the fit, and it is common practice to assume the NCP of heavy elements as simple Gaussian functions during the data reduction stages. One should notice that Eq. 4 only holds within the IA, i.e., when both E and Q are very large compared to the characteristic binding energies and inverse inter-atomic distances in the system, respectively. In an experiment, corrections for the finite value of Q are generally required, referred to as Final-State Effects (FSEs). These are commonly assumed to be of the form [26]: The proportionality of the FSE correction term upon σ M in Eq. 5 is a consequence of the harmonic approximation, a condition that can be relaxed in the actual fit by including an additional factor as a free parameter [1]. According to the definition of TOF for a spectrometer with incident and final flight paths L 0 and L 1 , respectively, and the condition that, at the peak centre, the energy transfer equals the nuclear recoil energy, E = E r , one has: As E r =h 2 Q 2 /2M is inversely proportional to the mass of the scattering nucleus, light-weight elements contribute to the scattering intensity at low TOF values, as opposed to heavy-weight elements that, approaching the limit of infinite mass, have an upper bound to their NCP peak centre at t ∞ = m/2E 1 (L 0 + L 1 ) 390 µs on VESUVIO. As a consequence of this dependence upon the recoil energy, NCS is a MAss-selective Neutron SpEctroscopy (MANSE) [27]. Figure 2 (top-left) shows the forward-scattering intensity as a function of TOF and spectrum number (see Figure 1 for details on the numbering of spectra) for polyethylene sample 1. One can easily notice the change in peak position for the hydrogen NCP, with the peak at lower TOF values for detectors at high scattering angles (higher values of Q). Finally, one should appreciate the fact that the dynamic variables in Eq. 4 are all expressed as explicit functions of the time channel t in the TOF spectrum from each detector at scattering angle θ: E 0 (t), Q(t, θ), and y M (t, θ). The loading of TOF spectra into MANTID using the "LoadVesuvio" algorithm is described in Appendix A.
3.1. Sequential pre-fit of raw TOF spectra The corrections to be performed on the raw NCS data are based on the defining parameters of the NCPs, i.e., the width of the peak σ M , the main observable of the measurement, and the intensity of the peak I M . Therefore, a preliminary fit of the raw spectra is needed so as to obtain estimates of σ M and I M , and the procedure can be iterated after the corrections have been performed. The positions of the peaks are fixed, at this stage, by the a priori knowledge of the mass of the constituents of the investigated sample. Details on the way the pre-fit is implemented in MANTID using the "VesuvioTOFFit" algorithm can be found in Appendix B. One can apply Eq. 1 to the case of polyethylene by fitting two profiles, one for hydrogen and one for carbon. As the stoichiometry of the sample is well known, with two hydrogen atoms per carbon atom, a constraint can be applied to improve the quality of the fit, especially for noisy data. The relative intensities of the two profiles, as measured by each detector, are in a fixed relation defined, in Eq. 1, by the ratio of their stoichiometric weights and bound scattering cross sections. Assuming an expression for the intensity of a given mass to be I M = N M b 2 M , one has The result of the pre fit on the raw spectra is shown in Figure 2 (top-right).

Correction for the sample-dependent γ-background
The fitted values of σ M and I M , together with other calibration parameters related to the detector positions, can be used to correct for the sample-dependent GB, as discussed in Ref. [28]. The correction has an oscillatory character around each NCP, and it is shown in the bottomright panel of Figure 2. Additional details on how to perform the GB correction in MANTID using the "VesuvioCalculateGammaBackground" algorithm are discussed in Appendix C.

Correction for multiple scattering
The multiple-scattering correction is originated by the finite thickness of the sample, and the consequent finite probability of a neutron to scatter more than once within the sample, before being detected [29]. The scattering probability can be expressed, for epithermal neutrons, as where T (E 0 > 1 eV) is the transmission for epithermal neutrons, ρ the number density of the sample, d is the thickness of the sample, and σ f the free scattering cross section of the constituent elements of the sample [25,30], CH 2 in the case of polyethylene, with stoichiometric coefficients N M , and defined as: The product ρσ f d is often referred to as the epithermal scattering power. A wise experimenter should chose d so as to minimise the contribution from multiple scattering aiming for a transmission of epithermal neutrons around 0.9, therefore making the terms with n > 1 in Eq. 8 negligible. The "VesuvioCanonicalThickness" algorithm can be used to define the best dimensions of a sample to be measured, and the "VesuvioTransmission' algorithm can be  [31,32]. The MS correction is evaluated using a Monte Carlo code [33] and the information about the fitted intensities {I M }, and the experimental value of T (E 0 > 1 eV). Additional details on how to perform the MS correction within MANTID using the "VesuvioCalculateMS" are discussed in Appendix D.

Sequential fit of corrected TOF spectra
An iterative procedure can be adopted, whereby the NCPs in the TOF domain are fitted after the corrections for GB and MS. The updated parameters for σ M and I M can be used to recalculate the corrections until convergence is reached. The results of the sequential fit over all detectors for the width of the NCP, {σ M }, represents a first, rough, estimate of this observable. The results for the polyethylene samples obtained from such statistical ensemble are reported in Table 1, and referred to as Mode 2, were obtained using the Simplex minimiser. The fit of TOF NCS data using a Levenberg-Marquardt minimiser was found to fail dramatically in the case of several experimental investigations, and its use at this stage is not recommended. The fit is also used to identify the NCPs from masses other than the mass of interest. These profiles should be subtracted from the experimental spectra before the analysis is continued in the domain of the West variable for the mass of interest. Additional details on the subtraction of NCPs from heavier masses are given in Appendix E. In the case of polyethylene, the fitted profile of the carbon NCP is subtracted from the data previously corrected for GB and MS, therefore eventually containing the hydrogen NCP alone. The fit over a bank of detectors corresponding to spectra 151 -158, together with the GB correction is shown in Figure 2 (bottom-left).

Global fit of hydrogen NCP in the West-scaling domain
Corrected TOF spectra can be transformed to the hydrogen West-scaling variable, y H , according to Eq. 2. One can appreciate the advantage of analysing the NCS spectra in the West-scaling domain from Figure 3  of strategies for the analysis of the hydrogen NCP are available at this juncture. All spectra are assumed to show the same physical information, i.e., a probability function with standard deviation σ M . However, two effects make the spectra dependent upon the scattering angle: i) the presence of FSE corrections, proportional to the inverse of the momentum transfer Q, slightly shifting the apparent centre of the NCP towards negative values of y H ; ii) the resolution function being broader for small scattering angles [34], as one can appreciate from Figure 3 (right).
At this stage, spectra are generally unit-area normalised, so as to satisfy the zero-order sum rule for probability distributions. Additional details on the data handling in the West-scaling domain are given in Appendix F

Global fit over all individual detectors
A number of scientific investigations in recent years [35][36][37][38][39] have been based on fits over all individual front-scattering spectra in the West-scaling domain, with the parameters defining the hydrogen NCP, namely σ H and c 2n , being included in the fit as global parameters. Other variables, such as the amplitude of each experimental NCP and its peak centre, y 0 ∼ 0, were considered spectrum-dependent. The number of parameters that were fitted in this way easily reached 100, and a robust and reliable framework needs to be available to undertake such a task. Traditionally, the analysis has been performed within the Minuit environment [40], used here as a benchmark to validate the results obtained within MANTID.

Fit of the spectral sum
Assuming that FSEs and the width of the resolution functions vary only slightly with the scattering angles considered, one can replace angle-dependent values with average ones. Consequently, by averaging all experimental spectra, resolution functions, and values of Q for every channel y H , one can perform a single-spectrum fit using the function where F , as opposed to J, traditionally denotes the experimental NCP, therefore including convolution with the resolution function and FSE corrections, andF (y H ,Q) is the average of all experimental spectra. The result of this procedure is reported in Table 1 and referred to as Mode 3, and the fitting curve is shown in Figure 4. The fitting procedure was tested using three minimisers: Simplex, Levenberg-Marquardt, and FABADA, yielding consistent results. Additional details on this fitting procedure are discussed in Appendix G.

Global fit over banks of detectors
A global fit over "banks" of detectors was implemented within the MANTID environment. The term "bank" here refers to a collection of detectors, not necessarily placed at similar scattering angles. In this procedure, the fit is performed over an ensemble of spectra with σ M defined as a global parameter. The results from all banks are then considered as a statistical ensemble for which the average and standard deviations are evaluated. The results of this analysis using the Simplex minimiser are shown in Table 1, and referred to as modes 4 -7, depending on the way each bank was defined. Additional details on the way the global fit over banks of detectors is implemented in MANTID can be found in Appendix H.

Discussion
As a number of procedures have been attempted to analyse the hydrogen NCP in polyethylene, we can now compare the obtained results. We comment below on each analysis mode:  (1) The global fit of all individual spectra in the West-scaling domain. This is a common approach to the analysis of NCS data, that has been demonstrated to show reliability and reproducibility. At present, it is performed using FORTRAN routines based on the Minuit environment. Here, we take these results as a benchmark. The drawbacks of this approach lie in the time-consuming evaluation of the figure of merit, and the difficult implementation in environments where the available minimisers are not powerful enough. (2) The sequential fit of TOF spectra. This is a necessary step in the data reduction. Due to the poor signal-to-noise ratio for individual spectra, the spread in values of {σ M } is generally large, which makes the results unusable for precise work. Yet, the average of the fitted values is relatively near the benchmark result, and can be used reliably for the GB and MS, and even FSE corrections.
(3) The fit on the sum of spectra in the West-scaling domain. This is a simple and quick approach allowing for a simple visualisation and interpretation of the results. The value of σ H can be accessed with a high precision, however the result could lack accuracy whenever some spectra from faulty detectors are included in the sum and contribute with spurious intensity to the average NCP. This is the case, for example, of the results from the noisier Sample 2, reported in italics in Table 1. The values obtained from mode 3, in this case, provide a clear underestimate of the value of σ H with respect to the other results in the table, an outcome that can be prevented by careful inspection, and possibly masking, of each individual spectrum prior to their sum. We notice that the task of recognising faulty spectra is generally more complex for noisier data. As a number of strategies are available within the Mantid environment to analyse the NCS spectra, one can treat the many results as members of a statistical ensemble associated with a mean value and a standard deviation. We refer to this approach as mode 8 in Table 1. It is interesting to notice how the values of σ H obtained in this way, as well as the magnitude of the associated uncertainty, are in remarkable agreement with the benchmark results. This suggests that more-robust minimisers could yield smaller uncertainties in the case of the global fits for modes 4 -7.

Outlook and conclusions
We have presented an update on the implementation of the data reduction and analysis of NCS spectra within the MANTID environment. To this end, we have used a polyethylene standard sample to discuss several strategies to obtain values of the hydrogen NCP width, σ H . As a result of this analysis, we have provided a detailed Python script that can be readily used to analyse NCS data collected on VESUVIO. The present work suggests that the analysis of NCS data in MANTID can benefit from the inclusion of other, more robust, minimisers.
Appendix A. Initialisation procedure The first step in order to analyse the NCS data in MANTID is to open a Python-script window and define the initialisation parameters. These include: • the experimental run numbers to be analysed, according to the VESUVIO data-acquisition journal; • the spectra corresponding to the detectors to be considered, according to the schematics in Figure 1; • the differencing mode to combine the periods of data acquisition, in particular "SingleDifference" in the case of front-scattering detectors, and "DoubleDifference" in the case of backscattering detectors; • the expected isotopes, and in particular their masses, in the sample or in the sample-pluscontainer system -noticing that the number and values of specified masses defines the number of NCPs to be fitted and the value of their peak centre; • any expected constraint between intensities of pairs of NCPs, whenever an empirical formula is known, so as to facilitate the fitting procedure.
Additionally, one can declare faulty detectors or additional optional parameters, as in the listed code below. At this stage it is possible to subtract experimental backgrounds to the data, merge nonsequential acquisition runs, or perform any needed operation on the TOF data prior to the correction and analysis directly related to NCS.

Appendix B. Sequential fit in TOF domain
A pre-fit of the NCS data is needed as both the γ-background (GB) and multiple-scattering (MS) corrections depend upon the NCP intensities, I M and widths, σ M . In a first step, one defines the number of expected profiles, together with their main properties. It is recommended to iterate over this procedure, so as to obtain precise values of the peak intensities after MS and GB corrections. In this script, the iteration procedure is iterated by the following piece of code: Prior to the fitting procedure, one needs to define the workspaces where the total fit and each fitted profile will be guested, and the profile functions to be fitted: The sequential fit the TOF domain can be performed using the VesuvioTOFFit algorithm: The relevant fitted parameters from the fit can be stored in a parameter workspace, including the spectrum number, and the values of the mass, width, and intensity for each fitted profile: In a second step, the relative intensities of each NCP are obtained by averaging the values saved in the parameter workspace, output of the previous fit: The total-scattering spectra obtained from the simulation can be scaled to the intensity of the experimental ones by multiplying for the ratio of integrated areas, and the same factor is used to scale the multiple scattering contribution, subsequently subtracted from the experimental spectra: