Neutron spectroscopy as a method for classical force-field parameterization: Past methods, present successes and future challenges

Classical molecular dynamics (MD) plays a central role in understanding structural and dynamical phenomena across all disciplines of physical chemistry. These models can be used to interpret experimental data, or as a method of study in their own right. Their legitimacy however rests solely on the accuracy of the underlying force-field, and so the parameterisation of these force-fields is the most crucial aspect of any study. The typical methods of parameterisation are structural or thermodynamic in nature, however this perspective article will examine a little used metric of parametersation; that of neutron spectroscopy, and in particular quasi-elastic neutron spectroscopy (QENS). QENS data contains self-correlation information for the hydrogen atoms of a system, over a wide range of distances and time-scales. These scales are relevant for local and global diffusion and rotation, thus pairing very well to the scales of molecular dynamics for organic systems. This article focuses in particular on the parameterisation of models of porous and surface catalysts. This area is a particularly rich field for the application of QENS, however there is a distinct lack of accurate classical force-fields currently.


Introduction
Classical MD has played a pivotal role in understanding chemical phenomena at the atomic scale ever since its widespread inception. Beginning with the first models of argon in the 60s [1], the scrutiny of force-fields against experimental results has been a major focus of study. These first studies compared pair correlation functions between experiment and theory in order to determine the structural accuracy of the new models, as well as comparing self diffusion coefficients as a measure of the accuracy of the dynamics. At this early stage the structural and dynamic agreement was very positive, reporting only a 15% deviation in the self-diffusion coefficient. Even in this very first study the focus was on a direct comparison with neutron scattering observables, with the iconic opening paragraph in the 1964 seminal paper by A. Rahman;'If neutron scattering data of unlimited accuracy and completeness was available, then the kind of work presented here would serve the useful though unexciting purpose of confirming the results already obtained with neutrons'. Classical force-fields by design omit certain degrees of freedom of the system. At any level of approximation, the removal of degrees of freedom from a model, whether it be the removal of electronic degrees of freedom, or the coarse-graining of molecular groups, introduces a change to the excess entropy of the system. This change in excess entropy has a well documented effect on the resultant dynamics of the system [2][3][4]. It is clear therefore that in order to reproduce the self-diffusion of a system accurately, a parameterisation with respect to experimental results for the self-correlation function is a suitable avenue of enquiry.
As a starting point it is worth examining, arguably the most well studied force-field of all; liquid water. Over the years water has accumulated a large set of different models, each with its own strengths and weaknesses [5]. In fact water acts as an ideal pedagogical case for the strengths and limitations of the force-fields for small molecules, as its large dipole moment, combined with its interesting steric geometry provide a complex range of Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. molecular coordinations, as well as some curious non-monatonic phase behaviours [6]. Many popular water  models are parameterised against thermodynamic properties, and in particular the experimental value for the  vaporisation enthalpy is a key target property, with TIP3P, TIP4P, TIP5P, TIP4P/2005 and SPC/E all  reproducing this (note: TIP4P/2005 and SPC/E require an ad hoc polarisation correction) [7,8]. These models also perform well against other thermodynamic and structural properties such as the room temperature liquid density and heat capacity at constant pressure. These widely used water models however have not been parameterised to reproduce the self-diffusion coefficient (D self ) explicitly. Nonetheless, the likes of SPC/E and TIP4P/2005 perform surprisingly well at room temperature, with values of 1.54 × 10 −9 m 2 s −1 and 1.27 × 10 −9 m 2 s −1 respectively, compared to an experimental value of 1.31 × 10 −9 m 2 s −1 . However, the likes of the TIP3P model, despite its many similarities gives a value of 3.72 × 10 −9 m 2 s −1 deviating experimentally by almost 300%. Of course it is not only desirable to reproduce D self at one given temperature, but instead the aim should be to reproduce it across a wide range of temperatures in the liquid state (i.e. reproduce the activation energy). To this end, these models do not perform quite as well, with SPC/E and TIP4P/2005 giving values of 15.4 kJmol −1 and 16.2 kJmol −1 respectively, compared to an experimental value of 18.4 kJmol −1 . Furthermore it is important to point out that the self-diffusion coefficient only measures the macroscopic diffusion in the limit of long time-scales, thus even if this coefficient is correct, the dynamics on shorter time-scales for a particular model may not be accurate. Looking beyond these simple models, there are for the most part very few models that have been directly parameterised for the interaction of small molecules with chemically relevant surfaces, or porous materials. The limited examples of surface studies, typically employ comparisons of MD or MC to that of neutron reflectometry [9][10][11]. Although MD models have been routinely compared to Neutron spectroscopy techniques for surface and confined/pore diffusion [12,13] there are a clear lack of examples of accurate parametrization to produce reliable models of these phenomena. Returning to the accuracy of the self diffusion coefficient for water models, it is worth pointing out that it isn't completely surprising that models which correctly predict structural and thermodynamic properties also provide a good proxy for the selfdiffusion coefficient. The study of the link between structural an dynamic properties is a long dated field, with key contributions from Rosenfeld [3] and Dzugatov [4]. In particular Rosenfeld showed that for simple liquids there exist sets of scaling relations, linking the dimensionless dynamical coefficients (X) to the excess entropy (s exc ) of the system; where, k B is the Boltzmann constant, and α and A are parameters dependent on the inter-atomic potential, but not on the thermodynamic state. Going one step further, the excess entropy of a system can be related to an integral relation involving the pair-correlations for an atomic system, or with the addition of an orientational correlation term for rigid molecular systems [2]. As a result, knowledge of the statistical average of the configuration of the system can in principle allow the prediction of the self-diffusion coefficient. In fact more generally there exist approximate equations relating the self diffusion coefficient to other collective dynamical properties, such as the Nernst-Einstein equation for electrical conductivity and the Stokes-Einstein equation for sheer viscosity [14]. The reproduction of the self-diffusion coefficient can therefore be loosely thought of as a proxy for dynamical properties of the system in general, and thus seems like a good choice as a figure of merit for fitting classical force-fields. Notwithstanding the modest successes of MD models in reproducing dynamical properties for small molecules, many of the cases of scientific interest in the modern era are based on much more complex multi component systems, within porous materials [13,15], or interacting with a substrate [16]. Neutron spectroscopy has been a very successful probe of the interaction of small organic molecules in porous materials or on surfaces. There has been particular success in the field of catalysis in zeolites [17,18] and on metallic surfaces [19], elucidating both the reaction mechanisms at the atomic scale, as well as the mechanism of diffusion to and from the catalytic site. The studies which extract the maximum amount of information content from a neutron experiment typically employ MD, thus accurate models for these systems are essential. Ab initio MD is an attractive approach, however the size of these more complex systems is typically in excess of 1000 atoms, and the required time-scale for comparison with QENS is that of nanoseconds. Although these approaches have been successfully applied for inelastic neutron spectroscopy [20,21], for QENS they are for the most part computationally unviable. There is therefore a clear gap to be filled in the form of accurate classical force-fields, capable of reproducing diffusion effects for these more complex materials.

MD and neutron scattering
Neutron scattering and molecular modelling have enjoyed a harmonious few decades, with the latter elucidating much of the structural atomistic details that would otherwise be lost in the vast information content of experimental data. In particular reverse monte carlo (RMC) schemes [22] employing Markov chain sampling Monte Carlo (MC) simulations have been used to generate 3D configurations which are consistent with the structure factor as measured from Neutron scattering. This method was used with great success for example in studying glassy networks in Germanium Selenium glasses [23], however this method does not provide forcefield parameters for an MD model. These concepts were expanded upon further with the introduction of empirical potential surface refinement (EPSR) [24,25], which differs from RMC, as an explicit set of interatomic potentials are defined and adapted to fit experimental data using MC iterations. The converged potentials accurately predict the experimental structure, but do not ensure reproduction of thermodynamic properties, as the potentials are degenerate by nature. This approach has also been shown to produce potentials that are qualitatively similar to that of existing models, like for example that of the SPC/E model [24]. However, without non-degenerate reliable force-field parameters, these methods can be of no use in trying to predict spectroscopic properties of the system.
With EPSR being such a mainstream success, the inevitable question becomes; could a spectroscopic equivalent be used for atomistic dynamics? Extending these concepts to the temporal domain has the large implication of having spatially and temporally correlated data, adding a significant level of complexity. This therefore warrants the question of whether such a task would be computationally tractable? When we speak of Neutron Spectroscopy, we can be talking about any one of a wide range of techniques, which span many orders of magnitude in energy, and therefore the time-scale of motion that is being measured.
At the very upper end of the energy scale (greater than 1 eV), we have the ballistic regime of deep inelastic neutron spectroscopy (DINS). DINS is a powerful technique which allows the measurement of the mass resolved momentum distributions of each atomic species [26]. This technique in particular is sensitive to zero point quantum energy effects, which can be particularly relevant for light elements such as hydrogen [27]. In fact this technique is well suited for the comparison with computational modelling techniques such as Path Integral Monte Carlo and forms of ab initio MD, but often falls short with comparisons to classical force-fields [28]. If we come down in energy somewhat, to the 1-500 meV range, we enter the regime of broad band inelastic neutron spectroscopy. Here we see a mix of intra-molecular and inter-molecular vibrations, methyl rotations, and phonons. This area of the spectrum elucidates much of the bonding properties of the system [29,30], but gives limited information on the delocalised motion of each atom. For these less localised motions, we must use the technique of Quasi-elastic Neutron Spectroscopy (QENS). QENS operates on the μeV-meV scales, which broadly translates to the pico-second time domain (this varies greatly instrument to instrument). QENS also typically probes length scales of a few angstroms, to a few nanometers. Here we find the most suitable length and time scales for direct comparison to MD, where trajectories of a few nanoseconds can provide the relevant temporal correlations, and box sizes of a few nanometers are easily obtainable. These length and time-scales are particularly relevant to the sizes of nanoscopic porous materials, whose unit cells are on the nanometer scale, and diffusive motion typically occurs on the ps time scale.
Before we embark on investigating the use of QENS as a means of parameterising classical force-fields, let us summarise some of the basic principles of QENS [31] for the reader in the hope of giving some context for how such data may be compared to MD observables. At a basic level QENS measurements provide two key variables associated with the exchange of kinetic energy between the incident neutron, and the sample under study. The first variable is that of the momentum change of the neutron, and this can be related directly to the reciprocal space vector Q giving distance information. The second variable is the energy change E of the neutron, which gives information on the energetic modes present in the sample. In fact, formally what we gain access to is the dynamic structure factor of the material; S(Q, E), with some weighting parameters which are dependent on the interactions of the neutron and each atom in the system. The details of this interaction can be reduced to a simple coefficient called the neutron scattering cross-section, which is dependent on the atomic species in question (note this is also dependent on isotope). Furthermore, the dynamic structure factor may be broken down into a coherent contribution, and an incoherent contribution, which provides information on cross-correlations and self-correlations, respectively. One of the chief benefits of Neutron Spectroscopy is the fact that the incoherent neutron cross-section of a proton is extremely large compared to almost any other atomic scattering crosssection. X-ray scattering techniques on the other hand rely on the interaction of the incident radiation with the electron density of each atom. As a consequence atomic species which are electron deficient, as is the case for most organic molecules, have very weak scattering. The scattering cross-sections for x-ray and Neutron scattering are summarised in figure 1. The consequence of the high proton incoherent cross-section for neutron spectroscopy is that organic molecules, which are typically proton rich, will give signals originating overwhelmingly from the self-correlated motions of the protons (i.e. those associated with self-diffusion). Furthermore as these protons are typically connected in a pseudo-rigid molecule, we effectively have a frame of reference for the self motion of the molecule as a whole. One of the axiomatic assumptions for the analysis of many QENS experiments, is that the signal originates completely from incoherent scattering (incoherent approximation), and thus coherent signals can be totally neglected. This allows the application of many simple analytically derived models for diffusive motion which can interpret experimental data without requiring computational models. In general a typical set of QENS results consists of spectra from a range of detectors which are positioned at different scattering angles. These different angles probe different momentum changes Q, and thus provide information on different length scales. The spectra themselves differ from that of more traditional forms of spectroscopy. Instead of many discrete peaks, representing different motions, QENS spectra are most commonly a single peak positioned at zero energy transfer (note methyl tunneling can present as an extra distinct peak), which has a different peak shape and width, dependent on the motions present. Each spectrometer will have an energy range which it can probe, that is determined by the incident spectrum of neutron kinetic energies, and the geometry of the flight path of the neutron to and from the instrument. The upper energy change is a hard limit based on the fastest neutrons that can be detected, and the lower limit is that of the minimum change in energy of the neutron that can be discerned (this is primarily dictated by our level of ignorance of the neutron flight path). This lower bound presents itself as an approximately gaussian peak shape centred at zero energy transfer (see figure 2, with a width corresponding to the energy resolution of our measurement. Any motion that is slower than the energy resolution of our instrument will fall within this resolution function and thus not be differentiable from any other static atomic species in the sample. Stochastic molecular motions such as diffusion and rotation which fall within the energy range of the instrument will present as Lorentzian signals convoluted with the aforementioned resolution function of the instrument. The width of these Lorenzian peak shapes (usually measured as the full width half maxima; Γ) as a function of Q are the primary set of variables which characterise the motions of the system.
The most basic functional form is typically considered to be that of Fickian diffusion and takes a basic linear form; Γ ∼ Q 2 . More complex models of diffusion include, jump diffusion on a 3D lattice, local diffusion within a restricted volume or diffusion in a single dimension to name but a few. These models are more complex and are often non-monotonic in Γ, and can be applied to some systems of key scientific interest, such as small organic jump diffusion from cage to cage in zeolitic systems. These simple models have had some success over the past  decades, however, increasingly for the more complex systems that are studied in the modern era these models are too simplistic to elucidate the complex forms of motion present. For this reason MD is a very powerful tool that can take us beyond these simple systems.
Although MD and QENS show strong compatibility for direct comparison, there are some minor considerations which need to be applied in order to provide a true like for like comparison [32]. Firstly, as the scattering in a QENS experiment is dictated by the scattering cross-section of the atomic species, we must apply a weighting factor to the MD data for each atom, in order to scale its contribution to the spectra. Additionally, as mentioned previously, motions from QENS spectra are a convolution of the pure Lorenzian signal and the resolution function of the instrument. As a result this convolution must be performed for the MD data before comparison. It is useful however for the reader to note that this convolution may be avoided if instead we use the intermediate scattering function as the function for comparison. In this case the convolution reduces to a computationally much cheaper multiplication [33][34][35][36]. Finally, we must be careful when simulating our system with MD, ensuring that the length and time-scales are such that they capture the same scales detectable by the QENS instrument being used. Simulations should therefore be of long enough duration to capture temporal correlations which are on a par with the minimum energy resolution of the instrument, but equally there is no need to sample the MD trajectory with a frequency which is greater than the upper energy limit of the QENS instrument. From a length scale perspective, we must ensure that the size of the simulation cell is large enough to capture the lowest Q of the QENS instrument. In fact software packages already exist for the comparison of MD to QENS, with the package MDANSE [37] even allowing the scripted direct conversion of data for a chosen instrument. With these considerations it should therefore be possible to generate a like for like comparison for the QENS and MD spectra, which in turn allows us to test the validity of the MD model. The real power of comparison of QENS data to MD is in the ability to break down the signal into individual components of the system and thus attributing different motions to different line shapes. These different components could be for example different molecular species in a mixture or molecules present in different cage architectures of a porous material. QENS therefore seems like a very attractive metric with which to parameterise classical force-fields, and indeed there have been recent attempts to this end. The first of these is the software package and scientific methodology, Molecular Dynamics Monte Carlo (MDMC) [38]. At its heart, MDMC is an MD engine wrapped within a Monte Carlo minimisation routine. An initial set of parameters are chosen for the force-field, and a short MD equilbiration occurs before a figure of merit is calculated (FOM: the deviation of the experimental and MD dynamic structure factor). The Metropolis criterion is then used to either accept or reject each new set of force-field parameters based on the FOM. This process then occurs until a converged set of force-field parameters are obtained. In principle the parameters for the force-field can be any parameter, such as the atomic radii (σ), strength of interaction(ò), bond/angular/dihedral constants (k), partial charges (Q) etc.
The MDMC algorithm's strength is in its generality, being able to in principle simultaneously fit every parameter of the system to the set of QENS data. This generality however comes at a great computational expense. As of yet, MDMC has only been tested for simple systems, such as liquid Ar and water [38]. With these simple systems it has seen great success, however the question of whether the algorithm is computationally tractable for more complex systems is still to be determined. In particular increasing complexity provides multiple sources of computational expense. The first simple issue is that of system size, thus increasing the computational demand in particular for the MD component of the algorithm. Secondly, more complicated models will in general have more complex potentials and bonding (i.e. costly coulombic and bending/dihederal interactions). But more important yet, increasing the amount of parameters increases the phase space dimensionality of the minimisation problem. The example of the liquid Ar is a monotonic system and only has two interaction parameters (σ and ò), thus having a 2D phase space. For a more complex multi-component system we both multiply the amount of self interatomic parameters, but also introduce cross-interaction terms which increase as (N s − 1)!, where N s is the number of different atomic species in the system. In addition we may now be introducing static charges, bonding terms, bending interactions and dihedral bending terms This parameter space very rapidly expands even by introducing limited complexity to the system. Thus when we arrive at a system that consists of multiple small organic species within a zeolitic framework, it is hard to imagine being able to use this general approach to converge to a set of force-field parameters. The higher the dimensionality of the parameter space too, the more likely one is to get stuck in a bottle neck of phase space, or local minima thus increasing the convergence time, or risking the optimisation procedure failing altogether. Instead the strength of this technique more likely lies in the parameterisation of single component liquid systems which although more complex than the previous test cases, should be computationally tractable, and provide a true test of the atomistic dynamics of these systems on the appropriate length scale of MD simulations. This could be particularly useful for hydrogen bonded systems, where the time-scales of the making and breaking of hydrogen bonds falls within the energy domain of the QENS technique.
It is clear then that the computational expense of methods such as MDMC for typical system complexities of interest in catalysis and gas absorption is likely to be too high in the immediate future. The question therefore arises, as to whether there is a more ad hoc interim approach which could be applied to produce reliable models for these important systems in the present. A recent study has been performed to this effect, which although relatively simple in nature, provides a proof of concept which may be applied to more complex systems. This study looked at benzene molecules absorbed on a nickel sponge catalyst, which was represented by an FCC 111 surface [19]. This system represents an important case in the catalysis of the upgrade of small aromatic molecules. This approach is targeted in nature, employing an existing OPLS force-field for the benzene-benzene interactions, but instead focusing on the Ni-Benzene interactions as the free parameters of the optimisation process, which are identified as being of primary importance for the determination of the surface hopping of benzene across the Ni surface. The method used a rigid Ni FCC 111 surface, as the minimal motions of the Ni combined with the large pore size of the Ni sponge made this a valid approximation, and eliminated yet another interaction parameter to make the problem more computationally viable. Unlike the MDMC method, this study employed a simple method of steepest decent approach to evolve the free parameters of the model. Relatively short simulations were performed, before calculating the FOM, which was used to define the magnitude and sign of the change in each parameter. To speed up the convergence, parameters similar to that of an existing organic-metal interactions were chosen as the initial trial values. This method provided fast convergence, obtaining a full convergence within 10 iterations (a final result in a matter of days). As mentioned earlier, a desirable property of any force-field is transferability across a wide temperature range. This model showed a strong transferability to lower temperatures, but began to deviate when approaching the freezing transition, showing the limitations of such an approach in the vicinity of first order phase changes. The resulting force-field from this study was then used to interrogate more intimate details of the structure and dynamics, including out of plane motions, molecular stacking, and hoping dynamics. This study shows a clear yet somewhat ad hoc strategy in attacking computationally complex systems which can be summarised in the flowchart shown in figure 3.
To highlight the steps in this method, let us consider the example of a small organic molecule in a zeolitic framework. The key steps begin firstly by identifying the components of the system with which there are existing well performing force-fields. For this example many small organic molecules already have been well studied and parameterised, and are often even valid in the context of nanoscopic pores. Next it is worth questioning whether certain areas of the system can be simplified through either removing flexibility, or some form of coarse graining. In the current case, the zeolite is a clear candidate for removing the flexibility of the system and thus reducing the computational cost for the MD portions. This may not be the case for more flexible systems such as metal organic frameworks, where the cage breathing may play an important part in determining the flow of molecules from one cage to another. The next step is to determine which interactions are likely to be the most relevant in affecting the dynamics of the system. At this point it is crucial to consider the relative strength of the interaction of each molecular unit. If for example if the organic molecule had a strong hydrogen bonding group, then the interaction with acidic sites in the zeolite could be identified as being of key importance and the dominant interaction which could hinder diffusion. As a result the free parameters of the system could be chosen to be those between these acidic sites and the hydrogen bonding units in the organic molecule. This is the most important step and has to be carefully decided on a case by case basis by applying chemical intuition as to the dominant interactions. These too are typically the interactions for which force-fields are either poorly parameterised or non existent as focus is usually for bulk liquid systems. The final consideration is that of system size. The sampling of multiple MD segments required during the minimisation procedure is costly, and cutting down the computational cost by using a small system could be key in making this a computationally tractable problem. Nonetheless care must be taken that system sizes do not become small enough to introduce artifacts that are common in small coulombic systems, where correlated structure and dynamics can produce artificially enhanced effects. Other considerations could be applied on a case by case basis. For example if the system involves small molecules confined within pores, and thus are not exhibiting macroscopic diffusion, then it may be possible to sample much shorter time-scale MD segments, as the low energy range of the spectra will not be relevant for such systems. The same could be true of a system acting as a static rotor, or bound to an interaction site. Despite the ad hoc nature of this approach, the benefits are substantial, as parameterisation of the aforementioned cross interactions can elucidate more nuanced motions, such as jump motions on a surface or between interaction sites, or pores in porous materials. These motions ultimately can be of key importance in dictating the diffusion pathways, and or blocking of such pathways as well as the time-scales for recycling of a catalytic site.

Conclusions and outlook
Neutron scattering has had a rich and harmonious history with molecular modelling, and is the mainstream choice for much of the structural analysis of small angle scattering data. The area of neutron spectroscopy in recent years has been following this example, but with the significant extra problems that the temporal domain adds, has been slower in the uptake of such methods. MDMC provides a clear and general framework to provide dynamically accurate classical force-fields for liquid systems composed of small molecules, however its current applicability to larger more complex systems such as those studied in catalysis and gas absorption is in question. For these more complex systems, a more ad hoc approach could be the solution, using existing insight into the nature of the chemical interactions in these systems. By assessing which interactions are of most relevance in effecting the local dynamics, as well as neglecting motions which will have little effect, the information content of the QENS spectra can be targeted on the relevant force-field parameters. The added value that can be achieved through the application of such approaches is clear, providing true atomistic insight beyond that achievable with the traditional analytical models that have historically been used to describe motions in porous and surface materials. Future works should focus on parameterisation of cross interaction between active sites and bonding groups in organic molecules, to probe whether the details of these interactions present in QENS data can be coded in simple classical force-fields.