SULISO: The Bath suite of vibrational characterization and isotope effect calculation software

Isotope effects are subtle but powerful probes of chemical reaction mechanisms and environmental conditions, with applications across chemical, biological and earth sciences. Their meaningful interpretation often relies on calculations based upon fundamental theories for their origin. The SULISO suite consists of four programs for the calculation of vibrational frequencies and isotope effects. CAMVIB isabroadvibrationalcharacterizationcodedevelopedforanalysisofcalculatedharmonicfrequenciesand of normal modes in terms of internal coordinates. LIPFR calculates isotopic partition function ratios for pairs of isotopically substituted whole molecules, corresponding to conventional methodology, whereas UJISO is designed to perform similar calculations on subsets of atoms from very large systems. CUTOFF is a utility which truncates a force-constant matrix for a large system to obtain a smaller matrix appropriate for a specified subset of atoms. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


a b s t r a c t
Isotope effects are subtle but powerful probes of chemical reaction mechanisms and environmental conditions, with applications across chemical, biological and earth sciences. Their meaningful interpretation often relies on calculations based upon fundamental theories for their origin. The SULISO suite consists of four programs for the calculation of vibrational frequencies and isotope effects. CAMVIB is a broad vibrational characterization code developed for analysis of calculated harmonic frequencies and of normal modes in terms of internal coordinates. LIPFR calculates isotopic partition function ratios for pairs of isotopically substituted whole molecules, corresponding to conventional methodology, whereas UJISO is designed to perform similar calculations on subsets of atoms from very large systems. CUTOFF is a utility which truncates a force-constant matrix for a large system to obtain a smaller matrix appropriate for a specified subset of atoms.

Introduction
Isotope effects (IEs) on rate or equilibrium constants are exquisitely subtle but yet extraordinarily powerful experimental probes of reaction mechanisms, environmental conditions and * Corresponding author. structural changes throughout chemistry geochemistry and biochemistry [1]. IEs are employed in studies of chemical reactions and equilibria in a wide range of contexts in both gaseous and condensed-phases including enzyme-catalyzed processes. Substitution of one isotope of an element with another isotope (of the same element but with different mass) in general causes small but significant variations in the properties of a molecule or a material which are manifest in ratios of rate constants or equilibrium constants that differ from unity for reactions or equilibria involving the isotopic variants. It is vital to have robust computational protocols in place for their reliable calculation and interpretation. Existing programs for IE calculations were designed for isolated molecules and are inappropriate for subsets of atoms embedded in large supramolecular systems, such as commonly arise in quantummechanical/molecular-mechanical (QM/MM) simulations of condensed matter, including enzyme-catalyzed reactions and chemical reactions in solution [2]. The SULISO suite 1 comprises four programs to facilitate IE calculations based upon Cartesian force-constant (Hessian) matrices computed externally by means of electronic-structure (e.g. Gaussian09 [3]) or molecularsimulation (e.g. fDynamo [4]) codes. CAMVIB is an extensive vibrational calculation program, with numerous features for characterization and correlation of frequencies and normal modes. LIPFR calculates isotopic partition function ratios for pairs of isotopically substituted whole molecules, corresponding to conventional methodology, whereas UJISO is designed to perform similar calculations on subsets of atoms from very large systems. CUTOFF is a utility which truncates a force-constant matrix for a large system to obtain a smaller matrix appropriate for a specified subset of atoms; this is helpful in the determination of the minimal but sufficient size of Hessian for subsets of supramolecular systems. UJISO and LIPFR both act as end-of-line codes, taking input from CAMVIB and/or CUTOFF to calculate isotope effects from the vibrational frequency data.

Problems and background
The traditional theory of IEs based on molecular partition functions and the Bigeleisen equation [1] assume that each molecular system corresponds to a zero-gradient stationary point on a potential energy surface. Reliable optimized geometries and Hessians are readily available from modern quantum-chemical packages employing second-derivative methods, provided that convergence thresholds are set suitably tightly. However, since the turn of the millennium, it has become common to perform simulations for very large molecular systems (either molecules in solution or within enzyme active sites) but to compute explicit Hessians for only a subset of the total number of atoms in the system. For example, relaxation of a specified subset of atoms to a local minimum (or saddle point) may be performed within a frozen environment of the remaining atoms. In the context of QM/MM methods, the mobile subset and the frozen environment may be the same as the QM and MM regions, although different selections may also be made. In these circumstances the N s subset atoms do not in themselves constitute a stationary structure in which vibrational degrees of freedom are separable from translations and rotation. The constraining influence of the environment means that diagonalization of the mass-weighted 3N s × 3N s Hessian in Cartesian coordinates generally yields 3N s non-zero eigenvalues which include six corresponding to libration of the whole subset with respect to its environment: ''translation'' and ''rotation'' of the subset as a whole are not free or separable motions but are coupled with the internal vibrational degrees of freedom.
It appears that other programs for IE calculations (e.g. QUIVER [5] and ISOEFF98 [6]) employ the Bigeleisen equation for 3N s − 6 internal degrees of freedom even for atomic subsets embedded within a supramolecular environment, and this has been achieved in published work either by projecting out the six external degrees of freedom or by simply ignoring the six lowest frequencies.
In principle these procedures are both invalid and unnecessary, and should be replaced by methodology that uses all 3N s frequencies of 1 In the Roman era the city of Bath was known as Aquae Sulis.
the embedded subset, as described elsewhere [7]. Moreover, other programs tend to be organized in order to provide an equilibrium IE or a kinetic IE for a particular reaction involving single structures for each of the initial and final states of the system. This approach is valid for gas-phase processes but arguably is not very suitable for reactions in condensed phases. We have argued elsewhere that it is essential to determine the isotopic sensitivity of a large flexible supramolecular system by averaging over thermally accessible conformations [8]. This may be done by consideration of the (''reduced'') isotopic partition function ratio [1] (IPFR), f, for each conformation of each system. The average IE is obtained as the quotient of average IPFRs for the initial state (IS) and final state (FS).
where each individual value of f involves a product of partition functions for 3N s degrees of freedom (or 3N s − 1 for a first-order saddle point together with a one-dimensional tunneling factor along the direction of the transition vector, within the range of validity of the harmonic approximation with respect to the de Broglie wavelength). Our programs LIPFR and UJISO are designed to compute IPFR values for a simple structure but for as many different isotopic substitutions as are desired; the results for many different conformations are collated and averaged by means of a spreadsheet.

Software architecture
The four programs are written in FORTRAN and output from CAMVIB and CUTOFF is used as input for LIPFR and UJISO, respectively. Each code is split into individual subroutines which carry out individual part of the calculation explicitly, the different functionalities being called by a series of keywords. (See Fig. 1.)

Software functionalities
CAMVIB functions with three main aims. Firstly, normal coordinates and vibrational frequencies can be calculated for any number of atoms up to a limit set by a parameter MAXNAT in the included file CAM_SIZE. Secondly, the code allows for transformation of Cartesian force constants to internal or symmetry force constants, which can all individually be scaled and manipulated. CAMVIB can also compute compliance constants and relaxed force constants. All of these options are requested by a series of keywords within the program.
CUTOFF performs the function of truncating a 3N × 3N Hessian to a smaller matrix representing the atomic positions specified in the input file. This outputs a new coordinate system and Hessian for input to UJISO.
UJISO and LIPFR calculate isotope effects with input originating from either program CUTOFF or CAMVIB, respectively. UJISO is particularly tailored to post-CUTOFF use to calculated IEs from subset Hessians, while LIPFR acts upon full systems. Multiple isotopes can be substituted and IPFRs calculated for each structure and Hessian The calculations can also be carried out at a number of temperatures. In order to confirm the integrity of the IPFR calculations, the Teller-Redlich product rule [1] is employed: the independently calculated ''masses and moments of inertia'' and ''vibrational product'' terms should be exactly equal.

• Implementation details
Each of the programs within the suite are called through standard UNIX implementation: ./camvib.exe < input.vib > output.vibout The convention with our codes is to use the following input extensions: Our programs function with any file extension chosen by the user; these are simply suggestions in order to provide meaningful reference to files in the future.

Illustrative examples
The following examples relate to a methyl cation surrounded by water molecules. A Gaussian09 geometry optimization and frequencies calculation generated a ''punch'' output file containing Cartesian coordinates and Hessian elements, which is edited by  adding keywords, title and specification of internal coordinates, to yield a CAMVIB input file (Fig. 2).
The keywords on line 2 specify the following:

GAUS
Cartesian coordinates and Hessian are supplied in GAUSSIAN punch-file format. REDU A set of internal (valence) coordinates including redundancies is supplied; local symmetry coordinates are to be constructed from specific combinations of valence coordinates for bending.

IFCS
Force constants in internal (valence) coordinates are to be computed and printed. VECT Normal coordinates are to be printed. FULL Verbose output. DISO A fort.12 file is generated, containing information to be used subsequent by the other programs, including elements of a ''cleaned-up'' Hessian from spurious components of translation and rotation have been removed.
All keywords and options are described fully in the CAMVIB manual: https://github.com/pbw20/SULISO_manuals. The CAMVIB output for this structure, with a planar methyl moiety sandwiched midway between two equidistant water molecules positioned on either side on the local 3-fold symmetry axis, and with water molecule making a linear interaction with each of the three CH bonds, reveals it to be a transition state (TS) for S N 2 methyl transfer, possessing an imaginary frequency for displacement in the normal mode corresponding to Walden inversion.
Next, an input file for CUTOFF (Fig. 3) is prepared from the CAMVIB fort.12 output by insertion of an asterisk (*) immediately after the element symbol for each atom to be retained by the cutoff procedure. In this example the first four atoms, comprising the methyl moiety, are kept while all others are removed: their Cartesian coordinates and their corresponding rows and columns of the Hessian are deleted.
The output from CUTOFF is in turn used to prepare the input for UJISO (Fig. 4) by adding some information at the head of the file and a specification of isotopologue(s) at the tail.
The UJISO output (Fig. 5) shows 12 vibrational frequencies. The first frequency (corresponding to out-of-plane bending of the planar methyl moiety) is printed as −253.7355, but is actually imaginary; it is identified as the transition frequency for which a quantum correction to the IPFR is evaluated. The excitational (EX) and zero-point-energy (ZP) contributions to the IPFR, which appears as Q(heavy)/Q(light), is evaluated over all of the 11 real frequencies.
The structure of the input file for LIPFR is essentially the same as for UJISO. (Full details may be found in the manual.) This program is used to compute IPFRs for a whole molecule, not a subset of a larger system: in this example we consider the methyl cation in isolation. Since its translational and rotational degrees of freedom are considered explicitly, along with the internal vibrational degrees of freedom, in general information in one line of the tail of the file differs from that for UJISO. Specifically, the line 1 0  where the initial zero denotes that the structure of the isolated methyl cation is not a TS, the ''6'' refers to the number of external degrees of freedom (corresponding to translational and rotational modes with essentially zero-valued frequencies), followed by the six ordinal numbers specifying which frequencies are to be ignored in the vibrational part of the IPFR calculation. The LIPFR output (Fig. 6) shows 6 essentially zero frequencies, corresponding to translational and rotational modes, which are excluded from the IPFR calculation. The translational and rotational contributions to the IPFR are evaluated from the atomic masses and the principal moments of inertia, and together make up the MMI factor. The vibrational components (EXC and ZPE) of the IPFR (printed as Q2/Q1) are evaluated over the 6 real vibrational frequencies. The two terms in the Teller-Redlich product rule are exactly equal, confirming that the geometry and the Hessian are consistent with each other.
An individual IPFR value (f indiv ≡ Q heavy /Q light or Q n /Q 1 ) and its factors for a single conformation of a system, from either UJISO or LIPFR, may be transferred to a spreadsheet or other program in order to evaluate other properties.
• The IPFR f state for a state of the system may be obtained by averaging over an ensemble of N thermally accessible conformations. If the individual structures were generated as snapshots at regular intervals from an equilibrated molecular dynamics trajectory, they should already represent a Boltzmann distribution and so a simple arithmetic mean provides the appropriate average IPFR.
f state = (1/N)Σf indiv (2) • Either an equilibrium or a kinetic IE may be obtained as a ratio of IPFRs for initial and final structures or states.
Equilibrium IE between individual structures for reactant (R) and product (P) = f R /f P .

Conclusions
The SULISO suite of programs for vibrational characterization and isotope effect calculation has been described. In a time where IEs are increasingly used in both molecular and supramolecular research, it is vital to have robust computational protocols for their calculation, based on reliable methodologies. We have developed these programs and have demonstrated their utility in several applications [9]. We recommend their use by others.