Evaluation of Force-Field Calculations of Lattice Energies on a Large Public Dataset, Assessment of Pharmaceutical Relevance and Comparison to Density Functional Theory.

Crystal lattice energy is a key property affecting the ease of processing pharmaceutical materials during manufacturing, as well as product performance. We present an extensive comparison of 324 force-field protocols for calculating the lattice energies of single component, organic molecular crystals (further restricted to Z' less than or equal to one), corresponding to a wide variety of force-fields (DREIDING, Universal, CVFF, PCFF, COMPASS, COMPASSII), optimization routines and other variations which could be implemented as part of an automated workflow using the industry standard Materials Studio software. All calculations were validated using a large new dataset (SUB-BIG), which we make publicly available. This dataset comprises public domain sublimation data, from which estimated experimental lattice energies were derived, linked to 235 molecular crystals. Analysis of pharmaceutical relevance was performed according to two distinct methods based upon (A) public and (B) proprietary data. These identified overlapping subsets of SUB-BIG comprising (A) 172 and (B) 63 crystals, of putative pharmaceutical relevance, respectively. We recommend a protocol based on the COMPASSII force-field for lattice energy calculations of general organic or pharmaceutically relevant molecular crystals. This protocol was the most highly ranked prior to subsetting and was either the top ranking or amongst the top 15 protocols (top 5%) following subsetting of the dataset according to putative pharmaceutical relevance. Further analysis identified scenarios where the lattice energies calculated using the recommended force-field protocol should either be disregarded (values greater than or equal to zero and/or the messages generated by the automated workflow indicate extraneous atoms were added to the unit cell) or treated cautiously (values less than or equal to -249 kJ/mol), as they are likely to be inaccurate. Application of the recommended force-field protocol, coupled with these heuristic filtering criteria, achieved an RMSE around 17 kJ/mol (MAD around 11 kJ/mol, Spearman's rank correlation coefficient of 0.88) across all 226 SUB-BIG structures retained after removing calculation failures and applying the filtering criteria. Across these 226 structures, the estimated experimental lattice energies ranged from -60 to -269 kJ/mol, with a standard deviation around 29 kJ/mol. The performance of the recommended protocol on pharmaceutically relevant crystals could be somewhat reduced, with an RMSE around 20 kJ/mol (MAD around 13 kJ/mol, Spearman's rank correlation coefficient of 0.76) obtained on 62 structures retained following filtering according to pharmaceutical relevance method B, for which the distribution of experimental values was similar. For a diverse set of 17 SUB-BIG entries, deemed pharmaceutically relevant according to method B, this recommended force-field protocol was compared to dispersion corrected density functional theory (DFT) calculations (PBE+TS). These calculations suggest that the recommended force-field protocol (RMSE around 15 kJ/mol) outperforms PBE+TS (RMSE around 37 kJ/mol), although it may not outperform more sophisticated DFT protocols and future studies should investigate this. Finally, further work is required to compare our recommended protocol to other lattice energy calculation protocols reported in the literature, as comparisons based upon previously reported smaller datasets indicated this protocol was outperformed by a number of other methods. The SUB-BIG dataset provides a basis for these future studies and could support protocol refinement.


Abstract
Crystal lattice energy is a key property affecting the ease of processing pharmaceutical materials during manufacturing, as well as product performance. We present an extensive comparison of 324 force-field protocols for calculating the lattice energies of single component, organic molecular crystals (further restricted to Z' less than or equal to one), corresponding to a wide variety of force-fields (DREIDING, Universal, CVFF, PCFF, COMPASS, COMPASSII), optimization routines and other variations which could be implemented as part of an automated workflow using the industry standard Materials Studio software. All calculations were validated using a large new dataset (SUB-BIG), which we make publicly available. This dataset comprises public domain sublimation data, from which estimated experimental lattice energies were derived, linked to 235 molecular crystals. Analysis of pharmaceutical relevance was performed according to two distinct methods based upon (A) public and (B) proprietary data. These identified overlapping subsets of SUB-BIG comprising (A) 172 and (B) 63 crystals, of putative pharmaceutical relevance, respectively. We recommend a protocol based on the COMPASSII force-field for lattice energy calculations of general organic or pharmaceutically relevant molecular crystals. This protocol was the most highly ranked prior to subsetting and was either the top ranking or amongst the top 15 protocols (top 5%) following subsetting of the dataset according to putative pharmaceutical relevance. Further analysis identified scenarios where the lattice energies calculated using the recommended force-field protocol should either be disregarded (values greater than or equal to zero and/or the messages generated by the automated workflow indicate extraneous atoms were added to the unit cell) or treated cautiously (values less than or equal to -249 kJ/mol), as they are likely to be inaccurate. Application of the recommended forcefield protocol, coupled with these heuristic filtering criteria, achieved an RMSE around 17 kJ/mol Introduction Being able to anticipate the different solid forms of drug candidates and, based upon knowledge of the solid form, determine their associated properties is crucial for assessing their developability. 1 In particular, being able to accurately predict the lattice energy, a measure of solid state cohesion defined more precisely below, serves a number of useful purposes in the context of pharmaceutical development.
The lattice energy may be related to the solid state contribution to the Gibbs free energy of solution and hence, in turn, related to thermodynamic solubility. [2][3][4] Whilst incorporating lattice energies calculated from experimental crystal structures may not necessarily produce more accurate quantitative structure-property relationship (QSPR) models of solubility than those based on molecular descriptors alone, 5,6 QSPR models based upon descriptors which can be explicitly related to the solid state or non-solid state contribution to solubility may provide greater physical insight. 2,3 In turn, this may support rational design of new active pharmaceutical ingredients (APIs) or solid forms with more desirable solubility. Control of pharmaceutical solubility affects the dissolution rate and bioavailability of the final drug product, 4 as well as supporting the design of manufacturing unit operations, such as cooling crystallization. 7 Lattice energy calculations also support crystal structure predictions, especially if they are accurate enough to distinguish between polymorphs, which supports polymorph screening and, hence, assessments of thermodynamic stability. [8][9][10] Lattice energy calculations may also be used as an intermediate step for other analyses. Firstly, these calculations may be decomposed in terms of molecular synthons and support the design of molecular crystals and crystallization processes. 9,[11][12][13][14] Secondly, it is conceivable that crystal structure based lattice energy calculations could support assessment of other factors related to pharmaceutical manufacturing, such as mechanical properties and powder behavior during processing. 15 Using accurate lattice energy calculations as an intermediate step for these other analyses assumes that they correspond to accurate calculations of the constituent contributions, rather than being accurate due to cancellation of errors.
The lattice energy may be defined as the energy change upon bringing together static, infinitely separated molecules (or, equivalently, molecules in an ideal gas), in their lowest energy conformation, to form a static lattice for an ideal crystal. 8,16 The lattice energy is a contribution to the sublimation enthalpy, which can be obtained experimentally. 17 Under certain assumptions, notably assuming the sublimed crystal to be an ideal gas, the relationship between sublimation enthalpy and the lattice energy can be expressed as per equation (1). 17 (In equation (1), denotes the sublimation enthalpy, the lattice energy, the change in vibrational energy going from the solid state to the gas phase, is the molar gas constant, is the temperature in Kelvin, and = 3.5 for a linear molecule and x = 4 for a non-linear molecule.) However, if further assumptions are made regarding the solid state and gas phase vibrational states, equation (1) may be further simplified, as per equation (2). 2 Since this equation implies all solids have the same, temperature independent, heat capacity, this is clearly a crude approximation. (1) In principle, if accurate values for were available, the experimental lattice energy could be derived from the experimental sublimation enthalpy via equation (1). Alternatively, if accurate solid and gas phase heat capacity values as a function of temperature were availablealong with solid phase change enthalpies where applicable, allowing for the sublimation enthalpy to be corrected to zero Kelvin, 18,19 and the zero-point 17,19 was either negligible or known, the experimental lattice energy could also be derived using equation (1). However, in the absence of this information, it is typically necessary to derive a more approximate estimate of experimental lattice energy using equation (2).
Hence, sublimation data may be used to evaluate lattice energy calculations, e.g. via comparing them to experimentally estimated lattice energies. 2,[20][21][22] In general, physics based methods are employed for calculating lattice energies from crystal structures. These methods include those which calculate energies using molecular force-fields, ab initio approaches, such as density functional theory (DFT), and hybrid approaches. 23 Molecular force-fields 24 calculate intramolecular energetic contributions in terms of parameterized terms reflecting the deviation of bond lengths and angles from local energy minima and intermolecular energetic contributions in terms of parameterized terms reflecting different kinds of contributions to intermolecular forces, with functional forms based upon knowledge of the relevant physics. 20,[25][26][27] For example, the contribution to the intermolecular energy arising from the charge distribution of the isolated molecule might be approximated as the pairwise summation of atomic partial charges interacting via Coulomb's law. 20 The parameters assigned are contingent upon the "atom types" involved in the intermolecular or intramolecular interactions, where each "atom type" represents a given (set of) elements in a defined chemical environment. 20 For example, the force-field parameters might include different atomic partial charges, for calculating Coulombic interactions, for different atom types. 20 Sublimation data may be used to help parameterize / optimize force-field approaches for calculating lattice energies from crystal structures. 26,27 However, the parameters used in these force-fields may be determined using a variety of calculated and experimental data and the parameters in existing force-fields may be refined using additional data. 20,28 In general, DFT calculations 24,29,30 (3). 22 In equation (3), denotes the energy per mole of unit cell, the total number of molecules in the unit cell and the energy per mole of isolated gas phase molecule.
These energetic terms may be computed using molecular force-fields or DFT calculations. In all cases, static structures are assumed. If it can further be assumed that the experimental crystal structure represents the lowest energy conformation of the solid state and that the lowest energy gas phase geometry is approximately the same as the crystal structure molecular geometry, 32 a force-field calculation of lattice energy can be obtained via only computing the intermolecular contributions to . These assumptions are made in the HABIT program 33 and its extensions (HABIT95, 34 HABIT98 and Visual HABIT), 14 as well as by other force-field protocols for calculating lattice energies. 26 However, even if changes in molecular structure in the gas phase are ignored, different lattice energy calculation protocols may also differ in the extent to which the input crystal structure geometry is optimized / relaxed to a local energy minimum with respect to the potential energy surface estimated from the calculation. 11,20 Whilst the lattice energy is defined with respect to the energy minimized lattice structure according to the true potential energy surface, it must be remembered that the calculations only provide an approximation of this potential energy surface. It is possible that the true energy minimized structure lies closer to the experimentally observed structure, allowing for some deviation due to thermal expansion and experimental error, i.e. complete relaxation of the crystal structure geometry may not yield the most accurate lattice energy calculation.
Hence, as well as the means of calculating energetic contributions (e.g. one out of many possible DFT methodologies or one out of many molecular force-fields), physics based calculations of lattice energies may also differ with respect to how the crystal and molecular structures are processed. Other variations, such as the choice of atomic partial charges to be assigned for forcefield calculations, are also possible.
The primary goal of the present work was to thoroughly evaluate a range of possible protocols for computing the lattice energies of single component molecular crystals, using computationally inexpensive force-fields, which could be implemented via automated workflows using the industry standard Materials Studio software, so as to develop a recommended workflow for automated lattice energy calculations that was appropriate for general organic and pharmaceutical materials. 35 To ensure a thorough evaluation was carried out, not only were a variety of possible force-fields selected, but a range of options, such as the optimization protocol, were also evaluated. To ensure a robust assessment of the performance of each protocol was performed, these protocols were evaluated using a new, large dataset, comprising molecular crystal structures linked to experimental lattice energy estimates derived from sublimation enthalpy data. We make this dataset publicly available for future evaluations of lattice energy calculation protocols. At the time we submitted this work for publication, this was the largest reported dataset for evaluations of lattice energy calculations from crystal structures and it remains the largest dataset used for evaluation of the methods for lattice energy calculation considered herein. Whilst an even larger dataset was reported by Gavezzotti and Chickos whilst we were revising our work for publication, 36 they evaluated different lattice energy calculation routines to the industry standard methods investigated here.
Since we were interested in the most suitable protocol for calculating the lattice energies of pharmaceutically relevant crystals, analyses were repeated on subsets of the data putatively deemed to be more pharmaceutically relevant, according to different criteria. We further develop heuristic recommendations as to when lattice energies calculated using the best force-field protocol should be treated with caution. Subsequently, a comparison was made between the best force-field protocol and a modestly sized, chemically diverse subset of the new dataset, of putative pharmaceutical relevance, and a dispersion corrected DFT lattice energy protocol which was implemented within the publicly available CASTEP software 37 when these studies were performed. Finally, we evaluated the best force-field protocol identified in our work using small datasets previously proposed in the literature for evaluating calculations of lattice energies / sublimation enthalpies. These evaluations highlight the need for further comparative assessments of the different approaches, which would be supported by the large dataset we have made available.

Methods and Data
To assist the reader, a glossary of technical terms is provided in Supporting Information Table   S1.

SUB-BIG Dataset
A dataset comprising 235 crystals, documented using their Cambridge Structural Database (CSD) refcodes, 38 linked to 434 experimentally derived sublimation enthalpies, along with the corresponding temperature, was curated. Each crystal corresponded to a unique chemical, i.e. none of the crystals was a polymorph or redetermination of any other dataset entry. 38 As is further discussed below, in the context of data quality, only a few sublimation enthalpies were confirmed to correspond to the same polymorph as the linked CSD refcode. At the time this work was submitted for publication, the SUB-BIG dataset was, to the best of our knowledge, larger than all previously reported datasets integrating experimental sublimation enthalpies (or experimentally estimated lattice energies) and crystal structures. 2,9,17,[19][20][21][22][25][26][27][28]31,[43][44][45][46][47][48][49][50][51][52] (Larger datasets were previously reported for molecular QSPR studies of sublimation enthalpies, but those datasets were not integrated with crystal structures. 53,54 ) However, in the course of revising our manuscript for publication, Gavezzotti and Chickos reported an even larger dataset than SUB-BIG. 36 An expanded description of SUB-BIG is provided in Supporting Information section A.0.

Data Quality Assignments for SUB-BIG Dataset
Data quality labels were assigned to the sublimation enthalpies, at 298 Kelvin, for the SUB-BIG  38 Moreover, previous studies indicate this would typically introduce a small degree of uncertainty into the experimentally estimated lattice energies for the corresponding crystal structures. Recent computational analysis suggests that most pairwise differences in lattice energies between polymorphs are typically less than 2 kJ/mol and are greater than 7.2 kJ/mol in only 5% of cases. 56 Earlier analysis of experimental data also suggested sublimation enthalpies of polymorphs typically differ by only a few percent.) 26 The quality labels assigned based upon the curated sublimation enthalpy and linked crystal structure were then combined to obtain an overall quality ranking (1 -8) for the linked sublimation enthalpy at 298 Kelvin, with lower values indicating higher confidence in the assignment of that value to the crystal structure and the crystallographic data.
Regarding the quality labels assigned based upon the curated sublimation enthalpy, data were deemed of higher quality if the experimental error, by which we mean any quantitative uncertainty estimate, and experimental method were both available and curated and if the experimental error did not exceed 4.9 kJ/mol. (It has been suggested that typical errors in experimental sublimation enthalpies are of the order of 4.9 kJ/mol. 22 However, as noted by Červinka et al., 19 different approaches to estimating sublimation enthalpy are associated with different quantitative estimates of uncertainty. Moreover, different studies may estimate these uncertainties in different ways.) 19 If either the experimental method or error were not curated, or comments from the cited references suggested the datapoint might be considered unreliable, the datapoint was judged of uncertain quality, with the absence of a curated experimental error deemed to make the datapoint less reliable. If the experimental error exceeded 4.9 kJ/mol, the datapoint was deemed of lower quality.
Regarding the quality labels assigned to the linked crystal structures, these took account of whether the assignment was uncertain due to known polymorphism, the size of the R-factor and whether the crystal was missing hydrogens or contained disordered atoms, albeit the automated preparation of crystal structures performed in the current work was designed to take fix the latter problems and no crystal structures with other kinds of documented disorder were included in the dataset.
A full description of how these data quality labels and the final data quality ranks were assigned is provided in Supporting Information section A.1.

Derivation of Experimental Lattice Energy Estimates for SUB-BIG Dataset
The workflow described in Supporting Information section A.2 was applied to resolve all sublimation enthalpies values reported at 298 Kelvin, or corrected to 298 Kelvin (see above), associated with a given CSD refcode in SUB-BIG into a single (average) sublimation enthalpy value at 298 Kelvin. In brief, lower quality datapoints (see above) and statistical outliers were excluded prior to averaging. This average sublimation enthalpy was then converted into an approximate, given the necessary crude assumptions, 2,17 experimental estimate of lattice energy, using equation (2). The single data quality rank (see above) of the averaged sublimation enthalpies was assigned to the experimental estimate of lattice energy.

Force-Field Calculations of Lattice Energy
Whilst 324 different protocols were evaluated, corresponding to variations in the details of each step, all protocols corresponded to the generic workflow in Figure 2. A variety of different options were explored: the choice of force-field (DREIDING, 59 Universal, 60 CVFF, 61 PCFF, 62 COMPASS, 20 COMPASSII); 28 the optimization routine for the crystal structure (if any) and extracted gas phase molecule; charges assigned; van der Waals cut-offs, Ewald summation accuracy and convergence criteria ("quality"). 63,64 Full details are given in Supporting Information section A.4. N.B. We note that some of these force-fields, such as COMPASSII, may be iteratively refined between different versions of Materials Studio, although earlier iterations can also be accessed, as explained in the documentation. 35 In the current work, we employed Materials Studio version 17.1.0.48 for all force-field calculations.
Our automated workflow for applying these protocols was only applicable to single component molecular crystals, further restricted to Z' <= 1. 40 We note that this does not reflect intrinsic limitations of Materials Studio, 35 but reflects the manner in which we implemented the calculations. 57 We further note that we only investigated local optimization of the gas phase geometry. Whilst this limits the computational overhead and is a reasonable approximation in many cases, since the crystal conformer is typically energetically similar to the gas phase global minimum, lattice energy is defined with respect to the global minimum 8 and this may sometimes be 25 kJ/mol lower than the crystal conformer. 32 Figure 2. A generic workflow for the 324 force-field protocols evaluated using the SUB-BIG dataset and various subsets thereof. The variations giving rise to 324 protocols are described in Supporting Information section A.4. Importantly, due to the manner in which they were implemented as a script 57 for Materials Studio, 35 these protocols were not applicable to structures with Z' > 1 40 or more than one molecular component, e.g. co-crystals or salts.

Statistical Assessment of Lattice Energy Calculations
The predictive performance of the lattice energy calculations was assessed in terms of the root mean-squared error (RMSE), the coefficient of determination (R 2 ), mean absolute deviation (MAD), Pearson's correlation coefficient (r) and Spearman's rank correlation coefficient ( ). 31,65 In addition, the 90% confidence intervals for the distributions of signed and unsigned errors obtained with the force-field protocols, for different scenarios, are provided in the Supporting Information (see below). For all force-field comparisons on the same sets of data, the protocols were ranked in order of increasing RMSE, which is equivalent to ranking in order of decreasing R 2 . 65 See Supporting Information section A.10 for further details.

Principles for Assessing Pharmaceutical Relevance
Two different approaches (A and B) were investigated for assessing the pharmaceutical relevance of entries in the SUB-BIG dataset. Importantly, for both approaches, pharmaceutical relevance was assessed with respect to whether the lattice energystructure relationships observed within the subset of SUB-BIG classified as pharmaceutically relevant were expected to reflect the relationships observed for active pharmaceutical ingredients (APIs) or API candidates in drug development. (By lattice energystructure relationships, we mean the molecular characteristics contributing to lattice energies and the manner in which they contribute. These could be different for pharmaceuticals as compared to general organic chemicals, e.g. due to the prevalence of particular molecular sub-structures or other changes which affect the kinds of intermolecular synthons formed and the strength of their contributions towards the lattice energy.) Any other considerations, such as reactivity and biological activity, 66 were not deemed relevant, as the aim here was to ensure that the evaluation of lattice energy calculation routines would be relevant for the kinds of chemicals being assessed during drug development.

Pharmaceutical Relevance Approach A
A Support Vector Machine (SVM) binary classification model, 67

Pharmaceutical Relevance Approach B
SUB-BIG entries were deemed to be pharmaceutically irrelevant if their values for molecular weight and the nConf20 flexibility descriptor 75 were outside the typical range of values for a representative subset of the GlaxoSmithKline (GSK) in-house crystal structure database: molecular weight = 212 -586, nConf20 = 0 -37. The analysis which led to the selection of these descriptors and a complete explanation of method B are provided in Supporting Information sections A.6.5. and A.6.7. respectively.

Assessment of the Suitability of Pharmaceutical Relevance Approaches
Analyses, which are fully described in Supporting Information section A.6, provided partial evidence in support of different structurelattice energy relationships for chemicals predicted to be drug-like vs. non-drug-like by method A. They more clearly indicated different structure -lattice energy relationships when the available data were partitioned using the median value for the descriptors employed for method B, albeit different thresholds for these descriptors were employed to filter SUB-BIG entries.

Density Functional Theory Calculations of Lattice Energy and Vibrational Energy
The DFT lattice energy calculations are summarized in  Information section A.6.7., molecular weights, density and whether they contained hydrogen bonds, according to the CSD Python API (entry.crystal.hbonds()). 58 Full details are provided in the Supporting Information ("SI_Electronic_Files\SUB-BIG_Dataset\DFT-Subset").

Results and Discussion
The results of all force-field and DFT lattice energy calculations are provided in the Supporting Information: "SI_Electronic_Files". The key findings obtained from analysis of those predictions are presented below.

Force-Field Protocol
It should be noted that various calculations failed, either because Materials Studio was unable to process the structurefor CSD refcode HEXDEC03or because the lattice energy calculation workflow failed at some point, for some of the force-field protocols. In the latter case, different protocols failed for different structures. (The one Boron containing structure -CSD refcode TPHBOR01could only be parsed when the Universal 60 or Dreiding force-fields 59 were employed. This may reflect the fact that these force-fields were designed to offer broad coverage of chemical space. The fact that the COMPASSII, COMPASS, PCFF and CVFF calculations failed reflects the more specific atom type definitions employed by these force-fields. When trying to run calculations using each of these force-fields for the TPHBOR01 structure, following the bond calculation step in the Materials Studio GUI, Materials Studio reported that a forcefield type could not be assigned. We note that the COMPASSII force-field does, at least, include an atom type for tetravalent Boron. 20,28 However, the Boron atom in TPHBOR01 is trivalent.) Since different protocols failed for different structures, only the subset of 210 SUB-BIG entries for which none of the calculations failed was used to compute performance statistics when comparing all 324 protocols.
The predictive performance statistics, across these 210 structures, for the five top ranking (lowest RMSE values) force-field protocols are presented in Table 1. Regarding the application of the calculation quality setting, we found that, across these 210 structures, the ultra-fine setting consistently increased the average calculation time, albeit typically by less than one second. However, it only improved the RMSE 101/162 times and all performance statistics were only consistently improved 69/162 times. In the case of the top ranking protocol from Table 1, the corresponding protocol employing the medium calculation quality setting was worse in terms of all performance statistics across these 210 structures, albeit with an RMSE increase of only 1.12 kJ/mol and a corresponding increase in average computational time of less than one second. These findings reflect the tighter convergence limits for these calculations (see Table S5), as illustrated for some examples in Supporting Information section C.14. When only the SUB-BIG entries considered pharmaceutically relevant according to method A are considered (see Figure S1), the protocol with the lowest RMSE value obtained without filtering SUB-BIG (top row of Table 1) remains the top ranking protocol. When only the SUB-BIG entries considered pharmaceutically relevant according to method B are considered (see Figure S2), that protocol is ranked 14 th out of 324. The precise ranking, according to RMSE, of this protocol (see Figures S3 -S5) is further changed upon restricting consideration to those data judged of highest quality according to the heuristic data quality rankings assigned to every pair of crystal structure and estimated experimental lattice energy in SUB-BIG. (Since this appreciably reduces the size of the relevant subsetssee Figure S6, the rankings may not be as robust. The different rankings observed when filtering according to data quality may also, in some cases, partially reflect changes in coverage of chemical space as well. Similar molecular weight distributionssee Figure S6 but appreciably different nConf20 75 - Figure S7 -    Table 1.

Recommended Force-Field Protocol
Out of the force-field protocols evaluated in this work, we recommend the top ranking protocol (lowest RMSE) as assessed upon all SUB-BIG entries for which every force-field protocol was able to calculate a lattice energy (  Tables S8-S9). Moreover, the rankings obtained using the larger set of data, i.e. without filtering SUB-BIG, can be expected to be more robust.
This recommended protocol corresponds to the following options: COMPASSII force-field, 28 with force-field assigned charges, partial relaxation of the crystal structure geometry (including the positions and internal co-ordinates of the molecules, but not the unit cell parameters), local optimization of the gas phase geometry and the ultra-fine quality setting for Materials Studio 35 calculations (see Figure 6).
However, it is not necessarily the case that the partial optimization of the crystal structure carried out by this protocol makes this most suitable for studies of intermolecular synthons or crystallization processes. [11][12][13][14] Indeed, Bernardes and Joseph previously found, based upon consideration of different force-field protocols, that the most suitable protocol for accurate lattice energy calculations was not most suitable for evaluating structural characteristics. 25 It is possible that, at least in part, this protocol gave the best empirical results, prior to subsetting the dataset, due to cancellation of errors involved in each step. Nonetheless, we found that allowing the unit cell parameters to relaxi.e. full local optimizationnot only typically resulted in the calculated lattice energies becoming more negative and worse performance in terms of most statistical measures ( Table 1), but could lead to significant distortion of the unit cell in some cases (Supporting Information section C.13). Hence, the constrained relaxation allowed for by the recommended protocol appears to be better for calculating lattice energies and may provide more realistic crystal structures for subsequent analysis.
However, crystal structure prediction 8,80 cannot rely upon fixed, experimental lattice parameters.
Hence, the observation that the optimized structures obtained when the recommended protocol was modified to allow complete relaxation of the crystal structure, including the lattice parameters, could differ significantly to the experimental structures in some cases (Supporting Information section C.13) may mean the force-field settings are not suitable for crystal structure prediction.
However, this question is beyond the scope of the present study and requires further investigation.  Table 2 were computed on a larger set of 226 structures than the 210 structures used to compare different protocols, as per Conversely, if a model consistently produced residuals of a given magnitude, i.e. if RMSE was fixed, R 2 would be reduced if the data were distributed over a narrower range. 65 Nonetheless, in spite of the possibility that some of the variation in these statistics might be affected by artefacts, consideration of all statistics suggests that the performance of the recommended protocol is either comparable or somewhat, but not dramatically, worse when moving to subsets of higher putative pharmaceutical relevance, including the DFT-Subset (a diverse subset of entries deemed pharmaceutically relevant according to method B). All experimentally estimated and calculated lattice energies, using the recommended protocol coupled with the heuristic filtering criteria, for all SUB-BIG entries are shown graphically in Figure 7.
Additional statistics are presented in Supporting Information Table S11, following filtering of these subsets to only retain the highest quality datapoints (data quality ranking = 1), according to the heuristic data quality labels. Curiously, although all performance statistics improve when filtering the applicable SUB-BIG entries and the subset deemed pharmaceutically relevant according to method A, the performance statistics are often worse when filtering the subset deemed pharmaceutically relevant according to method B or the DFT-Subset, a subset of those deemed pharmaceutically relevant according to method B, according to the heuristic quality labels. However, the statistics calculated on these smaller subsets are arguably less robust and possibly reflect, in part, differences in coverage of chemical space as well as data quality (see Figure S7).
Finally, the distributions of the signed and absolute error distributions, when applying the recommended force-field protocol coupled with the heuristic filtering criteria, are summarized in Figures S18 -S25. These indicate that, depending upon the subset of the data considered, the absolute prediction error has at least a 95% chance of not exceeding 36.6 kJ/mol. a. Entries for which the recommended force-field protocol failed or which did not pass the heuristic filtering criteria were removed. The filtering criteria removed any entries for which the force-field protocol calculated lattice energies which were positive, less than or equal to -249 kJ/mol and/or parsing of the messages generated by the Materials Studio Perl script indicated the automated structure preparation workflow had failed. b. All force-field calculations refer to the recommended protocol: COMPASSII force-field, with force-field assigned charges, partial relaxation of the crystal structure geometry (including the positions and internal co-ordinates of the molecules, but not the unit cell parameters), local optimization of the gas phase geometry and the ultra-fine quality setting for Materials Studio. neither. Of the significant outliers (absolute error >50 kJ/mol), labelled according to their CSD refcode, only NOZWII has significant experimental uncertainty (data quality rank = 8).

Comparison of Force-Field and DFT Lattice Energy Calculations
As can be seen from Table 2, the recommended force-field protocol clearly performs much better than the DFT protocol, in terms of all statistical measures. (As can be seen from Table S11, this remains the case when the DFT-Subset is filtered to only retain the highest quality datapoints, according to the heuristic data quality labels.) The comparative performance of both protocols is graphically illustrated in Figure 8. It is apparent that the DFT protocol tends to significantly overestimate the magnitude of the lattice energy, a bias which is reflected in its RMSE value (37.14 kJmol) being substantially larger than the standard deviation of the estimated experimental lattice energies for the DFT-Subset (21.57 kJ/mol), i.e. the model yields worse absolute predictions than a model which merely reproduced the average experimental value. Still, the positive correlation coefficients are better than would be expected for this so-called null model. 2 Finally, we note that the dispersion corrected DFT protocol was also outperformed by a variant of the recommended force-field in which the unit cell parameters were also relaxed (Supporting Information Table S12). This protocol is more directly comparable to the DFT protocol, since both protocols employed full local optimization of the unit cell geometry and gas phase molecule.
Our finding, that the dispersion corrected DFT protocol employed herein (PBE+TS), 77,78 was substantially worse at calculating lattice energies than the recommended force-field protocol, prompted recent enhancements to the CASTEP DFT code employed for this work. Specifically, this work resulted in the development of CASTEP's dispersion functionality to include an efficient many body dispersion functionality, 81 Grimme D3. 82 However, the newly developed CASTEP code was not available to us for the DFT calculations reported in the current work. The identity line is shown for both plots. All entries were deemed pharmaceutically relevant according to method B and the results for those entries which were also deemed pharmaceutically relevant according to method A are denoted using yellow circles. Only with DFT predictions are significant outliers (absolute error >50 kJ/mol) observed. Of these outliers, labelled according to their CSD refcode, only UGIYEP has significant experimental uncertainty (data quality rank = 7).
The molecular structures of these compounds can be seen in Figure 4.

Significance of Vibrational Contributions Calculated Using DFT
As can be seen in Supporting Information Section C.5 (Table S10) 25 and original COMPASS II validation set (41 crystals). 28 We computed performance statistics on subsets of those datasets for which our recommended force-field protocol calculated lattice energies which passed the heuristic filtering criteria. These statistics were based upon experimental estimates of lattice energy reported within or which could be estimated from those earlier studies. Where possible, we also computed performance statistics for these subsets based upon the calculated lattice energies, obtained with a variety of computational methods, which were reported within or could be estimated from those earlier studies. Full details are presented in Supporting Information section C.10, with the performance statistics summarized in Table S13.
These results clearly show that our recommended force-field protocol clearly gives worse performance, in terms of all statistics, than the best dispersion corrected DFT methods reported for the applicable subset of the X23 benchmark. 22 (However, these calculations have a much higher computational overhead than our recommended force-field protocol.) It is also outperformed, in terms of most statistics, by the OPLS-AA force-field calculations for the applicable subset of the dataset of Bernardes et al. 25 However, its performance on the applicable subsets of the original COMPASS II validation set and the SUB-48 dataset is typically competitive, in terms of most statistics, to DMACRYS 2 and the COMPASS II sublimation enthalpy simulations of Sun et al. 28 (converted to estimated lattice energies herein) respectively.
Moreover, these comparisons are based upon much smaller datasets than the SUB-BIG dataset provided in our work and used to select our recommended force-field protocol (compare the numbers in Table 2 and Table S13). In addition, the subsets of these various datasets, for which the methods were compared, correspond to different distributions of key descriptors, which are expected to be related to different lattice energystructure energy relationships, as shown in We also note that a significantly higher R 2 (0.97) than that obtained on the SUB-BIG dataset using our recommended force-field protocol (0.64 on the 226 entries left after applying the heuristic filtering criteria) was reported more than two decades ago by Charlton et al., 51 for comparisons of force-field lattice energy calculations using the HABIT 33 program without any structure optimization and atomic charges from semi-empirical quantum calculations. Whilst we were unable to derive their dataset to perform a direct comparison, we note that their dataset (62 crystal structures) was considerably smaller than the SUB-BIG dataset (235 crystal structures) employed for the current study.
Further work is required to more confidently assess the relative merits of these different lattice energy calculation protocols, via comparison using a larger dataset than in earlier studies. The SUB-BIG dataset reported herein, along with the putative pharmaceutical relevance classifications, could support these further studies, as could the even larger dataset reported, in the course of revising this work for publication, by Gavezzotti and Chickos. 36 Indeed, integration of these datasets, especially if further supplemented with data clearly linked to multiple polymorphs of the same molecules, 25 could yield an enhanced dataset for future investigations.
Finally, we note that our study was concerned with evaluating calculations of lattice energies from crystal structures, focusing on force-field protocols which could be readily implemented using the industry standard Materials Studio software. Hence, models for lattice energy, or the related sublimation enthalpy, based purely on molecular structures were considered out of scope. In principle, calculations based upon crystal structures offer the ability to rank polymorph stability and support crystal structure prediction, 8  However, whether these statistics are directly comparable in light of possible differences in data distributions 2,65 or whether the apparent performance would be comparable in terms of other statistics ( Table 2) is an open question. It would be particularly interesting to see whether some inexpensive calculations based upon molecular structure could offer comparable predictive performance to some crystal structure calculations, with reduced computational overhead.

Computational Overhead
The recommended force-field protocol takes approximately two seconds, on average, per crystal structure (based on the arithmetic mean timings for all DFT-Subset entries returned by the Perl time() function). In contrast, the dispersion corrected DFT lattice energy calculation took more than a day per crystal structure, on average, and the subsequent vibrational energy calculations took more than three days, on average, per crystal structure. (N.B. The different resources used to perform these calculations are documented in the Supporting Information (Section B).) Hence, also taking into account the reasonable correlation with experimental estimates of lattice energies (e.g. see Table 2), our recommended force-field protocol is suitable for screening in-house industrial databases of the order of hundreds (or maybe thousands) of crystal structures, as well as studying lattice energy structure relationships in larger datasets of tens of thousands of crystal structures. Indeed, recent work within industry 85 and ongoing work within the ADDoPT project 86 has investigated lattice energy structure relationships based upon COMPASSII force-field calculations for hundreds and tens of thousands of crystal structures respectively.

Scope for Improving the Recommended Force-Field Protocol
The high Spearman's rank correlation coefficient (0.88) obtained with the recommended protocol, coupled with the heuristic filtering criteria, across all applicable SUB-BIG dataset entries suggests this computationally inexpensive protocol is useful for screening large numbers of crystal structures according to their relative lattice energies. However, the corresponding error estimates (RMSE around 17 kJ/mol) indicate that this method cannot be used to estimate relative stabilities of polymorphs. 56,83 Estimating the phase diagram requires Gibbs free energy calculations and very high-level estimates of the lattice energy, as recently demonstrated for methanol. 83 There are several possible avenues which could be explored in future studies to improve the lattice energies calculated using the recommended force-field protocol: (1)  experimental sublimation may also correspond to the formation of gas phase clusters, rather than isolated molecules. 19 However, it is arguable that this issue requires correction of the sublimation data, 19 rather than the calculation, given the definition of lattice energy with respect to infinitely separated gas phase molecules.) 8 Nonetheless, the poor correlation observed between absolute prediction errors and the conformational flexibility descriptor nConf20 75 (Supporting Information, Figure S10), suggests the failure to globally optimize the gas phase molecule was not a significant limiting factor here in most cases.
Regarding the need for improved description of the intermolecular and/or intramolecular interactions, it should be noted that, whilst the functional form of the underlying COMPASSII force-field is documented along with the parameterization routine, full details of the parameters were not reported in the accompanying publication. 28 Nonetheless, since this force-field employs a simple Lennard−Jones, pairwise additive, description of dispersion, 20,28 we speculate that the incorporation of more sophisticated dispersion contributions, as recommended elsewhere, 31 could improve the calculations. We further speculate that the inclusion of atomic partial charges using quantum chemical methods might also yield better predictions.
Regarding processing of the crystal structures, it should be noted that our implementation of the force-field workflows assessed in this work is not capable of handling Z' > 1 structures. 40 It is also clear that the structure pre-processing (Figure 1, Figure 6) could cause significant errors in some cases (see Supporting Information section C.7.), albeit the heuristic filtering criteria should serve to identify when these errors occurred. Improvements to this automated workflow could avoid chemically incorrect representations of a few structures leading to serious errors in the calculated lattice energies.
Finally, recent work has found that Machine Learning may be used, with some success, to predict the difference between force-field and DFT calculations of lattice energies from knowledge of different crystal structures of the same molecule. 88 Similar approaches might be suitable to learn the error in the force-field calculations of lattice energy presented here. The experimental lattice energy estimates, coupled with the corresponding force-field calculated values, for the SUB-BIG dataset, which we make available in the Supporting Information ("SI_Electronic_Files"), provide a useful starting point for exploring this in future studies.

Conclusions
To the best of our knowledge, we report the most extensive evaluation of different force-field We determined that our recommended protocol outperformed the dispersion corrected DFT protocol investigated herein (PBE+TS), based on a diverse subset of SUB-BIG (17 molecular crystals) of putative pharmaceutical relevance (method B). This work prompted further development of the CASTEP DFT code, employed for this work, to improve its dispersion functionality.
Comparison of the predictive performance of our recommended force-field protocol, coupled with the heuristic filtering criteria, to the performance of other lattice energy calculation protocols reported in the literature for much smaller datasets suggested some other calculation routines could provide more reliable lattice energy estimates. However, some of these calculation routines have a significantly higher computational overhead, hence are less suitable for obtaining quick estimates of lattice energy or screening large databases. Moreover, further work is required to assess whether the relative performance of the methods would remain unchanged when assessed on larger datasets. In addition, there is scope for improving our recommended forcefield protocol, building upon the results presented in our current work. The SUB-BIG dataset we make available here would provide a basis for future studies aimed at more robust comparison to other methods as well as improvements to our recommended force-field protocol.

Supporting Information
Supporting Information Available: 1. Additional details regarding the methods and data used to generate our results (section A), detailed instructions on how to reproduce our results using the code (Perl scripts, Python scripts, Jupyter Notebooks) which is made available on Zenodo 57