Background

Ionic liquids (ILs) form an interesting group of chemical compounds. They can be built of various types of cations (imidazolium, pyridinium, pyrrolidinium, piperidinium, ammonium etc.) and anions (halides, bis[(trifluoromethyl)sulfonyl]imide, tetrafluoroborate, hexafluorophosphate etc.) [13]. The unlimited possibility of modification of particular ion in IL makes it possible to obtain the properties suitable for the specific needs. ILs have got several characteristic properties: they stay in liquid state at wide range of temperature; they have got an insignificant vapor pressure and melting point less than 100 °C; what is more, they are good solvents for various compounds [1, 3, 4]. Those properties as well as the possibility of structure’s changing indicate ILs’ application in many areas; examples are shown at Fig. 1 [414].

Fig. 1
figure 1

Ionic liquids applications

Such a wide spectrum of ILs’ potential applications, particularly as solvents, requires determination of their physicochemical properties, which are important from the technological point of view. Those properties are: viscosity, density, solubility in water, melting point etc. Experimental measurements of physicochemical properties for such a large group of chemicals are expensive and time-consuming. In this case, computational techniques could be applied to fill in the lack of the experimental data. One of the widely used computational methods is Quantitative Structure-Property/Activity Relationship (QSPR/QSAR) approach [1519]. QSPR/QSAR is nowadays extensively used for various purposes, e.g. predicting toxicity against different species [20, 21], identification of potentially hazardous metabolites [22, 23] and determination of the chemicals’ mobility in the environment [24]. Building QSPR models requires calculating molecular descriptors, which are “formal mathematical representations of a molecule” [24]. That step (in case of 3D and 4D descriptors- obtained from three-dimensional molecular structure) involves optimization of the structure’s geometry (finding structure’s global energetic minimum). Geometry optimization may be performed at different levels of the theory like ab initio methods (e.g. Hartree–Fock), density functional theory (DFT) and semi-empirical methods (e.g. PM6, PM7) [25, 26]. First two are based only on the theoretical assumptions. In contrary, semi-empirical methods take into account not only quantum mechanics theory, but they also used approximations (parameters) which are fitted to empirical data, especially molecular energies and geometries. Main advantage of semi-empirical techniques is short time of calculations, however they are consider as less accurate [27].

According to our best knowledge, any recommendations of the choice of the optimization method of ionic liquids have not been published yet. There are already published QSPR/QSAR models for ILs developed on descriptors calculated from molecular geometries optimized from semi-empirical (AM1 [28], PM3 [29]), ab initio (RHF/6-31G**) [30] and DFT methods (B3LYP/6-31G*) [31]. Since calculating the descriptors with different methods may result in additional uncertainty between QSPR models, in our work we focused on examining, what is the influence of the geometry optimization method on the QSPR model. In order to determine the scale of uncertainties originating from the optimization method selection, we conducted a systematic comparison of (I) three different sets of molecular descriptors, obtained by calculations based on differently optimized structures and (II) three QSPR models built on the basis of those descriptors’ sets. Main goal of the presented paper was to find an answer for following questions: (I) Which basis set should be used to optimize structure geometry in quantum-mechanics methods? (II) Is there any significant difference between the models, where the descriptors values have been obtained based on the structures optimized with different methods (PM7, HF and B3LYP)? (III) Is there any optimal approach that should be recommended to obtain reliable QSPR models for predicting ILs’ properties?

Materials and methods

Molecular models of the ions present in the studied ILs were optimized separately by three various methods (PM7, HF/6-311+G* and B3LYP/6-311+G*). Then, we performed a series of the statistical Wilcoxon’s tests in order to examine the influence of the optimization method on the descriptors’ values. Based on the results of the statistical analysis, we were able to determinate, whether there were significant differences in the calculated descriptor values.

Subsequently, we developed three QSPR models for predicting density of ILs utilizing the same descriptors calculated from the molecular geometries optimized by PM7, HF and B3LYP methods. This was performed for investigating the influence of the optimization method on the prediction ability of the models. The models were developed based on the same experimental data set, according to the following scheme: (1) splitting experimental data into training and validation sets, (2) calculating molecular descriptors, (3) selecting the optimal, physically interpretable, combination of the descriptors and then training the QSPR model, (4) external validation of the model, (5) providing the physical interpretation of the model.

Experimental data

Density (ρ) values for 66 ionic liquids were collected from NIST Ionic Liquids Database—IL Thermo [32]. Density was measured by the vibrating tube method at 298.15 K for all ILs. The studied ILs consisted of six various types of cations (imidazolium, ammonium, phosphonium, pyridinium, pyrrolidinium, piperidinium) and 27 anions (for more information please refer to Table A in Supplementary Information). Schematic representation of the cations’ structures is presented on Fig. 2.

Fig. 2
figure 2

Schematic representation of cationic structures: a ammonium, b phosphonium, c imidazolium, d pyridinium, e pyrrolidinium, f piperidinium

Geometry optimization and calculation of descriptors

In the next step, we created molecular models of all cations and anions constituting the studied ILs. This stage was performed with the Gauss View [33] software. Molecular geometries of the cations and anions (separately) were then optimized at three levels of the theory. Geometry optimization at the semi-empirical PM7 level was performed with the MOPAC 2012 software [34]. Hartree–Fock method was chosen in the study, as the basic ab initio algorithm. There are different functionals widely used in various types of DFT calculations. Since the hybrid functional B3LYP (Becke 3 term with Lee, Yang, Parr exchange) is the most commonly used one in QSPR studies [3538], we decided to employ this functional in our calculations. In order to perform and both ab initio and DFT calculations an adequate basis set must be used. As such, we decided to compare different sizes of Pople-style, contracted basis sets, including those of double and triple zeta types augmented with polarization and diffuse functions. We took into account: the energy of the structure (after optimization), the computing time, and the difference in atomic distances and angles between the computationally optimized structures and the experimental (crystallographic) data. The comparison was conducted for the anion most frequently present in the studied set of ILs: bis[(trifluoromethyl)sulfonyl]imide [NTf2], and for the cation of medium structural complexity: 1-ethyl-3-methylimidazolium [EMIM] (for more information please refer to Table B in Supplementary Information). Based on the comparison results, we decided to use the triple zeta basis set augmented with one set of diffuse and polarization functions on heavy atoms (6-311+G*). Both, ab initio (Hartree–Fock) and DFT (B3LYP) calculations were performed with the identical basis set with Gaussian 09 software [39]. Afterwards, we calculated molecular descriptors from the geometries optimized by the three mentioned methods. The descriptors were calculated with Dragon (version 6.0) software [40].

Model development

First, the collected experimental data were sorted according to the increasing values of ILs’ density (the modeled variable). Then, we split the data into the training and validation set using so-called “Z:1 algorithm”, in which every Zth compound is assigned to the validation set, whereas the remaining ones form the training set [41]. In effect of applying the splitting procedure (here Z = 3), we obtained the training set containing 45 ionic liquids (68 % of all) and the validation set containing 21 compounds (32 %). A table summarizing the collected data can be found in Supplementary Information.

The search for the optimal descriptor combination was carried out in two steps: firstly, by selecting 3D descriptors significantly correlated with IL density (r > 0.60); secondly, by applying the genetic algorithm, implemented in the QSARINS software [42, 43]. The following control parameters of the genetic algorithm were applied: population size: 20, and mutation rate: 45 %. The algorithm was used only for the descriptors calculated via the Hartree–Fock optimization method. The ab initio Hartree–Fock method constitutes a middle-ground of sorts—between DFT B3LYP and the semi-empirical PM7 method. We employed multiple linear regression (MLR) as the method of modeling.

Validation process

To ensure reliability of our model we followed the OECD (ang. Organization for Economic Cooperation and Development) recommendations for developing QSPR models proposed in 2004 [44]. According to those principles we properly defined endpoint; ensured transparency in the model algorithm; defined domain of applicability (AD); calculated measures of goodness-of–fit, robustness and predictivity and presented a mechanistic interpretation.

Main purpose of defining applicability domain (AD) is to point out eventual limitations of the developed model. Reliability of the predicted values depends on the structural similarity between the chemicals from the prediction and training sets [45]. There are several methods used to define limits of AD [46, 47]. In our work AD was investigated with the Williams plot (a plot of the standardized residuals vs. the leverage values, hi). The leverage values (hi) present similarity of particular compounds to the training set and can be calculated from the following Eq. (1):

$$h_{{\mathbf{i}}} = {\mathbf{x}}_{{\mathbf{i}}}^{{\text{T}}} ({\mathbf{X}}^{{\text{T}}} {\mathbf{X}})^{{ - 1}} {\mathbf{x}}_{{\mathbf{i}}}$$
(1)

where x i is the vector of descriptors calculated for the considered ith compound and X is the matrix of descriptors calculated for all compounds from the training set.

Applicability domain is then determined by the two critical values: three standard deviation units of the standardized residuals (±3σ) and the threshold leverage value (h*). The value of h* is calculated as h* = 3p′/n, where p’ is the number of model’s variables plus one, and n is the number of compounds in the training set. The predictions for compounds with hi > h* are treated as the results of extrapolation, so they are less reliable [4850].

Furthermore, leverage approach was compared with the standardization method proposed by Roy [51]. In that approach, compounds with features very dissimilar to the rest in the training set are called “X-outliers”. Compounds from validation set which are not similar to any of the training set are considered as points outside the applicability domain. We were able to identify the X-outliers compounds and those that are outside the AD in our dataset by using application named “Applicability domain using standardization approach”.

Goodness-of–fit of our model was measured by the determination coefficient R2 (Table 1) and the root mean squared error of calibration (RMSEC). To verify robustness and predictivity of the model we performed internal (leave-one-out method, LOO) and external validation. Robustness (\({\text{Q}}_{\text{CV}}^{2}\), RMSECV) and prediction (\({\text{R}}_{\text{EXT}}^{2}\), RMSEEXT) parameters were calculated accordingly to the equations given in Table 1 [49, 52]. In addition, we calculated the concordance correlation coefficient (CCC) [53] to measure model’s precision and accuracy and different variants of \({\text{r}}_{\text{m}}^{2}\) to indicate the external predictive capacity of a model [54]. We also estimated the presence of influential points in the training set by performing F-test proposed by Toth et al. [55], where F value is calculated by equation: \({\text{F}} = \left( {1 - {\text{Q}}_{\text{CV}}^{2} } \right)/(1 - {\text{R}}^{2} )\).

Table 1 Quality measures for QSAR models

Results and discussion

Optimization method versus descriptor values

To obtain deeper insight into the optimization results we tested the influence of the three methods (PM7, HF/6-311+G* and B3LYP/6-311+G*) on the values of molecular descriptors. From the entire set of ILs presented in this paper, we selected 29 unique cations and 27 anions (Table C in Supplementary Information). Then, we chose only groups of 3D descriptors (ones that might be affected by the molecule’s 3D structure) from the entire set of the calculated molecular descriptors. There were: Geometrical Descriptors, Radial Density Function descriptors (RDF), 3D-MOlecule Representation of Structures based on Electron diffraction (3D-MoRSE), Weighted Holistic Invariant Molecular descriptors (WHIM), GEometry, Topology and Atom-Weights AssemblY descriptors (GETAWAY), and Randic Molecular Profiles. Next, we divided them into smaller sub-groups according to their weighting scheme or descriptors type (for example Molecular Randic Molecular Profiles or Shape Randic Molecular Profiles—see Figure 1S in Supplementary Information). At the end, we performed a series of the statistical Wilcoxon’s tests (at 5 % level of confidence), comparing the descriptors of each cation and each anion to the descriptors of their analogs from the sub-set optimized with different method. For each ion we performed a number of tests equal to the number of descriptors’ sub-groups.

According to the obtained results (Figure 1S in Supplementary Information), it can be noticed that there are groups of descriptors especially sensitive on the structure optimization method. In cases of both cation and anion, Geometrical Descriptors have significantly different values, dependently on the optimization method applied (panels A–F). RDF descriptors are also sensitive on the optimization method, but it is only the case for cations (panels A–C, sub-groups 2–6). 3D-MoRSE descriptors constitute the class being the least affected by the different optimization methods (with only few exceptions from this trend). Weighted WHIM descriptors exhibited significantly different descriptors values for anions, but only when comparing the structures optimized with HF method to the other two methods. This suggests that similar values of the descriptors are obtained for the structures of anions optimized with PM7 and B3LYP methods. The comparison between the values of weighted WHIM descriptors gave slightly different result for cations. In this case, the values of the descriptors for cations optimized with HF method and PM7 are more similar. The similarity between the WHIM descriptors values for cations optimized with HF and B3LYP methods is much smaller. WHIM total descriptors have different values in case of every optimization method for both cations and anions (panels A–F, sub-group 17). In case of GATEWAY descriptors, there are are significant differences in descriptors values for every optimization method for both cations and anions. In the pool of GETAWAY descriptors, least differences can were noticed for cations optimized with use of HF and B3LYP methods (Panel C, sub-groups 18–22). Additionally, sub-classes of unweighted GATEWAYs and GATEWAYs weighted by ionization potential are rather similar for both cations and anions, independently of compared optimization methods—Panels A–F, sub-groups 18 and 22) Group of autocorrelation GATEWAY’s seems to be more sensitive for the molecule’s optimization method (panels A–F, sub-groups 23–27). Finally, Randic Molecular Profiles came out to be very sensitive for molecules optimization method both in case of cations and anions (panels A–F, sub-groups 28–29).

To sum up, our analysis proved, that the optimization method might significantly affect the descriptor’s values of most classes of 3D descriptors. In order to verify, what is the real influence of the optimization method selection for QSPR modeling, we performed further analyzes.

Optimization method versus predictive ability

As mentioned in Model development section, relationship between the structure of ionic liquids and the density was described by the quantitative model developed with GA-MLR technique. The developed model is a linear combination of two, uncorrelated (r = 0.09) descriptors: the 3D-MoRSE descriptor weighted by mass calculated for anion (Mor03 mA), and the mean information content on the leverage magnitude calculated for cation (HICC). The models’ equations obtained for the ions optimized with the three methods (PM7, HF/6-311+G* and B3LYP/6-311+G*) are presented in Table 2.

Table 2 Equations of developed models

First two models (PM7- and HF-based) are characterized by satisfying goodness-of-fit, robustness and predictive capabilities (the values of R2, \({\text{Q}}_{\text{CV}}^{2}\), \({\text{Q}}_{\text{EXT}}^{2}\) and CCC close to 1 and low values of the errors: RMSEC, RMSECV, RMSEEXT). The last one, developed with using of descriptors calculated based on the structures of ions optimized via B3LYP method has lower quality parameters. The visual correlations between the observed (experimental) and the predicted density values for the three developed models (Fig. 3) were in good accordance with the statistical parameters mentioned above.

Fig. 3
figure 3

Experimental versus predicted density values for each optimization method (PM7, HF/6-311+G* and B3LYP/6-311+G*)

Interestingly, the three developed models do not have identical applicability domain (Fig. 4). In case of the model utilizing descriptors calculated after PM7 optimization, two compounds, namely: 1-(2-methoxyethyl)-1-methylpyrrolidinium tris(pentafluoroethyl)trifluorophosphate (#56) and 1-methyl-3-octylimidazolium tris(pentafluoroethyl)trifluorophosphate (#28), exhibit leverage values higher than the critical one. However, the residual values stay within ±3 standard deviations from the mean value. That kind of point is called “good high leverage point” or “good influence point” and they stabilize the model (predicted data are correctly extrapolated) [56]. In case of the model developed based on descriptors obtained after HF optimization, all molecules are located within the space limited by threshold of ±3σ and critical leverage value. One ILs, namely: 1-methyl-3-propylimidazolium chloride, is placed on the edge of AD. As such, the value of density predicted for that IL has to be taken into consideration with greater caution. The model developed with descriptors calculated after optimization by B3LYP method has two points with leverage values higher than critical one. First IL, having small leverage value and residual the value between ±3σ (1-(2-methoxyethyl)-1-methylpyrrolidinium tris(pentafluoroethyl)trifluorophosphate) is a “good influence point”. Second IL with the high leverage value and the residual lower than −3σ (1-methyl-3-octylimidazolium tris(pentafluoroethyl)trifluorophosphate) is a “bad influence point”. It destabilizes the model [56]. Interestingly, 1-methyl-3-octylimidazolium tris(pentafluoroethyl)trifluorophosphate is considered as a “good influence point” in case of the model utilizing descriptors calculated based on PM7 optimization. That difference results from better prediction capability of the “PM7-based” model (predicted values are more similar to experimental ones).

Fig. 4
figure 4

Applicability domain (Williams plot) for each optimization method (PM7, HF/6-311+G* and B3LYP/6-311+G*)

As it was mentioned in the section: Validation process, applicability domain was also determined by using application “Applicability domain using standardization approach” (Table D in Supplementary Information). For “PM7-based” model, accordingly to standardization method, none of the ILs was consider as outlier, all have similar features. That result is not consistent with leverage approach, where two compounds from training set (#28, #56) were identified as less structurally similar to the others. In case of “HF-based” model obtained results are identical for both approaches, none of the compounds was classify as outlier or point out of AD. In case of the third model, “B3LYP-based”, two mentioned points were recognized as outliers, confirming the results from leverage approach. However, visualization of the outcome of leverage approach (Williams plot) gives us more precise information. Both points are less structurally similar to the rest from the training set, although prediction for one of them (#28) cannot be considered as reliable due to residual value outside of ±3σ limits.

All three presented models are linear combinations of the same two descriptors: Mor03 mA and HICC. The first one belongs to the group of 3D-MoRSE descriptors. That wide group is based on the electron diffraction descriptors and can be calculated with various weights (atomic mass, van der Waals volume, Sanderson electronegativity, polarizability). Notation Mor03 m indicates that the descriptor used here is weighted by mass; number 3 is related to scattering parameter. Generally, the weighted descriptors could be used to identify presence of specific molecular fragments. Weighting by atomic mass increases effect of heavy atoms on the values of 3D-MoRSE descriptors [57].

For anions with similar skeleton of the molecule and equal number of atoms) e.g. cysteinate and serinate anions) the presence of sulfur atom significantly decreases the value of Mor03 m descriptor (Fig. 5). Moreover, an increasing molecular mass (i.e., by adding next substituent, e.g. methyl group in threonate anion) also decreases descriptor’s value.

Fig. 5
figure 5

Relationship between anion structure and Mor03 mA descriptor

In case of the studied set of anions, we found that the heaviest atom in the molecule has a significant impact on the modeled value. When one considers two ILs with the same cation and structurally similar anions, such as hexafluorophosphate and tetrafluoroborate, one can notice that the ionic liquid with \({\text{PF}}_{6}^{ - }\) has higher density value (1.370) than the one with \({\text{BF}}_{4}^{ - }\) anion (1.202). That difference is caused by the higher molar mass of phosphorous atom (30.97 u) in \({\text{PF}}_{6}^{ - }\) anion. Both boron and fluorine atoms present in \({\text{BF}}_{4}^{ - }\) anion have lower molar mass than phosphorus.

Second descriptor (HICC), belongs to the GETAWAY group. It is defined as:

$${\text{HIC}} = \sum\limits_{\text{i}}^{\text{A}} {\frac{{{\text{h}}_{\text{i}} }}{\text{M}} \cdot \log \frac{{{\text{h}}_{\text{i}} }}{\text{M}}}$$
(3)

where A is the number of atoms, h i the leverage of the ith atom and M is a constant equal to 1 for linear, 2 for planar, and 3 for non-planar molecules.

HICC can be used to distinguish between the substituents in a series of cations [58]. For example, the value of HICC for 1-methyl-3-methylimidazolium cation is equal to 3.612. Alkyl chain elongation results in the increase of the descriptor’s value (1-ethyl-3-methylimidazolium = 4.015, 1-propyl-3-methylimidazolium = 4.247). When one considers the group of imidazolium ILs with the same anion ([NTf2]), one would notice that the value of HICC descriptor is inversely proportional to the density. The relationship between the density and the descriptor values for imidazolium cation (geometry after PM7 optimization) was showed in Fig. 6.

Fig. 6
figure 6

Relationship between density values and HICC descriptor for imidazolium cation

We also explored the distribution of the selected unique cations and anions present in the studied ILs (Table E in Supplementary Information) in the space of Mor03 mA and HICC descriptors (Figs. 7, 8). Therefore, we were able to find a relationship between the optimization methods of ionic structure and the values of the descriptors from developed QSPR models. Figure 7 demonstrates the anions’ distribution in the space of Mor03 mA descriptor. Similarly, Fig. 8 shows the distribution of the particular cations in the space of HICC descriptor.

Fig. 7
figure 7

Distribution of particular anion in space of Mor03mA descriptor calculated after optimization by PM7, HF/6-311+G* and B3LYP/6-311+G* methods

Fig. 8
figure 8

Distribution of particular cation in space of HICC descriptor calculated after optimization by PM7, HF/6-311+G* and B3LYP/6-311+G* methods

We noticed that the descriptor’s values obtained with using of the three studied geometry optimization methods are similar for almost all anions (Fig. 7). Though, there are three points (#5, #13, #27), which show difference between the optimization methods. In case of bis[(trifluoromethyl)sulfonyl]imide anion (#5), we noticed that the application of PM7 and B3LYP methods give almost identical descriptor values for that anion, but in case of HF there is a difference in the Mor03 m value. For the two other anions: hexafluorophosphate (#13) tris(pentafluoroethyl)trifluorophosphate (#27) HF and B3LYP methods provided more similar descriptor values than that of PM7 method. For example, the values of Mor03 m for #13 in DFT and HF panels are equal −7.51, while in case of PM7 the value raises up to −6.67. On the contrary, the effect of optimization method on the HICC descriptor is negligible (Fig. 8). Descriptor’s values for all cations were similar in all three studied methods.

Recommendations

The most important observation of our study is that there were significant differences between the quality measurements of the three developed QSPR models. Best-fitted model (highest value of determination coefficient, R2 = 0.951) were developed based on the molecular structures optimized with HF method. When one considers the predictive capabilities of the model, it turns out that the same model has got the highest value of \({\text{Q}}_{\text{EXT}}^{2}\) (the predicted values are similar to the experimental data). That result indicates that HF method can be successfully employed in QSPR studies. However, it should be highlighted that PM7-based model has comparable values of the quality measures as well. The only exception is the model developed with descriptors obtained after B3LYP optimization. The model parameters were significantly different from those for “HF-based” and “PM7-based models”—correlation coefficients R2, \({\text{Q}}_{\text{CV}}^{2}\), \({\text{Q}}_{\text{EXT}}^{2}\) were lower than 0.90, values of the errors RMSEC, RMSECV, RMSEEXT were higher than 0.070.

When deciding on whether HF or PM7 method can be applied for geometry optimization in QSPR studies, two aspects should be taken under consideration. First of all, both methods have satisfactory statistical characteristic of quality (Table 2). Secondly, HF methods require more intensive labor and cost than semi-empirical methods. When one considers the computing time and predictive capability of “HF-based” and “PM7-based” QSPR models it becomes clear that the semi-empirical method is faster and less expensive with simultaneously comparable results. Therefore, it can be successfully employed to geometry optimization in QSPR modeling. That conclusion is consistent with the earlier study published by Puzyn [37] and Rinnan [59]. These researches demonstrated that Hartree–Fock method (HF/6-31G) as well as semi-empirical (PM6 and RM1) could be successfully employed in QSPR studies. However, it should be mentioned that obtained results might be related to the dataset (e.g. number and type of compounds) and modeled value. Studies published by Roy [60] and Kar [61] showed that geometry optimization performed at higher levels of theory (MP2—that includes electron correlation) lead to the considerably higher quality of the developed model. Based on the current results of our study, we can conclude that PM7 method could be satisfactory used as an optimization approach for ionic liquids. One should be remembered that semi-empirical methods produce correct results only for the structures that are sufficiently similar to those which were used to parameterization particular semi-empirical approach [37].

Conclusions

In the presented work, we have compared three methods of geometry optimization: semi-empirical PM7, ab initio Hartree–Fock (6-311+G*) and DFT B3LYP (6-311+G*). We asked a question: How much the choice of the optimization method influences a QSPR model?

We demonstrated that 3D descriptor groups are sensitive on changing the optimization method; thereby the geometry optimization step affects the quality of the QSPR models. The results of statistical Wilcoxon’s test confirmed that particular methods provide various descriptor values.

We also have developed a model that could be used to predict the density of ILs and have examined an impact of the optimization method on the quality measures of the model. The QSPR models utilizing descriptors derived from the structures optimized at semi-empirical and ab initio levels had similar values of the validation characteristics. It means that both models had similar quality. However, when using semi-empirical methods one could calculate the descriptors and develop a QSPR in much shorter period of time. For that reason, we recommend using semi-empirical methods, such as PM7, for geometry optimization in QSPR studies for ionic liquids.