Investigation the effects of cation/anion structure on distribution of Thiophene between IL-rich and Hydrocarbon-rich phases by QSPR approach

Selecting an appropriate ionic liquid as extraction solvent for extracting sulfur compounds from fuel, is a promising way to reduce SOx emissions and to prevent environmental pollution. In this regard, the best structural descriptors of different cation and anion, which play an important role on the thiophene distribution between ionic liquid (IL) and hydrocarbon phases in the ternary systems, that is helpful for selecting a proper IL, have been investigated using quantitative structure − property relationship (QSPR) approach. Five different datasets were collected from the literature containing 664 data points. By investigating the various kinds of molecular descriptors of 1D, 2D and 3D, it was found that the increment of Spanning Tree Number (STN) descriptor as cation structure descriptor, causes a decrease in mole fraction of thiophene in IL-rich phase due to the increase in steric hindrance. STN descriptor reflects the complexity of structure of cation and the information about steric hindrance. Also, it was found that an increase in E1p value as anion structure descriptor leads to the decrement of mole fraction of thiophene to increase the interaction between cation and anion in ionic liquid, that reduced the available space of cation to interact with thiophene. The E1p descriptor represents the density of atoms projected along the length or unfilled space of the anion, thus it is informative of the accessibility of them by thiophene.


Introduction
The presence of sulfur and its derivatives in transportation fuels becomes an environmental hazard because of SOx emissions. Therefore, in the last few decades, environmental concern has increased and severe standards and regulations are implemented all over the world to minimize the negative health and environmental effects from automobile exhausts. The maximum sulfur content in the transportation fuels has been limited to less than 15 ppm in many countries. Consequently, the deep desulfurization of fuels has attracted increased attention in the research community worldwide.

Number of published papers
Year for theoretical screening of ionic liquids, which provide useful insights into certain thermodynamic properties of ionic liquids, but they demand computation, even for studying a single ionic liquid molecule [19,20]. This makes them impractical for screening ionic liquids within a large set of potential candidates. On the other hand, methods using the conductor-like screening model for real solution (COSMO-RS), have been demonstrated to predict the activity coefficients and phase equilibria of systems involving ionic liquid [14,17,18,[21][22][23][24][25][26]. Despite many uses of this method in literature, but there is still large room for improvement the quantitative prediction accuracy of the COSMO models and the development of suitable COSMO databases [25]. For instance, Gao et al. [17], investigated the influence of different parameters on capacity of several ionic liquids for thiophene and dibenzothiophene to remove them from fuels. They concluded that the desulfurization efficiencies rigidly increase with the capacity of ionic liquids for thiophene and dibenzothiophene, which is consistent with the expected fact that higher capacity brings better desulfurization efficiency. However, by comparison with other works, it can be observed that some ionic liquids, which are reported in the Gao's results, do not follow the same trend as reported in Gao's paper [15,27]. Due to the presence of the large number of experimental data on extractive desulfurization by ionic liquid systems, it is better to find a systematic screening method.
Quantitative structure-property relationship (QSPR) is a robust method to relate macroscopic properties to molecular level descriptors [28]. Holbrey et al. [16], investigated extraction of sulfur compound from dodecane using ionic liquids and used QSPR analysis of solvent group contributions to extraction. Gorji et al. [29], employed QSPR methodology to propose models for the prediction of thiophene distribution between the IL-and hydrocarbon-rich phases in ternary systems containing IL, thiophene, and hydrocarbon solvent. Moreover, they investigated the effects of different anion and cation structures on the thiophene distribution between ionic liquid and hydrocarbon phases in the ternary systems using QSPR approach [30,31].
In this work, QSPR has been used to investigate the effect of cation/anion structures on the sulfur-compound distribution coefficient between IL-and hydrocarbon-rich phases in ternary systems of extractive desulfurization.

Experimental database
In order to obtain generalized models that are able to provide predictions, the experimental domain must be large and evenly covered. Therefore, by  Table 1.

Methodology
At first, before any computation of molecular descriptors, the molecular structures should be at their least energy level. Therefore, by using Chembio3D Ultra software [56], the structures of both cation and anion for each ionic liquid were found at optimum level by minimizing the energy level. The method of optimization of the Molecular Mechanics 2 (MM2) software feature of Chem3D was used. Molecular mechanics is a simulation operation which employs equations of "classical physics" enabling the computation of various bonded attributes such as bond stretching, angle bending, and torsional energy along with other non-bonded features [57]. Molecular mechanics considers the attractive and repulsive forces to control the relative positions of the nuclei of the atoms constituting a structure. The potential energy of a given molecule can be represented by the following simplified equation: Here, a mechanical model is hypothesized considering that spheres representing atoms are joined by mechanical springs representing covalent bonds. The energy terminologies shown in equation (1) have been formally defined in Table 2. The interaction and energy functionalities explained by classical physics are also termed as "force fields". The steric energy for a molecule is first determined employing force fields followed by the adjustment of conformational stability leading to the minimization of the steric energy.
On the ternary systems, Y2 can be related to X2 as the following [29,59]: where a, and b are adjustable parameters which express slope or average distribution coefficient and intercept for each ternary systems, respectively. In this study, QSPR approach has been used to relate Y2 to X2 based on a linear equation, obtained by substituting the molecular descriptors for cations with b at equation (3). For anions, since the collected datasets with the same hydrocarbon do not have enough variant data for calculating models, the datasets with different hydrocarbon were considered. Therefore, the effect of each hydrocarbon on the mole fraction of thiophene in the IL-rich phase (Y2) should be investigated. Gorji et al. [29], used the Wiener polarity number (Pol) as the molecular characteristic of hydrocarbon for this purpose, as following: Among the pool of molecular descriptors calculated by Dragon software, the best ones should be selected as variables along with X2 to find the QSPR models for cations and anions. For this purpose, the genetic algorithm (GA) are applied. It should be mentioned that GA-MLR models by employing QSARINS software, were used in the current study [60].
After developing the QSPR models, it is essential to validate them to assess its reliability and

Results and Discussion
The target property, which has been examined in this work, was the correlation between Y 2 and X 2 for each ternary system involved different cations, anions and hydrocarbons for each dataset, by applying the QSPR approach.
At first, the correlation between Y 2 and X 2 in each dataset, without consideration the effects of cation/anion and hydrocarbon, has been examined. Then, the models have been developed by considering the influence of hydrocarbon on Y 2 , by adding Pol descriptor for anion's datasets. Afterwards, the effects of cation/anion structure have been evaluated, by adding their molecular descriptors. The developed models for each dataset are shown in Table 4. As can be seen in Tables 4 and 5, for dataset 1, the addition of a cation descriptor (STN descriptor) to the X 2 variable, improved the coefficient of determination (R 2 ) and F-value (F), and also a decrement of Root Mean Square Error (RSME) and Average Absolute Deviations (AAD). Accordingly, by comparing the statistical parameters of equations (5) and (6) in Table 5, adding a cation descriptor to X 2 variable, improved the prediction ability of QSPR model. Therefore, equation (6) is an appropriate model to predict Y 2 for dataset 1.
For dataset 2, the added cation descriptor (Mor04v descriptor) causes the improvement in statistical parameters except F-value, which indicates the added descriptor are not a proper one. For dataset 3, the added cation descriptor (Mor04v descriptor) leads to no significant enhancement in statistical parameters. Thus, the Y 2 can be predicted without adding a cation descriptor in this dataset. Fig. 2 shows the predicted Y 2 using QSPR models vs. experimental data of Y 2 for dataset 1. As can be seen in this figure, equation (6) can predict Y 2 with acceptable accuracy, while the predicted data by equation (5), are not in desirable agreement with experimental data. In Figs. 3 and 4 plots of predicted Y 2 vs. experimental data for datasets 2 and 3 are depicted. As can be observed in these figures, the predicted data with equations (8) and (10), are in acceptable agreement with experimental data compare to equations (7) and (9).
Due to the decrease of F-value and no noticeable improvement of statistical parameters in equation (10) Table 5, the coefficient of determination value in equation (6) ( ) is noticeably higher than its value in equation (8) ( ).
Consequently, the STN descriptor is a more desirable cation's descriptor than Mor04v descriptor to enhance the predictive models. STN descriptor is a structural descriptor of topological descriptors category, used as a measure of molecular complexity; which increased with the complexity of the molecular structure [61]. Some specific algorithms have been proposed to calculate the number of spanning trees in molecular graphs. The more connectivities and branches of the structure of a molecule, the higher value of STN, so the steric hindrance increases. Accordingly, there is less available space around cation to interact with thiophene regarding the increment of steric hindrance. As a consequence, the STN descriptor was added with the negative sign in the QSPR model for dataset 1. It should be mentioned that Joule et al. [62] and Gupa et    (7) for trainset and c) equation (6), d) equation (7) for testset versus experimental Y2 for dataset 2. For datasets 4 and 5, as observed in Tables 4 and 5, the addition of Pol descriptor which reflects the effect of variation of hydrocarbon, improved the ability of models to predict Y 2 as expected, and the reason why the addition Pol descriptor can boost the prediction ability of QSPR models reported in Gorji et al. study [29]. By adding hydrocarbon descriptor in equations (12) and (15), there is an increase in the R 2 value and a reduction of F-value for both datasets. Furthermore, the statistical parameters for these two equations, represented in Table 5, demonstrate the addition of Pol descriptor leads to improve the accuracy of models for predicting Y 2 . In datasets 4 and 5, the third variable in equations (13) and (16), is an anion descriptor to take into account the effects of structural features of anion on the capability of models for predicting Y 2 , which are E1p and Mor28v descriptors for equations (13) and (16), respectively, as shown in Figs. 5 and 6. As can be concluded from Comparing equations (13) and (16), it can be concluded that equation (13)  It is noteworthy that the ionic liquid's potential for the extraction of thiophene from hydrocarbon depends on the intermolecular interactions between thiophene and ionic liquid [23]. Accordingly, the increment of unfilled space of anion leads to stronger cation-anion interaction in ionic liquid, which causes the reduction of the availability of cation to interact with the thiophene. Therefore, it is acceptable that E1p descriptor has a negative sign in equation (13). Briefly, the higher value of E1p descriptor reflects more unfilled space of anion and more affinity of it to interact with cation, leading the decrease of Y 2 . Similar results were reported in previously studies [14,18,23].     (14), c) equation (15) for trainset and d) equation (13), e) equation (14), f) equation (15)

Conclusion
In the present study, the effect of cation/anion structure of ionic liquids on the distribution coefficient of thiophene in ternary systems was investigated by applying QSPR approach. Five datasets were collected from literature, three of them including different cation with the fixed anion and hydrocarbon for each datasets, two others including different anion and hydrocarbon with the fixed cation. The results depicted significant role of cation structure on thiophene distribution coefficient. Therefore, the STN descriptor was added to the X 2 as cation descriptor in QSPR model, led to improve the predictive capability of model (from R 2 =0.81 to R 2 =0.92). The increment of STN causes a decrease in mole fraction of thiophene in IL-rich phase due to the increase in steric hindrance. Also, it was found that anion structure plays an important role for thiophene distribution coefficient along with hydrocarbon structure features. According to this, E1p descriptor as anion structure descriptor was added to the model of X 2 and Pol variables, and improved the QSPR model predictivity (i.e. R 2 = 0.93). In summary, E1p descriptor represents the density of atoms projected along the length or unfilled space of the anion, thus it is informative of the accessibility of them by thiophene. As a result, an increase in E1p value causes the decrement of Y 2 , in order to stronger cation-anion interaction in ionic liquid. This work could provide a basis for studying the selection of a proper ionic liquid based on cation/anion structure descriptor in future research.

Availability of data and materials
Not applicable