Quantum Mechanical Prediction of Dissociation Constants for Thiazol-2-imine Derivatives

As weak acids or bases, in solution, drug molecules are in either their ionized or nonionized states. A high degree of ionization is essential for good water solubility of a drug molecule and is required for drug–receptor interactions, whereas the nonionized form improves a drug’s lipophilicity, allowing the ligand to cross the cell membrane. The penetration of a drug ligand through cell membranes is mainly governed by the pKa of the drug molecule and the membrane environment. In this study, with the aim of predicting the acetonitrile pKa’s (pKa(MeCN)) of eight drug-like thiazol-2-imine derivatives, we propose a very accurate and computationally affordable protocol by using several quantum mechanical approaches. Benchmark studies were conducted on a set of training molecules, which were selected from the literature with known pKa(water) and pKa(MeCN). Highly well-correlated pKa values were obtained when the calculations were performed with the isodesmic method at the M062X/6-31G** level of theory in conjunction with SMD solvation model for nitrogen-containing heterocycles. Finally, experimentally unknown pKa(MeCN) values of eight thiazol-2-imine structures, which were previously synthesized by some of us, are proposed.


INTRODUCTION
The physicochemical features of a drug molecule, such as solubility, partition coefficient, hydrogen-bonding ability, degree of ionization, and protein binding affinity, are directly related to its therapeutic action. Among these properties, ionization degree (pK a ) plays a significant role in the design of smart drug delivery systems. A high degree of ionization of a drug ligand is a prerequisite for good water solubility and therefore high hydrophilicity, which is required for drug− receptor interactions.
The nonionized form of the ligand improves the lipophilicity of a drug, and therefore, it helps the drug molecule to cross the nonpolar cell membrane. 1−4 Thus, the penetration of a drug ligand through the cell membranes is mainly governed by the pK a of the drug molecule and the membrane environment. On the other hand, since the reactivity and stability of a molecule are highly dependent on its pK a , the strength of acidity or basicity of a molecule in any solvent determines the mechanisms of chemical reactions involving synthesis, catalysis, oxidation−reduction, and decomposition. In eq 1, the general monoprotic acid (HA) dissociation equilibrium reaction and the deprotonation products (H + , A − ) are shown. The pK a of the given equilibrium is the negative logarithm of the equilibrium constant (K a ), as shown in eq 2. The standard free-energy difference between the reactants and products of the dissociation reaction given in eq 1 is related to K a and pK a as shown in eqs 3 and 4  (4) Accurate and fast pK a predictions by computational methods are crucial for more efficient drug design. Quantum chemical approaches have been extensively used for the prediction of pK a 's of the ligands. 5−16 Most of the methods differ in the calculation of the dissociation free energies (ΔG aq ) of a given deprotonation reaction (eq 3). The most straightforward theoretical approach is based on the direct calculation of the Gibbs free energies of the equilibrium reaction of the acid in solution (eq 1). This method, the so-called direct method, faces significant errors arising from the calculation of the free energy of H + . In the thermodynamic cycle approaches, TC1 (Scheme 1) and TC2 (Scheme 2), the free energy of solvation, desolvation, and deprotonation of the acid species HA is considered, where ΔG aq values can be calculated by = + G G G aq gas solv (5) ΔG gas and ΔΔG solv terms are calculated by ab initio or density functional theory (DFT) methods by adapting the expressions to the thermodynamic cycle constructed. According to the TC1 (Scheme 1), the most commonly implemented thermodynamic cycle, these terms are Note that in these reactions the H 2 O is the base and signifies a water cluster, whereas H (aq) + and H 3 O (aq) + represent the hydrated proton product. 18 Even though the former is claimed to be simpler and truer than the latter, both models are widely used for pK a calculations.
The isodesmic reaction scheme for pK a calculations is based on a reaction between an acid (HA) and the conjugate base of a reference molecule (B − ) with an experimentally known pK a as shown in Scheme 3, whose equilibrium constant is described in eq 6. In the isodesmic reactions, the types of bonds that are made to form the products are the same as those that are broken in the reactant. The pK a of acid HA relative to pK a of reference molecule HB is calculated by eq 7. Since in the isodesmic method, only the free energies of the reactants and products in solution are calculated, the errors due to the gasphase calculations are cancelled and as a result more accurate pK a 's are obtained. The protonation/deprotonation process occurs in similar environments in molecules A and B.
With significant biological activities, thiazol-2-imine derivatives are commonly used in organic, synthetic, pharmaceutical, and medicinal chemistry areas due to their anti-inflammatory, analgesic, 19 antibacterial, 20 antifungal, 21 antiviral, 22   Journal of Chemical Information and Modeling pubs.acs.org/jcim Article inhibition activities. 19 These molecules possess chiral centers; thus, the synthesis of thiazol-2-imine derivatives as single enantiomers is of great interest. Moreover, these thiazol-2imine derivatives that are present as single enantiomers can be used in asymmetric chemistry such as catalysts in converting olefins to primary amines, 23 asymmetric hydrogenations, 24 and generating chiral α-aminonitriles. 25 Recently, Tuncel and Dogan have synthesized single enantiomer thiazol-2-imines (5RR, 6RR, 7RR, 8RR) by water elimination reactions from the corresponding 2-iminothiazolidine-4-ones (1RR, 2RR, 3RR, 4RR) ( Figure 1). 26 The synthesized molecules are insoluble in water, and their pK a 's have been predicted in acetonitrile (MeCN). Due to its relatively lower polarity compared to water, MeCN is a widely used solvent in the pharmaceutical industry since it can mimic the membrane environment. 27,28 As pK a values are strongly dependent on the environment, water pK a 's may lead to improper conclusions about the degree of ionization of a molecule within the membrane. As an aprotic solvent with a pK a of 25, MeCN is a poor H-bond acceptor and a weak base; thus, when it is used as a solvent, it increases the pK a values of the solutes, such that acidic compounds act as weaker acids and basic compounds act as stronger bases in MeCN compared to water. Note also that nonaqueous pK a data are highly important in many fields since the available experimental data are less abundant than desirable. 29 Several solvents, including dimethyl sulfoxide, MeCN, and tetrahydrofuran (THF), have found wide application as media for studies of strong bases. MeCN has some advantages over other aprotic solvents: it is a very weakly basic aprotic solvent with a high dielectric constant (36.016) and hence favors the dissociation of ion pairs into free ions. MeCN solvates ions more weakly than water, being a weak electron-pair donor solvates cations better than anions. 4,30 Compounds 1RR-8RR, in addition to being drug-like molecules, are expected to be strong chiral organic bases as well because of the amidine conjugation in their structures ( Figure 2). Since they are anticipated to be stronger bases than their amine precursors, their pK a values are worth determining in different media.
In this study, the acetonitrile and water pK a 's of thiazol-2imine derivatives will be predicted by employing the isodesmic reaction scheme based on the previously reported success of the method. 14,31−35 With this purpose, by using several quantum mechanical approaches, we propose a very accurate protocol. In the first part, the methodology to be employed throughout the study is identified from a benchmark study, where several DFT functionals and solvent models are examined together with TC1 and TC2 schemes, for reproducing the experimentally known water pK a of the reference molecule to be used in the isodesmic reaction. In the second and third parts, the proposed methodology is validated on two sets of test molecules whose water and acetonitrile pK a 's are experimentally known. In the fourth part, pK a(MeCN) of the reference molecule to be used in the isodesmic reaction is calculated by the established methodology, and finally, pK a(MeCN) 's of thiazol-2-imines are calculated by using the isodesmic reaction scheme. The protocol proposed in this study can be used with confidence to predict the water and acetonitrile pK a 's of thiazol-2-imine derivatives.

COMPUTATIONAL METHODOLOGY
A systematic conformational search was conducted for all of the molecules employing the semiempirical PM3 method 36 by using the SPARTAN14 software. 37 Free rotations around single bonds were taken into account, and all of the geometries corresponding to stationary points have been reoptimized by using the Gaussian16 program suite. 37 For the benchmarking study of the DFT method, combinations of a hybrid-GGA exchange−correlation functional (B3LYP), 38 a hybrid meta-GGA functional (M062X), 39 and a hybrid-GGA exchange− correlation functional (ωB97XD) 40 with 6-31+G*, 6-31G**, 6-31+G**, 6-31++G**, and 6-311++G** basis sets were tested. The universal solvent model (SMD) 41 and the conductor-like polarizable continuum model (CPCM) 42 were employed to mimic the aqueous solvent environment. The partial atomic charges were derived from Natural Population Analysis (NPA), 43 Charge Model 5 (CM5), 44 and Hirshfeld Analysis. 45 root-mean-square error (RMSE), mean absolute deviation (MAD), and MD (mean deviation) were calculated to compare the efficiency of our predictions.

RESULTS AND DISCUSSION
Since thiazol-2-imines synthesized by Tuncel and Dogan 26 in Figure 1 are water-insoluble molecules, their pK a 's are predicted in an acetonitrile environment by employing the isodesmic reaction scheme. With this respect, several quantum chemical approaches were utilized. In the first place, in order to propose a methodology that describes the system best, the experimentally known water pK a of the reference molecule to be used in the isodesmic reaction was estimated, employing the TC1 and TC2 schemes with a set of different DFT functionals and solvent models. In the second part of the study, the proposed methodology was validated on a set of 2-(phenylimino)imidazolidine derivatives, and then in the third part the verification of the methodology in an acetonitrile environment was performed for a set of nitrogen-containing heterocyclic compounds. The successful applications of the proposed methodology allowed us to predict the pK a(MeCN) of the reference molecule, 2-imino-thiazolidinone. And finally, with the predicted acetonitrile pK a of the reference molecule, the pK a(MeCN) 's of thiazol-2-imines in Figure 1 were calculated by employing the isodesmic reaction scheme.

Identification of the Methodology.
As the reference molecule to be used in the isodesmic reaction scheme for the prediction of pK a(MeCN) 's of thiazol-2-imines, structurally resembling 2-imino-thiazolidinone ( Figure 3) with the experimental water pK a of 11.70 46 is used. Since the experimental pK a(MeCN) of the reference molecule is not known, a 3-step protocol is followed to predict its pK a(MeCN) . As well as the calculation scheme to be applied, the accuracy of the prediction is directly related to the quality of the QM method and the solvation model employed to describe the system's electronic nature and solute−solvent interactions. Therefore, the computational methodology to be used is identified by employing TC1 and TC2 schemes with a set of different DFT functionals, basis sets, and solvation models for the estimation of the water pK a of the reference molecule. The reference molecule has two conformers: hydrogen on 2-imine N being either up (toward S) or down (toward N3). "Hydrogen up" conformation, which was calculated to be 0.76 kcal/mol more stable than the "down" conformation, is further considered in our calculations.
As there are two protonation sites in the reference molecule (N3 and 2-imino N), NPA, CM5, and Hirschfeld charge analysis were performed for the identification of the dominant protonation/deprotonation site. The computed charges with all methodologies were found to give the same trend, with 2imino N having a higher electron density (q N(NPA) = −0.628e) compared to N3 (q N3(NPA) = −0.521e). Thus, the reported partial atomic charges of neutral and protonated (on 2-imino N) 2-imino-thiazolidinone are based on NPA calculations in the rest of the manuscript. First, the TC1 scheme displayed in Scheme 1 was employed with several levels of theory in order to predict the water pK a of the reference molecule. Two ΔG solv (H + ) values (−265.9 kcal/mol and −270.3 kcal/mol) were used for each trial, and the results presented in Table 1 show that the TC1 scheme predicts much lower pK a 's than that of the experimental value, which indicates the presence of a systematic error. TC1 seems to fail in the prediction of 2imino-thiazolidinone pK a with a very high ΔpK a (the difference between the predicted and experimental pK a ) range (2.56 ≤ |ΔpK a | ≤ 10.43) irrespective of the level of theory used. When ΔG solv (H + ) is taken as −270.3 kcal/mol, the calculated 2imino-thiazolidinone pK a values (calculated pK a 2 ) with respect to different levels of theory are observed to deviate more from the experimental value compared to calculations performed with ΔG solv (H + ) = −265.9 kcal/mol (calculated pK a 1 ). As an attempt to prevent the large uncertainties arising from the ΔG solv (H + ) values in TC1, the proton H + was substituted by the H 2 O/H 3 O + pair in TC2 (Scheme 2), and the calculated water pK a values of 2-imino-thiazolidinone with different levels of theory are presented in Table 2. The differences between the predicted and experimental pK a (ΔpK a ) have been calculated for each level of theory and are represented in Figure 4. In the TC2 scheme, a lower ΔpK a range (0.77 ≤ |ΔpK a | ≤ 6.06) is observed compared to the TC1 scheme. As seen from Figure 4, with the CPCM solvation model, the M062X functional behaves better than the others except for the 6-31G** basis set; on the other hand, with the SMD solvation model, the 6-31G** basis set where polarization function yields the smallest ΔpK a value.
When the M062X/6-31G** level of theory is used in the presence of the SMD solvation model, the calculated pK a of 2imino-thiazolidinone by employing the TC2 scheme is 12.47, for which the |ΔpK a | is 0.77 unit. Therefore, in order to predict the pK a(MeCN) of the reference molecule, further verification of the established methodology is performed on a set of test molecules.

Validation of the Methodology for Water Environment.
For the validation of the methodology established, a set of molecules with experimentally known water pK a were collected from the literature. The selected 10 2-(phenylimino)-imidazolidine derivatives share common scaffolds with our target molecules, thiazol-2-imines, as displayed in Figures 5 and S1, and the pK a 's of these molecules vary between 9.08 and 10.78. 47 The methodology proposed in the previous section is applied to the molecules in Figure 5 in order to predict their pK a 's in water. Initially, NPA charges on the N atoms were calculated for all of the structures and the most electron-rich N atom was selected as the first protonation site. For each molecule, the electron density around imine N was observed to be the highest and that their pK a 's are directly related to the electron density around the N atom. For example, ph6 has the highest pK a value (10.78) and the computed NPA charge on imine N for ph6 is the lowest (q N = −0.781e), whereas ph10 has the lowest pK a (9.08) with the highest NPA charge (q N = −0.362e) among the ph molecules. The presence of electron-withdrawing groups such as carbonyl moieties in the structures of ph8, ph9, and ph10 makes the imine N more electron-deficient compared to ph1−7 and as a result less basic. To estimate the water pK a 's of the 10 molecules displayed in Figure 5, the TC2 scheme with the M062X/6-31G** level of theory in conjunction with the SMD solvation model using water as a solvent was applied by protonating the imine N.
The results presented in Table S1 show that the predicted pK a 's of 2-(phenylimino) imidazolidine derivatives are very close to their experimentally determined values with a strong correlation (R 2 = 0.96; Figure 6). This impacts the success of the proposed methodology with a high accuracy of MAD and ΔpK a values (MAD = 0.12 and 0.05 ≤ |ΔpK a | ≤ 0.21). We additionally assessed the prediction power of our method by comparing the assigned pK a 's on imine N of 2-(phenylimino) imidazolidine derivatives with ChemAxon, 48 which is a commercial pK a prediction tool widely used in drug development processes. ChemAxon uses the empirically calculated partial atomic charges in molecules as parameters to identify the micro pK a 's. The predicted pK a 's by ChemAxon listed in Table S1 are mostly found to be in good agreement with the experimental values except ph8 and ph9, which deviate more than 1 unit. However, the correlation between the predicted and experimental pK a 's are observed to be weaker (R 2 = 0.31; Figure S2) with a higher MAD value (0.57) compared to the methodology developed in this study.

Validation of the Methodology for Acetonitrile
Environment. Following the validation of the proposed methodology in water environment, the applicability of the methodology was tested for a set of molecules for which there are experimentally determined pK a(MeCN) 's. The selected 17 molecules are nitrogen-containing heterocycles as displayed in Figures 7 and S3. Since some of the molecules possess more than one N atom in their structures, calculated NPA atomic charges on the N atoms allowed us to determine the most electron-rich N to identify the first protonation site. For the molecules that possess a -NH 2 functional group (3,5,6,7,8,9,11,12,14), the highest electron density is observed to be located on the N atom of the amino substituent. Then, imine N and N on the -N(CH 3 ) 2 substituent were calculated to have more electron density in molecules 2 and 4, respectively. The lack of amidine conjugation in 10, 13, 15, 16, and 17 makes these molecules less basic compared to that of 1, as predicted by the calculated partial charges on N atoms. TC2 scheme was employed with the M062X/6-31G** level of theory in conjunction with the SMD solvation model using acetonitrile The experimental and predicted pK a 's of N-containing aromatic compounds are presented in Table 3. The maximum deviation of the predicted pK a(MeCN) 's from experimental   pK a(MeCN) 's is found to be 0.77 unit, and a very good agreement between calculated and experimental values was obtained with R 2 = 0.98 ( Figure S4). Moreover, a small MAD value calculated (0.33) demonstrates the applicability of the proposed methodology to the acetonitrile environment. In addition to the pK a(MeCN) calculations, the water pK a 's of the molecules in Figure 7 are calculated with the proposed methodology and the results are presented in Table 3. Among the 17 selected molecules, experimental pK a(water) 's were collected from the literature for 14 of them. A very good correlation is obtained between the calculated and the experimental pK a 's with a maximum deviation of 0.53 units. MeCN is a polar aprotic solvent with a polarized C�N triple bond. Unlike water, it does not donate hydrogens to anions, and therefore, when used as a solvent, the ionization of the molecules in MeCN is more difficult compared to that of water, which in turn results in higher pK a 's and thus stronger basicity's in MeCN environment. The pK a for these molecules in MeCN ranges between 24.77 and 12.56, whereas in water it is between 13.50 and 5.23. 30 Scheme 3. Isodesmic Reaction Scheme Where the Acid HA Donates Its Proton H + to the Base B − to Yield Its Conjugate Base A − and Conjugate Acid HB a a ΔG soln is the free energy of deprotonation in solution.

Prediction of pK a(MeCN) 's of Thiazol-2-imines.
In this study, the goal is to predict the pK a 's of thiazol-2-imines synthesized by Tuncel and Dogan 26 in an acetonitrile environment. An accurate protocol is suggested by a twostep methodology validation. In the final step of the study, pK a(MeCN) 's of thiazol-2-imines were calculated by employing the isodesmic reaction scheme at the M062X/6-31G** level of theory in conjunction with the SMD solvation model using acetonitrile as a solvent. Scheme 4 illustrates an isodesmic reaction where the protonated 1RR is the acidic species HA + and 2-imino-thiazolidinone is the reference species B Ref .
To the best of our knowledge, the pK a(MeCN) of the reference molecule 2-imino-thiazolidinone is not known experimentally; its pK a(MeCN) is first predicted by employing the established methodology. The calculated pK a(MeCN) of 2-imino-thiazolidinone using acetonitrile as a solvent is 19.41 (TC2/M062X/6-31G**/SMD), and this prediction will further be used in the next step to propose pK a(MeCN) 's of thiazol-2-imine derivatives by employing the isodesmic reaction scheme.
The computed charges indicate that the protonation dominantly occurs on the imine N as presented in Table 4 and the predicted pK a 's of thiazol-2-imines are given in Table  5. We observe that 1RR, 2RR, 3RR, and 4RR have relatively lower pK a values compared to 5RR, 6RR, 7RR, and 8RR. Withdrawal of the electrons from the exocyclic imine N by the carbonyl substituents on the thiazole ring in 1RR, 2RR, 3RR, and 4RR results in less electron density around the N atom than the imine N atom of 5RR, 6RR, 7RR, and 8RR (q N = −0.582e for 1RR; q N = −0.745e for 5RR) ( Table 4). The lower electron density on the protonation site makes these molecules less available for proton abstraction. Thus, 1RR, 2RR, 3RR, and 4RR were predicted to be less basic compared to 5RR, 6RR, 7RR, and 8RR. For the molecules of our interest, the pK a range in MeCN is between 17.16−10.59 and 7.42−1.77 in water.
3.5. Applicability of the Established Methodology to Drug Compounds. We additionally tested our methodology on a small data set (due to the lack of experimental data in the literature) of drug precursors/compounds containing nitrogen heterocycles, which have experimentally determined water and MeCN pK a 's. Among heterocyclic pharmacophores, the benzimidazole ring system is quite common. Imidazolecontaining molecules (benzimidazole, thiabendazole, carbendazim, imidazole) have been chosen since imidazole-based medicinal chemistry suggests the potential therapeutic values of imidazole-derived compounds for treating incurable diseases. Literature survey reveals that the various derivatives of benzimidazole have been synthesized for their pharmacological activities such as antimicrobial, 56 antifungal, 57 antiviral, 58 analgesic, 59 antiprotozoal, 60 anticancer, 61 anti-inflammatory, 62 antihistaminic, 63 and antimalarial. 64 These substructures are often called "privileged" due to their wide recurrence in bioactive compounds. 65 Besides the imidazole derivatives, we have also evaluated the pK a values of guanidine derivatives 1,5,7-triazabicyclo Since the Middle Ages in Europe, guanidine has been used to treat diabetes. Due to its long-term hepatotoxicity, further research for blood sugar control was suspended at first after the discovery of insulin. Later development of nontoxic, safe biguanides led to the longused first-line diabetes control medicine metformin. 66−69 We attempted to predict the pK a 's of seven drug compounds by employing the isodesmic scheme at the M062X/6-31G** level of theory with the SMD solvent model in both water and MeCN, and the results presented in Table 6 are found to be in harmony with the experimentally reported pK a 's. The reference molecule is imidazole for benzimidazole, thiabendazole, and carbendazim; benzimidazole for imidazole. TBD is a reference molecule for MTBD and 2-(hpp)C 5 H 4 N and MTBD for TBD.

CONCLUSIONS
The pK a of a drug molecule and the membrane environment are the key factors controlling how well a drug penetrates through the cell membranes. In this study, an elaborate protocol to evaluate the pK a 's of the single enantiomers of thiazol-2-imines is proposed. Different functionals and basis set combinations were tested to evaluate the pK a(water) of a reference molecule resembling the molecules synthesized by Dogan et al., 26 2-imino-thiazolidinone, to be employed in the isodesmic reaction scheme. The M062X/6-31G** level of theory and SMD solvation model are found to be adequate for the prediction of pK a(water) of the reference molecule. In order to ensure that the constructed methodology describes the nitrogen-containing heterocyclic compounds well, a two-step validation procedure is followed. First, pK a(water) of 2-(phenylimino)imidazolidine derivatives were calculated with the proposed methodology and the predicted values were found to be very close to the experimental pK a(water) 's. Being interested in pK a calculations in a less polar medium, it was necessary to test the methodology for compounds similar to the species of interest whose pK a values in acetonitrile are experimentally known. Thus, the applicability of the method was verified on a set of nitrogen-containing heterocyclic compounds with experimentally known pK a(MeCN) . With the justified methodology, the experimentally unknown pK a(MeCN) of the reference molecule, 2-imino-thiazolidinone, was calculated, and finally, isodesmic reactions between the reference molecule and thiazol-2-imine derivatives synthesized by Dogan et al. were proposed to evaluate the pK a(MeCN) 's of the latter. Moreover, we showed that the protocol established in this study is very satisfying and promising in terms of its applicability to more diverse drug data sets for future validations. The predicted pK a values of these drug-like Table 5. Three-dimensional (3D) Representations of Thiazol-2-imine Derivatives and Their Predicted pK a 's in MeCN a and Water b Journal of Chemical Information and Modeling pubs.acs.org/jcim Article molecules will enable the evaluation of their lipophilicity/ hydrophilicity at a given pH.

* sı Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.2c01468. 3D representations of 2-(phenylimino)imidazolidine derivatives; calculated and experimental water pK a 's of 2-(phenylimino)imidazolidine derivatives; linear regression of experimental vs ChemAxon calculated pK a values of 2-(phenylimino)-imidazolidine derivatives; 3D representations of nitrogen-containing small aromatic compounds; and linear regression of experimental vs calculated pK a values of nitrogen-containing small aromatic compounds (PDF)