First-principles prediction of critical micellar concentrations for ionic and nonionic surfactants

The concentration of surfactant in solution for which micelles start to form, also known as critical micelle concentration is a key property in formulation design. The critical micelle concentration can be determined experimentally with a tensiometer by measuring the surface tension of a concentration series. In analogy with experiments, in-silico predictions can be achieved through interfacial tension calculations. We present a newly developed method, which employs ﬁrst principles-based interfacial tension calculations rooted in COSMO-RS theory, for the prediction of the critical micelle concentration of a set of non-ionic, cationic, anionic, and zwitterionic surfactants in water. Our approach consists of a combination of two prediction strategies for modelling two different phenomena involving the removal of the surfactant hydrophobic tail from contact with water. The two strategies are based on regular micelle formation and thermodynamic phase separation of the surfactant from water and both are required to take into account a wide range of polarity in the hydrophilic headgroup. Our method yields accurate predictions for the critical micellar concentration, within one log unit from experiments, for a wide range of surfactant types and introduces possibilities for ﬁrst-principles based prediction of formulation properties for more complex compositions. (cid:1) 2021 The Authors. Published by Elsevier Inc. ThisisanopenaccessarticleundertheCCBYlicense(http:// creativecommons.org/licenses


a b s t r a c t
The concentration of surfactant in solution for which micelles start to form, also known as critical micelle concentration is a key property in formulation design.The critical micelle concentration can be determined experimentally with a tensiometer by measuring the surface tension of a concentration series.In analogy with experiments, in-silico predictions can be achieved through interfacial tension calculations.
We present a newly developed method, which employs first principles-based interfacial tension calculations rooted in COSMO-RS theory, for the prediction of the critical micelle concentration of a set of nonionic, cationic, anionic, and zwitterionic surfactants in water.Our approach consists of a combination of two prediction strategies for modelling two different phenomena involving the removal of the surfactant hydrophobic tail from contact with water.The two strategies are based on regular micelle formation and thermodynamic phase separation of the surfactant from water and both are required to take into account a wide range of polarity in the hydrophilic headgroup.
Our method yields accurate predictions for the critical micellar concentration, within one log unit from experiments, for a wide range of surfactant types and introduces possibilities for first-principles based prediction of formulation properties for more complex compositions.Ó 2021 The Authors.Published by Elsevier Inc.This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/).

Introduction
Surfactants are commonplace in society and are the key components in formulated products, such as detergents, cosmetics, fibers, oilfield chemicals, pharmaceuticals and food [1][2][3][4][5].One property specific to each surfactant is the critical micellar concentration (CMC), which is the concentration of surfactant in water, above which micelles spontaneously start to form.Any nominal increase in surfactant concentration beyond the CMC only leads to the formation of more micelles, while the real surfactant concentration in water remains at the CMC.Although the CMC itself is not causally related to full formulation properties, any prediction of complex systems requires at least the CMC to be correctly predicted by the method.For modelers, besides the qualitative use for formulations, the CMC thus serves as a benchmark for predictions.
Attempts to predict the CMC have been published using different modelling techniques, such as quantitative structureÀactivity relationships (QSAR) models [6][7][8][9][10], group-contribution (GC) methods [11,12], dissipative particle dynamics (DPD) [13][14][15][16], and COSMO-RS-based models [17,18].QSAR model parametrization relies substantially on experimental data regression of previously measured CMC.This feature makes the models suitable for predicting the CMC of surfactants, which have similar chemical compositions to the ones used in the regression.As an example, Huibers et al. needed to use different parameters and equations to develop distinct QSAR models that could reliably predict the CMC of nonionic surfactants belonging to the alkyl ethoxylates, C x E y , family [7], and anionic surfactants of the sulfate class [6].GC methods, similar to QSAR, highly rely on the parametrization of the interaction groups.While GC methods were employed to predict CMC of nonionic surfactants [12,19], to the best of our knowledge no GC was employed to predict CMC for ionic surfactants.This is probably due to the lack of an extensive collection of parametrized headgroups for describing the hydrophilic part of ionic surfactants.DPD is a coarse-grained molecular model that simulates the dynamics of atoms grouped into ''beads."As with any other molecular dynamic model, beads interact with each other by means of a force field.In this regard, an adequate parametrization of the head group is crucial for the prediction of the CMC.DPD was used for the prediction of the CMC of nonionic surfactants [16,20].Anionic and cationic surfactants were also studied with DPD [13,14], although the parameterization of the bead describing the hydrophilic head group and the associated counterion prevents straightforward employment of this method to a wider range of surfactants.Usually, GC, QSAR and DPD models yield very accurate quantitative predictions, characterized by rootmean-square-error (RMSE) between 0.2 and 0.5 in logarithmic scale.On the other end, because of their highly empirical nature, they cannot be applied to different surfactant types (i.e., nonionic, anionic, cationic and zwitterionic), using the same parametrization.In other words, the dataset used for the predictions needs to include molecules with similar chemical structures to the ones used for the dataset of the parametrization.Within COSMO-RS methods, COSMOplex [18] has recently emerged as a robust method for the prediction of CMC of nonionic surfactants.The method that combines molecular dynamics (MD) simulation with COSMOmic (MD/COSMOmic) [17] was also employed for the prediction of the CMC of nonionic surfactants belonging to the C x E 6 series.So far, their usage is still limited to nonionic surfactants.COSMOplex and MD/COSMOmic yielded a prediction accuracy in predicting the CMC that is generally lower than GC, QSAR and DPD models (RMSE between 0.5 and 0.8 in the log scale), but in line with COSMO-RS models for the prediction of other thermodynamic properties, such as the thermodynamic partition coefficient [21][22][23][24][25][26].Despite the lower prediction accuracy, as compared to GC, QSAR and DPD, models based on COSMO-RS have the key advantage of using a universal parametrization in every prediction, thus making them potentially applicable to any surfactant class and type.
Taking advantage of the high flexibility of COSMO-RS based models, in this paper we present a prediction approach that employs interfacial tension calculations based on quantum chemical density functional theory (DFT) and the COSMO-RS [18] implicit solvent model.This approach involves a combination of two prediction strategies for modelling two different phenomena involving the removal of the surfactant hydrophobic tail from contact with water.The first strategy models the formation of ordered micelles via preferential hydrophobic tail solvation for strongly polar hydrophilic head groups, such as ionic surfactants.The second strategy models the critical micellar concentration as the surfactant concentration in a two-phase surfactant/water system in thermodynamic equilibrium, which proves to work well for nonionic surfactants without particularly polar headgroups.Examples in this study include the well-known alkyl ethoxylates, C x E y , a few other nonionic surfactants, and several ionic surfactants that are representative of anionic, cationic, and zwitterionic classes (see Table 1 for surfactants used in this study).Key features of our predictive approach are its robustness, computational speed, and general applicability to all the aforementioned surfactant classes.This flexibility, which to the best of our knowledge is not offered by any other CMC prediction model, combined with its computational efficiency, makes the model applicable to a wide range of surfactant systems.The overall predicted CMC is the lowest concentration of the two individual predictions of the computed CMC values from Strategy 1 and Strategy 2.

Quantum chemical calculations
We performed the quantum chemical DFT calculations with geometry optimization in order to compute the molecular screening charge surface, also known as r-surface or COSMO surface, within COSMO-RS [27] theory for all the surfactants in the predictions.Water and n-decane were taken from the COSMOtherm database.The COSMOconf package [28] was used to generate stable conformers for each surfactant.The whole set of generated conformers was used in the COSMO-RS calculations, and for alkyl ethoxylates, the conformers included those with and without internal hydrogen bonding of the terminal OH group to the ether groups.
All the DFT calculations were performed with the Turbomole software, version 7.4.0[29].Initially, DFT calculations with geometry optimization were carried out using a Becke-Purdue [30,31] (BP) functional, the TZVP basis set [32], resolution of identity (RI) approximation together with the COSMO [33] implicit solvent model, and an infinite dielectric constant (needed for subsequent COSMO-RS calculations [27]).Additional single point energy calculations were performed on the optimized geometries with the TZVPD basis set, in order to be able to use the ''BP_TZVPD_FINE_ 19" parameterization in the COSMOtherm software.

COSMO-RS calculation within COSMOtherm software
In order to compute the interfacial tension (IFT), we used our developed COSMO-RS-based method [34], which builds upon the flatsurf module of COSMOtherm [35].In brief, flatsurf calculates the free energy of transferring a molecule from a bulk phase to a featureless interface.The liquid-liquid IFT method adds a third (virtual) surface phase, which is placed between the two bulk phases.The liquid-liquid interface thus has two interfaces, both of which contribute to the total IFT and therefore allows for the partial implicit solvation in two adjacent phases.In particular, the addition of the surface phase, in the spirit of Guggenheim, allows for treating surface enrichment of surface-active species, like surfactants.All the details about the liquid-liquid IFT can be found in the original paper [34].Of special interest is that the method requires only neutral species to be present, which means that all ionic surfactants in the study include the counterion specifically bonded to the ionic surfactant [36].We employed the script by L.V. Nikolajsen for the calculations [37].
Using the liquid-liquid IFT method, two different strategies were utilized for the prediction of the CMC of anionic, cationic, zwitterionic and non-ionic surfactants at T = 298 K, relevant to the majority of experimental data collected.Fig. 1 presents a schematic description of the two strategies.
Strategy 1: The first strategy considers ternary systems composed of the surfactant, n-decane (i.e., the oil component), and water separated in two phases.Phase 1 consists of a mixture of water and surfactant, while Phase 2 consists of pure n-decane.The strategy is intended for surfactants with strongly polar hydrophilic headgroups, where regular micelles are expected to form, in which surfactant tails are preferentially solvated by other surfactant hydrophobic tails with little interaction between tail and headgroup (Fig. 1).It should be noted that the oil chosen in the calculations should also match the surfactant tail chemistry.All of the surfactants with a strongly polar headgroup in our study are dominated by alkane chains, so any alkane with a low solubility in water may be used.Strategy 1 employs an iterative approach involving a series of IFT calculations, whereby the molar fraction of the surfactant in the aqueous phase is effectively increased until the value of the IFT reaches 0 mN/m.At this point (red square in Fig. 2), the molar fraction of the surfactant in the water-rich phase is assumed to yield the CMC.Fig. 2 describes how the iterative approach works, and the surfactant SDS is taken as a reference, while profiles for other surfactants are given in Supplementary Figures S1-S5.From Fig. 2, we see how, at low molar fractions, the IFT profile is characterized by positive values around 50 mN/m, which is consistent with a water/alkane interface without surfactant.In the positive range of the IFT profile, where the interfacial tension between the oil and water-rich phase is higher than 0, the micellization process is energetically unfavorable; therefore, micelle formation does not occur.At the CMC point, IFT becomes zero, and the creation of a new surface area between the phases can be done without any penalty to free energy.Thus, micellization may begin.
In the calculations beyond the CMC point (red square), the IFT reaches negative values, which seems unphysical.However, this is based on a situation where computational equilibrium without dynamics is taken into account.This should be interpreted as follows: For a situation with negative IFT, the system would gain free energy by creating more surface area between the water and the oil (surfactant tails).This can only occur through the creation of more micelles.In the process of creating micelles, and in order to take advantage of the negative IFT, the mass balance in the system must be upheld, and any micelles created will deplete surfactants from the aqueous phase.This process will thus continue to generate micelles, reducing the aqueous surfactant concentration until it reaches the CMC, whereby no new surface area is created or destroyed.

Strategy 2:
The second strategy considers binary systems composed of two pure phases, Phase 1 constituted of pure surfactant and Phase 2 constituted of pure water.In this approach, the thermodynamic two-phase equilibrium is calculated using COSMO-RS, and the IFT between the two phases is also calculated as in Fig. 1.The molar fraction of the surfactant in the water-rich phase is assumed to yield the CMC.As a consequence of this approach, the value of the IFT may not always be zero, and the implications are discussed later.
While it will be made clear later in the results and discussion, we recommend that in order to predict the CMC for a general surfactant, both strategies should be employed, and the lowest predicted value of the two should be chosen as the predicted CMC.

Results and discussion
The two prediction strategies were tested for a set of anionic, cationic, zwitterionic, and nonionic surfactants.The predicted values are reported together with experimental data in Table 2.
From Table 2 it is evident that Strategy 1 performed best in predicting the CMC for ionic surfactants, whereas Strategy 2 generally yielded more accurate predictions for nonionic surfactants.We ascribe this marked difference in the prediction accuracy of the two strategies depending on the type of surfactant to the polarity of the head groups.In particular, the higher polarity of the head groups of ionic surfactants makes the distinction between the hydrophobic carbon chain and the hydrophilic head more pronounced, and it disfavors a ''normal" phase separation into two bulk phases because of the very unfavorable interactions between the hydrophilic and hydrophobic parts of the surfactant.Consequently, the n-decane oil phase represents better the hydrophobic environment of the ionic surfactant carbon chains, and the phenomenon that best describes ordered micelle formation by ionic surfactants is their preferential tail interaction via the creation of the virtual oil phase in Strategy 1.In the case of nonionic surfactants, the lower polarity of the head groups leads to more balanced interactions between the hydrophilic and hydrophobic parts of the surfactants, and, consequently, the micellization process is better described by a phase separation in the simple binary surfactant/ water system.
As the polarity of the hydrophilic headgroup of the surfactant in principle dictates which strategy should be utilized to predict the CMC, we investigated the polarity of different surfactants by looking at the screening charge density (r) profiles from COSMO-RS calculations.Fig. 3 shows the r-profiles of some representative surfactants selected as references.
As shown in Fig. 3a, all the ionic surfactants exhibit a primary peak around zero, which is from the nonpolar carbon tail.A secondary peak at r value slightly lower than 0.02 e/Å 2 describes the negatively charged Cl -counterion of CTAC, whereas a weakly pronounced peak at À0.01 e/Å 2 is due to the positively charged headgroup.For SDS, a secondary peak relative to the negatively charged head group is located around 0.015 e/Å 2 , and another secondary peak due the sodium counterion is found around À0.025 e/Å 2 .Comparing the CAPB profiles, we note that the CAPB without explicit counterions shows the r-profile extending beyond the value of 0.02 e/Å 2 (area marked with a red arrow), whereas the r-profile for CAPB with counterions shows a peak at around 0.02 e/Å 2 followed by a sudden drop beyond.CAPB with counterions, similar to SDS, exhibits another secondary peak in the vicinity of À0.025 e/Å 2 , which comes from the added sodium counterion.
Similarly, for the ionic surfactants in Fig. 3b, because of the hydrophobic carbon tail, all the nonionic surfactants exhibit a primary peak at around 0.00 e/Å 2 .For the C 8 E 3 surfactant, we observe a secondary peak due to the polarity of the hydrophilic head group.C 8 E 3 is taken to be representative for all the other nonpolar surfactants, which indeed show similar r-profiles, except C 9 C 2 NO.With regard to C 9 C 2 NO, we observe that, like CAPB without counterions, the r-profiles extends beyond the value of 0.02 e/Å 2 (area marked with a red arrow).
As we will discuss in the next sections, we found the optimal range of r-profile for reliable CMC prediction accuracy to be between the most negative value of the r-profile of the sodium cation (red line) and the most positive value of the r-profile of the chloride anion (green line).Consequently, for surfactants characterized by r-profiles, which extend beyond this range (i.e., CAPB without counterions and C 9 C 2 NO), the prediction accuracy can be expected to be lower.

Ionic surfactants
For ionic surfactants, we comment exclusively on predicted values of Strategy 1, as the predictions from Strategy 2 were consistently unreliable indicators of the CMC.
Anionic surfactants: With regard to anionic surfactants (i.e., SDS, SLES-EO1, SLES-EO3, PS and SOS), the CMC predicted for SDS agrees with experimentation.Concerning SLES-EO1 and SLES-EO3 surfactants, Aoudia et al. [39] reported values for log(CMC) very close to each other, À3.10 and À3.09.Contrarily, Vleugels et al. [46] observed a marked decrease in CMC with an increase in the number of EO groups for SLES surfactants having the same carbon chain length.In particular, log(CMC) values for SLES-EO2 and SLES-EO4 were À3.00 and À3.33 respectively.Our predictions yielded values of À2.40 for SLES-EO1 and À3.42 for SLES-EO3, which suggest a decrease in the CMC value for an increase of the EO units and, therefore, are in agreement with the trend observed by Vleugels et al. [46] Furthermore, in the case of SLES-EO3, Strategy 2 yields a predicted value of À2.75, thus achieving a prediction accuracy comparable to Strategy 1.For SLES-EO3, the predicted value of Strategy 1 is 0.33 lower than experimentation, and Strategy 2 is 0.34 higher than the experimental value.This could indicate that CMC for SLES surfactants with even larger numbers of EO units could potentially be better predicted with Strategy 2. With regard to PS surfactant, the predicted value of À3.35 is in excellent agreement with the experimental value of À3.30.The hydrophobic tail of PS is constituted of 18 carbons (C18) instead of the C12 structures of SDS and SLES.This deomonstrates that our method is able to capture the hydrophobic effect of different carbon chain lengths.Notably, the experimental measurement of the CMC for PS was conducted at 60 ℃ and consequently we performed the IFT calcu-lations at the same temperature.The excellent prediction accuracy for PS at 60 ℃ shows the robustness of the model in performing accurate predictions at temperatures that are considerably higher than ambient conditions.Finally, the SOS surfactant was tested due to its high CMC experimental value (-0.81 in the log scale).Again, Strategy 1 confirmed to be the most accurate, yielding a predicted value of 0.12 which is 0.94 higher than the experimental value.The difference with the experimental value is higher than the rest of anionic surfactants, but still within 1 unit in the log scale.

Table 2
Comparison between the predicted values for the log(CMC) and IFT for the two approaches and the experimental value of log(CMC).The colors are used to guide the reader.Green implies good prediction accuracy (<1 log unit deviation), orange indicated inaccurate predictions (>1 log unit deviation), and blue refers to IFT values for Strategy 2 that are larger than 3 mN/m.The bold values of the CMC predictions in the table are the best outcomes.a) Ref. [38], b) Ref. [39], c) Ref. [40], d) Ref. [41], e) Ref. [42], f) Ref. [43], g) Ref. [44], h) Ref. [45].
Cationic surfactants: In the case of cationic surfactants, the influence of the counterion on the CMC value was investigated considering CTAC, CTAB, CPC and OBDMC surfactants.Our predictions yielded values of À3.4 and À3.38 for CTAB and CTAC, respectively, which are in good agreement with experimental data of Wei et al. [40] for CTAB (-3.04) and Li et al. [41] for CTAC (-2.96).Although the difference between the log(CMC) of CTAB and CTAC are negligible for both experimental and predicted values, the trend in predictions is consistent with experiments, yielding lower values of log(CMC) for CTAB as compared to CTAC.Concerning CPC, the predicted value of À3.45 is in good agreement with the experimental value of À3.00.Notably, the CPC hydrophilic head is constituted by the heterocyclic pyridine acid group, which is substantially different from the trimethylammonium group of CTAB and CTAC.This further validates the consistency of the method in predicting the CMC of cationic surfactants with different hydrophilic headgroups.Finally, the OBDMC surfactant was tested due to its low experimental CMC value (-5.15 in the log scale).The predicted CMC of À5.22 agreed very well with experiments, which further confirmed the reliability of the model in predicting also quite low values of CMC.
Zwitterionic surfactants: The surfactant CAPB was chosen as a reference example for zwitterionic surfactants.Staszak et al. [42] measured the CMC for CAPB at different NaCl concentrations, reporting values for log(CMC) ranging from À2.15 (2 M NaCl water solution) to À3.55 (pure water).We performed two different predictions: one considering the pure CAPB molecule and a second one adding two counterions in proximity of the charged groups (i.e., Cl À anion next to the amine cation and Na + cation next to the carboxyl acid anion).The addition of the counterions enabled us to simulate the presence of NaCl in solution as well as to be consistent with predictions of the other ionic surfactants (including counterions bound to charged sites at the surfactant).Predictions of CAPB with counterions yielded a value of log(CMC) equal to À1.7, which is in good agreement with the experimental value of À2.15 (CAPB in 2 M water solution).However, this is considerably far from the experimental value of À3.55 in pure water.Predictions performed with pure CAPB molecules yielded a value of log(CMC) of À0.88, which is drastically higher than the experimental values.Notably, for the zwitterionic surfactant with counterions, the prediction of Strategy 2 was slightly more accurate than Strategy 1, yielding a value of the log(CMC) equal to À1.82.Overall, for zwitterionic surfactants we found that the addition of counterions was crucial to perform accurate predictions and that the performance of Strategy 1 and Strategy 2 were comparable.In particular, the addition of the sodium counterion entails the decrease of the screening charge density (r) of the negatively charged carboxylate to a value lower than 0.02 e/Å 2 , which we anticipate will impact prediction accuracy.
Finally, the effect of the parametrization was briefly investigated by comparing the predictions of log(CMC) for CTAB, SDS, and all the nonionic surfactant using the BP_TZVP_19 parameterization to our chosen TZVPD_FINE_19.The TZVP parametrization yielded lower predictions accuracy than TZVPD-FINE, yet was still within 1.5 log units.Predicted values for the TZVP parametrization are reported in Table S1 of the Supplementary material.

Nonionic surfactants
Linear and branched polyoxyethylenated (POE) surfactants:  (8.08 mN/m), the IFT is significantly above zero.Rosen et al. [45] observed that when the hydrophobic tail of the surfactant is too short there is little tendency for micellization to occur because the hydrophobic group does not manage to efficiently distort the solvent structure.This is indeed the case for C 6 E 3 and IC 6 E 6 surfactants, which have the same CMC experimental value and for which analogous considerations can be made.In particular, Telgmann et al. [47] observed that for C 6 E 3 , oligomeric structures rather than proper micelles occur in correspondence of the CMC.The presence of unstructured oligomers instead of micelles can explain why our predicted IFT was significantly higher than zero.In principle, our method predicts that no C 6 E 3 micelles would form at the ''CMC," which is consistent with experiments.For C 8 E 1 , the single EO group of the hydrophilic group might be too small to interact efficiently with the water phase, thus, causing phase separation (instead of a micellization phenomenon) between water and the C 8 E 1 molecules.
Glucoside hydrophilic groups: For C 8 BG 1 and C 12 BG 1, predicted values of the CMC are in good agreement with experimental values, again within one log unit.The values of the IFT are negative for both surfactants.This substantial reduction of the IFT can be explained with water molecules also hydrating part of the methylene units.This was indeed reported for C 8 BG 1 micelles by Paula et al. [48] , who observed that water hydrates between two and three methylene groups of the surfactant hydrophilic tail.
The predicted (À3.66) and experimental (À3.88) values for log(CMC) are in close agreement.Concerning the high value of the IFT (9.49 mN/m), this is probably due to the presence of branches and the aromatic ring in the hydrophobic tail.As reported by Rosen et al. [45] , the branching of the hydrophobic group increases the solubility of the surfactant in water and both the branching and presence of an aromatic ring causes looser packing of the surfactant tails, thus hampering the micellization process.
C 8 BGLYE: The predicted (À2.11) and experimental (À2.24) values for log(CMC) are in close agreement.The positive value for the IFT (3.97 mN/m) may be due to the branching of the hydrophilic head group.
C 9 C 2 NO: In contrast with all the other nonionic surfactants, Strategy 1 performs better than Strategy 2 in predicting the log (CMC) of C 9 C 2 NO.The predicted value for Strategy 2 (0.68) is markedly higher than the experimental one (À1.27),whereas the value predicted by Strategy 1 (À0.30) is within one logarithmic unit of error.As anticipated previously, the surfactant C 9 C 2 NO is quite polar, and it is characterized by a r-value higher than 0.02 e/Å 2 in its hydrophilic head group.The prediction accuracy is most probably impacted by this.The fact that Strategy 1 performs better than Strategy 2 can be once again ascribed to the high polarity of the head group.Looking at the r-profiles of Fig. 3a, we notice that ionic surfactants generally have head groups with higher polarity compared with nonpolar surfactants (Fig. 3b).The only exception among nonionic surfactant is indeed constituted by the head group of C 9 C 2 NO.This explains why Strategy 1, which performs best for highly polar head groups (i.e., ionic surfactants), gives a more accurate prediction for C 9 C 2 NO.
C 11 CONEO: The predicted CMC value of this surfactant is the least accurate of the dataset and the only one differing by more than 1 log unit from the experimental value.Our hypothesis, considering that COSMO-RS theory is known to reliably describe the amide groups, is that the model tends to overestimate the contribution due to the double bond (C@C) in the hydrophobic carbon chain.Rosen et al. [45] reported that when a carbon-carbon double bond is present in the hydrophobic chain, the CMC is generally higher than the one of the corresponding saturated compound.Our model predicts a CMC that is 1.2 log units higher than the experimental value.Extended studies on more similar surfactants would be necessary to confirm or deny our hypothesis, but it is outside the scope of this paper.
The focus of this paper is the prediction of CMC, which is substantiated by experimental studies.The qualitative interpretation of the calculations for CMC predictions with Strategy 2, where the IFT is significantly greater than zero, would be that the overall surface area between the surfactant and water phases needs to be smaller, so as not to pay too large of an energy penalty for forming the contact area.This would mean that only for surfactant/water systems where IFT between the two phases is below or near zero, small micelles would form.For systems where IFT is significantly larger than zero, much larger micelles would be formed or other phenomena would be observed, such as oligeromization [47] or nanoparticle formation consisting of multiple micelles to reduce the surface area [36].We are not aware of experimental measurements of drop or micelle sizes for all these systems, but present the implications of our predictions here nonetheless.

Overall prediction accuracy
The best predictions for all surfactants (ionic and nonionic) are compared to experimental data in Fig. 4.
All CMC predictions are rather accurate, and the errors are less than one logarithmic unit away from the experimental value, with the exception of the C 11 CONEO surfactant.The overall root-meansquare-error (RMSE) is 0.62, which is in the prediction range of alternate prediction methods based on COSMO-RS theory (which only included nonionic surfactants) [49].Looking at Table 2, we can infer how Strategy 1 is generally better for ionic surfactants and Strategy 2 is generally better for nonionic surfactants.This is in line with the underlying physical phenomena that are the foundations of the two strategies, and they consider the influence of the polarity of the hydrophilic headgroup.Because Strategy 2 is very fast (typically, a few seconds), and significantly faster than Strategy 1 (in the order of few minutes), we suggest that it be always calculated, regardless of surfactant.Furthermore, the most accurate value for CMC prediction, when considering both strategies, is always the lowest value of the two.This could be interpreted as a competition between mechanisms for micellization, such that whichever one occurs for the lowest surfactant concentration first will determine the CMC.If Strategy 1 provides the lowest CMC, ordered micelles are predicted to form, while if Strategy 2 provides the lowest CMC, less ordered micelles are predicted to form.Prag-matically, it would therefore be advantageous to consistently use both strategies and choose whichever provides the lowest CMC concentration as the best prediction.This can certainly be relevant for borderline cases (i.e., CAPB and C 9 C 2 NO), where it is not obvious beforehand which strategy may yield the most accurate prediction.We would like to point out that unlike the more heavily parameterized computational methods, the underlying COSMO-RS based method we employ in this paper does not rely on any experimental data on interfacial tension or CMC -only on other thermodynamic properties.
We showed how our method is capable of predicting CMC also for other temperatures.This was expected based on the fact that the underlying COSMO-RS method [27] and IFT prediction method [34] gave good results also for temperatures that deviate substantially from 298 K.

Conclusion
We have presented a robust and fast method for predicting the critical micellar concentration (CMC) of surfactants in water based on density functional theory and the COSMO-RS implicit solvent model.CMC for ionic, zwitterionic, and nonionic surfactants are all predicted accurately, within one log unit in concentration of experimental observations, with the only exception of C 11 CONEO, for which the predicted value of the CMC is 1.2 log units higher than the experimental value.
In comparison to previously developed in-silico prediction methods, our approach is the only one that can yield quantitative prediction accuracy for the full spectrum of surfactant species: nonionic, anionic, cationic and zwitterionic.
For nonionic surfactants, the prediction accuracy of our method, based on RMSE values, is slightly lower although comparable to QSAR [7], GC [12,19], and DPD models [16,20], whereas it is in line with the performance of COSMO-RS based models [17,18].The highest deviation from the experimental value occurred for the C 11 CONEO surfactant.For ionic surfactants, our method yields predicted values which are in line with previous QSAR [6] and DPD studies for SDS [13,14] and CTAB [14].
Apart from the largest deviation of the study, C 11 CONEO, our model yielded less accurate predictions for the C 6 E 3 and IC 6 E 6 surfactants, which are characterized by a high experimental value of the CMC (À1.00 in the logarithmic scale).Nevertheless, for these two surfactants (characterized by a rather short hydrophobic tail), also, the accuracy of the experimental data could be questioned.This is suggested by the low tendency for micellization to occur when the short hydrophobic group does not manage to efficiently distort the solvent structure, as reported by Rosen et al. [45] and by the observation of oligomeric structures, rather than proper micelles corresponding with the CMC for C 6 E 3 reported by Telgmann et al. [47].
The major drawback of previous prediction methods, except for COSMOplex, is their lack of immediate transferability to surfactant systems for which they have not been specifically parametrized.In particular, QSAR models need distinct regression to experimental values for surfactants with dissimilar chemical composition.GC models require parametrization of specific groups for treating diverse surfactants and are still limited to nonionic surfactants.DPD needs a non-trivial parametrization of the force field for bead interactions, whenever a surfactant with a new head group is considered.The MD/COSMOmic approach, similar to DPD, relies on

Fig. 2 .
Fig.2.Interfacial tension (IFT) profile for the water/SDS/n-decane system as a function of the aqueous concentration of SDS surfactant.

Fig. 3 .
Fig. 3. r-profiles of a) ionic and b) nonionic surfactants selected as reference.

Fig. 4 .
Fig. 4. Optimal set of predicted values of the log(CMC) (Strategy1 for ionic surfactants and Strategy2 for nonionic surfactants) are compared to experimental values of log (CMC).

Table 1
List of the investigated surfactants, together with their chemical formulas, surfactant type, and abbreviations used throughout the paper.
Fig. 1.Schematic representation of Strategies 1 and 2 for the prediction of CMC.M. Turchi, A.P. Karcz and M.P. Andersson Journal of Colloid and Interface Science 606 (2022) 618-627 C 10 E 3 , C 12 E 6 , C 16 E 7 and IC 6 E 6 (the branched POE) are polyoxyethylenated surfactants with different methylene/ethylene oxide (CH 2 /EO) ratios and, therefore, different hydrophobic/ hydrophilic balances.For all the POE surfactants, Strategy 2 predicts CMC with good accuracy, always within one unit in the logarithmic scale.Notably the values of the log(CMC) span between À1 of C 6 E 3 and À5.76 of C 16 E 7 .Therefore our model was tested in the whole spectrum of CMC of most common surfactants.Concerning the IFT predictions, all the values except for C 6 E 3 , IC 6 E 6 and C 8 E 1 are fairly close to zero.For C 6 E 3 (10.65 mN/m), IC 6 E 6 (5.43 mN/ m) and for C 8 E 1