Experimental Quantum Chemistry: A Hammett‐inspired Fingerprinting of Substituent Effects

Abstract The quantum mechanically calculable Q descriptor is shown to be a potent quantifier of chemical reactivity in complex molecules – it shows a strong correlation to experimentally derived field effects in non‐aromatic substrates and Hammett σm and σp parameters. Models for predicting substituent effects from Q are presented and applied, including on the elusive pentazolyl substituent. The presented approach enables fast computational estimation of substituent effects, and, in extension, medium‐throughput screening of molecules and compound design. An experimental dataset is suggested as a candidate benchmark for aiding the general development and comparison of electronic structure analyses. It is here used to evaluate the experimental quantum chemistry (EQC) framework for chemical bonding analysis in larger molecules.


Introduction
In this work, we explore a quantum chemically derived descriptor for quantifying the electronic effects of functional groups. Knowledge of electronic and steric effects of functional groups, condensed in the form of descriptors, has been important ever since the advent of modern chemistry. [1][2][3][4] Ideally, well-chosen chemical descriptors allow us to both rationalize and predict the chemical properties of compounds and the outcome of reactions, thus aiding the development of functional molecules and synthetic procedures. Successful applications of descriptors abound, for example, in the design of pharmaceuticals and materials, [5] for predicting trends in reaction rates [6][7][8] and for gaining insight into reaction mechanisms. [9] Our approach, which focuses on the analysis of substituents on aromatic and non-aromatic substrates, is both inspired by and is here first validated against the well-known Hammett σ scale. [10] The Hammett σ constant is an empirical chemical descriptor defined by the linear free energy relationship known as the Hammett equation [Eq. (1)]: [10] logK ¼ logK 0 þ 1s; (1) where K and K 0 are reaction equilibrium constants for a substituted and an unsubstituted benzoic acid derivative, respectively, σ is a constant that depends on the nature of the substituent and 1 is a constant that depends on the reaction mechanism and environment. Equation (2) expresses the corresponding Hammett relationship for rate constants, k and k 0 .
Since its conception more than 80 years ago, the Hammett equation has provided chemical insight into the connection between electronic structure and reactivity. The presumption behind the Hammett equation is that the contribution of a given substituent, quantified by σ, shows little dependence on the reaction conditions and mechanism, as long as the nature of the substituent remains unchanged by such conditions. For example, the σ value of a substituent does not change between substituted benzoic acids, esters, and amides. [10] Using this simple empirical approximation, Hammett was able to provide reliable quantitative estimates of electronic substituent effects. [3,10] Today, the Hammett scale constitutes an important experimental basis for useful chemical concepts such as charge transfer, which cannot always be uniquely defined in quantum mechanics. [11] As such, the Hammett scale is also a valuable experimental testing ground for theoretical descriptors meant to quantify different aspects of electronic structure.
Hammett parameters have been predicted using theory and can also be used as experimental benchmarks to valid descriptors produced by theory. One approach for evaluating σ from first principles is to calculate the ΔG of benzoic acid dissociation to obtain pK a values. [12][13][14] However, because of the intricacies of modeling solvation effects accurately, computational prediction of pK a is both challenging and costly. [12] It is often faster, and sometimes more informative, to instead predict σ parameters by connecting them to other properties of the electronic structure. An early example is a work by Jaffé, [15] where σ of substituents were shown to correlate with the electron density at the meta and para carbon atoms in the phenyl ring. Several studies have estimated σ from atomic charges obtained from quantum mechanical calculations on different benzene derivatives. [16] DiLabio et al. [17] demonstrated a relationship between the molecular ionization potential of disubstituted benzenes and the σ values of their substituents. Politzer and co-workers have shown that the local average ionization energy can correlate with Hammett σ constants. [8,18] Gadre et al. [19] and Galabov et al. [20] followed a somewhat similar approach, where local electrostatic potentials were compared to σ. Takahata et al. [14,21] have shown how σ is related to the shift of carbon core levels due to the presence of a substituent. Liu et al. have proposed a generalized equation to connect the Hammett equation with Pauling's electronegativity. [22] Popelier et al. have combined topological analysis of the electron density with principal component analysis to build a partial least square regression model able to estimate σ. [23] Fernández and Frenking have pointed out a correlation between π-conjugation energy and σ in substituted benzyl anions. [6] Krygowski and co-workers have worked to develop a physical interpretation of substituent effects through the use of Energy Decomposition Analysis (EDA) and by comparing σ with various quantities obtainable by quantum mechanical calculations. [24] Before introducing the Q descriptor, which we study in this work, and its relationship to reactivity, we will briefly review its underlying theoretical framework.

Experimental Quantum Chemistry and a Different Take on the Chemical Bond
The descriptor that we will study is defined within an EDA scheme based on the following equation: [25,26] where DE is the change in total energy over a chemical or physical transformation, n is the number of electrons and D� c is the change of the average electron energy. Previous work has shown that the � c term of Eq. (3) can be attributed to the central chemical concept of electronegativity [27] and D� c to electronegativity equalization. [28] The D� c term can interchangeably, and approximately, be interpreted as the average change in the occupied molecular orbital energy over a transformation. The ΔV NN and ΔE ee terms in Eq. (3) quantify changes to electrostatic repulsion between equally charged particles, nuclei, and electrons, respectively. The latter two terms will both increase in magnitude when atoms are brought together. However, because Eq. (3) contains the negative of the ΔE ee term, the value of the combined electrostatic terms, Δ(V NN -E ee ), mostly cancels out during bond formation or cleavage. The non-zero value of the Δ(V NN -E ee ) term can often be attributable to changes in the electronic distribution occurring because of chemical bonding. The reason why the multielectron term ΔE ee shows up with a negative sign in Eq. (3) is related to the double-counting of this energy in the D� c term. [25] The terms of Eq. (3) can be calculated at any quantum chemical level of theory, but they can also, in principle, be estimated, directly or indirectly, from a combination of thermochemistry, photoelectron, and vibrational spectroscopies and diffraction experiments. [25,26] Whereas Eq. (3) is exact within the Born-Oppenheimer approximation, the ability to estimate its component terms from experimental data merits the label of "Experimental Quantum Chemistry", or EQC. A more detailed description of EQC is provided in Ref. [25] and Ref. [26].
In the EQC framework a chemical bond can be defined by its homolytic bond dissociation reaction. In such a definition, the energy associated with a chemical bond directly relates to the corresponding energy change over the transformation, ΔE. This definition of a chemical bond and its energy is not necessarily the most elegant or practical. [29] For example, it is not always possible to consider a homolytic bond formation or dissociation of an intramolecular bond. One could also imagine defining a chemical bond based on its heterolytic dissociation. However, such a choice makes apparent the issue of reference states -how to choose the "right" cationic and anionic fragments that are formed. There exist many other EDA methods (see, e. g., Refs. [25,[30][31][32] and references therein), which are capable of defining interaction energies between parts of a molecule and to partition these energies into interpretable contributions such as, for example, electrostatic, charge transfer and dispersion interactions. [30,33] Defining a chemical bond by its homolytic formation/dissociation, like we do, has one advantage: such a process is, at least in principle, possible to observe experimentally. In what follows, we will rely solely on Density Functional Theory (DFT) calculations for our chemical bonding analysis.

The Bond Descriptor Q
One way to summarize information from the EQC-EDA shown as Eq. (3) is through a descriptor we denote as Q: [26] Q is designed to weigh the orbital nD� c and electrostatic Δ(V NN -E ee ) contributions to the total energy and can be considered an index capable of distinguishing between different chemical transformations. In particular, Q has shown an ability to differentiate between covalent, polar, and ionic bonds in diatomic molecules, which points to a connection with charge transfer. [26] The magnitude of Q for homolytic bond formation/dissociation has also been shown to correlate strongly with the degree of correlation energy in diatomic bonds. We point the reader to Ref. [26] for a more detailed discussion on the interpretation of Q for diatomic bond formation.
In this work, we take the next step in the application of EQC to large molecules and use Q to probe the character of chemical bonds inside of substituted benzoic acids and substituted bicyclo[2.2.2]octane carboxylic acids. In order to see how Q relates to electron-donating and withdrawing effects of substituents, we compare with experimental data in the form of Hammett σ constants, as well as field (F) and resonance (R) effects of functional groups. [3]

Results and Discussion
In Figure 1a, we show the most common reaction used for determining σ values, the acid-base equilibrium of benzoic acids in water. In what follows, we will investigate substituted benzoic acids in a related manner by comparing calculated Q values for bonds in these molecules with experimental σ constants. One critical question we ask is which bond in a substituted benzoic acid should be analyzed to obtain the most relevant information on a molecule's reactivity? Panels b, c, and d of Figure 1 illustrate the different kinds of bonds in benzoic acids investigated in this work, and these bonds will be separately described in the following three sections. Towards the end of this work, we additionally address non-conjugated systems and discuss how Q relates to two main components of substituent effects, field and resonance. The Q descriptor does not appear well suited to describe the creation of charged species, and we, for this reason, do not consider any heterolytic dissociations that would be more akin to the acidbase equilibrium shown in Figure 1a.

The Nature of the Carboxylic OÀ H Bond
The carboxylic OÀ H bond defines the most acidic site of the molecule -it is the bond that is heterolytically broken upon proton transfer -and we, therefore, expect the nature of this bond to relate to acidity. We probe the nature of this bond by calculating Q for its homolytic dissociation, as shown in Figure 1b. We know from previous work that the value of Q is sensitive to charge transfer in the homolytic formation of diatomics, [26] and so our first results shown in Figure 2 are surprising: there is no correlation between Q calculated for the OÀ H bond and σ! Of course, we should keep in mind that Q and σ are far from the same quantities. The Hammett σ constant is an empirical estimate of the effect of a substituent on the acidity of benzoic acids at 298 K in aqueous media. In contrast, Q here refers to a bond formation/dissociation process, occurring in vacuum as T!0 K, without consideration of thermal or solvent effects. Nevertheless, even as Q is a processrelated descriptor, we do expect it to provide information on bond properties, such as polarity. [26] So, what is going on?
The OÀ H bond differs from the bonds shown as Figures 1c  and 1d in several respects: the OÀ H bond is broken during deprotonation, and it is also not part of the π-conjugated system. A substituent affects acidity primarily by changing the balance in stability between the benzoic acid and its conjugate base. Information on the reactivity of benzoic acids might therefore be better obtained by studying the dissociation of intramolecular bonds that are common to both the acid and its conjugated base. Examples of such bonds are those between the aromatic moiety and the reactive carboxyl group or the  substituent functional group, shown in Figures 1c and 1d, respectively.

Connecting the Carboxyl Bond to Benzoic Acid Reactivity
We next look at Q calculated for the bond to the carboxyl group. In other words, homolytic dissociations of the general type shown in Figure 1c. Figure 3 highligths a strong (r 2 = 0.90) correlation between σ and Q, when the latter is calculated for 35 benzoic acids with common substituents in meta position. [3] We attribute the one distinct outlier of Figure 3, the C(CF 3 ) 3 group, to the group's bulkiness, as it slightly distorts the geometry of the phenyl ring by bending the ortho-hydrogens. That no significant steric effects are present is a necessary condition for the Hammett equation to be applicable. A corresponding plot for para-substitutions shows qualitatively the same trend, albeit with a slightly different coefficent of determination (r 2 = 0.83, Figure S1). Should we be surprised by the striking correlation of Figure 3? Yes and no. On the one hand, we know that the value of Q is related to charge transfer. [26] On the other hand, and as mentioned, Q and σ are fundamentally different quantities. Even considering that σ is assumed independent of temperature and solvent, [10] the degree to which Q and σ correlate in Figure 3 is noteworthy.
In Figure 3, colored circles represent substituents' families with a similar chemical function. Both alkyl and ether groups (green and red, respectively) produce tight clusters of data points. For substituent groups that feature amino or carbonyl/ carboxyl functions (blue and cyan, respectively) the results are slightly more spread on the Q axis. The clustering of Q values for similar compounds correctly predicts a small difference in chemical reactivity within each family of substituents. Never-theless, the spread in the Q scale for similar groups is larger than what is seen in the σ-scale. One reason for the scattering is arguably subtle differences within a family of substituents, for example, the length of alkyl chains. We will return to discuss other reasons why groups with similar σ can feature different Q values.

The Sensitivity of Q to Local Effects
Next, we consider the dissociation of the substituent group itself, i. e., bonds exemplified by Figure 1d. Figure 4 shows that, in this case, the correlation between σ and Q is significantly decreased relative to the comparison in Figure 3.
The main outliers in Figure 4 include carbonyls, nitriles, and nitro groups, all of which are known to generate significant electric fields due to their intrinsic dipoles. [1,35] Such fields are examples of local effects that affect the immediate surrounding, such as the substituent bond, more than regions further away, such as the carboxyl group. Other outliers in Figure 4 are the isophthalic esters (shown in cyan in Figure 4), which are known to exhibit a different kind of local effect. The length of the ester alkyl chain is known to have no influence on the reactivity of an isophthalic ester, measured in terms of its σ constant. [2,3] However, granted that differences in ester hydrolysis rates can be partially attributed to steric effects, the trend in rates suggest a local effect of the alkyl chain length on the ester group. [36] We have noted a sensitivity of Q to effects local to the investigated bond (viz. Figure 1d) and seen how, in the absence of local effects (viz. Figure 1c), Q can correlate with known reactivity trends. What other limits might exist for Q as a reactivity descriptor? Can the equivalent analysis also work in  non-conjugated systems? To find out, we next investigate if Q can distinguish between field and resonance substituent effects.

Q as a Reactivity Descriptor in Non-Conjugated Systems
To investigate if Q can be a reactivity descriptor also in nonconjugated systems, we have calculated this metric for the carboxyl bond in 4-substituted bicyclo[2.2.2]octane carboxylic acids (Figure 5b).
This class of compounds is conceptually non-conjugated analogs of para-substituted benzoic acids because the 4substituent and the carboxylic group have nearly identical distance and the same number of bonds in both types of compounds. [37] These two families of compounds have been compared in the past to experimentally delineate the so-called field (F) and resonance (R) effects of functional groups. [3] Field effects typically refer to a combination of different through-space effects. One kind of field effect is the electric field generated by the substituent, while another is throughbond effects occurring due to polarization of σ bonds. [1,2] In contrast, resonance effects are defined as those occurring only due to electron delocalization in conjugated π systems. [1] The way experiments have been used to separate F from R is by first considering the overall effect of a substituent as a combination of F and R effects. [38] This combination of effects is what is measured by σ in the traditional Hammett-type experiments with, for example, benzoic acids (Figure 1a). By next considering non-conjugated analogous of aromatic compounds, one can estimate the field effect of a chemical substituent separate from resonance effects. Figure 5a shows the acid-base equilibrium for 4-substituted bicyclo[2.2.2]octane carboxylic acids that have been used to derive tabulated field effects from measured changes in pK a relative to a reference compound. [3,37] Figure 5b shows our related approach for calculating Q for the carboxylic acid bond in these compounds. Figure 6 compares the calculated Q values with experimental F and R parameters tabulated by Hansch et al. [3] A linear regression shows a clear correlation between Q and F (r 2 = 0.83), a relationship that is largely mirrored when comparing Q and σ in conjugated systems (see Figures 2 and S1). In contrast, we find no clear correlation between Q and R (r 2 = 0.24).
A comparison of Q with F and R also for meta-substituted benzoic acids can be found in the Supporting Information ( Figure S2). Q does correlate with both F (r 2 = 0.78) and R (r 2 = 0.49) in meta-substituted benzoic acids, although to a lower degree than with the σ constant ( Figure 3). Having established, in our set of 35 substituents, a linear relationship between Q and F, and even more so between Q and σ, the question becomes: Can these relationships be used predictively?

Predicting Electronic Group Effects using Q
To test the predictive utility of Q, we have performed our analysis on an additional test set of 10 substituents. The models that we used to estimate σ and F as a function of Q are the linear regressions shown in Figure 3 and Figure 6a, and where Eqs. (5) and (6) are used when the Q analysis is performed on benzoic acids and bicyclo[2.2.2]octane carboxylic acids, respectively. Because the test set is small it has not been chosen randomly, but instead designed to be clearly chemically distinct and outside the initial set used to derive Eqs. (5-6). Experimental and predicted values of σ and F are shown in Table 1 for each subsituent in the test set. The root mean square error (RMSE) associated with the predicted data is � 0.12 and � 0.09 for σ and F, respectively. We note that the training set of 35 substituents provide a similar RMSE (� 0.09 for both σ and F). Despite the inherent errors expected from the model, the Q analysis predicts the correct reactivity trends. This agreement opens up for qualitative prediction of subsituent groups outside of the 45 ones calculated.
There exists a large number of substituents that are either too unstable or reactive to practically allow for experimental measurement of, for instance, σ or F values. One example is the highly energetic pentazolyl (À N 5 ) group, which can be handled at low temperatures when present on suitably substituted arylpentazoles. [39] The making of the kinetically more persistent pentazolate, cyclo-N 5 À , anion, first in the gas phase, [40] then in condensed phases [41] have opened up for N 5 -based reaction chemistry, and, possibly, for new kinds of pentazolyl-substituted compounds. Our predicted σ and F values for the pentazolyl substituent indicate a strong electron-withdrawal ability (Table 1). These predictions are in line with the known increased stability of aryl pentazoles in the presence of electron-donating substituents [39] and indicate that the electronic effects of the N 5 group resemble those of SO 2 CF 3 .

Conclusions
In this work, we have evaluated the ability of a quantum chemically derived bonding descriptor, Q, to capture substituent effects in large molecules. Q, which is straightforward and fast to calculate using standard DFT methods, is found to correlate strongly with experimentally derived field effect parameters and Hammett σ constants in a test set of 45 substituted benzoic acids and their non-conjugated analogs, 4substituted bicyclo[2.2.2]octane carboxylic acids. The correlation between Q and σ and F in aromatic and non-aromatic compounds, respectively, opens for the possibility of predicting substituent effects in a new way. In particular, Q can serve as a fast computational substitute for σ and F parameters in cases where the experimental determination of the latter is challenging. As an example, we have applied the developed Q-based protocol to predict the unknown σ and F values for the highly energetic pentazolyl, cyclo-N 5 , substituent (Table 1).
Our work constitutes the first validation against experimental data of a different kind of chemical bonding analysis, the EQC-EDA scheme when applied to larger molecules. The use of experimental data and reactivity scales in benchmarking of electronic structure analysis tools is desirable but relatively seldom practiced in a systematic fashion. [31,42] The functional groups investigated in this work are commonly used and span a wide range of properties and structures. We would like to encourage the inclusion of the same set of compounds and associated σ, F, and R parameters in the evaluation of other electronic structure analysis methods. The benefit of such crosscomparison, and our aim, is the gradual build-up of comprehensive benchmarks that can facilitate both the comparison and development of electronic structure analysis methods, making them even better suited to serve synthetic chemists and materials scientists.

Computational Methodology
All experimental σ, F, and R parameters are from Ref. [3]. Calculations have been performed at the M06-2X [43] /aug-cc-pVTZ [44] level of theory using Gaussian 16, revision B.01. [45] All structures were confirmed as true minima on the potential energy surface through vibrational analyses at the same level of theory. Our results are robust with respect to smaller basis sets. A comparison with the aug-cc-pVDZ basis set can be found for a selection of molecules in Figure S3. Detailed lists of data for all considered substituents are provided in Tables S1, S2, S3, and S4. A full EQC-EDA analysis on the training set of substituents is provided in Tables S5 and S6. The Q-analysis was performed using a Python script provided at www.rahmlab.com. The script relies on the cclib parsing library [46] and can interpret output from several common quantum chemistry programs. [a] This test set was selected to be chemically distinct from the data shown in Figures 3 and 6. The root mean square error associated with the predicted data is � 0.12 and � 0.09 for σ and F, respectively.