Intermolecular Non-Covalent Carbon-Bonding Interactions with Methyl Groups: A CSD, PDB and DFT Study

A systematic evaluation of the CSD and the PDB in conjunction with DFT calculations reveal that non-covalent Carbon-bonding interactions with X–CH3 can be weakly directional in the solid state (P ≤ 1.5) when X = N or O. This is comparable to very weak CH hydrogen bonding interactions and is in line with the weak interaction energies calculated (≤ –1.5 kcal·mol−1) of typical charge neutral adducts such as [Me3N-CH3···OH2] (2a). The interaction energy is enhanced to ≤–5 kcal·mol−1 when X is more electron withdrawing such as in [O2N-CH3··O=Cdme] (20b) and to ≤18 kcal·mol−1 in cationic species like [Me3O+-CH3···OH2]+ (8a).

The impact of a novel type of weak interaction on molecular recognition phenomena naturally leads one to speculate that other non-canonical interactions may play a similar role. One interesting candidate are σ-hole interactions involving sp 3 -hybridized C-atoms. Such interactions have been studied since about 2013 [7,46] and are particularly interesting because sp 3 -C is abundant in living systems. More specifically, the methyl group (X-CH 3 , where X = any atom or group) is frequently encountered in natural and synthetic compounds and 'non-covalent Carbon bonding' involving methyl groups has thus been studied by various researchers [47][48][49][50][51][52][53][54][55][56][57][58][59]. Most of these contributions are computational inquiries, while a small amount of these articles also deals with an analysis of non-covalent Carbon bonding interactions in protein structures present in the Protein Data Bank (PDB) [47,50,56]. Interestingly, none of the studies so far have systematically evaluated the crystal structure data present in the Cambridge Structure Database (CSD) [60,61]. What is more, evaluations of the PDB were largely anecdotal or only considered structures that comply to the (rather strict) criteria of a Carbon bonding geometry. Some also included intramolecular contact distances (which are notoriously difficult to evaluate).
In this contribution a combined CSD and PDB evaluation is presented aimed at elucidating whether electron rich entities have a preferential orientation around a methyl group within a rather large envelope, i.e., whether intermolecular non-covalent Carbon bonding interactions with methyl groups are directional. For evaluative purposes, several Density Functional Theory (DFT) computations were conducted as well. This combined database/DFT study reaffirms that noncovalent Carbon-bonding interactions with X-CH3 can be significant, although the interaction is hardly directional, in particular when the methyl group is poorly polarized such as most C-CH3 structures.

General Information on Database Analyses
The CSD [60,61] version 5.40 including two updates (until May 2019) was inspected using ConQuest [62] version 2.0.2 (build 246353, 2019). X-ray powder structures were omitted from the searches, which were further limited to structures containing 3D coordinates and those with an Rfactor ≤ 0.1. The PDB was queried using Relibase [63] 3.2.3 and restricted to protein and DNA crystal structures where the packing environment was also searched. No other restrictions were imposed on the PDB search. Datasets were obtained using the general query shown in Figure 1a. The methyl groups were split in those connected to a C, N, O, P, or S atom (X in the figure, in the PDB search specified as part of a ligand). The interacting 'electron rich' partners (ElR in the figure) considered were a water, amide or carboxy-O atom, a sulphur atom or the centroid of an aryl ring (in the PDB search always specified as part of the protein). The geometric constraints imposed on the searches were that the intermolecular distance d between the methyl C-atom and ElR was ≤5 Å and that the X-CH3•••ElR angle (α) was 90°-180°. All the data were thus confined within a hemisphere with a basal radius of 5 Å, centered on the methyl C-atoms as is shown in Figure 1b. (a) general query to obtain data with d ≤ 5Å, α = 90°-180°, X = C, N, O, P or S and ElR (electron rich entity) is as indicated. (b) Illustration of the method used to assess directionality (see text for details).

Methodology to Generate P(α) Plots
The datasets obtained as described above (2.1) were analysed to assess whether the distribution of ElR within the methyl-centered hemisphere reflects any directionality. This method has been successfully applied to assess the directional behaviors (in the solid state) of various other weak noncovalent interactions such as anion/lone-pair-π, [29,64] CH-π, [11,65] halogen-π [66,67] and nitro πhole interactions [42,68]. The method works by first computing the freely accessible volumes at each α-value (α free ) by subtracting the volume of a model methyl group from a spherical cone with 5 Å height and a cone angle of 180-α. This can be achieved by using the 3D-drawing program Autodesk ® Inventor ® Pro [29]. This is illustrated in Figure 1b, where the spherical cones are shown at 10° intervals. The model methyl group was generated by using standard aliphatic C-H bond distances (1.06 Å) [69] and the van der Waals radius of C (1.70 Å) and H (1.09 Å) [70]. The interfering volume

Methodology to Generate P(α) Plots
The datasets obtained as described above (2.1) were analysed to assess whether the distribution of ElR within the methyl-centered hemisphere reflects any directionality. This method has been successfully applied to assess the directional behaviors (in the solid state) of various other weak non-covalent interactions such as anion/lone-pair-π, [29,64] CH-π, [11,65] halogen-π [66,67] and nitro π-hole interactions [42,68]. The method works by first computing the freely accessible volumes at each α-value (α free ) by subtracting the volume of a model methyl group from a spherical cone with 5 Å height and a cone angle of 180-α. This can be achieved by using the 3D-drawing program Autodesk ® Inventor ® Pro [29]. This is illustrated in Figure 1b, where the spherical cones are shown at 10 • intervals. The model methyl group was generated by using standard aliphatic C-H bond distances (1.06 Å) [69] and the van der Waals radius of C (1.70 Å) and H (1.09 Å) [70]. The interfering volume between each spherical cone and the model methyl group can be obtained using the 'inspect interference' option in Autodesk ® Inventor ® Pro; the red part in Figure 1b is the interfering volume involving a spherical cone with a cone angle of 60 • (i.e., at α = 120 • ). The volume differences between such 'free' volumes with increasing values of α thus give the absolute volume distribution of freely accessible volume around a methyl model within the hemisphere, as a function of α: ∆α free (α). Dividing each volume (∆α free ) in this distribution by the total freely accessible volume (i.e., the volume of a hemisphere minus the interfering volume of the model methyl group in that hemisphere) thus gives the relative volume distribution as a function of α: ∆ rel α free (α). This distribution is the random (or volume) distribution. The data retrieved form the CSD and the PDB can be binned as a function of α. Relating this binned data to all the data in a dataset thus gives the observed relative distribution as a function of α: ∆ rel α data (α). The quotient of this relative data distribution over the random distribution is a measure for the actual probability (P) of finding data at a certain value of α. That is, P(α) is unity for a random distribution of data, while P-values larger than unity reveal a relative concentration of data, which is indicative of attractive interactions.

Methodology to Generate N(d') Plots
A second analysis involved plotting the hit fraction (N) for a subset with α = 160 • -180 • as a function of the van der Waals corrected H 3 C···ElR distance d' (i.e., d -vdW(C) -vdW(ElR)): [70] N(d'). Such distributions show how much of the data is involved in van der Waals overlap with the methyl C-atom along the vector of the X-CH 3 bond and how such data is distributed. For attractive interacting pairs this distribution is expected to exhibit a peak-like feature, or an S-like curvature when the cumulative hit fraction is used.

Computational Methods
DFT geometry optimization calculations were performed with Spartan 2016 at the B3LYP [71,72]-D3 [73]/def2-TZVP [74,75] level of theory, which is known to give accurate results at reasonable computational cost and a very low basis set superposition error (BSSE) [74,75]. The typical starting geometry for possible Carbon bonding adducts was set to d' = −0.1 Å and α = 180 • , and in the case of dimethylacetamide the C···O=C angle was also set to 180 • . The geometry optimizations were performed without any constraints. For other geometries (e.g., a H-bonded geometry), the molecular fragments were manually oriented in a suitable constellation before starting an unconstrained geometry optimization. The Amsterdam Density Functional (ADF) [76] modelling suite at the B3LYP [71,72]-D3 [73]/TZ2P [74,75] level of theory (no frozen cores) was used for energy decomposition and 'atoms in molecules' [77] analyses. Details of the Morokuma-Ziegler inspired energy decomposition scheme used in the ADF-suite have been reported elsewhere [76,78] and the scheme has proven useful to evaluate hydrogen bonding interactions [79].

P(α) Plots
A numerical overview of the amount of crystallographic information files (CIFs) and protein data bank files (PDBs) for each search query is given in Table S1, together with the amount of hits found in each dataset (a .cif or .pdb file can contain multiple hits). Shown in Figure 2 are the P(α) plots for the CSD (left) and PDB (right) data plotted at 5 • intervals involving X = C, N, O and ElR = water-O (top) or amide-O (bottom). These datasets were chosen because they all contained a large number of hits (>7,500) and thus allow for the most reliable comparison. A complete set of P(α) plots is provided in Figure S1.  Table S1 for numerical overview), the plots are given at a 5° resolution for α. A full set of P(α) plots (i.e., for all the X vs ElR pairs in Figure 1) is given in Figure S1 The data plotted in Figure 1 largely trace the line at P = 1 (highlighted in green), which is indicative of a random distribution of data. For X = O and N, these values are somewhat above unity around α = 160°-180° for water O-atoms in both databases and for amide O-atoms in the CSD. The maximum P-values are very small at about 1.5, which indicates a very small amount of directionality. Indeed, maximum P-values for several weak inter-molecular interactions are: ~2.5 for CH-π; [11,65] ~3 for π interactions with nitro compounds; [42,68] ~2.5-5 for anion-π and lone-pair-π; [29,64] ~2.5-10 for halogen-π [67] and also about 2.5-10 for halogen bonding with aryl-halogens. [66] Interestingly, the P-values did not peak near α = 90°-120°, an angle congruent with hydrogen bonding. These data thus suggest that the Carbon binding geometry is more directional than a hydrogen bonding geometry, although this directionality is very weak. In all cases for X = C, the Pvalues around α = 160°-180° are below unity, suggesting that the Carbon bonding geometry is least favored in these instances. The data for ElR = RCO2 and RyCS are very similarly distributed and the data for aryl rings is skewed towards α = 160°-180° only for the CSD data (for X = C, N, O, P and S, see Figure S1). For all other cases where X = P, very few hits were obtained (most numerous was RCO2 in the CSD with N = 1,454). While some datasets with X = S were of a reasonable size, too many were well below N = 7,500 and these data will thus not be discussed in the main text (there is a small discussion in the caption of Figure S1).

N(d') plots
The data characterized by α = 160°-180° was inspected further by means of N(d') plots as described in the methods and materials section. A numerical overview of the amount of data in each dataset as well as the (relative) amount of van der Waals overlap found in each dataset is given in Table S2. Shown in Figure 3 are plots of the hit fractions (in %) as a function of d' (the van der Waals corrected d) for X = C, N, O and ElR = water or amide. The same data plotted as cumulative hit fractions is shown in Figure S2 and the N(d') plots for all datasets containing >500 hits are shown in Figure S3. . P (α) directionality plots for the data retrieved from the CSD (left) and the PDB (right) using the general query shown in the top-right inset figure for X-CH 3 ···ElR pairs. X can be C, N or O and 'ElR' can be a water or an amide O-atom. The insert figure in the top left is intended as a guide to the eye to interpret the spatial location of data with a certain value of α. Due to the amount of data per dataset (see Table S1 for numerical overview), the plots are given at a 5 • resolution for α. A full set of P(α) plots (i.e., for all the X vs ElR pairs in Figure 1) is given in Figure S1. The P value of 1 is highlighted in green and indicates a random distribution of data. The data plotted in Figure 1 largely trace the line at P = 1 (highlighted in green), which is indicative of a random distribution of data. For X = O and N, these values are somewhat above unity around α = 160 • -180 • for water O-atoms in both databases and for amide O-atoms in the CSD. The maximum P-values are very small at about 1.5, which indicates a very small amount of directionality. Indeed, maximum P-values for several weak inter-molecular interactions are:~2.5 for CH-π; [11,65]~3 for π interactions with nitro compounds; [42,68]~2.5-5 for anion-π and lone-pair-π; [29,64]~2.5-10 for halogen-π [67] and also about 2.5-10 for halogen bonding with aryl-halogens. [66] Interestingly, the P-values did not peak near α = 90 • -120 • , an angle congruent with hydrogen bonding. These data thus suggest that the Carbon binding geometry is more directional than a hydrogen bonding geometry, although this directionality is very weak. In all cases for X = C, the P-values around α = 160 • -180 • are below unity, suggesting that the Carbon bonding geometry is least favored in these instances. The data for ElR = RCO 2 and R y CS are very similarly distributed and the data for aryl rings is skewed towards α = 160 • -180 • only for the CSD data (for X = C, N, O, P and S, see Figure S1). For all other cases where X = P, very few hits were obtained (most numerous was RCO 2 in the CSD with N = 1,454). While some datasets with X = S were of a reasonable size, too many were well below N = 7,500 and these data will thus not be discussed in the main text (there is a small discussion in the caption of Figure S1).

N(d') plots
The data characterized by α = 160 • -180 • was inspected further by means of N(d') plots as described in the methods and materials section. A numerical overview of the amount of data in each dataset as well as the (relative) amount of van der Waals overlap found in each dataset is given in Table S2. Shown in Figure 3 are plots of the hit fractions (in %) as a function of d' (the van der Waals corrected d) for X = C, N, O and ElR = water or amide. The same data plotted as cumulative hit fractions is shown in Figure S2 and the N(d') plots for all datasets containing >500 hits are shown in Figure S3. The data presented in Figure 3 thus imply a somewhat directional nature of X-CH3•••O amides/waters interactions for X = N and O, but not at all for X = C. These findings are in line with the lack of directionality observed in the P(α) plots for X = C and the somewhat directional behavior for X = N or O (see Figure 2). A likely explanation for this is the larger (Pauling) electronegativity of N (3.04) and O (3.44) compared to C (2.55), resulting in a larger degree of polarization of the X-CH3 bond for X = N or O. Another conceivable manner to make a methyl group pore electropositive is to bind it to a cationic fragment such as in protonated or quaternary R3N + -CH3 fragments. Thus, an additional dataset was retrieved from the CSD involving R3N + -CH3•••ElR pairs that fulfilled the d ≤ 5 Å and α = 160°-180° criteria (see bottom entries in Table S1 and Table S2). Shown in Figure 4 are the N(d') plots of the most numerous datasets involving R3N + -CH3 (hexagonals, N + ) together with similar datasets involving all possible N-CH3 fragments (diamonds, N). In all three cases (ElR = water, carboxy or aryl), the distribution is shifted to lower d' distances for cationic R3N + -CH3, which means that relatively more van der Waals overlap is present in these datasets. As can be seen especially from the cumulative N(d') plots (left), the grouping is tightest with carboxy O-atoms (green), followed by water O-atoms (red) and aryl rings centroids (grey) are not grouped at all (nearly linear). The data presented in Figure 3 thus imply a somewhat directional nature of X-CH 3 ···O amides/waters interactions for X = N and O, but not at all for X = C. These findings are in line with the lack of directionality observed in the P(α) plots for X = C and the somewhat directional behavior for X = N or O (see Figure 2). A likely explanation for this is the larger (Pauling) electronegativity of N (3.04) and O (3.44) compared to C (2.55), resulting in a larger degree of polarization of the X-CH 3 bond for X = N or O. Another conceivable manner to make a methyl group pore electropositive is to bind it to a cationic fragment such as in protonated or quaternary R 3 N + -CH 3 fragments. Thus, an additional dataset was retrieved from the CSD involving R 3 N + -CH 3 ···ElR pairs that fulfilled the d ≤ 5 Å and α = 160 • -180 • criteria (see bottom entries in Tables S1 and S2). Shown in Figure 4 are the N(d') plots of the most numerous datasets involving R 3 N + -CH 3 (hexagonals, N + ) together with similar datasets involving all possible N-CH 3 fragments (diamonds, N). In all three cases (ElR = water, carboxy or aryl), the distribution is shifted to lower d' distances for cationic R 3 N + -CH 3 , which means that relatively more van der Waals overlap is present in these datasets. As can be seen especially from the cumulative N(d') plots (left), the grouping is tightest with carboxy O-atoms (green), followed by water O-atoms (red) and aryl rings centroids (grey) are not grouped at all (nearly linear).

Computations
In order to gain insight into the nature and energetics of possible non-covalent Carbon bonding interactions involving various methyl groups and water or amide O-atoms, DFT calculations were performed of X-CH3 adducts with water and with dimethylacetamide (dma, see methods section for details). An overview of these adducts is given in Table 1, together with α of the optimized structures, the total interaction energy of the adducts in kcal•mol −1 and the percentages of electrostatic (E), orbital (O), and dispersion (D) interactions that contribute to this total energy [79]. Perspective views and atoms-in-molecules analyses of all converged structures are shown in Figure S4 and several representative examples are shown in Figure 5. Table 1. Numerical overview of adducts computed with DFT between an indicated X-CH3 methyls and water (adducts 'a') or dimethylacetamide (dma, adducts 'b'). Using ADF at the B3LYP-D3/def2-TZVP level of theory, interaction energies (in kcal•mol −1 ) were computed and an energy decomposition analyses is shown as a percentage of the total amount of interaction energies split up as electrostatic (E), orbital (O) and dispersion (D) interactions. Entries in grey are not consistent with a Carbon bonding geometry. See Figure S4 for perspective views and atoms in molecules analyses.

Computations
In order to gain insight into the nature and energetics of possible non-covalent Carbon bonding interactions involving various methyl groups and water or amide O-atoms, DFT calculations were performed of X-CH 3 adducts with water and with dimethylacetamide (dma, see methods section for details). An overview of these adducts is given in Table 1, together with α of the optimized structures, the total interaction energy of the adducts in kcal·mol −1 and the percentages of electrostatic (E), orbital (O), and dispersion (D) interactions that contribute to this total energy [79]. Perspective views and atoms-in-molecules analyses of all converged structures are shown in Figure S4 and several representative examples are shown in Figure 5. a One of the water H-atoms and one of the RCH3 C-atoms are closest to each other; b Carbon bonding interaction geometry; c Interaction with the cationic C; d Hydrogen bonding interaction(s); e Interaction with cationic N; f CH-π interaction; g The interaction energy with benzene was also computed, starting from a geometry with X-CH3•••benzene centroid = 180°. All adducts converged at a geometry where this angle was about 90° degrees. Interaction energies are about 4-5 kcal•mol −1 and dominated by electrostatics (35-40%) and dispersion (40-50%, see Figure S5 for details).
For comparison purposes, adducts with ethane were computed as shown in entries 1 of Table 1 Table 1 that were optimized by DFT (B3LYP-D3/def2-TZVP). The thin lines are bond paths (bp's) and the small red spheres are bond critical points (bcp's) obtained from an 'atoms-in-molecules' analysis. The bond density (ρ) is in arbitrary units ·10 2 and bcp's indicative of non-covalent Carbon bonding have been highlighted in yellow. Table 1. Numerical overview of adducts computed with DFT between an indicated X-CH 3 methyls and water (adducts 'a') or dimethylacetamide (dma, adducts 'b'). Using ADF at the B3LYP-D3/def2-TZVP level of theory, interaction energies (in kcal·mol −1 ) were computed and an energy decomposition analyses is shown as a percentage of the total amount of interaction energies split up as electrostatic (E), orbital (O) and dispersion (D) interactions. Entries in grey are not consistent with a Carbon bonding geometry. See Figure S4 for perspective views and atoms in molecules analyses.
Interacting X-CH 3 : a One of the water H-atoms and one of the RCH 3 C-atoms are closest to each other; b Carbon bonding interaction geometry; c Interaction with the cationic C; d Hydrogen bonding interaction(s); e Interaction with cationic N; f CH-π interaction; g The interaction energy with benzene was also computed, starting from a geometry with X-CH 3 ···benzene centroid = 180 • . All adducts converged at a geometry where this angle was about 90 • degrees.
For comparison purposes, adducts with ethane were computed as shown in entries 1 of Table 1. Both structures converged to a hydrogen bonding interaction. ∆E = −1.2 kcal·mol −1 in 1a and a methyl acts as electron donating site; i.e., an O-H···C hydrogen bonding interaction. This can be understood due to the polarization in ethane, where both C's are most electronegative and are polarized by the H-atoms. Adduct 1b is about twice as stable with ∆E = −2.8 kcal·mol −1 and also features a hydrogen bonding interaction, but now between a methyl CH and a π-bond in dma. Both 1a and 1b are stabilized mainly by dispersion (46-58%), then electrostatics (32-26%) and least by orbital interactions (22-16%). The neutral water adducts where X = permethylated N, O, P, or S are energetically nearly identical to the ethane adduct (2a-5a). Like with ethane, 4a converged in a O-H···C hydrogen bonding interaction, which can be rationalized by the lower (Pauling) electronegativity of P (2.19) compared to C (2.55). 2a, 3a and 5a converged at a geometry consistent with a Carbon bonding interaction. This is illustrated for structure 3a in Figure 5, where a single bond critical point (bcp) is located between C and O with a bond density of 0.60 · 10 2 a.u.. This can be rationalized by the higher (Pauling) electronegativity of N (3.05), O (3.44) and S (2.58) compared to C (2.55). Electrostatics or dispersion are the main energetic stabilizing factor in adducts 2a-5a which is typical for weak and non-directional interactions like in adduct 1a. A similar series with dma was computed as adducts 2b-5b. Only 2b converged in a geometry consistent with a Carbon bonding interaction (with dispersion as main driver) while the others are C-H···O hydrogen bonding interactions. Carbon bonding interactions with regular permethylated main group elements are thus comparable to very weak C-H hydrogen bonding interactions and less than about −1. The cationic adducts 6-11a were computed as well and the most stable of these involved the Me 3 C + carbocation in 6 (adducts with pentamethylated Carbon are unstable). The bonding interaction in 6b is largely covalent, as evidenced by the interaction energy of −82.4 kcal·mol −1 , the large orbital contribution (55%), a dense bond critical point (18.4 · 10 2 a.u.) and a clear pyramidalization of the central C-atom (see Figure S4). Of the other adducts, all except 6a (Me 3 C + ···O interaction) and 9 (with the least electronegative P) converged into an X-CH 3 ···O Carbon bonding geometry. This is illustrated for 8a and 11a in Figure 4, where a clear bcp can be seen in between methyl C and O water with a bond density of 1.15 · 10 2 and 0.98 · 10 2 a.u. for 8a and 11a respectively. The bonding energies in 7-11a are mainly electrostatic in origin (~70%) and about −8 kcal·mol −1 for water and −15 kcal·mol −1 for dma. The most stable adducts in both series involved the most electronegative O (3.44) in Me 3 O + (8). Two alternative configurations with N-methylpyridinium were also computed (12 and 13). In 12, the O points in between two CH hydrogens as is illustrated for 12a in Figure 4. In adducts 13 the O atom is located directly above the cationic N + . Both 12 and 13 are more stable than the Carbon bonding geometry found in 11, suggesting that hydrogen bonding interactions are most preferred. The interaction energies of Carbon bonding interaction with cationic species is similar to previous data of the adducts: [H 3 N + -CH 3 ···OCH 2 ] (−9.7 kcal·mol −1 ); [47] [Me 3 N + -CH 3 ···OC(H)NH 2 ] (−13 kcal·mol −1 ); [49] [Me 2 S + -CH 3 ···OH 2 /NH 3 /OCH 2 ] (about −8-9 kcal·mol −1 ); [47,49] and [R 2 S + -CH 3 ···various lone-pairs] (about −9.0 kcal·mol −1 ) [50].
As the calculations with cationic species imply that electron withdrawing substituents amplify the Carbon bonding interaction, it was decided to compute adducts with small molecules that have an electron withdrawing group: Iodomethane (14), 1,1,1-trifluoroethane (16), acetonitrile (18) and nitromethane (20). All these adducts converged as a Carbon bonding geometry and are energetically favorable by 1.6-2.8 kcal·mol −1 for water and 3.0-4.9 kcal·mol −1 for dma. The adducts involving nitromethane (20) were most stable and are shown in Figure 5, together with an aim analysis revealing a single C···O bcp (ρ = 0.79 · 10 2 and 0.83 · 10 2 a.u. for 20a and 20a respectively). For the water adducts, H-bonding geometries were also optimized: 15a, 17a, 19a and 21a. All these adducts were about twice as stable and the Carbon bonding geometry. This can be ascribed to the fact that besides a C-H···O hydrogen bonding interaction, another weak hydrogen bonding interaction is present as well (i.e., C-H···I in 15a, C-H···F in 17a and C-H···π in 19a, see also see also Figure S4). For example, a dimer of HCF 3 is estimated at −2.6 kcal·mol −1 and exhibits two weak C-H···F hydrogen bonding interactions (not shown). The interaction energies with neutral yet polarized methyl groups (14)(15)(16)(17)(18)(19)(20)

General Discussion
From all the calculations collected in Table 1 it is evident that the interaction energies of Carbon bonding geometries with the sp 2 -O in dma (adducts 'b') is consistently about twice as strong as the interaction with sp 3 -O in water (adducts 'a'). This is in line with the larger amount of van der Waals overlap observed in the N(d') plots (Figure 2,~30% for amides vs~15% for water). The interaction energies of adducts with a Carbon bonding geometry range from very weak (below −1.5 kcal·mol −1 in 2, 3, 5), to moderately weak (between −1.5 and −5 kcal·mol −1 in 14, 16, 18 and 20) to fairly strong in the cationic adducts (between −7 and −18 kcal·mol −1 in 6-11). ∆E becomes smaller (more stable) in the order 2 < 14 < 16 < 18 < 20 < 7 < 8. Within this series, the orbital contribution remains constant at about 15-20%, while the electrostatic component increases from 30-35% in 2 to about 65% in 8. This implies that stronger Carbon bonding interactions are mainly driven by electrostatic interactions and that weaker such adducts are driven by dispersion. These computational results are consistent with recent literature reports [47][48][49][50][51][52][53][54][55][56][57][58][59] and the database analyses presented here; neutral adducts are very weak and thus hardly (or non) directional but can be made stronger (and thus presumably more directional) when X in X-CH 3 is strongly polarized (see especially Figure 4). The relevance of Carbon bonding interactions with methyl groups is thus likely limited to highly polarized and/or cationic species. While this limits the scope considerably, it is worth pointing out that ligands with methyl groups related to those in adducts 2-10 and 14-21 are abundant within proteins structures and that cationic methyl groups also occur. For example, methylated methionine residues in methyl transferases [80] and nicotinamide derivatives such as nicotinamide adenine dinucleotide [81,82].

Summary and Conclusions
The CSD and the PDB were systematically evaluated for potential directional behavior of intermolecular non-covalent Carbon bonding interactions involving X-CH 3 and electron rich entities such as O/S atoms or an aryl ring (ElR) within a hemisphere of 5 Å basal radius (centered on C). It was found that X-CH 3 ···ElR interactions can be as directional as very weak hydrogen bonding interaction involving C-H (P max ≤ 1.50) but not directional at all when X = C. Grouping of data with significant amounts of van der Waals overlap (up to~30%) was observed in various sub-datasets in the region where the X-CH 3 ···ElR angle α is 160 • -180 • . These distributions were significantly shifted to shorter distances (i.e., more van der Waals overlap) in the case of cationic R 3 N + -CH 3 ···O water/amide compared to charge-neutral R 2 N-CH 3 ···O water/amide interactions.
Model DFT calculations revealed that charge neutral X-CH 3 ···O adducts with water and dimethylacetamide are very weak (≤ -1.5 kcal·mol −1 in 2, 3a, 5a) and are often not the energy minima of the adducts (1, 3b, 4, 5b). The interaction energies can be increased by deploying a more electron withdrawing X (-1.5 to -5 kcal·mol −1 in 14, 16, 18 and 20). Rendering X cationic leads to even more stable adducts (-7.0 to -18 kcal·mol −1 ) in 7, 8, 10 and 11). Carbon-bonding adducts with dimethylacetamide are consistently twice as stable as those with water. Energy decomposition analyses showed that increased stability is driven by electrostatics and atom-in-molecule analyses regularly gave a clear bond critical point involving the methyl C-atom.
It is thus concluded that this combined database / DFT study reaffirms that intermolecular non-covalent Carbon interactions with X-CH 3 is electrostatically driven and can be significant. The interaction can even by mildly directional in the solid state (comparable to weak CH hydrogen bonding interactions), provided X is sufficiently electron withdrawing.
Supplementary Materials: The following are available online, Table S1. Numerical overview of datasets retrieved from the CSD and the PDB. See Figure S1 and Figure 2 for P (α) directionality plots. For data of refined datasets see Table S2 (α ≥ 160 • and van der Waals overlap). 'n.a.' stands for 'not assessed'. The data for quaternary (cationic) N-atoms is not used in the paper for P(α) plot but they were used in N(d') plots. Table S2. Numerical overview of the amount of data in each dataset from Table S1 characterized by α ≥ 160 • (Nα ≥ 160 • ), together with the amount of data within that dataset where van der Waals radii overlap (NΣvdW). 'n.a.' stands for 'not assessed'. Datasets with Nα ≥ 160 • larger than 100 were inspected further by means of the N(d') plots shown in Figures  S2-S3. As a guide to the eye, datasets with less than 100 hits are in grey, those between 100 and 500 hits are in blue and those above 500 are in regular black. Figure S1. P (α) directionality plots for the data retrieved from the CSD (left) and the PDB (right) using the general query shown in the top-left inset figure for X-CH3···ElR pairs. X can be C, N, O, P, or S and 'ElR' can be a water, amide or carboxyl O-atom, an RyCS S-atom (y = 2 or 3, R = any non-metal) or the centroid of an aryl ring, as is indicated in the right-hand side of the figure. The insert figure in the top right is intended as a guide to the eye to interpret the spatial location of data with a certain value of α. Due to the amount of data per dataset (see Table S1 for numerical overview), the top four P (α) plots are given at a 5 • resolution for α and the bottom six at a 10 • resolution. NB: Interestingly, in the P(α) plots for X = S and ElR = water, P is above unity at α = 160 • -180 • for the CSD data, while P < 1 for the PDB data in this same region. In both databases, the P-values are ≥ 1 around α = 105 • . This indicates that the H-bonding geometry (α ≈ 105 • ) is somewhat directional on both databases but that the carbon bonding geometry is more directional only in the CSD. However, the dataset retrieved from the CSD is much smaller (1,546 hits) than the dataset from the PDB (30,725 hits) and the observed feature α = 160 • -180 • in the CSD might well be an artefact. Similarly, the P(α) plots with X = S and an amide O-atom reveal that P ≥ 1 at α = 90 • -105 • , again congruent with a hydrogen bonding geometry. P ≥ 1 also at α = 170 • -180, but only for the PDB data. As this dataset is more voluminous (N = 11,215 vs 2,175 in the CSD), this implies that a carbon bonding geometry is more directional (at least in protein structures). A possible reason for this discrepancy might be that many cysteine and methionine residues are involved in metal coordination or directly methylated thus polarizing the S-C bond. For example, iron-sulphur clusters are held in place by cysteine-Fe coordination bonds [1] and methylated methionine residues are a (crystallographically known) intermediate in methyl transferases. [2]. Figure S2. Cumulative hit fraction (in %) as a function of the van der Waals corrected XCH3···ElR distance d' (in Å) for several datasets from Table S2 (α ≥ 160 • , as illustrated by the inset figure). The interacting pairs involve water-O (red, empty) or amide-O (blue, half-filled) with X = C (squares), N (diamonds), or O (circles). See Figure 2 for the same data plotted as a regular hit fraction. Figure S3. Cumulative (top) and regular (bottom) hit fraction (in %) as a function of the van der Waals corrected XCH3···ElR distance d' (in Å) for all the datasets from Table S2 (α ≥ 160 • , illustrated with the inset figure) that contain ≥ 500 data points. The inset legends show the nature of X (horizontally) and ElR (vertically). NB: In addition to the trends observed and discussed in the main text according to Figure 2, It is interesting to note that in all cases, for X = C the plot is shifted most to longer d' and has the least van der Waals overlap. Moreover, carboxy O-atoms are distributed about the same as amide O-atoms and thio S-atoms about the same as water. Aryls are always randomly distributed with the least amount of van der Waals overlap. Figure S4. Ball and stick representations of perspective views of all molecular adducts listed in Table 1 that were optimized by DFT (B3LYP-D3/def2-TZVP). The thin lines are bond paths (bp's) and the small red spheres are bond critical points (bcp's) obtained from an 'atoms-in-molecules' analysis. The bond density (ρ) is in arbitrary units 102 and bcp's indicative of non-covalent carbon bonding have been highlighted in yellow. Figure S5. Ball and stick representations of molecular adducts selected from Table 1 that were optimized by DFT (B3LYP-D3/def2-TZVP). The thin lines are bond paths (bp's) and the small red spheres are bond critical points (bcp's) obtained from an 'atoms-in-molecules' analysis. The bond density (ρ) is in arbitrary units ·102 and bcp's indicative of non-covalent carbon bonding have been highlighted in yellow.
Funding: This research was funded by the Netherlands Organization for Scientific Research (NWO) grant number 723.015.006.

Conflicts of Interest:
The author declares no conflict of interest.