Introduction

Methylcellulose (MC) is one of the most important classes of cellulose derivatives, in terms of produced amounts and versatility of uses. Several books and reference works (e.g., Grover 1993; Nasatto et al. 2015) can provide readers with a deeper understanding of the respective chemistry, properties, and applications. MC has practical applications in various industries, such as food, cosmetics, paints, and pharmaceuticals, where it is used as a thickener, stabilizer, and emulsifier (Thirumala et al. 2013; Morozova 2020; Forghani and Devireddy 2018). MC exhibits a rich phase behavior, including isotropic, nematic, cholesteric, and smectic phases, which can be tuned by varying the degree of methylation, the concentration in solution, and the temperature (Forghani and Devireddy 2018; Hynninen and Patrakka 2021). Studies have also investigated the mechanical properties of MC gels and films, as well as the interactions of MC with other molecules such as surfactants and proteins. The insights gained from these studies have contributed to our understanding of the self-assembly and dynamics of complex fluids and biological systems, especially with regard to medical applications (Thirumala et al. 2013; Ahlfeld et al. 2020; Bonetti et al. 2021; Biswas 2016).

The complex supramolecular and hierarchical structure of cellulose continues to present challenges to the understanding of its properties and behavior. Often, model compounds of low molecular weight are used to simplify the chemical or physical problems under study, in particular with regard to analytical issues. These model compounds most frequently are monomers and dimers, sometimes oligomers, of the cellulose backbone that mimic monomeric, sometimes, oligomeric, building blocks or sections of cellulose or cellulose derivatives. The substituents or modifications on the model compounds are thereby meant to reflect the substituents or modifications on the polymeric cellulose chain. Also, our earlier work has made extensive use of model compounds, particularly in questions of chromophore formation in oxidatively damaged celluloses (Rosenau et al. 2004, 2017a; Henniges et al. 2013; Korntner et al. 2015) or from hexeneuronic acids (Rosenau et al. 2017b), the identification of residual chromophores in different cellulosic matrices and their destruction in bleaching (Rosenau et al. 2007, 2011), model system for cellulose derivatization (Hettegger et al. 2015; Odabas et al. 2016) or questions of quantification of oxidized groups on the cellulosic polymer (Tot et al. 2008; Röhrling et al. 2002).

Computational studies of molecular shape are dependent on model compounds. For example, the shapes of the polymer can be inferred from the lowest energy shapes of the disaccharide. Besides the obvious role of the direct use of experimental structures in the computation, determination of the effects of substituent groups on the properties of the monosaccharides can inform the necessary assumptions in studies of the polysaccharide where it is still not reasonable to explicitly account for all possible variations of structure. Consider that the cellobiose molecule has 10 exocyclic substituents. With three orientations for any one exocyclic group, there are 310 = 59,049 possible geometries for the structure with a single ring shape and fixed geometry of the linkage between the two glucose rings. That could be multiplied by 324 combinations of the linkage torsion angles for a more complete study of the disaccharide for a number that is too large to be reasonable for a quantum mechanics study, so preliminary assumptions can be helpful in reducing the actual variables.

Based on methyl 4-O-methyl-β-d-glucopyranoside (1) as the reference compound, model compounds of monomeric MC units, with all eight different patterns of methyl substitution (mono-, di- and tri-substituted at O-2, O-3 and O-6) were synthesized (compounds 28, Table 1). The multi-step syntheses made heavy use of protecting group chemistry (Yoneda et al. 2016). Each compound was fully analytically characterized, involving full resonance assignment in the 1H and 13C NMR domains and mass-spectrometric data as well as the purity confirmation by elemental analysis. The model compounds represent monomeric β-d-glucopyranoside units in methyl celluloses, with the methyl substituents at OH-1 and OH-4 simulating the truncated side chains of the polymeric counterpart. Previous work has demonstrated that the terminal 4-OMe group was crucial to induce crystallization and H-bond patterns resembling the cellulose allomorphs, and that methyl 4-O-methyl-β-d-glucopyranoside (Röhrling et al. 2002; Tot et al. 2008) as well as oligomeric β-methyl glucosides with a terminal 4-OMe group (Mackie et al. 2002; Ruiz Ruiz et al. 2006; Yoneda et al. 2008) are valid model compounds for cellulose in terms of solution and solid-state structural data. This explains why all model compounds presented in this study are derived from the same fundamental structure with both 1-OMe and 4-OMe substituents (Table 1).

Table 1 Synthesized derivatives of methyl 4-O-methyl-β-d-glucopyranoside with different methyl substitution patterns (18)

In previous work, compounds 18 have been used to study the effects of methyl group substituents at different positions on the hydrolytic stability of glycosidic bonds and methyl substituents in MCs in an aqueous solution (Yoneda et al. 2008; Hosoya et al. 2014) and on physicochemical properties as well as NMR shifts (Karrasch et al. 2009a and b). In this work, we would like to report the solid-state structure of the model compounds, with a focus on the 13C NMR and crystal structure analysis data, as a basis for a later, more in-depth analysis.

Materials and methods

The syntheses of the methylcellulose model compounds 18 have been reported previously (Yoneda et al. 2016).

Solid-state NMR

All solid-state NMR experiments were performed on a Bruker Avance III HD 400 spectrometer (resonance frequency 400.13 MHz for 1H and 100.61 MHz for 13C), equipped with a 4 mm dual broadband CP-MAS probe. 13C spectra were acquired by using the TOSS (total sideband suppression) sequence at ambient temperature with a spinning rate of 5 kHz, a cross-polarization (CP) contact time of 2 ms, a recycle delay of 2 s, and SPINAL-64 1H decoupling. 2 k data points were sampled with an acquisition time of 43 ms resulting in a total sweep width of 240 ppm. Chemical shifts were referenced externally against the carbonyl signal of glycine with δ = 176.03 ppm. The acquired FIDs were apodized with an exponential function (lb = 11 Hz) prior to Fourier transformation.

Solution NMR

All solution NMR spectra were recorded on a Bruker Avance II 400 spectrometer (resonance frequency 400.13 MHz for 1H and 100.61 MHz for 13C) equipped with a 5 mm observe broadband probe head (BBFO) with z-gradients at room temperature with standard Bruker pulse programs. The samples were dissolved in 0.6 mL of CDCl3 (99.8% D) or methanol-D4 (99.8% D). Chemical shifts are given in ppm, referenced to the respective residual solvent signals. 1H NMR data were collected with 32 k complex data points and apodized with a Gaussian window function (lb =  − 0.3 Hz and gb = 0.3 Hz) prior to Fourier transformation. 13C-jmod spectra with WALTZ16 1H decoupling were acquired using 64 k data points. Signal-to-noise enhancement was achieved by multiplication of the FID with an exponential window function (lb = 1 Hz). All two-dimensional experiments were performed with 1 k × 256 data points, while the number of transients and the sweep widths were optimized individually.

X-ray crystallography

Single crystal X-ray data were collected on a Bruker Kappa APEX-2 CCD diffractometer with a nitrogen gas cryostream cooler (Oxford Cryosystems) and a Bruker AXS Smart APEX CCD diffractometer using graphite-monochromatized Mo-Kα radiation (λ = 0.71073 Å) and 0.5° ϕ- and ω-scan frames usually covering complete Ewald spheres with θmax = 30°, except for compound 6 which was measured on a STOE Stadivari instrument (Eulerian 4-circle diffractometer, frame width 0.36°, 6892 frames, detector distance = 40 mm). Non-hydrogen atoms were refined anisotropically. Corrections for absorption with the program SADABS, structure solution with direct methods, structure refinement on F2 (Bruker AXS, 2001: programs SMART, version 5.626; SAINT, version 6.36A; SADABS version 2.05; XPREP, version 6.12; SHELXTL, version 6.10. Bruker AXS Inc., Madison, WI, USA). C–bonded H atoms were placed in calculated positions and thereafter refined as riding (CH3 groups refined in orientation using AFIX 137). O-bonded hydrogen atoms were located in Fourier syntheses and were then refined with a restraint that kept the O–H bond distance and the C–O–H angle at 0.84 Å and 109.5° fixed, but permitted rotation about the C–O(H) bond axis (AFIX 147 of program SHELXL) with Uiso(H) = 1.5 Ueq(O). The absolute structures of all compounds could not be determined through the very weak anomalous dispersion effects and had to be assigned through the known absolute configuration of the glucose residue.

Compounds 18 formed colorless or white crystals (for form, appearance and crystallization conditions see Table 3). A suitable crystal was mounted on a glass fiber in each case and examined by X-ray single crystal diffraction at RT. The deposited Cambridge Crystallographic Data Center (CCDC) files (see Table 3) contain the supplementary crystallographic data for compounds 18. These data can be obtained free of charge via www.ccdc.cam.ac.uk/data_request/cif, by emailing data_request@ccdc.cam.ac.uk, or by contacting The Cambridge Crystallographic Data Centre, 12, Union Road, Cambridge CB2 1EZ, UK; fax: + 44 1223 336,033.

Results and discussion

The solid-state NMR spectra and data of compounds 18 (Fig. 1 and Table 2) showed some interesting deviations from the solution NMR counterparts. While in the solution NMR every C atom gave an unambiguous signal, in solid-state NMR two compounds showed more resonances than would have been expected from the structural formulae. Thus, based on the number and intensities of the signals, two magnetically equivalent entities had to be assumed for compound 7 (the 1,3,4,6-methylated derivative), and even three for compound 6 (the 1,2,4,6-methylated derivative). Generally, there was a quite pronounced effect of the solid-state packing on the solid-state chemical shifts. This effect is evidently canceled out in solution when the solid-state environment and the molecules undergo Brownian motion which renders the sample isotropic. The differences in chemical shifts for structurally analogous C atoms are therefore much larger in the solid-state NMR spectra than in the solution spectra. For example, in the solid state, the chemical shift values of C-1 ranged between 102.5 and 105.7 ppm (Δδ = 3.2 ppm) and that for C-4 between 77.7 and 82.2 ppm (Δδ = 4.5 ppm), compared to 105.2–105.4 ppm (Δδ = 0.2 ppm) and 78.3–78.6 ppm (Δδ = 0.3 ppm), respectively, in solution. The methoxy group resonances in the solid state covered a rather wide shift range of ~ 5 ppm, while in solution the shift differences between the compounds were smaller than 0.3 ppm. Methyl substitution caused a significant down-field shift of the 13C resonances, approximately between 9 and 12 ppm for C–2, 8–11 ppm for C–3 and 11 ppm for C-6 (Table 2). A methoxy-substituted C-6 is found at > 70 ppm and thus shifted close to the region typical for C–2 to C-5 in non-substituted hexopyranoses. A list with all 13C resonances and their full assignments, including those of the methyl substituents, can be found in the Supporting Information.

Fig. 1
figure 1

Solid-state 13C CPMAS NMR spectra of methyl β-d-glucopyranoside (top spectrum) and the differently methylated methyl 4-O-methyl β-d-glucopyranoside derivatives (18)

Table 2 13C CPMAS NMR chemical shifts (ppm) of the differently methylated methyl 4-O-methyl β-d-glucopyranoside derivatives (18). For the methyl substituents´ shifts and the comparison to solution NMR, see Supplementary Information

Over time, we succeeded in crystallizing all eight compounds to a quality sufficient for structure determination by X-ray diffraction. The resulting geometries are summarized in Table 3, along with selected compound data. Crystal data and structure refinement details, listings of bond lengths and angles as well as packing diagrams for all eight compounds are compiled in the Supplementary Information. As already suspected from the solid-state NMR data, compound 6 contained three independent molecules per unit cell, and compound 7 two independent molecules. This explains and confirms the occurrence of multiple resonances per C atom in the solid-state 13C NMR spectra of these two compounds. Compound 4 crystallized as the hydrate.

Table 3 Crystal structures (ORTEP images, 40% thermal ellipsoids) and selected compound data (sum formula, molecular weight, methylation pattern, optical appearance and solvent used for crystallization) of the differently methylated methyl 4-O-methyl β-d-glucopyranoside derivatives (18)

One point of interest in detailed studies of monosaccharide structure is the ring shape. This is conveniently described in the language of ring puckering. In the Cremer-Pople (1975) puckering space (see Fig. 2), the Θ parameter determines the type of ring shape, with 0° and 180° being the 4C1 and 1C4 chairs and 90° being the boats and skew conformations. The half chairs and envelopes are at about 52° and 128° in Θ. The Φ parameter indicates the extent of the pseudorotation. All compounds were clearly in the 4C1 domain, with the rings being somewhat distorted. Because of the glucopyranose ring oxygen atom (O5), a perfect 4C1 chair would have a Θ value of about 7°. Coincidentally, all these structures have Φ values close to 0°, where the O5 would be the atom out of plane if the structure would have the other five atoms in a plane. A complete map of the different ring forms with MM3 energy contours is given in Dowd et al. (1994). Note that there are only two unique chair forms because 4C1 is the same as OC3 and 2C5, and 1C4 is the same as 3CO and 5C2. The structure that deviated most from the perfect chair was the one substituted at positions 1,2,3, and 4. It is shown in tube representation in Fig. 2b, with O5 being higher above a mean plane than the other atoms. Still, Θ is at 15.5°–a Θ of 26° (half of 52°) would mean a conformation resembling OE more closely than 4C1. The values of the puckering amplitude, Q, are all very similar, with the Q value for the most and least substituted rings being 0.575 and 0.572 Å, respectively. Other than the concentration of the structures close to Φ  = 0° there is nothing particularly unusual about the puckering in the compounds 18. There is no evidence for any particular influence of methyl substitution on the ring shape.

Fig. 2
figure 2

Computational ring puckering analysis of the derivatives 18. a A portion of Cremer-Pople (C–P) space. Black diamonds indicate different structures, bold single digits give the number of substituents, the comma-delimited numbers are the positions of the methyl groups (including O-1 and O-4), and the decimal fractions show the amplitude of the puckering. b Tube representation of compound 5, the derivative that deviates most from the perfect chair conformation

The anomeric effect is the unexpectedly high concentration for the α-glucose in solution equilibrium with the β-glucose anomer. In the present studies, mutarotation is blocked by the formation of the methyl glucoside, but there are other manifestations of the stereoelectronic arrangement for the atoms in the sequence involving the ring atoms C5, O5, C1, O1, and the methyl carbon attached to the ring by O1. Table 4 shows that the C1-O1 bond is significantly shorter than the other C–O bonds, about 1.39 Å, but the other bonds are all very similar. Various attempts to find a correlation between the presence or absence of methyl substituents failed.

Table 4 Bond lengths at the anomeric and exo-anomeric centers for the different methylated glucose structures

The O3-H…O5’ hydrogen bond is a frequent finding in molecules related to cellulose, starting with cellobiose. The present work (Table 5) shows that O5 is not a great acceptor of hydrogen bonds in the environments provided by these methylated sugars, however. Only two of the 11 O5 atoms act as acceptors of methyl group hydrogen atoms, and no other donors were observed. Similarly, seven of the O1 atoms do not accept hydrogen bonds; the remaining four include one that accepts from the water of hydration and three that accept C–H…O hydrogen bonds. On the other hand, nine of the O6 accept, as well as eight each for the O3 and O4 atoms. O2 atoms accept seven hydrogen donations. The difference between the O1 and O4 in these monosaccharides may be attributed in part to the rather short C1-O1 bond and other electronic properties. In the cellulose disaccharides, the short C1-O (linkage) bond is maintained, as is the longer O (linkage) C4 bond (Yoneda et al. 2008).

Table 5 Hydrogen bonding activity at the various oxygen atomsa

The orientation of the primary alcohol group is always of interest, with three minima in the calculated energy for complete rotation of the O6 group around the C5-C6 bond. Different from extensive surveys of related molecules in the crystallographic database that shows that the preferred O6 orientation is gauche to both O5 and to C4, the gg orientation, here, nine of the 11 O6 groups were gauche to O5 and trans to C4, the gt conformation. Despite the O6 in nature’s most prevalent carbohydrate compound (native cellulose) having the other conformation, trans to O5 and gauche to C4 (tg), it is in only a small minority of all different carbohydrates. When the original, biosynthesized cellulose structure is disrupted by dissolution or even swelling by NaOH or amines, the O6 re-crystallizes in the gt form.

Neither of the gg O6 structures are derivatized. The gg O6 of the 1,2,4 structure participates in a sequence of O3-H…O6-H…O4 network of hydrogen bonds, and the O6 of 1,3,4 structure is in a similar sequence involving O2-H..O6-H…O4. All of the O1 atoms are substituted with methyl groups, and they all take a position close to O5 and trans to C2 (mean C7-O1-C1-O5 torsion angle of 49(6)° where the value in parentheses is the standard deviation).

Returning to the question of secondary alcohol substituent group orientations raised in the introduction, some patterns can be found that add to our knowledge of carbohydrate generally, and cellulose derivatives specifically. Unlike the primary alcohol group (O6 atom) and the methyl group on the anomeric carbon that take more or less staggered orientations relative to their tetrahedral adjacent atoms, the methyl substituents on O2, O3, and O4 are oriented with the methyl carbon close to eclipsing their respective methine ring hydrogen atoms. Similar findings for galactose pentaacetate were found earlier, supported by density functional theory calculations (Thibodeaux et al. 2002). Figure 3 shows the distribution of orientations of the secondary methoxy groups as torsions to their associated methine hydrogen atom.

Fig. 3
figure 3

Distributions of torsion angles for H–C–O–C of the methyl substituents on ring carbons 2, 3, and 4

Visual examination of the crystal structures of 1–8 showed that the methyl groups were arranged in such a way that they mostly had one of their three hydrogen atoms in a plane that included the methine protons on the ring atoms. This is shown for the first member of the series in Fig. 4, with all structures shown in the Supplementary Information. Participation of the methyl group hydrogen atoms in these planes requires particular values of the torsion angle for the H–C–O–C sequence from the ring to the methyl group and the torsion angle for rotation about the O-Me bond.

Fig. 4
figure 4

The 1,4-dimethyl structure, shown with planes drawn through the methine hydrogen atoms (H2, H4 and H6b on the top side of the ring, and H1, H3, and H5 on the bottom side. These mean planes also included a hydrogen from the methyl groups on the 1 (lower plane) and 4 (upper plane) oxygen atoms as well. The intersection of the planes with these hydrogen atoms are shown by the approximately half white and half red colorings, with the white portions being above the planes and the red half spheres being below the rose-colored planes

Conclusions

We were able to provide the solid-state 13C NMR data with complete resonance assignment and the crystal structure data of the methylcellulose model compounds 18, along with an analysis of their solid-state molecular geometry, the packing data and the hydrogen bond systems.