‘Naked’ and Hydrated Conformers of the Conserved Core Pentasaccharide of N-linked Glycoproteins and Its Building Blocks

N-glycosylation of eukaryotic proteins is widespread and vital to survival. The pentasaccharide unit −Man3GlcNAc2– lies at the protein-junction core of all oligosaccharides attached to asparagine side chains during this process. Although its absolute conservation implies an indispensable role, associated perhaps with its structure, its unbiased conformation and the potential modulating role of solvation are unknown; both have now been explored through a combination of synthesis, laser spectroscopy, and computation. The proximal −GlcNAc-GlcNAc– unit acts as a rigid rod, while the central, and unusual, −Man-β-1,4-GlcNAc– linkage is more flexible and is modulated by the distal Man-α-1,3– and Man-α-1,6– branching units. Solvation stiffens the ‘rod’ but leaves the distal residues flexible, through a β-Man pivot, ensuring anchored projection from the protein shell while allowing flexible interaction of the distal portion of N-glycosylation with bulk water and biomolecular assemblies.


■ INTRODUCTION
N-glycosylation of proteins is a highly conserved process in all eukaryotes whereby complex oligosaccharides (glycans) are cotranslationally appended to the side-chain asparagine γnitrogen of the consensus motif NXS/T ( Figure 1). 1,2 N-Glycans display a rich structural diversity 3 and are vital for both the correct folding of nascent glycoproteins, where interactions with the chaperone lectins calnexin and calreticulin are key, 4 and also for the correct functioning of the mature glycoproteins, where glycans play crucial roles in such processes as protein targeting, intercellular signaling, and host−pathogen recognition. 5,6 Common to these diverse roles and the many structural possibilities for N-glycans is the conserved presence of an invariant core pentasaccharide (Man 3 GlcNAc 2 −) ( Figure  1a). This presence in all mammalian N-glycoproteins (regardless of protein location or function) and the highly conserved nature of N-glycosylation among eukaryotes suggests an indispensable (and perhaps general) role for the core pentasaccharide in facilitating the proper functioning of Nglycans. 4,7,8 Despite its ubiquity, however, the underlying physical reasons for its conservation are not yet fully understood. 9 NMR measurements of model glycopeptides in aqueous solution, coupled with molecular dynamics investigations, 10,11 have exposed the influence of the proximal chitobiose (−GlcNAc-GlcNAc−) stem on the local peptide conformation and suggested an explanation of nature's choice of N-acetyl-Dglucosamine (GlcNAc) over the more plentiful D-glucose. Does the chosen chitobiose stem play other structural roles? Does the notably rare β-mannoside (Man) central linkage between this stem and the trimannoside (Man 3 ) unit impart any unique structural or functional benefits and do these units (and other building blocks) function independently or in concert? Similar NMR and molecular dynamic investigations of the naturally occurring Man 9 GlcNAc 2 N-glycan in aqueous solution have suggested enduring structural features in the core pentasaccharide, incorporating locally bound water molecules. 12, 13 Does the core pentasaccharide present any favored water pockets and are its shape and stiffness influenced by solvent-mediated interactions?
One approach to these various questions is to explore the structural mechanics of the individual building blocks, in isolation and then linked together, using a combination of synthetic oligosaccharide assembly, mass-and conformerselected infrared laser spectroscopy under molecular beam conditions, molecular mechanics, and quantum chemical calculations. This allows determination of their preferred (inherent) conformational structures when isolated in the gas phase to provide basic structural benchmarks; then interrogation of the locations and structural consequences of their microhydration; and finally, their response to a bulk, aqueous environment.
Early stages of this approach have previously been applied to the NXS sequon 14 and the trimannoside unit. 15 We now report the culmination of this strategy, applied here to dissect the unbiased structure of the entire core pentasaccharide and then to understand the effects of hydration upon it. Target oligosaccharide building blocks (1−4, Figure 1b) were designed in a form that provided a chromophore required for detection through mass-selected ultraviolet (UV) photoionization. 16 Syntheses of the chitobiose (GlcNAc-β-1,4-GlcNAc) stem 1, the connecting disaccharide (Man-β-1,4-GlcNAc) 2, the extended trisaccharide (Man-β-1,4-GlcNAc-β-1,4-GlcNAc) 3, and finally the complete core pentasaccharide (Man 3 GlcNAc 2 ) 4, all as their phenyl (Ph) glycosides, were achieved on the gram-scales required for their gas-phase interrogation. Their conformational preferences have been determined in the gas phase using infrared ion depletion (IRID) laser spectroscopy 16 coupled with molecular mechanics, density functional and ab initio theoretical calculations. Those of their discretely hydrated (and 'blocked') complexes were determined through theoretical calculation. Comparisons with their preferred conformations in aqueous solution, reported for chitobiose 17 and the trimannoside (Man 3 ) 18−20 and computed here for the core pentasaccharide through molecular mechanics calculations, reveal a unique insight into the inner workings of the core pentasacccharide and the molecular scaffolding provided by it and its components.
■ METHODS Experimental Section. Full details of all synthetic procedures and characterization are provided in the Supporting Information (SI). The carbohydrates were vaporized by laser desorption into an expanding supersonic jet of argon before passing through a 2 mm skimmer to create a collimated molecular beam which then intersected tunable UV and IR laser beams in the extraction region of a linear time-of-flight mass spectrometer (Jordan). Conformer-specific, mass-selected spectra were recorded in the UV and IR using UV−UV and IR−UV ion dip (IRID) double resonance spectroscopy. 14 Computational. The calculations began with completely unrestricted and exhaustive surveys of the conformational landscapes of each of the carbohydrate 'building blocks' and their singly hydrated complexes, using a molecular mechanics method (MMFFs force field). 21 Structures with relative energies <15 kJ mol −1 , together with additional representative structures that might have a significant population in the cooled adiabatic expansion (see SI), were reoptimized through density functional theory calculations (B3LYP/ 6-311+G*) using the Gaussian 09 program package 22 (and two supercomputers, i2BASQUE and SGIker, employing up to 96 processors per calculation) to provide a more accurate energy ranking of the lowest-energy structures and their associated harmonic vibrational spectra. Zero-point corrected relative energies were computed through subsequent single point ab initio calculations (MP2/6-311++G**), and final optimizations were based upon comparisons with the experimental spectra themselves, to provide feedback and guide the 'fine-tuning' of the predicted structures.
The core pentasaccharide conformational structure was also explored exhaustively, using the molecular mechanics OPLS2005 23 and GLYCAM06/AMBER 24 force fields. (OPLS2005 possessed the fewest low-quality stretch, bend, and torsional parameters when benchmarked for the carbohydrate segments against all other native force fields found in the Macromodel software; GLYCAM06/AMBER (not included in Macromodel) is referenced specifically, for carbohydrate structures). All stereogenic centers were preserved in the computational search, and the ring-opening method of Still 25 was used to explore additional ring conformations (full details of all calculations are given as SI). , is attached to asparagine residues of glycoproteins (N, S, and T denote asparagine, serine, and threonine, X denotes any amino acid but not proline). Proteins are cotranslationally modified with a tetradecasaccharide which is tailored by glycosyl-hydrolase and transferase enzymes to create diverse glycans with varying antennae but all based upon the conserved core pentasaccharide (gray box). (b) The structures and symbol representations of the core pentasaccharide 4 and the building blocks 1−3 from key regions of 4 used in this study. The naming convention A−E is used in this manuscript to identify the individual glycosyl residues. The site of the benign chromophore used in this study mimics the location of the protein scaffold or truncated glycan (shown by a red star).

■ RESULTS AND DISCUSSION
Design of the Glycan Targets. The target core pentasaccharide 4 and its building blocks 1−3 ( Figure 1b) were synthesized as phenyl β-glycosides, to provide a UV chromophore, 16 installed at the structurally benign reducing terminus. The target sugars were all required in gram-scale quantities to provide sufficient material for spectroscopic investigation in the gas phase. While such scales have been accomplished for certain smaller mono-and disaccharides, large-scale synthesis is rare on larger glycans, such as the core pentasaccharide, and required the development of efficient synthetic routes. This dictated a global approach to all targets that would allow flexible reuse of key building blocks and intermediates in more than one target ( Figure 2).
An additional key synthetic challenge was the formation of the central β-D-mannoside linkage in 2−4, notoriously difficult in carbohydrate chemistry. 26−29 Several elegant methods have been reported for the direct stereoselective synthesis of β-Dmannosides, 30−38 however selectivities can sometimes be low, giving unwanted α-mannoside side products 32 that are difficult to remove through purification. Notably, <90% β-selectivity is the best reported to date for direct β-mannosylation of the OH-4 position of a GlcNAc precursor that would correspond to the synthesis of the core pentasaccharide. 34,39,40 Since this would place severe limitations on the effective construction of grams of the chromophore-equipped targets, an alternative strategy, with potential for greater selectivity for the β-mannoside linkage, was chosen. This took advantage of neighboring group participation of a levulinate ester stereodirecting group (Lev) at the C-2 position of an unnatural glucoside residue to allow the formation of a corresponding β-glucoside linkage. The levulinate ester may be selectively deprotected. Subsequent stereospecific inversion of the configuration at C-2 (I in Figure  2) would then allow conversion of this unnatural residue from the β-D-gluco to the natural β-D-manno configuration. We chose to accomplish this critical inversion through S N 2 displacement at a late stage of our syntheses, envisaging that, despite the strategic risk, this might afford more rigid intermediates that, by virtue of their reduced conformational flexibility, would be less prone to unwanted decomposition (e.g., elimination) pathways.
Synthesis of the Proximal (GlcNAc 2 1), Central (Man-GlcNAc 2), and Extended Stem (Man-GlcNAc-GlcNAc 3) Building Blocks. A suitably chromophoric 'stem' unit of the core pentasaccharide was prepared as the phenyl β-chitobioside 1 (Figures 2 and S1) from parent monosaccharide sugar Dglucosamine 5. A common thioglycoside divergent precursor 6 41 was prepared with participatory phthalimide N-2 protection in six steps from D-glucosamine. 6 was divergently elaborated to both donor 7 42 and acceptor 8 43 (through OH-4-regioselective reductive benzylidene ring opening); use of a trichloroacetimidate donor group in 7 allowed its activation by TMSOTf without disruption (or aglycon exchange) of the S-ethyl (−SEt) group in acceptor 8 (Figures 2 and S1) to assemble 9. The phenyl chromophore was then introduced using phenol through activation of the −SEt group in 9 with Niodosuccinimide (NIS) and TMSOTf; global deprotection of resulting phenyl β-glycoside 10, afforded the free, chromophore-tagged stem (GlcNAc) 2 disaccharide unit 1.
The central disaccharide Man-β-1,4-GlcNAc unit 2 contains the unusual and challenging β-D-mannoside linkage. An acceptor 12 already containing the phenyl chromophore was prepared from 6 (using phenol and NIS/TMSOTf-mediated Figure 2. Schematic of synthetic strategy toward the core pentasaccharide and its building blocks. Synthesis of the core pentasaccharide 4 and its building blocks 1−3 shown using symbol representations. The chromophore used in this study, phenyl, is shown by a red star. For full synthetic details and structures see Figure S1 for the steps describing the conversion of 5 to 1−3 and intermediate 9; Figure S2 for conversion of 18, 19, and 9 onward to 4. All glycosylations were accomplished with >98% stereoselectivity for α-(αG) or β-(βG) glycosidic linkages through the use of participatory C-2 ester substituents (Ac or Lev). activation followed by regioselective reductive benzylidene ring opening to reveal OH-4). 12 was glycosylated using the thio-Dglucoside donor 11 containing a C-2 Lev stereodirecting group (Figures 2 and S1); activation (NIS/TMSOTf) gave disaccharide 13 with excellent (>98%) β-D-stereoselectivity. The crucial C-2 epimerization (vide supra) from β-D-glucoto β-D-manno-configuration (13 → 14, Figure S1) was accomplished in 80% yield over three steps (selective Lev cleavage, triflate formation, and ultrasound-assisted S N 2 displacement with acetate). 44−47 The necessity for ultrasound treatment to promote triflate displacement was consistent with the expected stability afforded by the conformational restriction of the glucoside 13 by its 4,6-O-benzylidene acetal; this stability also ensured minimal side reaction through competing elimination. Global deprotection of 14 yielded chromophore-tagged 2 in 70% yield over four steps.
Synthesis of the Chromophore-Tagged Core Pentasaccharide 4. Consistent with our strategy, the most challenging target, the core pentasaccharide 4, was created (Figures 2 and S2) using previously generated intermediate (GlcNAc) 2 S-ethyl disaccharide 9. This (GlcNAc) 2 disaccharide was elaborated to a suitable acceptor 20, again through regioselective protecting group manipulation to reveal OH-4′ and coupled in a convergent manner with disaccharide donor 22. 22 itself was assembled from D-manno trichloroacetimidate 18 48 and the thioglucoside diol 19. 49 Although 19 contains two potential sites for reaction (OH-2 and OH-3), TMSOTfactivated glycosylation proceeded with both excellent regio-(OH-3 over OH-2) and stereo-(>98% α-manno) selectivities to give 21 50 in 76% yield. Advantageously, this left free the OH-2; direct installation of the stereodirecting Lev group onto OH-2 gave 22.
Unusually, our intended disaccharide 20 + disaccharide 22 [2 + 2] glycosylation utilized a donor and an acceptor both containing the same −SEt anomeric substituent; ordinarily this would give rise to unwanted coincident activation during glycosylation. However, by using an iterative glycosylation strategy, 51 thioglycoside 22 was successfully preactivated first (with triflic anhydride and diphenylsulfoxide in the presence of stabilizing base DTBMP, allowing its successful use as a donor). Subsequent addition of 20 then led to its reaction only as an acceptor nucleophile (without −SEt activation) and gave tetrasaccharide 23, in good yield and with excellent stereoselectivity for the critical central β-linkage (>98%). This therefore directly provided 23 bearing a subsequently useful −SEt group, thereby avoiding functional group manipulations at this reducing terminus.
Before installation of the final α(1,6)-D-mannosyl unit of the core pentasaccharide, we performed the critical inversion of the central unit from β-D-glucoto β-D-manno-configuration (23 → 24) (Figures 2 and S2) as for blocks 2 and 3 (chemoselective Lev deprotection, triflate formation, and ultrasound-mediated S N 2 displacement with acetate), again in excellent yields. Next, benzylidene removal revealed OH-4 and OH-6 in 25. Regioselective OH-6 glycosylation with mannosyl donor 18 (used for 22) gave pentasaccharide 26; here the use of a compatible trichloroacetimidate donor again led to retention of the useful −SEt group at the reducing terminus. Acetylation and subsequent NIS/TMSOTf-mediated activation of the −SEt with phenol installed the chromophore into protected, phenyl β-pentasaccharide 27. Three high-yielding deprotection reactions gave the final chromophore-tagged, core pentasaccharide 4 in gram-scale quantities.
The Gas-Phase IRID Spectra and Structures of Building Blocks 1−3. The R2PI spectrum of 1, the chitobiose (GlcNAc B -β1,4-GlcNAc A ) stem that links the core pentasaccharide to protein, presented two overlapping components (shown later in Figure 5 and in Figure S3). Their associated IRID spectra were in best accord with its two lowest-energy conformers, termed 1-trans and 1-cis (Figure 3a), in which the two coplanar pyranose rings are anti-or syn-periplanar, respectively (see also Figure S5).
The glucopyranoside rings in the 1-trans conformer are supported by two inter-ring hydrogen bonds, OH-3 A → OH-6 B (O-5 B ), while a third, stronger OH-3 B → OC(NH) B bond pulls the B-ring acetamido group into the plane of the pyranose ring. The A-ring acetamido group remains 'free' with its amide plane presenting a near perpendicular orientation. The 1-cis conformer (Figure 3a), which has a similar relative energy at 0 K but a higher free energy at 298 K, is supported by a strong OH-6 B → OH-6 A hydrogen bond; a further inter-ring interaction, NH B → OH-3 A contributes to a cooperative chain, OH-4 B → OH-3 B → OC(NH) B → OH-3 A → O C(NH) A , to provide additional stabilization. Since both acetamido groups are now able to interact with their neighboring OH-3 A /OH-3 B groups, both are rotated toward the plane of their respective pyranose rings.
The disaccharide unit −Man C -β1,4-GlcNAc B − 2 straddles the unusual β-mannoside linkage. Its IRID spectrum is in good accord with a superposition of the vibrational spectra associated with its two lowest-energy conformers (2-cis), but is quite unlike that of the lowest-lying 2-trans conformer (9.9 kJ mol −1 higher, Figure 3b). The 2-cis conformation is supported by two inter-ring hydrogen bonds, OH-2 C → O-3 B and OH-6 C → O-6 B , and the B-ring acetamido group is again hydrogen bonded to its nearest neighbor, OH-3 B → OC(NH) B .
The IRID spectrum of the −Man C -β1,4-GlcNAc B -β1,4-GlcNAc A − trisaccharide 3, (Figure 3c), which extends from ∼3200 to ∼3600 cm −1 , displays three broad maxima below 3500 cm −1 and congested weaker features at higher wavenumbers, indicating contributions from strong and weakly hydrogen-bonded OH groups. Although poorly resolved, its contour is in qualitative correspondence with the IR spectrum associated with its minimum-energy structure. This presents cisoriented Man C -GlcNAc B and GlcNAc B -GlcNAc A segments, linked together by two very strong hydrogen bonds that connect the terminal Man C and the central GlcNAc B units, and a third that links GlcNAc B with GlcNAc A . The next conformation lies 12.2 kJ mol −1 higher in energy; it only presents a single inter-residue hydrogen bond, and its calculated vibrational spectrum does not reproduce the experimental band at low wavenumber ( Figure S11).
Hydrated and 'Blocked' Conformations of 1−3. The structural information obtained from IRID spectroscopy was developed further through computation, to explore the effect of explicit hydration and 'blocking' (to cap hydroxyls that would be absent in extended oligosaccharides). The lowest-energy computed structures of the monohydrate 1·H 2 O and also 2· H 2 O (Figures 4a,b, S6, and S8) now both present trans conformations, supported by the bound water molecule which bridges across the two pyranose rings. In 1·H 2 O this creates an extended cooperative chain, OH-4 B →OH-3 B → OC(NH) B → H 2 O → OH-6 A → O-1 A , which greatly strengthens the interring binding by linking NH B to O-6 A and, as a consequence, also strengthens the supporting inter-ring bond, OH-3 A → OH-6 B (O-5 B ) (r[OH3 A ···O6 B ] 2.16 → 2.07 Å). Its relative energy now lies 4.5 kJ mol −1 below that of the lowest-lying cis hydrate, cf. 0.6 kJ mol −1 in the 'bare' unsolvated disaccharide 1 ( Figure  S6). This contrasts markedly with hydrated cellobiose (Glc-β1,4-Glc) where a cis conformation is retained. 52 Explicit hydration therefore locks the trans conformation of 1, enhancing the rigidity of the 'stem'. NMR studies of chitobiose in an aqueous environment at 298 K also identify a very similar average conformation in solution. 17 The two acetamido groups continue to adopt a trans relative disposition, consistent with the maintenance of a rigid conformation about the inter-ring glycosidic bond, 17 reinforced perhaps by transiently bound, bridging water molecules. 53,54 The central glycosyl residue in the core pentasaccharide, Man C , is itself glycosylated on hydroxyl groups OH-6 and OH-3, thereby precluding donor hydrogen bonds from these positions. 15,55 The dramatic structural consequences of this are revealed in the finding that the lowest-energy conformer of the 'blocked' Man (6-OMe) C -β1,4-GlcNAc B − disaccharide 2-B (and also its hydrate, 2-B·H 2 O) adopts a trans orientation, in contrast to the strongly preferred cis conformation of 2 (compare Figures 3b and 4c); the lowest-energy 2-B-cis conformer now lies ∼5 kJ mol −1 above the global minimum. Unlike 2-cis (Figure 3b), the trans conformer of 2-B is only supported by a single, weak inter-ring hydrogen bond, OH-3 B → O-5 C . The trans conformation is also retained in the hydrate (Figure 4d) where the water molecule localizes on the GlcNAc B ring, inserted between the acetamido group and O-3 rather than forming an inter-ring bridge, again in contrast to hydrated chitobiose. 52 The reduced inter-ring bonding increases the flexibility about the Man C -GlcNAc B glycosidic linkage, to provide a fluxional 'pivot'.
Similarly, in 3-B the blocked Man C OH-6 group lacks the hydrogen bond found in 3 that linked the Man C and GlcNAc B units (Figure 4e). The Man C -GlcNAc B segment again switches from cis to trans to create a more flexible local structure about the β-Man C 'pivot', very similar to that of 2-B, and the trisaccharide adopts a more extended structure, with r[O-4 C ··· O-1 A ] increasing from 11.2 to 14.5 Å.
Gas-Phase Spectra and Structures of the Core Pentasaccharide 4. The R2PI spectrum of the core pentasaccharide 4 (Figure 5a) is centered at the same wavenumber as the trans conformer of 1 and 3 (also trans about GlcNAc B -GlcNAc A ), providing circumstantial evidence for a possible preference for a trans GlcNAc B -GlcNAc A conformation in 4 also. Since the pentasaccharide contains 14 hydroxy and 2 acetamido groups, structures that optimize global OH···O and NH···O hydrogen bonding are to be expected in the gas phase: not surprisingly, its IRID spectrum ( Figure 5b) presents a broad red-shifted quasi-continuum, ranging from ∼3100 to ∼3700 cm −1 . This suggests a highly congested set of overlapping bands associated with a large number of both strong, and weak, hydrogen bonded OH and acetamido groups.
Molecular mechanics simulations, undertaken partly in response to the limitations of the experiment (see also Supporting Information Methods and Results and Figures  S13 and S14), also predicted a preference for a large number of hydrogen bonds (defined by r[OH···O] < 2.5 Å, θ[OH···O] > 120°) in the isolated pentasaccharide, see Figure 6a(i), and also its triply hydrated complex, Figure 6a(ii), with 6−8 being the optimal number for the isolated molecule (using the OPLS2005 force field, Figure S13). The simulations in bulk water, Figure 6a(iii), predicted disruption of the intramolecular networks, leaving only 2−4 hydrogen bonds in the low-energy ensemble ( Figure S13).
As a consequence, the unsolvated pentasaccharide 4, preferentially adopted compact structures with the distal Man 3 head unit folded back, supported by hydrogen-bonded interactions with the proximal GlcNAc 2 unit, which presented a cis conformation about the −GlcNAc B -GlcNAc A − linkage (although trans conformations, generally associated with more extended structures, were also present at higher relative energies). The addition of water molecules in the gas phase, Figure 6a(ii), disrupted the intramolecular hydrogen-bonding pattern. The lowest-energy conformers of the triply hydrated complex, Figure 6a(ii), presented more extended structures, now with a trans configuration about the −GlcNAc B -GlcNAc A − linkage (Figure 4), supported by water bridging across the inter-ring bond and extended cooperative hydrogen-bonded networks (which were conserved in both low and high energy structures).
In bulk water, the intramolecular hydrogen bonds were largely 'washed out'; the (near identical) lowest-energy structures predicted by OPLS2005 and GLYCAM06/ AMBER, Figure 6a(iii), again presented a trans GlcNAc B -GlcNAc A − stem, but now part of a fully extended structure in which the distal Man 3 unit was completely unfurled. Strikingly, apart from the difference in the dihedral angle ω(H1-C1-O6-C6) between Man C and Man E , the preferred structure of the solvated core pentasaccharide was very similar to that of the core unit in the high-mannose glycan, Man 9 GlcNAc 2 , determined through coupled NMR measurements and molecular dynamics simulation using the GLYCAM93/ AMBER force field, Figure 6a(iv,v). 12,13,18−20 There were also remarkable similarities between the conformations of the distal Man 3 − head unit in bulk water, found here, and the isolated (hydrated) Man 3 unit, determined in solution 18−20 and predicted through investigations in the gas phase, 15 Figure  6a(iv), and also between the conformation (and flexibility) of The analysis of intramolecular distances (using the OPLS2005 simulations), shown in Figures 6b and S14, provided a way of estimating molecular size 56 and hence the favorability of extended versus compact structures. The longest end-to-end distribution for the unsolvated, isolated core pentasaccharide reflected flexibility, but the most favored distance (∼16.7 Å) indicated a compact structure corresponding to a geometric cross-section (σ ∼ 230 Å 2 ) similar to the gas kinetic cross-section (σ ∼ 260 Å 2 ) reported for the doubly sodiated core pentasaccharide ion, 3 which is also likely to be compact. Addition of three water molecules resulted in a broad distribution with peaks at ∼21.5 and 23.5 Å, indicating a more extended but still flexible structure. In bulk water, however, the structures were exclusively extended and relatively inflexible, with a narrow spread of end-to-end distances peaked at ∼23.4 Å. The experiments and simulations have revealed a folded 'naked' core pentasaccharide that uncurled as it became hydrated, with conserved structural motifs taking shape, in particular the trans conformation of the proximal chitobiose unit, to provide an extended, well-defined structure in aqueous solution.

■ DISCUSSION AND CONCLUSIONS
A recent bioinformatics analysis 57 of thousands of glycoproteins listed in the PDB 58 found that many N-glycans showed significantly similar substructures close to the protein, suggesting their use as fragments in glycan modeling (and supporting the approach employed in the present, experimentally based study). The N-glycans also displayed a rigid protein-proximal GlcNAc B -GlcNAc A stem.
The acetamido groups of the stem appear to play a key role; in the gas phase the preferred trans conformation of 1 is strengthened by microhydration through an inter-ring bridging water molecule linking the GlcNAc B acetamide to O-6 A . In bulk water, molecular dynamic simulations predict the absence of a direct inter-ring bond between these groups, which suggests the Figure 6. The spectra and structures of the core pentasaccharide 4. (a) The lowest-energy structures of the core pentasaccharide calculated on the OPLS2005 and GLYCAM06/AMBER potential energy surfaces: (i) the isolated molecule; (ii) the triply hydrated complex (the water molecules were initially located at binding sites based upon the preferences of singly hydrated 1, 2, and the trimannosyl Man E (Man D )Man C − head unit 15 ); and (iii) in bulk water, (hydrogen bonds shown in red). (iv) An overlay of the "open" conformer of the trimannosyl Man E (Man D )Man C − head unit and the core pentasaccharide in (v); (v) the preferred aqueous structure of the high mannose glycan, Man 9 GlcNAc 2 , determined through NMR measurements and molecular dynamics simulations. 12,13 Red dots represent transiently bound water molecules. (b) Distributions of the longest intramolecular distances (for conformers with energies <30 kJ mol −1 ) in the core pentasaccharide, predicted by molecular mechanics (OPLS2005) simulations: isolated, unsolvated (red), explicitly hydrated (green), and in bulk water (black).
trans chitobiose stem is held in a rigid extended conformation through 'filling' of a conserved water pocket at the local bridging site. The other GlcNAc A acetamide is predicted to be unbonded, leaving it free to act as a hydrogen-bond donor/ acceptor and 'anchor' the glycan to the peptide backbone of the glycoprotein. 10,11,59 Hexoses (such as the more abundant Dglucose, Glc) that lack this acetamide could not form such strong hydrogen bonds (if they were to play the roles of GlcNAc A and GlcNAc B ) and would not match the rigidity of the GlcNAc 2 stem nor the potential for anchoring with the protein backbone. 10,11,59 The next structural feature, provided by the central βmannoside Man C -GlcNAc B , also contains a glycosidic linkage formed through OH6 C that precludes formation of the OH6 C → O6 B inter-ring hydrogen bond found in the isolated unit 2. Instead of the compact cis conformation adopted by 2, supported by two inter-ring hydrogen bonds, the 'deletion' of one of them has the key effect of greatly increasing the flexibility of the β-mannoside linkage, which adopts a trans conformation that is unaffected by discrete hydration. Its inherent flexibility allows the β-mannosyl residue to act effectively as a pivot between the rigid chitobiose stem and the outer trimannosyl head.
Unsolvated structures of 4 are compacted by many intramolecular hydrogen bonds (Figure 6a(i)). However, discrete hydration sees the structure begin to unfurl with a water molecule stabilizing the rigid chitobiose stem (vide supra), while the trimannosyl head unit folds over to encapsulate water in another pocket. In a fully aqueous environment, a picture emerges (Figure 7) of an N-linked glycan accommodating an extended core pentasaccharide structural unit that incorporates a rigid proximal chitobiose stem, anchored at one end to the adjoining protein 10,11,59 and at the other connected through a more flexible β-mannoside pivot to the branched mannosyl D and E arms.
This combines both structural integrity and freedom of movement to allow interaction with binding partners. The extended display of the N-glycan would leave the distal branchtip sugars free to function as potential ligands for receptors and also provide the potential for 'levered' interactions promoting a conformational 'response' in the underlying protein fold. The latter could explain the apparent ability of the calnexin/ calreticulin chaperone system to bind the tips of the sugars found on N-glycoproteins 60 while still being able to stimulate proper folding of the underlying N-glycoprotein. The selection of chitobiose, with its two acetamido groups rather than cellobiose for the stem, appears to be critical, facilitating hydrogen-bonded interactions with neighboring amino acid residues 10,11 and suggesting a mechanism by which the core pentasaccharide influences the structural stability of the adjacent peptide chain and its folding kinetics. 7 The central Man E (Man D )Man C − unit, which unfurls upon hydration, presents the information-rich outer extremities of N-glycans to the environment (for solubility, recognition, or transport).
■ ASSOCIATED CONTENT * S Supporting Information Synthetic methods; NMR data; spectroscopic and computational methods; computed relative energies, vibrational frequencies, and structural data. This material is available free of charge via the Internet at http://pubs.acs.org.
Supercomputing Centre and SGIker at UPV-EHU. We are especially grateful for the advice and assistance provided Professor Jesus Jimeńez-Barbero and Professor David Clary and thank Dr. Mark Wormald for helpful discussions. Experiments performed in Orsay were supported by Triangle de la Physique, contract 2010-079T.