Isolation of Human Milk Difucosyl Nona- and Decasaccharides by Ultrahigh-Temperature Preparative PGC-HPLC and Identification of Novel Difucosylated Heptaose and Octaose Backbones by Negative-Ion ESI-MSn

Despite their many important physiological functions, past work on the diverse sequences of human milk oligosaccharides (HMOs) has been focused mainly on the highly abundant HMOs with a relatively low degree of polymerization (DP) due to the lack of efficient methods for separation/purification and high-sensitivity sequencing of large-sized HMOs with DP ≥ 10. Here we established an ultrahigh-temperature preparative HPLC based on a porous graphitized carbon column at up to 145 °C to overcome the anomeric α/β splitting problem and developed further the negative-ion ESI-CID-MS/MS into multistage MSn using a combined product-ion scanning of singly charged molecular ion and doubly charged fragment ion of the branching Gal and adjacent GlcNAc residues. The separation and sequencing method allows efficient separation of a neutral fraction with DP ≥ 10 into 70 components, among which 17 isomeric difucosylated nona- and decasaccharides were further purified and sequenced. As a result, novel branched difucosyl heptaose and octaose backbones were unambiguously identified in addition to the conventional linear and branched octaose backbones. The novel structures of difucosylated DF-novo-heptaose, DF-novo-LNO I, and DF-novo-LNnO I were corroborated by NMR. The various fucose-containing Lewis epitopes identified on different backbones were confirmed by oligosaccharide microarray analysis.


Supplementary Methods
Table S1.Main composition information of 70 HPLC fractions from PGC-HPLC at 105 o C. Table S2.Identification of difucosylated HMOs with heptaose and octaose backbones by ESI-MS n .

Complete structural assignment of novel difucosylated HMOs on branched octaose backbone by NMR
NMR spectra for the decasacccharides DF-novo-LNO I (fraction #44c) and DF-novo-LNnO I (fraction #42b) were recorded at 950 MHz in D 2 O, and assigned using 2D heteronuclear 1 H/ 13 C NMR spectra.The anomeric region of the HSQC spectrum of DF-novo-LNO I is shown in Figure 5a.Heteronuclear spectroscopy can discriminate between signals that are overcrowded in the 1 H dimension, such as the β-anomeric region between 4.35 and 4.75 ppm, though some pairs of signals still overlap, for example the anomeric cross-peaks from GlcNAc residues III and VII, and residues IV and II (Figure 5a).
HSQC-TOCSY and H2BC spectra were used to assign the remaining signals from GlcNAc and Glc residues, and Gal H2, H3 and H4 signals.The small coupling constant value between H4 and H5 of Gal (and Fuc) residues means that the HSQC-TOCSY H4/C4-H5/C5 cross-peaks are faint or missing.
Sequence determination and completion of fucose residue assignment relies on longrange HMBC cross-peaks.Figure 5b shows an expansion of the β-anomeric region of the HSQC (blue) and HSQC-TOCSY (red) spectra of DF-novo-LNO I, overlaid with the HMBC spectrum (green).The HMBC peaks illustrated are inter-residue cross-peaks between H1 of one residue and the carbon immediately across the glycosidic linkage, defining both sequence and linkage positions. 1H and 13 C chemical shifts for DF-novo-LNO I are summarized in Table S3a (GlcNAc.Glc and Gal) and Table S3b (Fuc).
In order to establish whether the 6-branch is attached to the backbone at residue Gal II or residue Gal IV, further reasoning and experimental work was necessary.As is common for Gal residues, the HSQC-TOCSY spectrum can be traced securely only between C1/H1 and C4/H4 for both Gal II and Gal IV.The two spectra are very similar; both residues are substituted at the 3-position and one of them is also substituted at the 6-position.
Further evidence was sought from 2D ROESY spectroscopy.The spectrum obtained (Figure 5c) showed clear ROESY cross-peaks from H4 of II to H6,6' at 3.78 and 3.74 ppm, and from H4 of IV to H6 at 3.96 and 3.82 ppm, confirming the position of the 6-branch at residue IV.
Two fucose spin systems are traceable through HSQC-TOCSY, H2BC and HMBC spectra, identified by the characteristic H6/C6 methyl signals.Differential assignment of the two spin systems to residues IX and X is made difficult by the absence of inter-residue HMBC peaks, so assignments to IX and X are based on close similarity between current results and literature values, as shown in Table S3b.
NMR assignments for DF-novo-LNnO I were obtained in the same manner, as illustrated in Figure S12 For this compound heavily overlapped signals (particularly in the 1H dimension) from Gal residues VIII VI, IV and II seriously hindered determination of the position of the 6branch without ambiguity.
1 H and 13 C chemical shifts for DF-novo-LNnO I are summarised in Table S3c (GlcNAc, Glc and Gal) and Table S3d (Fuc).

Sequence and linkage validation by NMR of the difucosylated nonasaccharide on a novel heptaose backbone
The relatively small amount of fraction #18a available was insufficient for most heteronuclear NMR spectroscopy except for HSQC, but 1 H homonuclear 2-dimensional TOCSY and ROESY spectra gave partial assignments and clear inter-residue ROESY connectivity (Figure S13) corroborating the structures indicated by mass spectrometric methods.
Assignments for each monosaccharide residue, as summarized in Tables S4, are based on a S5 combination of experimental data (TOCSY, ROESY cross-peaks) and comparison with NMR data for similar monosaccharide residues in DF-novo-LNO I and DF-novo-LNnO I as described above, as well as literature values for the fucose residues as indicated in the Table S4.ROESY connectivity listed above is completely compatible with the structure proposed.
Other ROESY cross-peaks could tentatively be assigned to H1-H5 connections for Gal residues or could originate in interactions between closely packed residues in the branched structure.H6 and H6' of the 6-substituted Gal II were assigned by ROE cross-peaks from H4 and confirmed by the presence in the HSQC spectrum of cross-peaks to C6 at 71.2 ppm, shifted well downfield of the other H6-C6 cross-peaks.The Galβ1-3Gal motif is strongly supported by the two wellresolved inter-residue ROE cross peaks from Gal VI H1 to Gal II H3 and H4.

SUPPLEMENTARY METHODS
Preparation of fucosylated decasaccharide fraction from human milk.The stored frozen milk sample (~20 L) from different donors was thawed at 4 °C and filtered sequentially through two hollow fiber membranes (750 and 50 kDa, Shandong Bona Biology, Shandong, China) to remove the high molecular mass proteins and lipids aggregates.The collected filtrate was mixed with an equal volume of ethanol to prepare the final HMO solution at 100 mg/mL.HMO microarrays.All the carbohydrate-binding proteins were purchased from commercial sources as listed below.Aleuria aurantia lectin (AAL), Erythrina cristagalli lectin S7 (ECL), Ulex europaeus agglutinin I (UEA I) were purchased from Vector Laboratories (San Francisco, CA), and antibodies including mouse anti-blood group A (SAB4700674), mouse antiblood group B, mouse anti-Le a , mouse anti-Le b , biotinylated goat anti-mouse IgG and BSA were obtained from Sigma-Aldrich (Saint Louis, MO), and mouse anti-Le x (P12) and anti-Le y were from Santa Cruz Biotechnology (Santa Cruz, TX).Goat anti-mouse IgM was from Abmart (Shanghai, China), and streptavidin AF647 (S21374) was from ThermoFisherScientific (Rockford, IL).

Figure S2 .
Figure S2.Optimization of PGC-HPLC conditions using a standard mixture of LNFP I, II and III.

Figure S3 .
Figure S3.Optimization of temperature for PGC-HPLC separation of DF-novo-LNnO II and DF-novo-LNO I.

Figure S4 .
Figure S4.Stabilities of sialylated and neutral HMOs under different ultrahigh temperature conditions.

Figure S5 .
Figure S5.Group separation of HMOs based on two-dimensional hydrophilic chromatography.

Figure S7 .
Figure S7.Workflow for ESI-MS n analysis of difucosylated HMO isomers with octaose backbones.
Figure 1b) containing difucosylated nona-and decasaccharides were further purified on either the same column at 105 ℃ or on a Hypercarb PGC HT column (3.0 × 100 mm, 3 μm, Thermo Fisher Scientific) at 145 o C with optimized CH 3 CN/H 2 O gradient (with or without 0.1% formic acid).
1 chain; T2, type 2 chain; in, internal Lewis (Le) epitope; fragment ions in blue are diagnostic ions.Oligosaccharides highlighted in red are novel structures.

Figure S2 .Figure S3 .Figure S4 .
Figure S1.Ultra-high temperature HPLC system.(a) Modular components and layout of the ultra-high temperature HPLC system; (b) Inside and side view of the internal elements of the ultra-high temperature oven; (c) Post-column cooling device.

Figure S5 .
Figure S5.Group separation of HMOs based on two-dimensional hydrophilic chromatography.(a) Initial fractionation on a Click-TE GSH column.(b) Fraction F5 was further fractionated by HPLC on an amide column.
Figure S12.NMR spectra of #42b (DF-novo-LNnO I).(a) The anomeric region of the HSQC spectrum of DF-novo-LNnO I, showing cross-peaks characteristic of α-anomeric residues for Fuc residues IX and X, and the reducing end Glc α-anomer.The remaining β-anomeric cross peaks are assigned to Gal, GlcNAc and reducing end β-Glc residues.Some signals overlap, e.g. the anomeric cross-peaks from GlcNAc residues III and VII, and Gal residues VIII, VI, IV and II.(b) Expansion of the β-anomeric region of the HSQC (blue) and HSQC-TOCSY (red) spectra of DF-novo-LNnO I, overlaid with the HMBC spectrum (green).The HMBC peaks illustrated are inter-residue cross-peaks between H1 of one residue and the carbon immediately across the glycosidic linkage, defining both sequence and linkage positions.
(a) anomeric region of the HSQC spectrum (b) β-anomeric region of the HSQC and HSQC-TOCSY spectra

Table S1 . Main composition information of 70 HPLC fractions from PGC-HPLC at 105
o C.