Unravelling Glucan Recognition Systems by Glycome Microarrays Using the Designer Approach and Mass Spectrometry

Glucans are polymers of d-glucose with differing linkages in linear or branched sequences. They are constituents of microbial and plant cell-walls and involved in important bio-recognition processes, including immunomodulation, anticancer activities, pathogen virulence, and plant cell-wall biodegradation. Translational possibilities for these activities in medicine and biotechnology are considerable. High-throughput micro-methods are needed to screen proteins for recognition of specific glucan sequences as a lead to structure–function studies and their exploitation. We describe construction of a “glucome” microarray, the first sequence-defined glycome-scale microarray, using a “designer” approach from targeted ligand-bearing glucans in conjunction with a novel high-sensitivity mass spectrometric sequencing method, as a screening tool to assign glucan recognition motifs. The glucome microarray comprises 153 oligosaccharide probes with high purity, representing major sequences in glucans. Negative-ion electrospray tandem mass spectrometry with collision-induced dissociation was used for complete linkage analysis of gluco-oligosaccharides in linear “homo” and “hetero” and branched sequences. The system is validated using antibodies and carbohydrate-binding modules known to target α- or β-glucans in different biological contexts, extending knowledge on their specificities, and applied to reveal new information on glucan recognition by two signaling molecules of the immune system against pathogens: Dectin-1 and DC-SIGN. The sequencing of the glucan oligosaccharides by the MS method and their interrogation on the microarrays provides detailed information on linkage, sequence and chain length requirements of glucan-recognizing proteins, and are a sensitive means of revealing unsuspected sequences in the polysaccharides.


Suppl Discussion
Observations on Dectin-1 interactions with synthetic branched oligosaccharides by other methods using oligosaccharides in solution

Fig. S1
Microarray analyses using microarrays of 12 soluble glucan polysaccharides to reveal their expression of ligands and antigens for the 14 proteins investigated.

Fig. S7
Carbohydrate microarray analyses of His-tagged murine and human Dectin-1

Table S1
Polysaccharides examined and selected as sources of gluco-oligosaccharides, after their partial depolymerisation

Table S2
Glucan recognizing proteins investigated and their reported oligosaccharide recognition

Table S3
Gluco-oligosaccharides used for development of ESI-CID-MS/MS method.

Table S5
1H-NMR to corroborate the presence of some α1,4 on poria heptasaccharide/poria polysaccharide Table 6A MALDI-MS analysis of NGLs derived from gluco-oligosaccharide fractions with homo-linkages Table S6B MALDI-MS analysis of NGLs derived from gluco-oligosaccharide fractions with hetero-linkages Table S6C MALDI-MS analysis of NGLs prepared from additional commercial and chemically synthesized gluco-oligosaccharides Table S7A Oligosaccharide NGL probes included in the glucan microarrays, sorted by linkage type and degree of polymerization

Table S7B
Fluorescence binding intensities obtained with all the proteins investigated in addition to the major C-ions at m/z 989, 827, 665, 503, 341, consistent with dominant 1,3linkage. These observations were in accord with our initial microarray binding data with NGLs of Poria cocos BioGel P4 oligosaccharide fractions with >DP-2 using TmCBM41, which is known to recognize Glc1,4Glc-linked sequences (supplemental Fig. S6A). 1 H-and 13 C-NMR corroborated the presence of α1,3-linked in addition to α1,4-linked sequences (supplemental Table S5).
Although the 1 H-NMR spectrum at 700 MHz was crowded (not shown), the relative peak heights of resolved resonances, such as those from H5 of the 3-linked Glc and H3 of the 4-linked glucose, indicated that the fraction contained at least 20% of α1,4-linked component. Therefore, the Poria fractions DP3 to DP13 were purified by preparative HPTLC to remove the 1,4-linked contaminants (HPTLC fractions). HPTLC analysis of Poria-7, for example, showed that the contaminant was largely removed (supplemental Fig. 6B), and MALDI-MS analyses confirmed the heptasaccharide as the major component (supplemental Fig. 6C). The purified Poria-7 was analysed by ESI-MS/MS and the fragment ions from the 4-linked contaminant (m/z 1091/1073/1031) were largely removed (Fig. 4B). A series of NGL probes of Poria-DP3 to DP13 was generated (supplemental Table 6A) and microarray analysis corroborated purity as the binding signals by the 1,4-linkage-specific TmCBM41 were largely disappeared, whereas strong binding signals were detected with the α1,3-linkage-specific MOPC104E (supplemental Fig. 6A).

Saturation Transfer Difference (STD)-NMR analyses of CBMs
STD NMR has been used to investigate the geometry and kinetics of protein-glycan complex formation (2). This technique involves NMR spectroscopy of a mixture of a protein with a ligand in solution. NMR spectra of the mixture are obtained with and without radiofrequency irradiation of the protein, and subtraction of one spectrum from the other. The difference spectrum contains only signals from parts of the ligand interacting with the protein, to which saturation has been transferred. It is therefore possible to use this method to determine binding epitopes on glycan ligands (3).
To complement the observations from microarray analyses the interaction of four β1,3glucan-binding CBMs with 1,3-linked trisaccharide Lam-3 was analyzed in solution by STD-NMR. These were TmCBM4-2, BhCBM6, CmCBM32 and CmCBM6-2. The STD signals were detected with each of the four CBMs (supplemental Fig. 10A), indicating their ability to interact with Lam-3 under the solution phase NMR conditions. By comparing the STDs as percentages of the corresponding unperturbed signals, we were able to distinguish, at the atomic level, the modes of recognition by these CBMs in solution.
With TmCBM4-2 (supplemental Fig. 10B), STDs were detected arising from all three glucose residues. The interactions with the H2 protons both of the reducing and the internal residue were particularly strong. These observation are in line with crystal structure evidence(4), in which the binding site is a groove that is long enough to accommodate a hexasaccharide. This can explain the lack of binding signals with Lam-2 and Lam-3 given by this CBM in the microarray analysis (Fig. 5B). Whereas, in solution, the trisaccharide tested here, Lam-3, may be completely included in the interior of the groove, neither the Lam-2 and Lam-3, gain adequate access to the binding groove when immobilised at their reducing end on the array surface.
With CmCBM32-2, BhCBM6 and CmCBM6-2 (supplemental Fig. 10C-E), STDs were observed mainly from the non-reducing end residue of Lam-3, which implies that the non-reducing terminal makes the closest contacts with the three proteins, although with different contact protons. Literature data for CmCBM32-2 are limited, but the data on BhCBM-6 are in agreement with crystal structure, in which the binding site is described as a small, blocked-off groove (5). With CmCBM6-2, STDs were observed strongly at the non-reducing residue and weakly at the internal and reducing residues. This CBM has a broad specificity and two glycan binding sites, one of which (cleft A) interacts with the NR residue, and the second of which (cleft B) accommodates at least a trimer (6); the STDs cannot distinguish between the two sites but the results might be interpreted in terms of binding to both sites B and A; or possibly at site B alone. By way of validating the glucose linkage specificity in the STD experiments, we analyzed maltotriose (Glcα1,4Glcα1,4Glc) in the presence of TmCBM4-2. In good agreement with the known specificity of the CBM and our observations in microarray analysis, no STD signals were observed (data not shown).
In sum, for the CBMs that bind more strongly to the non-reducing terminal of the trisaccharide ligand (CmCBM32-2, BhCBM6 and CmCBM6-2) binding could be detected to oligosaccharides as short as DP-2; for TmCBM4-2 which interacts with all the residues of the trisaccharide but most strongly with reducing and with internal residues, a chain length of DP-4 and longer was required on the array to detect binding.

Observations on Dectin-1 interactions with synthetic branched oligosaccharides by other methods using oligosaccharides in solution
Surface plasmon resonance (SPR) is another analysis system that has also been used in Dectin-1 studies with chemically synthesized oligosaccharides in solution as inhibitors of Dectin-1 binding to a glucan-phosphate-coated biosensor (7). We summarize the results in supplemental Table S8. Among the three branched analogs HE-8 B5 , HE-9 B6 and HE-10 B7 and the linear HE-8, HE-9 and HE-10 (DP-8, DP-9 and DP-10, respectively) analyzed in the SPR inhibition system, the decasaccharide He-10 B6 , with a DP-9 backbone and a single β1,6 branch on the third residue at the non-reducing end was observed to be most potent inhibitor (IC 50 0.029 mM). The branched HE-8 B5 , with a DP-7 backbone (IC 50 0.13 mM) was recorded to be ten times more potent than the longer analog, HE-9 B6 , with an DP-8 backbone (IC 50 1.3 mM).
In another Dectin-1 study using chemically synthesized linear and branched glucooligosaccharides, ELISA inhibition assays were performed (8). Here the heptadecasaccharide (Structure 1 below), with a DP-16 backbone and a single β1,6 branch on the third residue at the non-reducing end, as in He-10 B6 , and a linear DP-16 (Structure 2), were observed to be equally active as inhibitors of Dectin-1 binding to immobilized glucan polysaccharide.
It is possible that the SPR inhibition system reveals effects of the β1,6 branch on oligosaccharides in solution with a particular backbone chain length. This phenomenon requires investigation. The HE-8 B5 and HE-9 B6 or a synthetic analog with a backbone chain longer than the nonassacharide that is required to detect binding in the microarrays are not currently available for conversion into NGLs for microarray analyses but will be the subject of future investigations. 7 and (C) (Barley-3a and -b, Barley-4a, -b and -c, respectively), Barley penta-and hexasaccharides (Barley-5a and Barley-6a) and pullulan glucotetraose with the sequence of Glc1,6Glc1,4Glc1,4Glc and heptaose with the sequence of Glc1,6Glc1,4Glc1,4Glc1,6Glc1,4Glc1,4Glc (Pullu-4 and -7, respectively). Linear 1,3linked glucan octa-, nona, and decasaccharides HE-8, HE-9 and HE-10, and branched glucan nona-, deca-and undecasaccharides, HE-9 B7 -10 B2 , -10 B3 , -10 B5 , -10 B7 and -11 B2,6 were synthesized chemically, as described (17). The chemically synthesized hexasaccharide (18), JG-6 B1, was a generous gift from Jianxin Gu (Fudan University, Shanghai).  Table S1. (A) murine monoclonal antibodies to α-or β-glucans; (B) microbial CBMs; (C) receptors of the innate immune system: murine Dectin-1 (His-tagged) and DC-SIGN (human Fc chimera). The binding scores are depicted as fluorescence intensities elicited with 30 and 150 pg polysaccharide per spot (blue and purple bar, respectively). The polysaccharides cyclic-β-glucan (ID 3) and laminarin (ID 5) are probably not efficiently retained on the nitrocellulose surface to elicit binding after a binding event due to their low molecular weights. The α1,3-glucose-containing Poria cocos polysaccharide (ID 13) is not water-soluble, therefore not included in the microarrays. This explains the lack of binding signals with the α1,3-glucan-specific MOPC-104E-IgM.    S8. 'On-array' inhibition of Dectin-1 and CmCBM32-2 binding to immbobilized NGLs with β1,3-linked oligosaccharides (A) Inhibition of the binding of murine Dectin-1 Fc chimera to immobilized 1,3-linked Curd-13 NGL by 1,3-linked oligosaccharides Curd-11 and Curd-13 but not Lam-7; Curd-13 NGL was arrayed at 5 fmol/spot. The 'on-array' inhibition results thus corroborate the chain length requirement for Dectin-1 recognition observed in the microarrary analysis as there was a higher inhibitory activity of the curdlan oligosaccharide fraction with DP-13 over that with DP-11; and a lack of inhibition by the oligosaccharide with DP-7. (B) Inhibition of the binding of CmCBM32-2 to immobilized 1,3-linked Lam-5 NGL by 1,3linked oligosaccharides Lam-3, Lam-5 and Lam-7; Lam-5 was arrayed at 5 fmol/spot. In contrast with Dectin-1, for CmCBM32-2 the short β1,3 oligosaccharides showed high inhibitory activities of binding, thus corroborating the results of the microarray analysis. The results highlight different chain length-dependencies and the abilities of 2G8-IgG and 1E12-IgM antibodies, TmCBM4-2 and CmCBM32-2, but not Dectin-1, to bind branched β1,3/β1,6 oligosaccharides. The glucose linkages are symbolically indicated by diagrams in the top panel; for probe 157 the internal 1,3 linkage has an α-configuration. For probes 6, 8-13 glucose chain lengths of the major components are depicted.  Binding epitopes implied by the STD results are shown on the right for each CBM. All the experiments were performed at 30 ºC except for CmCBM32-2 which was measured at 45 ºC. The relative STD effects (highest STD signal normalized to 100%) are illustrated by dark, medium and light gray circles indicating strong (>80 %), medium (40-80%), and weak (<40%) STD effects, respectively. Overlapped STD signals that could not be clearly assigned are not shown on the structures. Asterisks indicate signals for which the STD is an approximate estimate due to partial overlap.  Curdlan polysaccharide was solubilized in an alkaline aqueous solution (50mM NaOH) prior printing. Table S2. Glucan-recognizing proteins investigated and their reported oligosaccharide recognition.

1H8-IgG
Vaccine-induced with C. albicans β-glucan polysaccharide (Potential protective) Specificity investigated in the present report