Predominant Structural Features of the Cell Wall Arabinogalactan of Mycobacterium tuberculosis as Revealed through Characterization of Oligoglycosyl Alditol Fragments by Gas Chromatography/Mass Spectrometry and by ‘H and 13C NMR Analyses*

The peptidoglycan-bound arabinogalactan of a vir- ulent strain of Mycobacterium tuberculosis was per- 0-methylated, partially hydrolyzed with acid, and the resulting oligosaccharides reduced and O-pentadeute- rioethylated. The per-0-alkylated oligoglycosyl alditol fragments were separated by high pressure liquid chromatography and the structures of 43 of these con- stituents determined by ‘H NMR and gas chromatog- raphy/mass spectrometry. The arabinogalactan was shown to consist of a galactan containing alternating B-linked B-D-galactofuranosyl (Galf) and 6-linked /3-D-Galf residues. The arabinan chains are attached to C-5 of some of the 6-linked Galf residues. The arabinan is comprised of at least three major structural domains. One is composed of linear S-linked cr-D-arabinofu- ranosyl (Araf) residues; a second consists of branched 3,5-linked a-D-Araf units substituted with 5-linked

The peptidoglycan-bound arabinogalactan of a virulent strain of Mycobacterium tuberculosis was per-0-methylated, partially hydrolyzed with acid, and the resulting oligosaccharides reduced and O-pentadeuterioethylated.
The per-0-alkylated oligoglycosyl alditol fragments were separated by high pressure liquid chromatography and the structures of 43 of these constituents determined by 'H NMR and gas chromatography/mass spectrometry. The arabinogalactan was shown to consist of a galactan containing alternating B-linked B-D-galactofuranosyl (Galf) and 6-linked /3-D-Galf residues.
The arabinan chains are attached to C-5 of some of the 6-linked Galf residues. The arabinan is comprised of at least three major structural domains. One is composed of linear S-linked cr-D-arabinofuranosyl (Araf) residues; a second consists of branched 3,5-  The cell walls of members of the Mycobacterium genus and related genera contain a chemotype IV peptidoglycan (1, 2) to which an arabinogalactan is covalently attached. The arabinogalactan is further modified by esterification with mycolic acid residues (3, 4). Small amounts of tightly associated, highly immunogenic proteins are also present (5). The arabinogalactan component of the cell wall of mycobacteria has been implicated in a range of biological responses associated with human and experimental mycobacterioses, such as the high titer IgG antibodies in tuberculous and leprosy sera (6, 7) and the state of T-cell-mediated immunological anergy evident in the multibacillary form of disease (8).
It has previously been demonstrated that the polymer is composed mostly of D-arabinosyl and D-galactosyl residues (g-12), that the arabinosyl residues were all furanoid (6, 13), and that the arabinosyl portions of the molecule were the dominant B-cell antigens (6). The linkages of many of the arabinosyl and galactosyl residues were determined (6, 14) and the disaccharide, /I-Galf-(l-6)-Gal, isolated (15). More recently, additional linkages involving the galactosyl units have been identified (16,17) and all of these galactosyl residues have been shown to be furanoid (17). However, the manner in which the variously linked glycosyl units are combined has not been established because of the complexity of the heteropolysaccharide which is not composed of a repeating unit. In this present, comprehensive study, 43 different oligosaccharide fragments of the polysaccharide were characterized which allowed the recognition of several structural motifs that were representative of the majority of the molecule. The proposed structure of the arabinogalactan should lead to a greater understanding of the role of the mycolylarabinogalactan-peptidoglycan complex in the immunogenicity, pathogenesis and physiology of mycobacteria (18).

RESULTS
The strategy for the generation and analysis of per-Oalkylated oligoglycosyl alditols in order to establish the structures of complex polysaccharides have been described in detail (19,20). Purified cell walls of M. tuberculosis, rather than arabinogalactan released from cell walls by degradative means (6, ll), were used in an attempt to obtain fragments contain-  groups on C-l and C-4 of the arabinitol ("aid") end ( Fig. 1D) arose because the arabinosyl residues in the arabinogalactan are furanoid and, when hydrolyzed and reduced, the hydroxyl groups at C-l and C-4 are exposed. The presence of O-C$H], at positions other than at C-l or C-4 of the pentitol indicates where other glycosyl residues were originally attached. Thus, the 0-C2[2H]s group at C-5 of the arabinitol ( Fig. 1D) established that this residue was substituted at C-5 in the arabinogalactan. The two arabinosyl residues ("a" and "b") in the sequence illustrated in Fig. 1D contain no 0-C2['H15 groups. However, other oligoglycosyl alditols arising from the cleavage/modification of the methylated arabinogalactan did contain O-C$H], groups on residues "a" and "b" and provided consid-' The abbreviations used are: GC/MS, gas chromatography-mass spectrometry; EI/MS, electron intact-mass spectrometry; HPLC, high performance liquid chromatography; Ara, arabinosyl; Rha, rhamnosyl; d, ['HI, deuterium; Me, methyl; f, furanosyl; p, pyranosyl; SDS, sodium dodecyl sulfate; TMS, trimethylsililyl; BCG, Bacille Calmette-Guerin; RT, retention time; TMC, Trudeau mycobacterial collection; COSY, two-dimensional chemical shift correlated spectroscopy.
1. An illustration of the sequence of reactions used to produce compound 11, one of the 43 partially-0-methylated, partially-0-pentadeuterioethylatedoligoglycosyl alditols that are the subject of this report.
The hydrolytic conditions resulted in random cleavage of the per-O-methylated polymer (indicated with bold arrows), resulting in the production of all 43 products (plus others) at the same time. R and R' are other glycosyl residues. erable sequence information.
Due to the complexity of the mixture of per-0-alkylated oligoglycosyl alditols, it was not possible to structurally characterize the components directly by GC/MS. Therefore, the per-0-alkylated oligoglycosyl alditols were separated by reverse-phase HPLC and the partially fractionated derivatives analyzed by GC/MS, 'H NMR, and, when appropriate, by glycosyl linkage composition analysis. Details of this methodology and the resulting data are presented in the Miniprint.
The structures of 43 of the oligoglycosyl alditol fragments were established and are presented in Fig. 4.
Evidence for the Presence of Several Major Structural Motifs in Arabinogalactan-Examination of the structures 1 through 43 ( Fig. 4) revealed that the majority of the oligoglycosyl alditols contain only Ara or Gal (rather than a mixture of both). The arabinogalactan appears to consist of arabinan and galactan regions, which is consistent with the conclusions arrived at from the analysis of enzymatic digests of the arabinogalactan (6). It was also possible to recognize oligosaccharide families and thereby to arrive at five major structural domains (Fig. 5).  4. The value of n was not rigorously established but is approximately 4. The value of r depends on how frequently motif E is inserted into arabinogalactan (see Fig. 9 and the text). All ring forms are furanoid and all glycosyl residues are D.
Structural motif A (Fig. 5) was deduced, in part, from the structure of oligoglycosyl alditol 19 (Fig. 4B) which demonstrated that some of the 3,5-linked Araf residues are substituted at both C-5 and C-3 with 2-linked Araf. This conclusion is supported by the structures of 5, 6, 17, and 18 (Fig. 4, A  and B). The glycosyl residue on C-2 of the 2-linked arabinosyl units was shown to be t-Araf by elucidation of the structures of 1, 10, 11, and 21. Accordingly, it was clearly established that the disaccharide, t-Araf-(l-2)-Araf, is attached at both the 3 and the 5 position of some of the 3,5linked Araf branched residues. In addition, the characterization of 17 and 18 demonstrated that these 3,5-linked Araf residues are linked to C-5 of an Araf.
The structures of 4, 12, and 24 ( Fig. 4) established the presence of linear 5-Araf units (structural motif C). The presence of this motif is consistent with earlier observations (6). The length of the linear 5-Araf regions is unknown; obviously, some are as long as 4 residues (Fig. 4).
The linkage data in Table I indicate that approximately 2 of the 11 B-linked galactosyl residues are branched at position 5. The structures of 39 and 40 clearly indicate that 5-linked Araf units are attached to C-5 of the B-linked Galf residues and correspond to structural motif E (Fig. 5).
The Presence of Rhamnose and Mannose in the Arabinogalactan and Evidence for Other Minor Structures-There have been reports of the presence of rhamnose in the cell wall of mycobacteria (17, 21); however, its occurrence in mycobacterial arabinogalactan had not been demonstrated. The isolation and characterization of oligosaccharides 41-43 (Fig. 4E) established that some of the 5-linked galactosyl residues are linked to C-4 of a rhamnosyl residue. Thus, the reducing terminus of the arabinogalactan molecule is occupied by a rhamnosyl residue.
These are minor components, as mannosyl residues account for ~2% of the arabinogalactan (Table I). It is unlikely that these mannosyl residues arise from contaminating arabinomannan (22, 23), and they probably represent minor variations on the major structure. In addition, oligosaccharides 2 and 3 (Fig. 5A), which are present in small amounts, do not fit any of the dominant structural motifs and may represent incomplete versions of structural motif A.
Confirmation of the Absolute Configuration of the Ara and Gal Residues-GC/MS analysis of the (CH&Si derivatives of both R(-) and S(+) 2-butyl glycosides and appropriate standards (24) showed the Ara to be >98% D and the Gal to be >97% D and confirmed that both Ara and Gal in the arabinogalactan of M. tuberculosis are D (11). This also established that quantitatively minor amounts of L-sugars are not present and therefore do not account for immunological activity.

Assignment of Anomeric Configurations by 'H NMR Analysis of the Per-0-Alkylated
Glycosyl Alditol Fragments-All of the major glycosyl linkages present in the arabinogalactan were represented in the per-0-alkylated oligoglycosyl alditols analyzed by 'H NMR (Table II). The large coupling constant (J1,& of 4.6 Hz for the t-Araf residue in 1 shows that the t-Araf units are @-linked.3 Analysis of 4 by 'H NMR shows that the 5-Araf residues are a-linked and that the 2-Araf residues (5 and 6) are a-linked, regardless of whether they are attached to C-3 or C-5 of the 3,5-branched Araf. The @ configuration of the t-Araf and (Y configuration of the 2-Araf units were confirmed by 'H NMR analysis of 11. 'H NMR analysis of 9 showed that the 3,5-linked Araf residues are a-linked. 'H NMR analyses of 28, 31, and 36 demonstrated that all of the galactosyl residues are P-linked. Determination of the Anomeric Configurations of the Glycosyl Residues in the Arabinogalactan by 13C NMR-The 13C NMR spectrum of the entire arabinogalactan, solubilized by base treatment of whole mycobacterial cell walls, is shown in Fig. 6. The majority of the C-l signals appear between 6108 and 6109, corresponding to a-Araf/P-Galf residues3 (25-28). The two signals at 6106.8 and 6106.6 were assigned to C-l of a-linked 2-Araf residues. This assignment is possible because substitution at C-2 but not at C-3 or C-5 causes the chemical shift of C-l to move upfield by l-2 ppm (25,26,29). Since the only 2-linked glycosyl residues in the arabinogalactan are 2-Araf (Table I), the signals at 6106.8 and 6106.6 must arise from the C-l of the 2-Araf residues. The assignment was a The non-ring oxygens on C-l and C-2 of both a-Araf and @-Gali are trans, and thus a-Araf and P-Gay give similar chemical shift values for C-l and H-l and similar 3J1,2 coupling constant values. For a-Araf/@-Galf, (txm.s) C-l has its resonance at -6108 and 3J1,~ is approximately 3 Hz or less (25)(26)(27)(28)(29). For B-Araflor-Galf (cis), the chemical shift of C-l is -6102 and the 3Jl,2 is 4 Hz or greater (25-29). The value of the chemical shift of H-l cannot be used to distinguish LY-and /3-Araj or Galf (25-29).  confirmed using 13C/lH correlation NMR and COSY NMR to trace the connectivity of the C-1s at 6106.8 and 6106.6 to the glycosyl substituted carbons at 688.2 and 687.9. The 13C/ lH correlation spectrum (Fig. 7) showed connectivity between C-l resonances at 6106.8 and 6106.6 and corresponding H-l resonances at 65.16 and 65.25, respectively. The small coupling constant values (Jl,z = 2.3 and 2.5 Hz, respectively) measured from the two-dimensional-COSY spectrum (Fig. 8) confirmed (27, 29) that the C-l resonances at 6106.8 and 6106.6 were from an a-Araf. Furthermore, the COSY 'H/'H correlation (Fig. 8) showed that the chemical shift of the H-2 was at 64.19 for both residues. The 13C/lH correlation (data not presented) then allowed the resonances of the corresponding C-2s at I  I  I  I  I  I  I  90  55  60  75  70  65  60 PPM 688.2 and 687.9 to be identified, and their chemical shift value confirmed glycosyl substitution at position 2 of the Ara (Table  I and Ref. 29).
The signals at 6101.9 and 6101.8 in the 13C NMR spectrum of the arabinogalactan (Fig. 6) are consistent with the presence of /3-Araf/cY-Galf3 residues (25, 26, 28). Pyranosyl residues would also give signals in this chemical shift region; however, the glycosyl-linkage analysis data (Table I), our previous data (17), and the structures of l-36 and 39-43 show that all of the Ara and Gal residues are furanoid. The assignment of the signals at 6101.9 and 6101.8 to the C-l of fi-Araf/Lu-Galf is therefore unequivocal.
The 'H NMR data (Table II) clearly established that the t-Araf residues are in the p configuration, and, thus, the two C-l signals at -6102 must result from t-/3-Araj. This assignment is supported by the signal at 664.1 (Fig. 6) which corresponds to the C-5 (primary alcohol carbon) signal of t-P-Araf (664.2 for the methyl glycoside, Ref. 28) but not to that of t-a-Araf (662.4 for the methyl glycoside, Ref. 28). The signal at 664.1 cannot result from any other primary alcohol carbon in the arabinogalactan; the primary alcohol carbons of 5-and 3,5-Araf and 6-and 5,6-Galf have chemical shifts at -670 (25), and the primary alcohol carbons of 2-linked-a-Araf (29) and 5-linked-,B-Galf (26) have their resonances at -662.
13C NMR Confirm Structural Motif A-The different chemical shifts (6106.6 and 6106.8) of the C-l resonances of the two 2-linked-a-Araf residues (Fig. 6) of motif A are due to slight differences in their chemical environments.
One 2linked-a-Araf residue is attached to C-3 and the other to C-5 of the 3,5-linked-a-Arafunit.
In addition, the C-2 resonances of the 2-linked-cu-Arafs are clearly observable and again are at two slightly different chemical shifts (SS8.2 and 687.9), and, finally, the two C-l resonances of the two t-P-Arafs are also seen at 6101.9 and 6101.8. The intensities of the resonances at 687.9,88. 2, 101.8,101.9,106.6, and 106.8 are approximately equal and are consistent with them being part of the same structure, structural motif A.4

DISCUSSION
The arrangement of the arabinosyl and galactosyl residues within the arabinogalactan of the cell wall of mycobacteria has, until now, not been studied in detail. This arabinogalactan is unusual in that, unlike most bacterial polysaccharides (30, 31), it is not composed of an oligosaccharide repeating unit. The presence work confirms (6) the presence of distinctive galactan and arabinan segments and established that the arabinogalactan is composed of a few distinct, defined structural motifs.
Structural motif A is the most unexpected and significant. It has been shown that arabinosyl residues are responsible for the antigenicity of arabinogalactan and that serological activity resides largely in a fraction containing 2-linked arabinosyl residues (6, 14). Thus, it is logical to speculate that part, or all, of structural motif A is the major humoral immunological epitope of arabinogalactan and, consequently, of whole mycobacteria (18). Monoclonal antibodies raised against lipoarabinomannan (22) also react with the purified cell walls,5 suggesting an arabinose-containing epitope common to lipoarabinomannan and arabinogalactan. Structural motifs B and C account for the bulk of the internal portions of the arabinan segments of the arabinogalactan and are consistent with previous studies (6)  5)-Galf-( 1+6)-Galf were revealed by close examination of the mass spectra of the individual per-0-alkylated oligogalactosyl alditols. The possibility that the 2,3,5-tri-O-methyl galactose (6-Galf) released by acid hydrolysis was preferentially degraded was examined by subjecting the permethylated arabinogalactan to various hydrolytic conditions, but the ratio of 5-Galf to 6-Galj remained constant. In addition, no new aciddegradable sugars were revealed by mild acid hydrolysis. The evidence, from oligoglycosyl alditol analysis, for structural motif D is strong, but the existence of an undetected arrangement involving 5-Galf is still possible.
The structures of the galactosyl-rhamnosyl oligosaccharides (41)(42)(43) are consistent with an alternating 6-Galf and 5-Galf sequence in which the 5Galf reducing end is linked to C-4 of a rhamnosyl residue. The cell walls of Nocardia FUbFU contain similar galactosyl-rhamnosyl attachments (32) which suggests that a rhamnosyl residue is involved in a direct or indirect linkage to the peptidoglycan in some actinomycetes. The characterization of 42 (Fig. 4E) as a monoglycosyl methylrhamnoside suggests that cleavage occurred between the rhamnosyl residue and the peptidoglycan moieties during the base-catalyzed methylation procedure. This cleavage is consistent with a phosphodiester link. The 'H and 13C NMR analyses established that the 2-, 5-, and 3,5-linked Araf residues are a-linked. In contrast, the terminal Araf residues are P-linked, a result that could not be predicted from optical rotation measurements (6, 10). All the Galf residues are B-linked, a result which is consistent with previous studies (15).
The manner in which structural motifs A-E are combined together within the arabinogalactan (Fig. 9) is not known. The arabinogalactan polymer contains approximately 100 residues based on glycosyl composition (Ara/Gal/Rha/Man, 62:35:1.4:1.4), glycosyl linkage composition (Table I) and the assumption of a single rhamnosyl residue/molecule.
The presence of 2 mol % of 5,6-Galf (Table I) suggests that approximately two arabinan chains are attached to the galactan core. However, the exact relationship between the arabinan chains and the galactan backbone is not known. Each of the two arabinan chains illustrated in Fig. 9 depicts a different arrangement of structural motifs A, B, and C. However, which one of these is accurate, or whether a more complex or hybrid combination is correct, is not known. The arrangement whereby the linear 5-Araf occurs in tetraglycosyl units is supported by the existence of 24 (Fig. 4). It should be stressed that the arrangements of the structural motifs in the arabinogalactan are uncertain and the model is meant to provide FIG. 9. An illustration of some of the ways in which the major structural motifs of the cell wall arabinogalactan may be assembled.
The purpose of this illustration is to stimulate ideas on how the structural units may be joined together. This is not intended to be a final representation of the structure of arabinogalactan.
by guest on March 23, 2020 http://www.jbc.org/ Downloaded from a framework for the design of future experiments. The complete elucidation of the structure of the arabinogalactan will require the isolation of oligoglycosyl fragments encompassing branched and linear regions of both the arabinan and galactan. In addition, the complete structural characterization of the arabinogalactan requires that oligoglycosyl units (37 and 38 and 1 and 2, Fig. 4) that do not conform to the model proposed in Fig. 9 are accounted for. However, 37 and 38 are minor products and probably represent minor populations of arabinogalactan in which mannosyl or mannobiosyl rather than 2-linked arabinofuranosyl residues occupy non-reducing termini; indeed, mannose is a component of the arabinogalactan of some corynebacteria (16).
The proposed structures (Fig. 9) are also of value in determining where the mycolyl residues, that are part of the native mycolylarabinogalactan, are attached, It has been reported that the mycolic acids are attached to O-5 of arabinosyl residues (3, 4, 33-35), and it can be seen from Fig. 9 that position 5 of both the t-Araf and 2-Araf, both present in structural motif A, are available for such attachment. The diarabinosidyl mycolate (mycolyl&Araf-( l-3)-Ara), isolated by Markovits and Vilkas (33), is in accord with structural motif A only if the mycolic acid is attached to C-5 of the 2-Araf.
Considerable progress has recently been made in defining the individual, extractable entities of mycobacterial cell walls and relating these to aspects of the pathogenesis and immunogenesis of disease (18). However, the massive insoluble mycolylarabinogalactan-peptidoglycan framework had defied even primary structural definition. Clearly, the structure of these molecules must be understood before their role in the refractoriness of mycobacteria to chemotherapy, the peculiar persistence of mycobacteria within the macrophage, and the propensity for disease recrudescence can be addressed.