Functional Analysis of Glycosyltransferases Encoded by the Capsular Polysaccharide Biosynthesis Locus of Streptococcus pneumoniae Serotype 14*

Bacteria belonging to the speciesStreptococcus pneumoniae vary in their capsule. Presently, 90 capsular serotypes are known, all possessing their own specific polysaccharide structure. Little is known about the biosynthesis of these capsular polysaccharides. The cps locus of S. pneumoniae serotype 14 was cloned. So far, 7 open reading frames have been sequenced, cps14B to cps14H. The gene products are similar to proteins involved in bacterial polysaccharide biosynthesis, both of Gram-negative and -positive micro-organisms. Gene-specific mutants were created for cps14D tocps14H by insertional mutagenesis. All mutants no longer agglutinated with a monoclonal antibody against type 14 capsule polysaccharides. The biosynthetic function of cps14E andcps14G was determined by analysis of the intermediates in the synthesis of the oligosaccharide subunit, formed in membrane preparations of the wild-type and mutant strains and in membrane preparations of Escherichia coli expressing the pneumococcal glycosyltransferases. The enzyme encoded bycps14E is a glucosyl-1-phosphate transferase that links glucose to a lipid carrier, the first step in the biosynthesis of the type 14 repeating unit. The gene product of cps14G encodes a β-1,4-galactosyltransferase, the enzyme responsible for the second step in the subunit synthesis, the transfer of galactose to lipid-linked glucose.

The expression of a polysaccharide capsule is a common feature of many bacterial species, particularly those causing serious invasive infections of man (1). The capsule confers the bacterium resistance to complement-mediated opsonophagocytosis (2). In addition, some bacteria express capsular polysaccharides (CPs) 1 that mimic host molecules, thereby avoiding the specific immune system of the host (3). Bacterial CPs are generally composed of repeating oligosaccharides, consisting of two to ten monosaccharides, sometimes complemented with other components. Because of the variable number and different types of monosaccharides within a subunit and the different ways these monosaccharides can be linked together, an enormous variety of polysaccharide structures can be formed. The species Streptococcus pneumoniae (the pneumococcus), a human pathogen causing invasive diseases such as pneumonia, meningitis, and otitis media, comprises 90 serotypes (4), each one having its own specific CP. The capsule of the pneumococcus is an important virulence factor. This became clear from the observation that unencapsulated mutants lose their virulence (5). Also, antibodies directed against CPs are protective (6), and this protection is serotype-specific. CPs from 23 serotypes are included in a vaccine (7). Although this vaccine protects healthy adults against infections caused by the serotypes included in the vaccine, its efficacy is poor in high-risk groups such as young children (8,9).
The chemical structures of most pneumococcal CP types have been determined (10), but little is known about the genes involved in CP synthesis (cps), and the biosynthesis itself. Biosynthesis should require a complex enzymatic pathway, starting with the uptake or synthesis of the monosaccharides and their activation by conversion to nucleotide derivatives. A membrane-bound transferase complex would catalyze the successive linkage of the monosaccharides to a membrane-bound lipid carrier, followed by polymerization of the subunits and subsequent export and attachment of the complete CP to the cell surface.
Classic genetic experiments performed by Avery et al. (11) indicated that the cps genes are clustered on the chromosome of the pneumococcus. Recent reports by others (12,13) and ourselves (14) confirm this observation. Also in other bacterial species (e.g. Escherichia coli, Haemophilus influenzae), cps genes are clustered. Between these loci, there appears to be a considerable degree of sequence homology. Moreover, they have a common genetic organization involving three functional regions (15,16). The central region encodes serotype-specific biosynthetic functions and is flanked on both sides by two conserved regions presumed to encode proteins for common functions like CP transport. Little is known about the genetic organization of cps loci in Gram-positive bacteria, including S. pneumoniae.
We study the capsule biosynthesis of S. pneumoniae serotype 14, whose CP has a linear backbone of 36)-␤-D-GlcpNAc-(133)-␤-D-Galp-(134)-␤-D-Glcp-(13 repeating units, with monosaccharide side chains of a ␤-D-Galp-(13 linked to C4 of each N-acetylglucosamine residue. We recently identified a chromosomal region of serotype 14 involved in CP synthesis and cloned at least a large part of the cps locus into a low copy number cosmid vector (14). Two genes, cps14E and cps14F, involved in CP synthesis were identified. Cps14E encodes a glycosyltransferase. In this communication, we report on the structure of five other genes of the serotype 14 cps locus. We also determined the functions of gene products by inactivating the genes and by functional expression of these genes in E. coli.

MATERIALS AND METHODS
Bacterial Strains and Growth Conditions-S. pneumoniae serotype 14 was strain NCTC11902. A serotype 4 strain was obtained from A. J. W. van Alphen (Academic Medical Center Amsterdam, The Netherlands). The pneumococcal strain CP1200 has been described previously (17). Pneumococci were grown at 37°C in Todd-Hewitt broth supplemented with 0.5% yeast extract (THY-medium) or on blood agar plates. Tetracycline was used at a concentration of 10 g/ml for selection of the S. pneumoniae mutants. E. coli CG120 (18), containing pAM120, was a gift from C. G. Rubens (University of Washington, Seattle, WA). E. coli DH5␣ (19) was used as a host for pBluescript II KS (Stratagene), and all clones were grown in or on LB or TB media (19) containing the appropriate antibiotic concentration.
DNA Techniques-Cloning, transformation, Southern blot analysis, and most other DNA techniques were performed as described before (19). Chromosomal DNA was isolated as described by Ausubel et al. (20). Restriction endonucleases and T4 DNA ligase were purchased from Pharmacia-LKB. High-fidelity PCR was performed with Pfu DNA polymerase (Stratagene).
DNA Sequencing and Analysis-Plasmid DNA was sequenced using an A.L.F. DNA sequencer (Pharmacia-LKB), and sequencing data were assembled using pc/gene (IntelliGenetics). The programs TFASTA and BLAST were used to compare sequences to data bases available at the National Center for Biotechnology Information, Bethesda. Hydrophobic stretches within proteins were predicted by the method of Klein et al. (21). The nucleotide sequences of cps14B to cps14H have been deposited with GenBank TM /EBI DataBank.
Construction of Gene-specific Knock-out Mutants-Construction of cps14E and cps14F mutants has been described (14). An insertional mutation in cps14D, cps14G, and cps14H was constructed in vitro by blunt ligation of a 4.9-kilobase HincII fragment of pAM120 containing the tetracycline resistance gene of Tn916 (tetM) in the HpaI site of cps14D in pMK104, in the SacII site of cps14G in pGV100, and in the SpeI site of cps14H in pGV100, respectively. After transformation to E. coli DH5␣, clones were selected in which the direction of transcription of the tetracycline resistance gene was similar to that of the adjacent cps genes. This yielded the plasmids pKOM13, pKOM01, and pKOM08, respectively. The cps14D, cps14G, and cps14H insertion mutation was introduced into the chromosome of the wild-type by natural transformation as described by Yother et al. (22), using the three constructs. Pneumoccoccal strain CP1200 was used as a donor strain for competence factor. Transformants were selected for tetracycline resistance, and mutations were confirmed by Southern hybridization analysis.
Construction of Plasmid pMK113-An in-frame deletion in cps14F (from nucleotide 3564 to 3893) was introduced in plasmid pMK100REV by performing a high fidelity PCR with the primers 14FBGLP1 (5Ј-ATAAAGATCTATGAGTTAAATGACCTCC-3Ј) and 14FBGLP2 (5Ј-CAGGAGATCTGTACAATGGGAAGAAATG-3Ј), which both contain an internal BglII site, and pMK100REV as target DNA. The PCR product was digested with BglII, religated, and transformed to E. coli DH5␣. The resulting plasmid of one of the clones, designated pMK113, was sequenced to confirm the in-frame deletion in cps14F and to check whether any mutations were introduced into cps14E and cps14G.
Membrane Preparations-Pneumococcal membranes were prepared as described (14). E. coli membranes were isolated by the same procedure with some modifications. Cultures were grown overnight in TB containing 50 g/ml ampicillin at 37°C and then diluted 10-fold in the same medium. Cells were grown to mid-log phase (A 600 ϭ 0.4 -0.6), and then expression of the cps genes was induced by adding isopropyl-␤-Dthiogalactopyranoside (IPTG) to a final concentration of 0.5 mg/ml. The incubation was extended for 2 h at 37°C and then cells were harvested by centrifugation. The cells were resuspended in 10 ml of 0.7 M sucrose, 50 mM Tris-Cl, pH 8.0, 1 mM EDTA, 20 mg of lysozyme was added, and the suspension was incubated for 3 h at 4°C. The rest of the procedure was as described for pneumococcal membranes (14). The pneumococcal and E. coli membranes were stored at Ϫ70°C and proved to be stable for several months.
Glycosyltransferase Activity Assays-Glycosyltransferase activity was essentially determined as described previously (14). For each reaction, 40 l of membrane preparation (approximately 25 g of protein) was incubated at 10°C for 1 h with 0.025 Ci UDP-[ 14 C]glucose (Amersham Life Science, Inc., 296 mCi/mmol), and/or 0.025 Ci UDP-[ 14 C]galactose (Amersham, 305 mCi/mmol), or 0.025 Ci UDP-[ 14 C]Nacetylglusosamine (Amersham, 251 mCi/mmol) as indicated in each experiment, and 10 mM MgCl 2 in a final volume of 50 l. When indicated, unlabeled UDP-glucose was added to a final concentration of 500 M. Reactions were stopped by the addition of 1 ml chloroform:methanol (2:1). This solution was extracted three times with 0.2 ml of PSUP (1.5 ml of chloroform, 25 ml of methanol, 23.5 ml of water, and 0.183 g of KCl). The incorporation of 14 C into the glycolipid fraction in the organic phase was measured in a scintillation counter (Beckman).
Analysis of Lipid-linked Intermediates by Thin Layer Chromatography-Lipid-linked intermediates were hydrolyzed from the lipid carriers by mild acid hydrolysis: one-fifth of the glycolipid fraction was dried in a Speed-Vac and resuspended in 100 l of 1-butanol. 100 l of 0.05 M trifluoroacetic acid was added, and the solution was heated for 20 min at 90°C. The solution was dried again in a Speed-Vac, and the pellet was resuspended in 40% 2-propanol containing 5 mg/ml unlabeled carrier glucose, galactose, and lactose, and subjected to thin layer chromatography (TLC). TLC was carried out on HPTLC silica gel (Merck), developed in 1-butanol/ethanol/water (5:3:2), sprayed with En 3 hance spray (DuPont) and autoradiographed for 1-2 days. To visualize unlabeled sugar standards, the TLC plate was sprayed with 5% H 2 SO 4 in ethanol and heated to 100°C for 10 min.
Reduction Analysis of the [ 14 C]-labeled Lactose Intermediate-[ 14 C]labeled lactose, released by mild acid hydrolysis of the lipid carrier of wild-type pneumococcal membranes incubated with UDP-[ 14 C]glucose and UDP-[ 14 C]galactose, was scraped off a TLC plate and eluted in water. Half of the sample was reduced with sodium borohydride as described previously (23) and desalted using a column of mixed-bed resin (IWT TMD-8, Aldrich). Both samples (reduced and unreduced lactose) were subjected to strong acid hydrolysis. The dried saccharides were dissolved in 80 l H 2 O, 20 l of 100% trifluoroacetic acid was added, and the solution was heated for 4 h at 100°C. The solution was dried again, and the pellet was resuspended in 40% 2-propanol containing 5 mg/ml unlabeled glucose, galactose, sorbitol, and lactose, serving as carriers and internal standards. The sample was analyzed by TLC as described above, but to achieve a better separation of glucose, galactose, and sorbitol, the TLC plate was developed twice in in ethyl acetate/ methanol/water/acetic acid (7:2:1:0.1).

The Biosynthesis of the Serotype 14 Polysaccharide Subunit
Starts with Glucose-In this study, we focussed on the first two steps in the type 14 CP biosynthesis. Lipid-linked oligosaccharide intermediates were prepared by glycosyltransferase assays, performed on membrane preparations. Membranes were incubated with 14 C-labeled UDP-activated monosaccharides that are part of the serotype 14 polysaccharide subunit: glucose, galactose, or N-acetylglucosamine. Serotype 4, which contains none of these monosaccharides in its polysaccharide structure, was used as a negative control. The incorporation of 14 C-label into the glycolipid fraction was measured (Fig. 1A). Serotype 14 membranes efficiently incorporated [ 14 C]glucose and [ 14 C]galactose but hardly any [ 14 C]N-acetylglucosamine. The incorporation of all labeled sugars into the glycolipid fractions of serotype 4 membranes was much lower than that of serotype 14. Our data agree well with those of Distler and Roseman (24), confirming that the 14 C-labeled monosaccharides are incorporated into a serotype-specific glycolipid fraction and that either glucose or galactose is the first sugar linked to a lipid carrier. To address this further, the glycolipid fractions were subjected to mild acid hydrolysis and then analyzed by TLC and autoradiography (Fig. 1A). Incubation with only UDP-[ 14 C]glucose resulted in incorporation of mainly glucose (lane 1), whereas UDP-[ 14 C]galactose gave mainly lactose (␤-Gal(134)␤-Glc) and some glucose (lane 2). UDP-glucose will be formed from UDP-galactose by a UDP-glucose-4-epimerase activity, which is apparently present in this system (24 (Fig. 1B). These data show that glucose is the reducing terminal sugar attached to a lipid carrier. In the second biosynthesis step, galactose is linked to this glucose residue, resulting in lipid-linked Gal␤(1-4)Glc (lactose).
Sequence Analysis of Seven Open Reading Frames of the cps Locus of Serotype 14 -We previously cloned the cps locus of serotype 14 into the low copy number vector pPR691, yielding cosmid clone cMK02 (14). Restriction fragments of this clone were subcloned in pBluescript, mapped (Fig. 2), and sequenced. Seven complete ORFs were found in this region. No transcription terminator signals were detected within the sequenced region, suggesting that the ORFs are part of one operon. Each ORF is preceded by a ribosome binding site sequence, and they were all closely coupled, being separated by a maximum of 18 nucleotides. The first four ORFs were almost identical to cps19fB-E of S. pneumoniae serotype 19F (13) and therefore designated cps14B-E. This homology abruptly ends from the start of cps14F. cps14F-H do not show any homology with cps genes of serotype 19F. Gene-specific knock-out mutants of cps14D-H no longer agglutinated with a monoclonal antibody against type 14 CP (see below), confirming that the genes are essential for type 14 CP synthesis. An overview of the properties of the ORFs and their translation products is given in Table I. The cps14B gene product is homologous to proteins encoded by genes of several other Gram-positive bacteria involved in CP or exopolysaccharide synthesis (Table I), but their function is unknown.
The proteins encoded by cps14C and cps14D are homologous to a family of ExoP-like proteins (Table I). They represent two domains that are mostly divided over two proteins, but they are also found combined in one protein (ExoP). The amino acid sequence AISPSSPNIKRNTLIGFLAG (residues 168 -187) of Cps14C matches a consensus sequence motif identified for proteins involved in the chain length determination of polysaccharides (30,26). Furthermore, the hydrophobicity plot of Cps14C resembles those of these proteins, with two hydrophobic segments at its N and C termini, and a hydrophilic domain in the cental part. These data suggest that Cps14C is involved in the chain length determination of type 14 CP. Cps14D contains nucleotide-binding motives as reported for most of the proteins homologous to Cps14D. The sequences GEGKTT (amino acids 46 -51) and YIIVD (amino acids 128 -132) match the consensus proposed for the Walker nucleotide-binding motifs A and B (34). The amino acid sequence FARAGYKTLLIDGDTR downstream of the Walker A motif matches for 90% another conserved sequence motif found in ATPases (LAXXGXKXLLIDX-DXR) reported by Huang and Schell (31). Rhizobium meliloti strains carrying truncated ExoP proteins devoid of the C-terminal domain, including the nucleotide-binding motif, produced smaller amounts of exopolysaccharide with a lower molecular weight, indicating a less efficient transport and an altered chain length regulation (30). These data indicate that Cps14D may have a role in chain length determination and export of type 14 CP.
Cps14E, a glycosyltransferase, and its homology with other bacterial transferases has been described previously (14). Cps14E contains four potential membrane spanning stretches.
Cps14F shows significant homology with the N-terminal part of SpsK of Sphingomonas S88 (35). It contains one hydrophobic segment by which it may be anchored into the bacterial membrane. Cps14G is homologous to the C-terminal part of SpsK of Sphingomonas S88. These homologies suggest that the gene products of cps14F and cps14G act together. They may have been one gene originally, or spsK may have evolved from a gene fusion. Based on marginal similarities with putative UDP binding sites of other glycosyltransferases, it was suggested that SpsK is a glycosyltransferase (35). Also in Cps14G, a small region shows similarity with a proposed UDP binding site domain of UDPglycosyltransferases (36). Fig. 3 shows sequence similarities The deduced amino acid sequence of cps14H shows low homology with the O-antigen polymerases encoded by the rfc genes of Shigella flexneri (37) and Salmonella typhymurium (38). The hydrophobicity plot of Cps14H shows 9 to 11 potential membrane spanning regions (data not shown), characteristic for the O-antigen polymerases identified so far. Cps14H may be the type 14 polysaccharide polymerase.
Subunit Synthesis in Gene-specific Knock-out Mutants-The gene-specific knock-out mutants Spn#E and Spn#F, in which cps14E and cps14F are inactivated, have been described previously (14). Three more mutants were created as described under "Experimental Procedures," Spn#D, Spn#G, and Spn#H, in which cps14D, cps14G, and cps14H were disrupted. The correct location of the tetM insertions on the chromosome was confirmed by Southern hybridization analysis (data not shown). All five mutants were not agglutinated by Hasp14.1, a monoclonal antibody specific for serotype 14 CP, indicating that all five genes are involved in CP synthesis. Membranes of wild-type bacteria and the five mutants were prepared and incubated with radioactively labeled UDP-glucose and UDPgalactose. Membranes of wild-type bacteria and the mutants Spn#D, Spn#F, Spn#G, and Spn#H showed comparable levels of incorporation but Spn#E demonstrated a reduced transferase activity (Fig. 4), as described earlier (14). The lipid-linked intermediates of these mutant membranes were removed from the lipid carrier by mild acid hydrolysis and characterized by thin-layer chromatography (Fig. 4). Wild-type membranes and membranes of Spn#D both showed glucose and lactose as lipidlinked intermediates. These products were absent in the pGV100 is a 2.4-kilobase EcoRI chromosomal DNA fragment cloned in pBluescript. pMK100, pMK102 to pMK106, pMK108, and pMK110 are subclones of cosmid clone cMK02. E, EcoRI; H, Hin-dIII; R, RsaI; S, SpeI; and X, XbaI. Spn14#E lane. Thus the cps14E gene product is responsible for the first step of CP biosynthesis, the transfer of glucose-1phosphate onto the lipid carrier. We assume that the biosynthesis of other pneumococcal polysaccharides (e.g. C-polysaccharide) also proceeds via lipid-linked intermediates but at a much lower level than CP production. This may explain the low background of incorporated glucose into the glycolipid fraction of mutant Spn#E. Membranes of both Spn#F and Spn#G incorporated only glucose. This result shows that at least Cps14G is involved in the second step in the biosynthesis of serotype 14 CP, the addition of galactose to lipid-linked glucose. Since we cannot exclude the possibility that insertion of the tetracycline resistance cassette in cps14F has a polar effect on cps14G, we cannot draw any conclusions from this experiment whether cps14F is involved in galactosyltransferase activity or not. Therefore, a non-polar in-frame deletion was made in cps14F in an E. coli clone that is able to express cps14E to cps14G (see below). Membranes of Spn14#H demonstrated both glucose and lactose as lipid-linked intermediates, which shows that Cps14H (the putative CP polymerase) is not involved in the first two biosynthesis steps.
Expression of Pneumococcal Glycosyltransferase Genes in E. coli-It appeared that pneumococcal glycosyltransferases can be expressed in E. coli. Six different clones containing cps genes of serotype 14 were used (Fig. 5A). Membranes of the E. coli cells were isolated 2 h after induction with IPTG, and transferase assays were performed using labeled UDP-glucose and UDP-galactose and labeled UDP-galactose with an excess of unlabeled UDP-glucose. The incorporation of [ 14 C]-label into the glycolipid fraction was measured (Table II), and formed oligosaccharide intermediates were hydrolyzed and characterized by TLC (Fig. 5B). As a marker, the lipid-linked intermediates hydrolyzed from pneumococcal membranes incubated with 14 C-labeled UDP-glucose and UDP-galactose were run on the same TLC.
Clone pMK104 contains a major part of cps14B, the complete cps14C and cps14D genes and a small part of cps14E (287 bp) in the wrong orientation with respect to the lac promoter of pBluescript. This clone was used as a negative control since it contains no putative glycosyltransferase genes, and indeed, no glucose or galactose was incorporated.
Clone pMK112 contains the larger part of cps14E, cps14F, and half of cps14G (the first 276 bp) in the correct orientation with respect to the lac promoter. In this clone, the first 283 nucleotides of cps14E are missing, but a start codon (ATG) with a ribosome binding site (AGAAG) are present, resulting, when expressed, in a polypeptide missing the first 98 amino acids of Cps14E. When cultures were induced with IPTG, membranes of this clone showed considerable glycosyltransferase activity, but only [ 14 C]glucose and not [ 14 C]galactose was recovered from the glycolipid fraction. When 14 C-labeled UDP-galactose and an excess of unlabeled UDP-glucose was added to these membranes, no radioactive monosaccharides could be detected in the glycolipid fraction.
Clones pMK100REV and pMK100 only differ in orientation of the insert. They contain the same major part of cps14E as clone pMK112, plus the entire cps14F and cps14G genes, and a part of cps14H. Membranes of clone pMK100REV showed a significant glycosyltransferase activity. Even with an excess of unlabeled glucose, [ 14 C]galactose was incorporated. The TLC shows that lactose could be released from the lipid fraction. Clone pMK100 shows a weaker lactose spot on the TLC, indicating that there is some expression of the cps14 genes, independent of the lac promoter. Also in clone pMK113, a derivative of clone pMK100REV containing an in-frame deletion in cps14F, lactose could be hydrolyzed from the lipid fraction although the incorporation of 14 C label is lower than in clone pMK100REV. These data confirm that that cps14E encodes a glucosyl-1-phosphate transferase and that cps14G encodes a ␤-1,4-galactosyltransferase. Moreover, these transferase assays show that Cps14F is not required for the transfer of galactose to lipid-linked glucose. Clone pGV100 contains a small part of the carboxyl-terminal end of cps14E (577 bp), the entire cps14F and cps14G genes, and a large part of cps14H. This clone did not show any glycosyltransferase activity. Since no functional Cps14E is expressed in this clone, lipid-linked glucose is not present as an acceptor for Cps14G activity, and therefore lipid-linked lactose is not formed.

DISCUSSION
In this study, we characterized a chromosomal region involved in capsular polysaccharide biosynthesis of S. pneumoniae serotype 14. Seven complete open reading frames (cps14B-H), which seem to belong to a single transcriptional unit, were found. Five gene-specific knock-out mutants were created by insertion of a tetracycline resistance cassette (tetM) in cps14D-H. All five mutants did not react with a type 14 specific monoclonal antibody, indicating that cps14D-H are part of the cps locus of serotype 14.
The first four ORFs, cps14B-E, are almost identical to cps19fB-E (97% identity at the DNA level) of serotype 19F. This homology ends abruptly from the start of cps14F, suggesting that a recombination event in this area led to exchange of cps genes. Data that suggest that horizontal gene transfer of CP biosynthesis genes in pneumococci occur have been reported before (39). In Gram-negative bacteria (e.g. E. coli and H. influenzae), serotype-specific genes that are involved in CP synthesis are located between two conserved regions that are presumed to code for common functions (15,16). Recombination between the flanking conserved CP genes would be expected to result in another capsular type and evidence for the in vivo horizontal transfer of serotype-specicific CP genes of H. influenzae has been reported (40). Cross hybridization experiments between pneumococcal cps genes and chromosomal DNA of several serotypes (13) 2 showed that the cps14/19fA-D genes are conserved in many serotypes. Whether this forms one of the two conserved regions as observed in the organization of cps loci in Gram-negative bacteria remains to be clarified.
In earlier work, we showed that cps14E encodes a glycosyltransferase (14), and we suggested that both Cps14E and Cps19fE are glucosyltransferases since glucose is the only monosaccharide that serotypes 14 and 19F have in common. In this study, we have shown that the subunit synthesis of serotype 14 starts with the addition of glucose to a lipid carrier and that Cps14E is responsible for this first step in the oligosaccharide synthesis. Also in serotype 19F, the subunit synthesis starts with the addition of glucose to a lipid carrier (data not shown), and therefore, it seems likely that Cps19fE catalyzes the first step of CP synthesis in serotype 19F. The CP structure of serotype 14 is very similar to that of Streptococcus agalactiae type III, the only difference is an extra sialic acid residue linked to the galactose side group of S. agalactiae type III CP. Cps14E has a considerable similarity with CpsD of S. agalactiae type III (14). Rubens et al. (25) found reduced galactosyltransferase activity in a CpsD mutant, indicating that cpsD encodes a galactosyltransferase. This suggests that CP biosynthesis in both streptococcal species is rather different. However, the possibility that also in their study UDP-galactose is converted into UDP-glucose by an epimerase, and that glucose is transferred to a lipid carrier cannot be excluded. E. coli clone DH5␣(pMK112) contains a major part of cps14E downstream of the lac promoter of pBluescript, and, when induced, membranes of this clone showed considerable glucosyltransferase activity. Most likely, a polypeptide, missing the first 98 amino acids of Cps14E, is expressed that still has glucosyltransferase activity. This is in agreement with the observation that expression of only the 3Ј-half of wbaP (rfbP) of Salmonella enterica, a gene homologous to cps14E and involved in the first step of O-antigen synthesis, results in the synthesis of a polypeptide with glycosyltransferase activity (41).
Both the cps14F and the cps14G mutant still showed glucosyltransferase activity, but both mutants were blocked in the second step of the subunit synthesis, the transfer of galactose  Table II). Glc, glucose; Gal, galactose; and Gal-Glc, lactose.  pMK112  121  3  pMK100REV  174  116  pMK100  53  51  pMK113  85  64  pGV100  8  4  pMK104  4  2 to lipid-linked glucose. Insertion of tetM in cps14D has no polar effect on cps14E, which is situated immediately downstream of cps14D, since mutant Spn#D is still able to incorporate glucose and galactose into the glycolipid fraction. However, the possibility that the lack of galactosyltransferase activity of Spn#F is caused by a polar effect on cps14G, rather then the cps14F mutation itself, could not be excluded. Therefore, a non-polar in-frame deletion was made in cps14F in E. coli clone DH5␣(pMK100REV), resulting in clone DH5␣(pMK113). Membranes of E. coli DH5␣(pMK100REV), which contains cps14E to cps14G, showed glucosyl and galactosyl transferase activity. Both activities were still present in clone DH5␣(pMK113), which shows that Cps14F is not required for galactosyltransferase activity. Thus cps14G solely encodes the ␤-1,4-galactosyltransferase activity required for the second step in the biosynthesis of the oligosaccharide subunit in S. pneumoniae serotype 14, the transfer of galactose to lipid-linked glucose. The homology of Cps14F and Cps14G to the N-terminal half and C-terminal half of the Sphingomonas SpsK protein, respectively, suggests a combined function for both proteins. The function of CpsF is not clear, but the reduced glycosyltransferase activity in clone DH5␣(pMK113) compared with the activity of DH5␣(pMK100REV) indicates that Cps14F has an enhancing role in glycosyltransferase activity. It is unlikely that cps14F encodes another glycosyltransferase since preliminary sequencing data revealed the presence of two ORFs downstream of cps14H, which both show homology to several other glycosyltransferases. These putative genes would encode the ␤-1,3-N-acetylglucosaminyltransferase and the ␤-1,4-galactosyltransferase, required for the last two sugar additions in the oligosaccharide subunit synthesis of S. pneumoniae serotype 14 CP. Cross-hybridizations between cps14F and cps14G probes and chromosomal DNA of several other pneumococcal serotypes, showed that only serotypes 13, 15A, 15B, 15C, and 15F contain cps14F-and cps14G-like sequences. Both probes reacted with EcoRI fragments of the same size, indicating that both genes also exist together in these serotypes. In addition, these five types all contain Gal␤(1-4)Glc as part of their subunit structure (10), and transferase assays performed with membranes of these serotypes showed that lipid-linked Gal␤(1-4)Glc is formed (data not shown). These data suggest that genes homologous to cps14G encode similar ␤-1,4-galactosyltransferases, required for the CP synthesis in these pneumococcal serotypes.