Molecular Cloning and Enzymatic Characterization of a UDP-GalNAc:GlcNAcβ-R β1,4-N-Acetylgalactosaminyltransferase fromCaenorhabditis elegans *

A common terminal structure in glycans from animal glycoproteins and glycolipids is the lactosamine sequence Galβ4GlcNAc-R (LacNAc or LN). An alternative sequence that occurs in vertebrate as well as in invertebrate glycoconjugates is GalNAcβ4GlcNAc-R (LacdiNAc or LDN). Whereas genes encoding β4GalTs responsible for LN synthesis have been reported, the β4GalNAcT(s) responsible for LDN synthesis has not been identified. Here we report the identification of a gene fromCaenorhabditis elegans encoding a UDP-GalNAc:GlcNAcβ-R β1,4-N-acetylgalactosaminyltransferase (Ceβ4GalNAcT) that synthesizes the LDN structure. Ceβ4GalNAcT is a member of the β4GalT family, and its cDNA is predicted to encode a 383-amino acid type 2 membrane glycoprotein. A soluble, epitope-tagged recombinant form of Ceβ4GalNAcT expressed in CHO-Lec8 cells was active using UDP-GalNAc, but not UDP-Gal, as a donor toward a variety of acceptor substrates containing terminal β-linked GlcNAc in both N- and O-glycan type structures. The LDN structure of the product was verified by co-chromatography with authentic standards and 1H NMR spectroscopy. Moreover, Chinese hamster ovary CHO-Lec8 and CHO-Lec2 cells expressing Ceβ4GalNAcT acquired LDN determinants on endogenous glycoprotein N-glycans, demonstrating that the enzyme is active in mammalian cells as an authentic β4GalNAcT. The identification and availability of this novel enzyme should enhance our understanding of the structure and function of LDN-containing glycoconjugates.

Many of the functional moieties of complex glycoconjugates are in the terminal sequences of N-and O-glycans of glycoproteins and in glycolipids, which are recognized by a growing number of carbohydrate binding proteins (1)(2)(3)(4). A common terminal motif that is modified in a variety of ways by the additions of other sugars and sulfate groups is the lactosamine sequence Gal␤4GlcNAc-R (LacNAc or LN), 1 which is generated by a large family of UDP-Gal:GlcNAc␤-R ␤1,4-galactosyltransferases (␤4GalTs) acting on terminal GlcNAc residues (5). However, another common terminal motif found in vertebrate and invertebrate glycoconjugates is the GalNAc␤4GlcNAc-R (LacdiNAc or LDN) sequence. The LDN motif occurs in mammalian pituitary glycoprotein hormones, where the terminal GalNAc residues are 4-O-sulfated (6) and function as a recognition markers for clearance by the endothelial cell Man/ S4GGnM receptor (7). However, nonpituitary mammalian glycoproteins also contain LDN determinants (8 -11), indicating that expression of LDN determinants in vertebrate glycoconjugates is more widespread than once thought. In addition, LDN and modifications of LDN sequences are common antigenic determinants in many parasitic nematodes and trematodes (12)(13)(14)(15)(16)(17).
The LDN structure can be considered a variant of the more typical LN structure generated by a family of ␤4GalTs that includes the best characterized of all glycosyltransferases, the ␤4GalT I or lactose synthase (18 -26). As more members of this family have been studied and the cDNAs encoding them have been cloned, it is evident that they share highly homologous regions within their amino acid sequences (27)(28)(29)(30)(31)(32)(33)(34)(35)(36). Interestingly, these regions of homology are also found within the amino acid sequence of a snail UDP-GlcNAc:GlcNAc␤-R ␤1,4-N-acetylglucosaminyltransferase (37)(38)(39). This latter finding raised the possibility that the ␤4GalNAcT enzyme(s) might also have amino acid sequence homology to members of the ␤4GalT family. However, despite many studies reporting on the activity of a putative ␤4GalNAcT capable of generating LDN sequences (11, 40 -46), the gene(s) encoding the putative ␤4GalNAcT responsible for LDN synthesis has not been identified.
In searching for the putative ␤4GalNAcT required for LDN synthesis, we examined genes in Caenorhabditis elegans. The C. elegans genome contains three open reading frames that encode proteins with sequence homology to the ␤4GalT family.
One of these open reading frames (ORF R10E11.4; sqv-3) is predicted to encode a protein involved in vulval invagination (47) and is likely to be a UDP-Gal:xylose ␤-R ␤1,4-galactosyltransferase (33,48). Another of these open reading frames (ORF W02B12.11) encodes a protein for which no enzymatic activity has yet been reported. The third open reading frame (ORF Y73E7A.7) was identified more recently than the two mentioned above and therefore had not been reported in pre-vious studies (27,31). In this study, we have cloned a cDNA corresponding to the latter open reading frame and demonstrate that it encodes a ␤4GalNAcT, which we have termed Ce␤4GalNAcT. Ce␤4GalNAcT is active when expressed in mammalian cells in generating LDN determinants on N-glycans of glycoproteins.

EXPERIMENTAL PROCEDURES
Materials-All chemicals and reagents used in this study, unless otherwise indicated, were from Sigma. The C. elegans cDNA library was a gift from Dr. Robert Barstead (Oklahoma Medical Research Foundation, Oklahoma City, OK). The QIA Quick gel extraction kit was from Qiagen (Valencia, CA). Restriction enzymes were from New England Biolabs (Beverly, MA). The pCR 2.1 vector was from Invitrogen. The pcDNA3.1(ϩ)-TH was a gift from Dr. Alireza R. Rezaie (Department of Biochemistry and Molecular Biology, St. Louis University School of Medicine, St. Louis, MO). FuGENE 6 and Complete protease inhibitor mixture were from Roche Molecular Biochemicals. N-Glycanase was from Glyko (Novato, CA). HighSignal West Pico Chemiluminescent Substrate was from Pierce. GlcNAc␤1-3GalNAc␣1-O-pNP (core 3-O-pNP) and GlcNAc␤1-6GalNAc␣1-O-pNP (core 6-O-pNP) were obtained from Toronto Research Chemicals (Toronto, Canada). Acceptor compounds (see Table II) 1-3, 5, 9, and 12 were purchased from Sigma, 4 was from Koch-Light Laboratories, and 6 -8 were from Toronto Research Chemicals. Compounds 10 and 11 were a kind gift from Dr. L. Anderson (University of Wisconsin, Madison, WI), and 14 -17 were from Dr. J. Lönngren (University of Stockholm). Compounds 13 (39) and 18 -21 (32) were synthesized as described previously. Radiolabeled nucleotide sugars were obtained from PerkinElmer Life Sciences and were diluted with unlabeled nucleotide sugars (Sigma) to give the desired specific radioactivity.
Cloning and Sequencing of the Ce␤4GalNAcT cDNA-A BlastP search of the NCBI nonredundant protein data base for homologues of the human ␤4GalT I (accession number CAA39074) identified a hypothetical protein encoded by an open reading frame in the C. elegans genome designated Y73E7A.7. A cDNA was amplified by PCR from a mixed stage C. elegans cDNA library using primers corresponding to the 5Ј-and 3Ј-ends of this open reading frame (5Ј-GCCACCATGGCTTT-TCGTCATTTGGC-3Ј; 5Ј-CTAAAAACACGTTGGAAAGTCC-3Ј). Amplification was carried out at 95°C for 2:30 min followed by 35 cycles at 95°C for 50 s, 53°C for 50 s, and 72°C for 1:50 min and then at 72°C for 10 min. The PCR product was purified from an agarose gel slice using a QIA Quick gel extraction kit, cloned into the pCR 2.1 vector, and sequenced on both strands at the Sequencing Facility of the Oklahoma Medical Research Foundation (Oklahoma City, OK).
Construction of an Expression Vector Encoding a Soluble, Epitopetagged Form of Ce␤4GalNAcT-A PsiI (partial)/PvuII DNA fragment starting at bp 87 of the Ce␤4GalNAcT open reading frame and extending beyond the stop codon was subcloned into the EcoRV site of the pcDNA 3.1(ϩ)-TH vector. The resulting vector (pCMV-SH-Ce␤4GalNAcT) encodes a fusion protein, designated SH-Ce␤4GalNAcT, which consists of a signal peptide at the N terminus followed by an HPC4 epitope and then the catalytic domain of the Ce␤4GalNAcT (beginning at Lys 30 , the first amino acid after the transmembrane domain). The HPC4 epitope is recognized by the Ca 2ϩ -dependent monoclonal antibody HPC4 (49,50). SH-Ce␤4GalNAcT is under the transcriptional control of the cytomegalovirus promoter, which is present in the vector.
Expression of SH-Ce␤4GalNAcT-CHO-Lec8 and CHO-Lec2 cells were transfected with pCMV-SH-Ce␤4GalNAcT using FuGENE 6, according to the manufacturer's instructions, and cultured in Dulbecco's modified Eagle's medium containing 10% fetal calf serum and 600 g/ml Geneticin to select for stably transformed cells. After 4 weeks of culturing in medium containing Geneticin, the cells were cultured in the same medium without Geneticin, and the culture medium was harvested every 3 days and used to purify SH-Ce␤4GalNAcT. To assay intracellular ␤4GalNAcT activity and for Western blots, cells were washed with 75 mM sodium cacodylate, pH 7.0, and lysed in a buffer of 50 mM sodium cacodylate, pH 7.0, 20 mM MnCl 2 , 1% Triton X-100, 1ϫ Complete protease inhibitor mixture (EDTA-free). The lysates were centrifuged at 12,000 ϫ g for 3 min, and the supernatants were used for further analyses.
Purification of SH-Ce␤4GalNAcT-Medium containing SH-Ce␤-4GalNAcT was centrifuged at 1,500 ϫ g for 5 min to remove cellular debris and then incubated with HPC4-UltraLink beads (5 mg of HPC4 antibody/ml of beads; 0.1 l of beads/ml of medium) for 1 h at room temperature on a rotating platform. The beads were collected by cen-trifugation at 600 ϫ g for 3 min and washed three times with 10 ml of 100 mM sodium cacodylate, pH 7.0, 2 mM CaCl 2 . The beads were then resuspended in the same buffer with the addition of 20 mM MnCl 2 and used as the enzyme source. For Western blot analysis, the bound material was released by incubating the beads in a buffer of 50 mM sodium cacodylate, pH 7.0, 20 mM EDTA for 10 min at room temperature and then collecting the supernatant.
SDS-PAGE and Western Blot Analyses-Cell lysates were treated with N-glycanase in a buffer of 20 mM sodium phosphate, pH 7.5, 50 mM ␤-mercaptoethanol, 0.1% SDS, 0.75% Nonidet P-40 for 3 h at 37°C. Control treatments were carried out in the same way but without adding N-glycanase. The lysates were then mixed with loading buffer, resolved by SDS-PAGE (4 -20% gradient), and transferred to a nitrocellulose membrane. The membrane was blocked with 5% bovine serum albumin in a buffer of 20 mM Tris-HCl, pH 7.2, 150 mM NaCl, 2 mM CaCl 2 , 0.05% Tween 20 for 5 h at 4°C. It was then incubated with the primary antibody (mouse monoclonal anti-LDN IgM SMLDN1.1 (16) or HPC4 IgG) in the same buffer (without bovine serum albumin) for 1 h at room temperature, washed in the same buffer, and incubated with the secondary antibody (horseradish peroxidase-conjugated, goat antimouse IgM or IgG) as before. The membrane was then washed again, incubated in HighSignal West Pico Chemiluminescent Substrate for 2 min at room temperature, and exposed to a BioMax film (Eastman Kodak Co.) for 1 min. The film was then developed using a processing machine (Konica SRX-101).
␤4GalNAcT Assays-Standard assays were performed essentially as described previously (45) in a 25-l reaction mixture containing 2.5 mol of sodium cacodylate, pH 7.2, 12.5 nmol of UDP-[ 3 H]GalNAc (2.5 Ci/mol), 1 mol of MnCl 2 , 0.1 mol of ATP, 0.1 l of Triton X-100, 2 l of beads, and acceptor substrate, containing 25 nmol of terminal Glc-NAc at the nonreducing end unless otherwise indicated. Control assays lacking the acceptor substrate were carried out to correct for incorporation into endogenous acceptors, and all assays were carried out in duplicate. All assays were linear with time for up to 180 min. After incubation at 37°C for 180 min, the reaction was stopped. When oligosaccharides or glycopeptides were the acceptor, the labeled product was separated from unincorporated label by chromatography on a 1-ml column of Dowex 1-X8 (Cl Ϫ form) according to Easton et al. (51). When oligosaccharide acceptors with hydrophobic aglycon (pNP) were used as the acceptor, the product was isolated using Sep-Pak C-18 cartridges (Waters) as described (52). The isolated products were assayed for incorporation of radioactivity by liquid scintillation.
High pH Anion Exchange Chromatography with Pulsed Amperometric Detection (HPAEC-PAD)-The product catalyzed by SH-Ce␤4-GalNAcT using GlcNAc␤1-O-pNP as acceptor was isolated using a Sep-Pak C-18 cartridge (1 cm 3 ) and lyophilized. Three nmol of the product (dissolved in water) were analyzed by a Dionex HPAEC-PAD system, using a PA-1 column with a 100 mM NaOH solution at a flow rate of 1 ml/min. The standard containing the authentic LDN structure GalNAc␤1-4GlcNAc␤1-O-pNP was synthesized using bovine ␤4GalT I and GlcNAc␤1-O-pNP as the acceptor for UDP-GalNAc in the standard assay described above. Commercially acquired GlcNAc␤1-3GalNAc␣1-O-pNP (core 3-O-pNP) and GlcNAc␤1-6GalNAc␣1-O-pNP (core 6-O-pNP) were also used as standards.
Large Scale Synthesis of Product for 1 H NMR Analysis-Synthesis was carried out overnight at 37°C in a 1-ml reaction mixture containing 50 mol of sodium cacodylate, pH 7.0, 300 nmol of GlcNAc␤1-S-pNP, 1 mol of UDP-GalNAc, 20 mol of MnCl 2 , 5 mol of ATP, 3 mol of NaN 3 , and 100 l of beads. The product was then isolated using a Sep-Pak C-18 cartridge (1 cm 3 ) and lyophilized. 400-Mz 1 H NMR-150 nmol of the product catalyzed by SH-Ce␤4GalNAcT using GlcNAc␤1-S-pNP as acceptor were treated with D 2 O (99.75 atom %; Merck) three times with intermediate lyophilization. Finally, the sample was redissolved in 400 l of D 2 O (99.96 atom %; Sigma-Aldrich). 1 H NMR spectroscopy was performed on a Bruker MSL 400 spectrometer operating at 400 MHz at a probe temperature of 300 K. Resolution enhancement was achieved by Lorentzian to Gaussian transformation. Chemical shifts are expressed in ppm downfield from internal sodium 4,4-dimethyl-4-silapentane-1-sulfonate but were actually measured by reference to internal acetone (␦ ϭ 2.225 ppm in D 2 O).

RESULTS
Isolation of the cDNA Encoded by Y73E7A.7 (Ce␤4GalNAcT)-A potential C. elegans open reading frame designated Y73E7A.7 was identified by a BlastP search as encoding a homologue of the human ␤4GalT I. An identical cDNA (GenBank TM accession number AY130767) was amplified by PCR from a mixed stage C. elegans cDNA library using primers corresponding to the 5Ј-and 3Ј-ends of this open reading frame, establishing that the gene is expressed in vivo. The cDNA of Y73E7A.7 encodes a predicted 383-amino acid protein with a single transmembrane domain in a type 2 topology, which is a common topological motif in glycosyltransferases. The protein encoded by Y73E7A.7 is predicted to contain six potential N-glycosylation sites and two DVD motifs, which are thought to participate in metal ion binding (53) (Fig. 1). Curiously, the last four potential N-glycosylation sites share an identical sequon (NQT), the significance of which is not clear at this time. The protein sequence encoded by Y73E7A.7 is 35.5% identical to human ␤4GalT I ( Fig. 2A) and is more closely related to the first four members of the ␤4GalT family (I, II, III, and IV) than to the other three (Fig. 2B).
Expression and Purification of a Soluble, Recombinant Form of the Protein Encoded by Y73E7A.7 (SH-Ce␤4GalNAcT)-To assess whether Y73E7A.7 encodes an active ␤4-galactosyltransferase or possibly a ␤4-N-acetylgalactosaminyltransferase, a soluble, recombinant form of the protein was generated lacking the cytoplasmic N terminus and transmembrane domain and containing the HPC4 peptide epitope at the new N terminus. This construct was stably expressed in Chinese hamster ovary CHO-Lec8 cells. These cells are impaired in the transport of UDP-Gal into the Golgi (54) and consequently generate hybrid-and complex-type N-glycans containing terminal GlcNAc and O-glycans containing the simple Tn antigen GalNAc␣1-Ser/Thr (55)(56)(57). The transfected cells expressing Y73E7A.7, but not the control mock-transfected cells, acquired a novel intracellular GalNAcT activity in the cell extracts capable of utilizing UDP-GalNAc as the donor and GlcNAc␤1-S-pNP as the acceptor (Fig. 3A). The recombinant protein containing the HPC4 epitope from extracellular medium was bound by HPC4-conjugated beads, confirming the ␤4GalNAcT activity of the enzyme encoded by the Y73E7A.7 (Fig. 3A). A Western blot of the material bound to the HPC4-conjugated beads (Fig. 3B) confirmed that it corresponded to the predicted size of the HPC4 epitope-tagged protein (43.1-kDa peptide plus N-glycans) as discussed below. These data demonstrate that Y73E7A.7 encodes an active ␤4GalNAcT and the enzyme was designated the C. elegans UDP-GalNAc:Glc-NAc␤-R ␤1,4-N-acetylgalactosaminyltransferase (Ce␤4Gal-NAcT), and the soluble, HPC4 epitope-tagged version was designated SH-Ce␤4GalNAcT.
Donor and Substrate Specificity of SH-Ce␤4GalNAcT-The enzyme purified from the medium using HPC4-conjugated beads was used in assays to further characterize its activity. In assays to determine its specificity for nucleotide-sugar donors (Table I), SH-Ce␤4GalNAcT efficiently utilized UDP-GalNAc but did not significantly utilize UDP-Gal, UDP-GlcNAc, or UDP-Glc. To define the acceptor specificity of Ce␤4GalNAcT, the enzyme was tested with a wide variety of acceptors (Table  II). SH-Ce␤4GalNAcT efficiently utilizes free GlcNAc and all substrates containing terminal ␤-linked GlcNAc in both N-and O-glycan type structures. SH-Ce␤4GalNAcT less effectively utilizes ␣-linked GlcNAc or 6-sulfated GlcNAc and does not utilize acceptors with terminal ␤-linked Gal, Glc, or GalNAc. The acceptor substrate specificity of SH-Ce␤4GalNAcT is therefore similar to the broad specificity reported for human ␤4GalT I (32). In contrast, the snail ␤4-GlcNAcT has a marked preference for acceptors with ␤1,6-linked terminal GlcNAc (39) (see Table II for a side-by-side comparison).
In view of the sequence homology between Ce␤4GalNAcT and the ␤4GalT family, we examined whether the modifier protein ␣-lactalbumin would affect the acceptor specificity of SH-Ce␤4GalNAcT. ␣-Lactalbumin, which is expressed in lactating mammary glands, associates with ␤4GalT I and switches its acceptor specificity from GlcNAc-R to free Glc, thus forming lactose synthase (58). However, unlike its effect on ␤4GalT I, ␣-lactalbumin does not induce SH-Ce␤4GalNAcT to utilize Glc as an acceptor instead of GlcNAc (Table III). ␣-Lactalbumin does appear to slightly depress activity of Ce␤4GalNAcT toward the free GlcNAc acceptor, suggesting a possible weak interaction between the enzyme and ␣-lactalbumin.
To further establish the structure of the product generated by SH-Ce␤4GalNAcT using GlcNAc␤1-S-pNP as acceptor, the product was analyzed by 1 H NMR spectroscopy (Fig. 5). The spectrum shows two H-1 doublets at ␦ ϭ 5.146 ppm and 4.540 ppm. The coupling constants of the H-1 doublets (10.5 and 8.5 Hz, respectively) indicate that both C-1 atoms are in ␤-anomeric conformation (59). The doublet at 5.146 ppm and the signal at ␦ ϭ 2.013 ppm can be assigned to the H-1 and the CH 3 -NAc of GlcNAc␤1-S-pNP by analogy to the resonance po- sitions in GlcNAc␤1-4GlcNAc␤1-S-pNP (38). The doublet at ␦ ϭ 4.540 ppm and the signal at ␦ ϭ 2.077 ppm have shifts that are close to those reported for a ␤4-linked GalNAc residue (44,45). The NMR spectrum therefore confirms that the analyzed product is GalNAc␤1-4GlcNAc␤1-S-pNP.
In Vivo Synthesis of LDN Structures on N-Glycans by SH-Ce␤4GalNAcT-Since SH-Ce␤4GalNAcT was active in cell extracts when expressed in CHO-Lec8 cells (Fig. 3A), we examined whether it can generate LDN structures on endogenous glycan acceptors in animal cells. Cell lysates from nontransfected CHO-Lec8 and CHO-Lec2 cells and transfected CHO-Lec8 and CHO-Lec2 cells expressing SH-Ce␤4GalNAcT were examined for the presence of LDN determinants by a Western blot analysis using a monoclonal antibody SMLDN1.1 against LDN (16) (Fig. 6A). As indicated above the CHO-Lec8 cells are deficient in UDP-Gal transport into the Golgi (54), whereas the CHO-Lec2 cells are deficient in CMP-sialic acid transport into the Golgi and hence generate nonsialylated glycans terminating in Gal residues (60). Nontransfected CHO-Lec8 and CHO-Lec2 cells did not express detectable levels of LDN determinants as detected by SMLDN1.1 (Fig. 6A). By contrast, both cell lines expressing SH-Ce␤4GalNAcT expressed the LDN epitope on several glycoproteins. It would be predicted that the Ce␤4GalNAcT might only add GalNAc to N-glycans in CHO cells, since CHO cells produce O-glycans of the core 1 structure (Gal␤3GalNAc␣1 Ser/Thr) lacking in GlcNAc residues (61,62). Cell extracts derived from CHO cell lines expressing SH-Ce␤4GalNAcT were treated with N-glycanase to determine whether LDN determinants were present in N-glycans. N-Glycanase treatment quantitatively removed the LDN-reactive epitopes from glycoproteins, demonstrating that LDN was expressed exclusively on N-glycans by the SH-Ce␤4GalNAcT. Transfected CHO-Lec2 cells expressed lower levels of LDN determinants than transfected CHO-Lec8, possibly due to competition from endogenous ␤4GalTs, since the cells expressed equivalent amounts of SH-Ce␤4GalNAcT as detected by a Western blot using the HPC4 antibody (Fig. 6B). The latter experiment also confirmed the molecular weight of SH-Ce␤4GalNAcT, demonstrating that N-glycanase treatment shifted the 59.4-kDa protein to 43.1 kDa, the predicted peptide size of SH-Ce␤4GalNAcT. DISCUSSION The results presented here provide several new insights into the biosynthesis of animal cell glycoproteins. We have identified a specific N-acetylgalactosaminyltransferase Ce␤4GalNAcT from C. elegans capable of utilizing UDP-GalNAc as the donor for the transfer of GalNAc residues to terminal GlcNAc residues in a wide variety of acceptors to generate the LacdiNAc (LDN) sequence GalNAc␤4GlcNAc-R. The enzyme is a member of the ␤4-galactosyltransferase family, although Ce␤4GalNAcT is unable to utilize UDP-Gal as the donor. In vertebrate cells, the recombinant form of Ce␤4GalNAcT is fully functional and capable of generating the LDN structure in complex-type N-glycans of glycoproteins. This represents the first identification of a ␤4GalNAcT capable of generating the LDN sequence in animal glycoconjugates.
Although the LacNAc (LN) sequence Gal␤4GlcNAc-R is a more general terminal modification in vertebrate glycoconjugates, the LDN sequence also occurs in several vertebrate glycoproteins and glycolipids, including pituitary glycoprotein hormones (63) and other glycoconjugates (8, 11, 64 -66). A hormone-specific ␤4GalNAcT enzyme, active in the pituitary gland and other tissues, acts preferentially on glycoproteins containing a specific peptide motif (46,63,(67)(68)(69)(70). The GalNAc residue added to these hormones is subsequently 4-O-sulfated (71)(72)(73), and the resulting terminal GalNAc-4-SO 4 acts as a clearance signal that regulates their circulatory half-lives (6, 74 -76). The addition of the LDN motif to other glycoproteins, such as glycodelin (9, 66) and protein C (8), may also be celland protein-specific and may be important to the functional activities of these glycoproteins. In addition to the hormonespecific ␤4GalNAcT, a motif-independent ␤4GalNAcT activity has been detected in extracts from many cells (69), including human 293 cells (11), bovine mammary gland (43), snails (40,41), insect cells (45), and schistosomes (42,44). The LDN motif is also a more common structural feature in invertebrate glycoconjugates compared with the LN motif, especially as seen in many parasitic nematodes and trematodes (12-17, 77). However, neither the enzyme(s) nor gene(s) encoding the enzyme FIG. 3. Expression and purification of the protein encoded by  Y73E7A.7 (SH-Ce␤4GalNAcT). A, intracellular extracts of wild-type CHO-Lec8 cells (Lec8) and CHO-Lec8 cells expressing a soluble, HPC4 epitope-tagged form of the protein encoded by Y73E7A.7 (SH-Ce␤4GalNAcT; Lec8/GT) were tested for GalNAcT (striped bars) and GalT (black bars) activities using GlcNAc␤1-S-pNP as acceptor. The material captured by HPC4 beads from the extracellular medium from both cell types was also tested for these activities. The activity is indicated in pmol of donor sugar transferred/h for 10 5 cells (extracts) or 10 ml of medium (beads). B, Western blot using the HPC4 monoclonal antibody of the material captured on HPC4 beads from 10 ml of medium from Lec8/GT cells. The positions of molecular mass markers are indicated on the left in kDa. GlcNAc␤-S-pNP UDP-GalNAc 100 GlcNAc␤-S-pNP UDP-GlcNAc 0.7 GlcNAc␤-S-pNP UDP-Glc 0.2 GlcNAc␤-S-pNP UDP-Gal 1 a Assays were carried out in duplicate as described under "Experimental Procedures" using SH-Ce␤4GalNAcT attached to HPC4 beads with a donor concentration of 0.5 mM and an acceptor concentration of 1 mM. For comparison, 100% activity corresponds to 5.9 nmol/min/ml of bead suspension. responsible for LDN synthesis in invertebrates have previously been defined.
Ce␤4GalNAcT is clearly a member of the ␤4GalT family of enzymes with homology to the other members found in various species ranging from C. elegans to mammals. Curiously, the GalT I or lactose synthase is capable of utilizing both UDP-Gal and UDP-GalNAc, and in the presence of ␣-lactalbumin, this enzyme is stimulated to utilize UDP-GalNAc as the donor to generate LDN with free GlcNAc as the acceptor (78). Thus, we considered the possibility that the LDN structure might not be generated by a separate enzyme specific for UDP-GalNAc. Therefore, it is especially interesting that the Ce␤4GalNAcT, Man␤1-4GlcNAc␤1-4GlcNAc-Asn-glycopeptide 48 365 GlcNAc␤1-2Man␣1-3 } a Assays were carried out in duplicate as described under "Experimental Procedures" using SH-Ce␤4GalNAcT attached to HPC4 beads with a donor concentration of 0.5 mM and an acceptor concentration of 1 mM terminal GlcNAc. For comparison, 100% activity (using free GlcNAc as acceptor) corresponds to 2.1 nmol/min/ml bead suspension.
b Also for comparison, relative activities with the same acceptors for human ␤4GalT I (32) and L. stagnalis ␤4GlcNAcT (39) are taken from previous publications. a Assays were carried out in duplicate as described under "Experimental Procedures" using SH-Ce␤4GalNAcT attached to HPC4-beads with a UDPGalNAc concentration of 0.5 mM. For comparison, the 100% activity corresponds to 2.1 nmol/min/ml beads suspension. although a member of the ␤4GalT family, does not utilize UDP-Gal. Two recent crystallographic studies on ␤4GalT I have shed light on the amino acid residues that are important in donor and acceptor recognition by the enzymes of the ␤4GalT family. The first study demonstrated that changing a tyrosine residue (Tyr 289 ) in the bovine ␤4GalT I to isoleucine altered its donor specificity from UDP-Gal to UDP-GalNAc (21). It is noteworthy that the Ce␤4GalNAcT contains an isoleucine residue (Ile 257 ) at the corresponding position. The second study identified 12 amino acids in the bovine ␤4GalT I that constitute its acceptor binding site (79). These amino acids vary considerably among members of the ␤4GalT family, and Ce␤4GalNAcT has between 2 and 4 of these residues in common with each of the other members of this family. The specific amino acids in Ce␤4GalNAcT responsible for its sugar nucleotide and acceptor specificity await identification.
It is noteworthy that the soluble form of the Ce␤4GalNAcT, when expressed in CHO-Lec8 or CHO-Lec2 cells, is capable of generating LDN epitopes on cellular glycoproteins. Interestingly, a significant amount of the total Ce␤4GalNAcT was present in cell extracts compared with extracellular media (Fig.  3A). This implies that the soluble enzyme is sufficiently retained in the cell to allow productive interactions with intracellular acceptor glycoproteins. The mode of retention of the soluble Ce␤4GalNAcT in CHO cells is not known. Targeting and retention in the Golgi apparatus for many glycosyltransferases requires membrane anchoring, although other domains of the enzymes are also important (80,81). Similarly, we previously observed that the soluble form of the ␣1,3-galactosyltransferase is also functional within cells (82). However, soluble forms of some other glycosyltransferases inefficiently glycosylate intracellular acceptors (83,84). It is conceivable that the high concentration of potential terminal GlcNAc-R acceptors in CHO-Lec8 and CHO-Lec2 cells could cause the retention of Ce␤4GalNAcT in appropriate Golgi compartments, based on the observation that many glycosyltransferases show affinity for their acceptor substrates and can be purified by affinity chromatography on immobilized acceptors (85). The Ce␤4GalNAcT could also interact with some other Golgi-resident protein, such as another glycosyltransferase, as proposed in the kin recognition hypothesis (86). Overall, our results support the possibility that Golgi retention of glycosyltransferases is likely to be a complex event mediated in part by multiple domains of the enzymes and not necessarily by the transmembrane domains.
Although Ce␤4GalNAcT is able to act on most of the common types of mammalian N-and O-glycans, there is only a limited knowledge of the glycan structures produced in C. elegans. It has been reported that the LDN motif appears at the reducing end of unusual O-glycans of C. elegans with the predicted sequence R-GalNAc␤4GlcNAc-Ser/Thr (87). Whether Ce␤4-GalNAcT is responsible for synthesis of this type of structure is currently unknown, as are the enzymes that can potentially act to extend a glycan from the LDN motif.
The availability of a recombinant, well characterized ␤4Gal-NAcT active in mammalian cells should help advance our understanding of this type of glycosyltransferase and the struc-tures and functions of LDN-containing glycans. The enzyme can be a valuable tool for both the in vitro and in vivo synthesis of LDN-based glycan structures, which may be used for further studies on their function in both vertebrates and invertebrates, as well as for studying LDN-containing antigenic glycans and pharmaceutical or commercial products.