Cloning and characterization of cDNA encoding 3-methylcholanthrene inducible rat mRNA for UDP-glucuronosyltransferase.

We have isolated cDNA clones of the mRNA for rat UDP-glucuronosyltransferase that catalyzes the glucuronidation of 4-nitrophenol, by using synthetic oligonucleotides as hybridization probes. The complete nucleotide sequence of the 1,927-base pairs cDNA insert has been determined. With untranslated sequences of 124 and 216 base pairs in the 5'- and 3'-terminal regions, respectively, the cDNA insert contained 1,587 base pairs that encode a complete primary structure of a putative precursor form of 4-nitrophenol UDP-glucuronosyltransferase with a calculated molecular weight of 60,114. The cDNA sequence also indicates the presence of 25 amino acids preceding the sequence determined by microsequence of the isolated protein. This extrapeptide, for the most part, consists of hydrophobic amino acids which are characteristic of the signal peptides as found for secretory proteins and most transmembrane proteins. Furthermore, the deduced amino acid sequence contains a putative halt transfer signal of a hydrophobic segment (residues 487-510), which is flanked on both sides by the peptide segments of highly charged amino acid residues (residues 463-486 and 511-529). These features are consistent with the properties of transmembrane proteins. Specific cDNA probes were used to analyze the induction of the enzyme in rat tissues by treatment with 3-methylcholanthrene. RNA blot analysis showed that 3-methylcholanthrene increased 10- to 15-fold the amount of hybridizable mRNA in liver. The livers and kidneys from 3-methylcholanthrene-treated rats were found to contain almost the same amount of hybridizable mRNA, although the basal level in the kidney was much higher than that of the liver, and the amounts in the lung were much lower than that of the liver and kidney.

alyzes glucuronidation of bilirubin and steroids as well as xenobiotic compounds including carcinogens (1-3). The enzyme is mainly located in the endoplasmic reticulum of hepatocytes (4). The isoenzymes, which exhibit different substrate specificity, have been purified in several laboratories (5)(6)(7)(8)(9)(10)(11)(12)(13). Some forms of the isoenzymes are increased in amount by treatment of animals with inducers such as 3-methylcholanthrene (3°C) and phenobarbital (14,15). These inducers are also known to induce the synthesis of cytochrome P-450, another drug-metabolizingenzyme. Recent work on the cDNA cloning of UDPGT has also established the existence of isoenzymes (16,17). These facts indicate that UDPGT constitutes a multigene family, as reported in cytochrome P-450 isoenzymes (18). However, the relationship between individual isoenzymes and their corresponding cDNAs is not substantiated, as detailed structural analysis of the purified protein has yet to be obtained. Falany and Tephly (11) and Roy Chowdhury et al. (13) have systematically purified the isoenzymes of UDPGT from rat liver microsomes, using chromatofocusing and affinity chromatography. The isoform with M, 55-kDa has a high activity toward 4-nitrophenol (4-NP) as a substrate (11,13), and it is inducible by the treatment of 3°C (8,11,19). The cytochromes P-45Oc and d, located in the endoplasmic reticulum, are also induced by the treatment of 3°C (20,21), and the regulatory DNA elements responsive to 3°C are identified (22). It would be interesting to study whether or not these drug-metabolizing enzymes share a common regulatory sequence which is responsive to 3°C. As the first step toward elucidating the regulation mechanism of the UDPGT gene, we have attempted to isolate a cDNA clone of 3°C induced UDPGT mRNA.
In this study, we determined partial amino acid sequences for 4-NP UDPGT isoenzymes. On the basis of this information, we synthesized an oligomer probe to isolate its cDNA clone. The sequence analysis of the cloned cDNA enabled us to deduce the complete amino acid sequence of the enzyme and to suggest the existence of a precursor form of the enzyme. The mechanism of translocation of the UDPGT into the luminal side of the endoplasmic reticulum and its orientation in the membrane are discussed on the basis of the primary structure of UDPGT. EXPERIMENTAL  Purification of Rat 4-NP UDP-Glucuronosyltramferme-Hepatic microsomes were prepared from female Wistar rats injected daily for 3 days with 3°C (4 mg/100 g body weight), and solubilized in 0.5% Emulgen 913. The isoform of UDPGT with M, 55 kDa was purified from Fraction A (eluate at pH 8.9) by chromatofocusing chromatography as described by Falany and Tephly (11). The purified enzyme had a high activity for 4-NP as a substrate and an M, of 55 kDa as determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis.
The Amino Acid Sequence of the NH, Terminus and Tryptic Peptides-For the amino acid sequence analysis and compositional studies, the enzyme was further purified by high performance liquid chromatography using a pBondapak phenyl column (4.1 X 300 mm). Protein was eluted with a linear gradient method using buffer A (0.1% trifluoroacetic acid, pH 2.0) and buffer B (trifluoroacetic acid/H20/ CH3CN = 0.1:9.9:90, v/v/v, pH 2.0) over 90 min. The amino acid sequence was determined by automated Edman degradations using the Quadrol-Polybrene program on a modified Beckman 890C sequencer as described previously (23). Approximately 1.2 nmol of protein were applied to the sequencer after acetone precipitation.
Trypsin digestion was performed in 0.2 M NH4HC03 buffer, pH 8.0, with a protein/enzyme ratio of 50:1, w/w, for 24 h at 37 "C. The digest was subjected to reverse phase HPLC using an Ultrasphere C8 column (4.6 X 250 mm, 5-pm particle size) and peptides were eluted with a linear gradient from buffer A (0.1% trifluoroacetic acid) to 60% buffer B (trifluoroacetic acid/HzO/CH3CN = 0. 1:9.9:90, v/v/v, pH 2.0). The peptides were detected by UV absorbance at 206 nm. The peptides were rechromatographed on an Ultrasphere C18 column prior to sequence analysis. The peptide sequences were determined on a gas-phase sequenator built by the City of Hope (24). The COOHterminal amino acid of the protein was determined with a hydrazinolysis as previously described by Haniu et al. (23).
Construction of cDNA-Total RNA was prepared by the vanadylribonucleoside complexes procedure (25) from livers of female Wistar rats previously treated with 3°C. Poly(A)+ RNA was isolated from total RNA by chromatography on oligo(dT)-cellulose (P-L Biochemicals, type 7) as described (26). The double-stranded cDNA was synthesized from the total poly(A)+ RNA (10 pg) according to the method of Land et al. (27). In order to isolate longer cDNA, cDNA was size-fractionated by centrifugation in a sucrose density gradient (5-25% sucrose in 50 mM Tris-HC1, pH 7.5, containing 1 mM EDTA). The sized cDNA (1.5-3.5-kb long) was introduced into pBR322 at the PstI site by the dG/dC-tailing method and then used for transformation of E. coli HB101. Following replica plating, one set of the colonies was lysed and the DNA was denatured to be fixed on the filters as described (28). For hybridization with the 18-nucleotide oligomers (Fig. I), the filters were prehybridized at 65 "C for 17 h in a hybridization solution containing 10 X Denhart's solution (0.2% each of bovine serum albumin, polyvinyl pyrolidon, and Ficoll 400), 1 M NaCI, 10 mM EDTA, 0.1% sodium N-laurylsarcosinate, 30 pg/ml salmon sperm DNA, and 50 mM Tris-HC1, pH 8.0, and then they were hybridized with the 32P-labeled 18-mer oligonucleotide mixture (specific activity, 3 X 10' cpm/pg) in the hybridization solution (3.2 X IO6 cpm/0.5 ml of filter paper). The filters were washed twice with 6 X SSC (SSC, 0.15 M NaCl and 0.015 M sodium citrate) containing 0.1% sodium laurylsulfate for 30 min at 37 "C. The hybridization signal was detected by autoradiography at -70 "C as described (28).
Bacterial colonies were grown in Luria-Bertani medium supplemented with tetracycline at 7.5 pg/ml. Plasmid DNA was purified by the alkaline lysis method (28).
Sequence Analysis of the cDNA-The plasmid containing cDNA insert of 4-NP UDPGT was digested with several restriction enzymes   (28) and sequenced by the dideoxynucleotide chain termination method (29). The cDNA segments (see Fig. 2, Appendix) subcloned into pUC18 were also sequenced by the chemical method of Maxam and Gilbert (30). RNA Blot Analysis-Total RNA was extracted from the liver, kidney, and lung of untreated or 3-MC-treated female rats by the guanidine thiocyanate method of Chirgwin et al. (31). The total RNA (18 pg) was denatured (28) for electrophoresis on a 0.8% agarose gel containing 2.2 M formaldehyde, and then transferred to a nitrocellulose filter. The RNA-transferred filter paper was hybridized with either 32P-labeled Probe I, PstI-PvuII fragment (nucleotide 35-371) or Probe 11, PvuII-DraI fragment (nucleotide 372-1777), derived from the cDNA clone. The size markers were mouse rRNAs.

NH2-terminal Sequence and
Synthesis of the 18-base pairs cDNA Probe-The sequence of the NHp-terminal 26 amino acids of the UDPGT-isoenzyme (M, 55 kDa) (see "Experimental Procedures") was determined by automated microsequence analysis ( Table I). The histidine at the COOH terminus was detected by hydrazinolysis with an approximately 50% recovery. Furthermore, internal peptides derived from tryptic digests of the isoenzyme were purified by HPLC and their amino acid sequences were determined (data not shown). The sequence with the least codon degeneracy was chosen for the synthesis of a mixture of 18-mer oligonucleotide probes ( Fig. 1).
Screening of the cDNA Library-The cDNA library (=50,000 clones) was screened for the 4-NP UDPGT specific sequence using the mixture of 18-mer oligonucleotides. Eleven colonies that gave a positive signal above background were obtained. Plasmid DNAs were prepared from the individual colonies, digested with PstI, and then the identity of the plasmid DNAs was confirmed by DNA blot hybridization with the same synthetic oligomer used as the probe. The isolated plasmids exhibited cDNA inserts ranging in size from 0.5 to 1.1 Kb. However, none of these cDNA inserts covered the full length of the corresponding mRNA, because RNA blot analysis with a cDNA insert of 1.1 Kb used as a probe suggested that the length of the mRNA is about 2 Kb (data not shown). To isolate longer cDNAs, the cDNAs were sizeselected for species longer than 1.5 Kb (see "Experimental Procedures"). The cDNA library made from the sized cDNA (=50,000 clones) was screened using the 32P-labeled short cDNA (1.1 Kb) of UDPGT as a probe. A total of 28 clones were obtained, and their plasmids contained cDNA inserts ranging in size from 1.5 to 2.0 Kb.
Nucleotide Sequence of Rat UDPGT cDNA and Its Deduced Amino Acid Sequence-The restriction map and the strategy for sequencing the 2.0-Kb insert are shown in Fig. 2, Appendix. All parts of the sequence were determined at least twice in both directions. The amino acid sequence determined on the isolated tryptic peptides are found within the deduced amino acid sequence of an open reading frame starting from the first to the 1587th nucleotide (529 amino acids) as indicated in Fig. 3, Appendix. Furthermore, the sequence surrounding the putative initiator codon, which appears downstream from a nonsense codon, is included in the consensus sequence (A/G)NNATG(A/G) characteristic of an active start codon (32). These observations lead us to conclude that the cDNA sequence cloned here is for the mRNA of 4-NP UDPGT, and the open reading frame starting from nucleotide 1 indeed encodes the primary structure of 4-NP UDPGT. It follows, therefore, that the coding sequence for the NHzterminal sequence determined from the purified enzyme starts 75 bases downstream of the initiation codon. This result indicates that the amino acid sequence consisting of 25 amino acids in front of the NHz-terminal sequence is an extra peptide to the mature enzyme. This extra peptide is mostly composed of hydrophobic amino acids and has Gly at the carboxy terminus. These features are consistent with the properties of a signal peptide for translocation of newly synthesized proteins across a membrane (33, 34). The termination codon occurs at the position 1588-1590, followed by a 3"untranslated sequence consisting of about 210 nucleotides. A putative recognition sequence AATAAA for addition of the poly(A) is observed 188 nucleotides downstream from the termination codon. However, the poly(A) tail was not found at the 3' end of the cloned cDNA. RNA blot analysis of the total RNA from rat livers with the cloned cDNA used as a probe leads us to estimate the length of the UDPGT mRNA to be 2 Kb. The length of the cDNA cloned here probably represents a nearly full-length copy of the mRNA, lacking a few nucleotides in the 5' and 3' extreme ends in addition to the missing poly(A) sequence.
In agreement with the previous report that the enzyme is a glycoprotein (13), several potential glycosylation sites (Asn-X-Ser/Thr) were identified at positions, 281, 291, and 429 in the predicted primary structure (Fig. 3).
A hydropathy analysis shows that several segmental portions with high hydrophobicity are distributed along the length of the molecule (Fig. 4). Interestingly, the most hydrophobic portion of the mature protein (487-510) is located in the COOH-terminal region and is sandwiched between two highly positively charged segments.
The Induction of 4-NP UDPGT by the Treatment of 3-MC-It is known that some species of UDPGT are synthe- sized in response to various exogenous inducers. The activity of 4-NP UDPGT has been reported to be increased in livers of experimental animals by the administration of 3°C (14,15). In order to investigate the mechanism of the induced synthesis of 4-NT UDPGT by the treatment of inducers, RNA blot analysis was done on total RNAs prepared from the livers, kidneys and lungs of rats treated and not treated with 3°C.
Probes for the RNA blot analysis were chosen by comparing cDNA segments of 4-NP UDPGT with androsterone UDPGT (Jackson and Burchell (17)). Probe I (nucleotide 35-371) was chosen as a specific probe for 4-NP UDPGT because this part of the sequence is highly variable between the two. Probe I1 (nucleotide 372-1771) was chosen as a general probe for the mRNA for the UDPGT family because it contained a highly conserved sequence. If one assumes that the nucleotide sequence of the conserved region codes for amino acid sequences with functions common to the UDPGT molecules, Probe I1 has the potential of detecting other species of mRNA of UDPGT than the two described, since multiple forms of UDPGT have been reported (1 1, 13).
In all experiments, the same amount of the total RNA was applied to the RNA blot analysis. As shown in Fig. 5, the two probes detected a major hybridization band at 20 S with a minor band at 16 S and other faint bands. The chain length (2000 base pairs) of the major band estimated from the mobility correlates well with that determined from the sequence analysis. The nature of the minor band at 16 S has not yet been determined. Probe I1 yielded much denser and broader bands than Probe I in all tissues examined, suggesting the presence of species of UDPGT other than the one whose sequence was determined here. Without treatment of 3-MC, a very low level of hybridization signal was detected in the liver RNA especially with the probe specific for 4-NP UDPGT, whereas a high level of the signal was observed in the kidney RNA. On the other hand, upon treatment with 3-MC, the intensity of the hybridization signals for both liver and kidney RNAs increased to the same levels with both the specific and general probes. Thus, the induction level was much higher in livers (10-to 15-fold) than in kidneys (%fold). The UDPGT mRNA in lungs was much lower than that of the liver and kidney even in animals treated with the inducer (data not shown).

DISCUSSION
A mixture of 18-mer oligonucleotides which was derived from sequence analysis of tryptic peptides of 4-NP UDPGT has been used in the isolation of a cDNA for the corresponding mRNA of rat livers. Several lines of evidence confirm that the cloned cDNA is indeed a cDNA copy of the 4-NP UDPGT mRNA 1) the predicted sequence of amino acids 26-51 corresponds exactly to the sequence determined from the NHzterminal end of the purified enzyme, 2) partial amino acid sequences of 18 tryptic peptides (147 amino acids) from the purified protein are distributed throughout the predicted sequence, and 3) histidine, the COOH-terminal amino acid of the predicted sequence agrees with that of the purified enzyme. The calculated molecular mass of the enzyme composed of amino acids 26-529, is 57.5 kDa, which is in agreement with that determined for the purified enzyme by sodium dodecyl sulfate polyacrylamide gel electrophoresis ( M , 55 kDa). Recently, Jackson and Burchell (17) and Mackenzie (35) have reported the primary structure of androsterone UDPGT and 4-methylumbelliferone UDPGT, respectively. The deduced amino acid sequences of these enzymes are compared in Fig. 6. The alignment between three isoenzymes reveals the following facts: 1) the 4-NP UDPGT shares 40% of the identical amino acid residues with 4-methylumbelliferone UDPGT, and 37% with androsterone UDPGT, respectively. The 4-methylumbelliferone UDPGT and androsterone UDPGT show 58% amino acid similarity. 2) The predicted sequence of androsterone UDPGT presumably lacks the sequence corresponding to the putative signal sequence and several consecutive amino acids in the mature enzyme. 3) Homologous and variable amino acids are not evenly distributed along the sequences. Clustering of homologous amino acids are noted in the COOH-terminal half of the molecules except for the extreme COOH-terminal of about 40 amino acids, while variable amino acids are more frequently distributed in the NH2-terminal half of the molecules. 4) Although constituent amino acids are varied, a region of high hydrophobicity is sandwiched between two groups of highly charged amino acids in the COOH-terminal region of the three molecules, a structure reminiscent of a halt transfer signals in transmembrane proteins (41).
From these results, it is concluded that these three species of UDPGT diverged from a common ancestor. The functions of the conserved and variable regions have not yet been identified. Because UDP-glucuronic acid is a common donor substrate to the enzymes, at least some part of the conserved region may correlate to the binding of the nucleotide and glucuronic acid moieties (36). UDPGT is a membrane-bound enzyme that is distributed in the endoplasmic reticulum and the nuclear envelope (4). The enzyme activity of UDPGT in the isolated microsomes is markedly stimulated by the treatment of detergents which affects the integrity of the membrane structure (37,38). These observations suggest that the enzyme is located in the luminal side of the endoplasmic reticulum, the situation being similar to the case of nucleoside diphosphatase as reported by Kuriyama (39). The amino acid sequence predicted from the cDNA supports this proposed disposition of the enzyme in ER membranes. The NHz-terminal extra peptide may function as a signal peptide as reported originally in the case of secretory proteins (40), so that the newly synthesized enzyme molecules may be translocated cotranslationally across the membranes. The extra peptide is probably cleaved off from the mature protein during the translocation process.
In the region proximal to the COOH terminus, the hydrophobic domain (amino acid 487-510) was surrounded on both sides by short segments of highly charged amino acids (Figs. 3 and 6). This structure is characteristic of a halt-transfer signal as described by Sabatini et al. (41) and may function as such to anchor the UDPGT molecule to the membrane with the COOH-terminal-charged amino acids on the cytoplasmic surface. The general structure of the enzyme resembles those of most transmembrane proteins on the cell surface including glycophorin (42), histocompatibility antigens (43), and many receptor proteins (44)(45)(46). These transmembrane proteins are transferred to the plasma membrane after they have been integrated into the structure of the ER membrane, whereas UDPGT remains in the ER membrane. Unknown sorting out mechanisms may function in this process.
Another microsomal enzyme, NADPH-cytochrome P-450 reductase is bound to the cytoplasmic surface of the ER membrane (39, 47) and the same is true with cytochrome P-450s although they may be somewhat buried in the membrane (48). The UDPGT is known to be involved in drug metabolism in concert with cytochrome P-450, that is, phase I and I1 The model indicates that a drug substrate ( S ) is hydroxylated by the cytochrome P-450 system (Phase I reaction) and the hydroxylated product (SOH) is glucuronized by UDPGT (Phase I1 reaction), which is located in the luminal side of the ER membrane. The postulated translocation protein (2') catalyzes a translocation reaction of UDPGA from the cytoplasmic compartment by a coupled exchange with UDP or some other mechanisms. Alternatively, free UDP generated by the action of UDP-glucuronosyltransferase may be hydrolysed by NDPase (39). The charged segment at the COOH-terminal of the GT is marked with a +. Fp, NADPH-cytochrome P-450 reductase; P-450, cytochrome P-450; GT, glucuronosyltransferase; T, translocation protein for UDPGA; S, drug substrate; SOH, hydroxylated product; SO-GA, glucuronide conjugate of SOH; NDPase, nucleoside diphosphatase; ER, endoplasmic reticulum. reactions take place successively in the endoplasmic reticulum. The asymmetrical arrangement of these drug metabolizing enzymes in the membranes may ensure the vectorial excretion of drug metabolites from cytosol compartments to the lumen of the ER, as depicted in Fig. 7. Furthermore, this model needs the existence of the translocator or some other mechanism for transport across the membrane of UDP-glucuronic acids, a substrate for UDPGT, which is synthesized in the cytosol compartments.
Several UDPGT-isoenzymes share substrate specificity toward 4-NP (11,13). Of these, the isoform whose sequence was deduced here has the highest activity for 4-NP (11). It has been reported that the activity of 4-NP glucuronidation was increased in livers of rats treated with 3°C (14, 15). In this study, we examined the effect of 3°C treatment on the amount of the 4-NP UDPGT mRNA in rat liver, kidney, and lung with the cloned cDNA used as probe. The Probe I specific for the 4-NP UDPGT mRNA, PstI (35)/PuuII (371), which has a relatively divergent nucleotide sequence from that of androsterone UDPGT (Fig. 6), was used in Northern blot analysis of total RNA prepared from untreated-and 3°Ctreated rats. The level of the transcriptional induction was different with tissues so far examined. The mRNA specific for 4-NP UDPGT was already expressed in significant amounts in kidney of untreated rats, whereas the level of the mRNA was very low in liver of the same animals. Upon treatment of rats with 3-MC, the level of the mRNA was increased to the same extent in liver and kidney (Fig. 5). Accordingly the induction level was much higher in liver (10to 15-fold) than in kidney (%fold). The content of the mRNA was always extremely low in lung of treated or untreated rats (data not shown). On the other hand, when the Probe 11, PuuII (372)lDruI (1777), which contains a highly homologous sequence with that of androsterone UDPGT, was used, the induction level was approximately %fold, even in livers, because of the relative abundance of the mRNA hybridizable with this probe in the tissues of untreated animals. These results are consistent with the data that 3°C treatment increases approximately &fold the activity for 4-NP in rat liver microsomes. These observations indicate that mRNAs for other species of UDPGT with some affinity for 4-NP than the 4-NP UDPGT are present in a considerable amount in the tissues of untreated rats and that the synthesis of the mRNA for 4-NP UDPGT was specifically induced in liver in response to 3°C. This is in contrast to the case of kidney tissue where a considerable amount of the mRNA was already expressed in the absence of the inducer. The two forms of cytochromes P-450, P-45Oc and P-450d, are known to be induced in rat livers in response to 3°C (49, 50). Therefore, it follows that the two species of cytochromes P-450 and 4-NP UDPGT which catalyze the two successive reactions in drug metabolism, are coinduced transcriptionally in livers of 3-MC-treated rats. It will be interesting to clarify the mechanism underlying the coinduction of the two kinds of drugmetabolizing enzymes by 3°C.