Cloning and sequencing of the cDNA for rat liver 3 alpha-hydroxysteroid/dihydrodiol dehydrogenase.

Rat liver 3 alpha-hydroxysteroid dehydrogenase (3 alpha-HSD, EC 1.1.1.50) is an NAD(P)(+)-dependent oxidoreductase which will terminate androgen action by converting 5 alpha-dihydrotestosterone to 3 alpha-androstanediol. It is identical to dihydrodiol dehydrogenase and it can function as a 9-, 11-, and 15-hydroxyprostaglandin dehydrogenase. Its reactions are potently inhibited by the nonsteroidal anti-inflammatory drugs (NSAIDs). A cDNA (2.1 kilobases) for 3 alpha-HSD was cloned from a rat liver cDNA expression library in lambda gt11. Portions of the cDNA insert which contained an internal EcoRI site were subcloned into pGEM3, and dideoxysequencing revealed that the cDNA contains an open reading frame of 966 nucleotides which encode a protein of 322 amino acids with a monomer Mr of 37,029. The identity of this clone was confirmed by locating two tryptic peptides and two endoproteinase Lys-C peptides from purified 3 alpha-HSD within the nucleotide sequence. The amino acid sequence of rat liver 3 alpha-HSD bears no significant homology with 3 beta-, 17 beta- or 11 beta-hydroxysteroid dehydrogenases but has striking homology with bovine lung prostaglandin F synthase (69% homology at the amino acid level and 74% homology at the nucleotide level) which is a member of the aldehyde/aldose reductase family. This sequence homology supports previous correlates which suggest that in rat 3 alpha-HSD may represent an important target for NSAIDs. The nucleotide sequence also contains three peptides that have been identified by affinity labeling with either 3 alpha-bromoacetoxyandrosterone (substrate analog) or 11 alpha-bromoacetoxyprogesterone (glucocorticoid analog) to comprise the active site (see accompanying article (Penning, T. M., Abrams, W. R., and Pawlowski, J. E. (1991) J. Biol. Chem. 266, 8826-8834]. The sequence data presented suggests that 3 alpha-HSD, prostaglandin F synthase, and aldehyde/aldose reductases are members of a common gene family.

Rat liver 3a-hydroxysteroid dehydrogenase (3a-HSD, EC 1.1.1.50) is an NAD(P)+-dependent oxidoreductase which will terminate androgen action by converting 5a-dihydrotestosterone to 3a-androstanediol. It is identical to dihydrodiol dehydrogenase and it can function as a 9-, 11-, and 15-hydroxyprostaglandin dehydrogenase. Its reactions are potently inhibited by the nonsteroidal anti-inflammatory drugs (NSAIDs). A cDNA (2.1 kilobases) for 3a-HSD was cloned from a rat liver cDNA expression library in Xgtll. Portions of the cDNA insert which contained an internal EcoRI site were subcloned into pGEM3, and dideoxysequencing revealed that the cDNA contains an open reading frame of 966 nucleotides which encode a protein of 322 amino acids with a monomer M, of 37,029. The identity of this clone was confirmed by locating two tryptic peptides and two endoproteinase Lys-C peptides from purified 3a-HSD within the nucleotide sequence. The amino acid sequence of rat liver 3a-HSD bears no significant homology with 38-, 178-or 118hydroxysteroid dehydrogenases but has striking homology with bovine lung prostaglandin F synthase (69% homology at the amino acid level and 74% homology at the nucleotide level) which is a member of the aldehydelaldose reductase family. This sequence homology supports previous correlates which suggest that in rat 3a-HSD may represent an important target for NSAIDs. The nucleotide sequence also contains three peptides that have been identified by affinity labeling with either 3a-bromoacetoxyandrosterone (substrate analog) or 1 la-bromoacetoxyprogesterone (glucocorticoid analog) to comprise the active site (see accompanying article (Penning, T. M., Abrams, W. R., and Pawlowski, J. E. (1991) J. Biol. Chem. 266,[8826][8827][8828][8829][8830][8831][8832][8833][8834]). The sequence data presented suggests that 3a-HSD, prostaglandin F synthase, and aldehyde/aldose reductases are members of a common gene family.

* This research was supported by National Institutes of Health
Grants GM33464 and CA39504, Research Career Development Award CA01335 (to T. M. P.), and a Pharmaceutical Manufacturers Association advanced predoctoral fellowship (to J. E. P.). A preliminary account of this work has been presented at the 5th International Conference of the Inflammation Research Association in conjunction with the van Arman Scholarship Competition and will be published in extended abstract form in Agents and Actions (in press). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "acluertisement" in accordance with 18  has been a focus of study because of its perceived role in the termination of androgen action. By catalyzing the conversion of 5a-dihydrotestosterone (a potent androgen) to 3a-androstanediol (a weak androgen), the levels of 5a-dihydrotestosterone in androgen-responsive tissues can be reduced (1-3).
The most thoroughly characterized 3a-HSD is the homogeneous enzyme from rat liver cytosol (4-6). Rat liver 3a-HSD is a monomeric ( M , 34,000) NAD(P)+-dependent oxidoreductase which is abundantly expressed. Immunotitration data with polyclonal antisera suggest that it represents 1-3% of the cytosolic protein in this tissue (7). This high level of expression makes it possible to obtain milligram quantities of the purified enzyme at concentrations approaching 10 mg/ml for x-ray crystallographic studies (6).
The abundance of rat liver 3a-HSD has facilitated the study of the diverse functions of the enzyme. First, it can function as a 3a-hydroxysteroid dehydrogenase and can metabolize androgens (8) and glucocorticoids (9) and is involved in the biosynthesis of bile acids (10, 11). Second, it can function as a dihydrodiol dehydrogenase, and by oxidizing trans-dihydrodiols of polycyclic aromatic hydrocarbons it can suppress formation of the ultimate carcinogens, the anti-diol epoxides (12)(13)(14). Third, it can function as a 9-, 11-, and 15-hydroxyprostaglandin dehydrogenase (15) and may be involved in regulating levels of inflammatory prostaglandins.
Rat liver 3a-HSD also satisfies several criteria expected of a target enzyme for nonsteroidal anti-inflammatory drugs (NSAIDs) (4-6). The enzyme is potently inhibited at its active site by NSAIDs in rank order of their pharmacological potency (4). Concentrations of drugs required to inhibit 3a-HSD are either similar or lower than those required to inhibit cyclooxygenase. 3a-HSD also binds arachidonic acid and prostaglandins with affinity in the low micromolar range (5) and can transform Prostaglandins through its hydroxyprostaglandin dehydrogenase activity (15). An indomethacin-sensitive 3a-HSD has also been shown to be widely distributed in rat tissues including prostate, spleen, heart, testis, small intestine, stomach, lung, and brain (16).
T o gain insights into the possible connection between the hydroxysteroid, dihydrodiol, and hydroxyprostaglandin dehydrogenase activities of 3a-HSD, the enzyme structure is currently being analyzed in this laboratory. We now describe the isolation of a cDNA clone for Sa-HSD which has allowed us to deduce the amino acid sequence of this protein. The
Screening of the cDNA Library-A Xgtll expression library was screened as described (17) using a 1:250 dilution of rabbit anti-rat 3a-HSD serum. Immunopositive clones were detected using goat antirabbit IgG horseradish peroxidase conjugate and 4-chloro-1-naphthol plus HLO1. Positive clones were plaque-purified and grown in Escherichia coli strain Y1090. Phage DNA was isolated using the plate lysate method (17). A full-length clone was obtained by rescreening approximately 300,000 plaques from the cDNA library using "'Plabeled 36-mer oligonucleotide probes (Fig. 4) directed against the 5' ends of two partial-length immunopositive clones. The probes were end-labeled using T4 polynucleotide kinase and y-[,"'P]ATP. The filters were hybridized with oligonucleotide probes in 6 X SSC (1 X SSC = 0.15 M NaCI, 0.015 M sodium citrate) containing 0.5% nonfat dried milk for 16 h a t 68 "C. Filters were sequentially washed in 1 X SSC, 0.1% SDS (30 min a t 68 "C) and 0.1 X SSC, 0.1% SDS (30 min a t 68 "C) and autoradiographed.
Subcloning-cDNA inserts were isolated from the phage vector DNA by EcoRl digestion. The inserts were sized and separated by gel electrophoresis using 1.4% agarose, 1 X TBE (90 mM Tris borate plus 2 mM EDTA). Since all of the positive clones contained an internal EcoRI site, thus generating two fragments (except clone 3n-HSD-I), Southern analyses using the oligonucleotide probes were performed to determine the orientation of the fragments. For sequencing, cDNA fragments were subcloned into the EcoRI site of pGEM3 using T4 ligase. Transformed cells (E. coli strain HB101) were selected using ampicillin resistance, and plasmid DNA was isolated (17). To facilitate sequencing, nested deletions were constructed using exonuclease 111 and S1 nuclease (Erase-A-Base", Promega Riotec) following the manufacturer's instructions.
Sequencing-Dideoxysequencing (18) was performed on alkalidenatured supercoiled plasmid DNA containing cDNA inserts using either the primers for the SP6 and T7 promotors of the pGEM3 vector or synthetic 17-mer oligonucleotide primers. Sequencing was performed using the Klenow fragment of DNA polymerase and reverse transcriptase (GemSeq", Promega Biotec).
Northern Anal.vsis-Total RNA was extracted from rat liver following homogenization into 6 M guanidine thiocyante and ethanol precipitation (19). Aliquots (4-10 pg) were size-fractionated by electrophoresis using 1% agarose gels containing 2.2 M formaldehyde (20). After electrophoresis the RNA was transferred to Nyt.ran" filters and hybridized to cRNA probes.
cRNA probes were made using 3wHSD-I1 (Fig. 2), which was linearized with either Hind111 or PvuII restriction endonucleases. IJsing SP6 and T i RNA polymerases (Riboprobe" Gemini System 11, I'romega Biotec) with [ w "PICTP, sense and anti-sense riboprobes were synthesized, respectively. Labeled RNA-DNA hybrids were separated from the unincorporated nt~cleotides hy chromatography on Quick Spin'r'' columns (Sephadex G-50 UNA grade, Roehringer Mannheim), and the probes were denatured by heating at 100 "C before hybridization. Hybridization was conducted for 16 h at 5 "C, and washes were performed for 1 h at 68 "C using the solutions  (22). The reaction mixture was dialyzed against 10 mM potassium phosphate, pH 7.6, containing 1 mM EDTA and digested with TI'CKtreated trypsin. TPCK-treated trypsin was assayed before use using N-benzoyl-L-arginine ethyl ester as substrate and was added to the digest at zero time and again at 2 h to a final concentration of 2% w/ w. The digestion was allowed to proceed for 18 h at 37 "C. Peptide mapping was performed on a C18-pBondapak column (Waters, Milford, MA) linked to a gradient high pressure liquid chromatography system (Beckman binary llOA pumps operated by a 421 controller and linked to a 164 variable wavelength detector). Peptides were elut,ed using the gradients indicated and were detected by monitoring their absorbance at 214 nm.
Peptide Seyumcing-Peptides were covalently attached to either an arylamine or diisothiocyanate Sequelon" membrane via their free carboxyl and amino groups, respectively, for solid-phase sequencing (23). Automated Edman degradation was performed on a Milligen/ Biosearch 6600 Prosequencer (Burlington, MA). The PTH-derivatives liberated after each cycle were separated by RP-HPLC on a Milligen-Sequetag column (3.9 X 300 mm) using a 35 mM ammonium acetate, pH 4.8, acetonitrile gradient. PTH-derivatives were detected at 269 and 3 1 3 nm and quantified by comparison with a mixture containing 100 pmol of each standard PTH-derivative.
Computer Ann~ysis-Analyses of sequence homologies hetween 3 t u -HSD and entries in the GenBank and EMBL data banks were performed using the Wilbur and Lipman similarity search program of the IntelliGenetics Software package (Mountain View,, CA). The IntelliCenetics package was also used to perform Chou-Fasman analysis (24).

RESULTS AND DISCUSSION
Cloning Strategy-Approximately 500,000 plaques from a Xgtll expression library prepared from female Sprague-Dawley rat liver mRNA were screened using a high-titer, monospecific polyclonal antiserum which was raised against purified rat liver cytosol 3n-HSD (7). The ability of this antiserum to detect only a single protein (Mr 34,000) from male and female rat liver cytosol by Western blot analysis and its ability to immunotitrate 3n-HSD activity in in vitro assays have been described elsewhere (7).
To confirm the identity of the cDNA clones, peptide sequence data were obtained from purified 3wHSD. For these studies, homogeneous 30-HSD was reduced, carboxymethylated, and subjected to trypsin digestion. Tryptic peptide maps were obtained by RP-HPLC (Fig. 1A) and two tryptic peptides (Tw and T24) were subsequently purified (Fig. 1, B and C) for sequencing. The sequence obtained for Tr4 corresponds to a peptide of 12 amino acids: NH2-Leu-Trp-Ser-Thr-Phe-His-Arg-Pro-Glu-Leu-Val-Arg-C02H, while the sequence of T,,, corresponds to a peptide of 19 amino acids: NH,-His-Phe-

Asp-Ser-Ala-Tyr-Leu-Tyr-Glu-Val-Glu-Glu-Glu-Val-Gly-
Gln-Ala-Ile-Arg-C02H (Table I). Two peptides were also purified by RP-HPLC following endoproteinase Lys-C digestion of 3wHSD (data not shown), which gave the following se-  Arg 6.7 a The average repetitive yield was 294%. The yield is the net of the previous cycle except where presented in parentheses, and is then the gross yield. * T,, was sequenced on two different occasions and gave identical sequence information.
PTH-dehydroalanine was detected in this cycle at 313 nm. A significant a-aminobutyric acid peak was detected in this cycle at 313 nm, which is a degradation product of threonine. quences of the two endoproteinase Lys-C peptides found in native 3a-HSD. These findings confirmed that 301-HSD-I and 3a-HSD-IIb contained the coding region for the COOH terminus of the protein. By contrast, 3a-HSD-IIa contained a continuation of the open reading frame but was too short to encode for the whole protein.
To isolate a full-length clone, two 36-mer oligonucleotide probes were synthesized, which were complementary to either the 5' end of 3a-HSD-I or the 5' end of 3a-HSD-IIa. Clones which hybridized to both oligonucleotide probes and thereby contained coding sequence at both the 5' and 3' ends were selected for further study. The clone containing the largest insert (3a-HSD-111) was isolated and yielded fragments of 1.1 and 1.0 kb in length following EcoRI digestion. Both fragments were subcloned into pGEM3 for sequence analysis, and it was determined that 3a-HSD-I11 contained the entire coding region for 3a-HSD. The alignment of the clones based on their restriction maps and sequence data is shown in Fig. 2. Northern Analysis and Detection of 3a-HSD mRNA-Using clone 3a-HSD-11, cRNA probes were synthesized using the SP6 and T7 promotors of pGEM3 and the appropriate RNA polymerases to generate sense and anti-sense probes, respectively. Northern analysis of total rat liver RNA revealed the presence of a single mRNA species of 2.7 kb which could only be detected with the anti-sense probe. These results confirmed the identity of the coding strand. The ability to detect 3a-HSD mRNA in extracts of total RNA supports the view that 3a-HSD and its message are abundantly expressed. The size of the 3a-HSD mRNA is significantly larger than the cDNA (2.1 kb) and indicates that portions of either the 5'or 3'untranslated regions are absent from the cDNA.
Sequencing of 3a-HSD cDNA-Clones 3a-HSD-1-111 were subjected to dideoxysequencing using several strategies. First, Sequencing was also performed using oligonucleotide primers, and nested deletions were constructed using exonuclease III/Sl nuclease. The open reading frame (dark box) has been sequenced from both strands, and approximately 420 base pairs (bp) from the 3"untranslated region have been sequenced. restriction sites were utilized to generate a series of deletion subclones. Second, nested deletions were created using exonuclease I11 and S1 nuclease. Third, oligonucleotides complementary to either strand were synthesized as sequencing primers. The different regions of the cDNA sequenced are shown in Fig. 3. Using these strategies, the nucleotides spanning the open reading frame in both strands were sequenced. Sequencing revealed that the open reading frame is 966 nucleotides long and encodes a protein of 322 amino acids in length (Fig. 4). Although the start and stop codons are clearly evident, a polyadenylation signal (AATAAA) was not found in the 3"untranslated region. The initiation codon was determined to be the ATG at position 1, since there are three inframe stop codons between this initiation codon and the next in-frame upstream ATG codon. The assigned initiation codon is located within a potential eukaryotic translation initiation consensus sequence (26). The molecular weight predicted for the protein from the cDNA is 37,029 and is 9% higher than that determined by SDS-polyacrylamide gel electrophoresis. The identity of the clone was confirmed by locating the sequences of the two tryptic peptides (TZ4 and T3$) and the two endoproteinase Lys-C peptides (endoproteinase Lys-C1 and endoproteinase Lys-C,) from 3a-HSD within the nucleotide sequence. The amino acid composition predicted for 3a-HSD from its cDNA (Leu33, LysZ8, Asp,,, Asn14, Gluzo, Gln12, VaL, k g , Alal8, Phe17, S e b , Arg,,, ThrI6, Gly14, Prol4, Tyr13, Cysg, His7, Met,, and Trp,) is in close agreement with the amino acid composition determined from complete acid hydrolysis of Sa-HSD (Leu3,, LysZ7, AsxB8, G1x3,, Valp3, Ilels, Alal9, PheI6, Ser15, Argl,, ThrIS, G h , Prol7, Tyr12, Cysg, Hi% Mets) (6). Since the NH2 terminus of 3a-HSD is not amenable to peptide sequencing, presumably due to a blocked amino acid, it is of interest that the chemically determined compo-  I I   I : I I I : I I 1 1 : I I I I I I I I   I : I I I : I I l I : I I I I I I I : I l l 1 I sition shows the presence of 5 methionine residues. These 5 methionines are all present in the cDNA provided the initiation codon encodes for a methionine which would then be the NHp-terminal amino acid. The deduced amino acid sequence also contains three peptides that have been identified by affinity-labeling studies with bromoacetoxysteroids to comprise the acive site, and these are located in the COOH-terminal portion of the openreading frame (see accompanying paper (45)).
Sequence Analysis-3a-HSD bears no significant sequence homology with other HSDs that have been recently cloned and sequenced, including human placental 38-hydroxysteroid dehydrogenase (27), rat liver llp-hydroxysteroid dehydrogenase (28), and human placental 17P-hydroxysteroid dehydrogenase (29, 30). Comparison of the homology between 3a-HSD and entries in the GenBank and EMBL data banks identifies the greatest degree of homology with prostaglandin F synthase from bovine lung (69% at the amino acid level and 74% at the nucleotide level) (31) which is a member of the aldehyde reductase family. Members of the aldehyde reductase family that have been sequenced from human placenta (32), bovine lens (33), and rat lens (34) share at least 58% homology with 3a-HSD at the amino acid level. Homology is also seen between 3a-HSD and rho-crystallin from the European common frog (35) (Fig. 5). This homology is worthy of mention since the crystallins represent structural proteins of the lens, they are abundantly expressed, and one of the crystallins is known to display lactate dehydrogenase activity (36). Sequence comparison with other trans-dihydrodiol dehydrogenases is not possible since our data represent the first complete sequence for an enzyme with this function. It is noteworthy that minor forms of mouse and guinea pig liver dihydrodiol dehydrogenase, which display 17P-HSD activity, co-purify with aldehyde reductase (37,38). In addition, partial amino acid sequence data obtained by fast atom bombardment and plasma desorption mass spectrometry of peptides isolated from various isoforms of rabbit dihydrodiol dehydrogenase give sequences that are highly conserved both within our primary structure and the primary structure of mammalian aldehyde reductases (39).
Despite sequence homology with the aldehyde reductases, 3a-HSD has several properties which distinguish it from cDNA of Rat Liver 3a-Hydroxysteroid Dehydrogenase 8825 members of this family. These include its dual pyridine nucleotide specificity, its ability to catalyze both oxidation and reduction reactions at physiological pH, and its ability to reduce aromatic ketones. In analyzing the primary structures of short chain alcohol dehydrogenases, Jornvall and co-workers (40) have described a motif Tyr-X-X-X-Lys which is completely conserved. This consensus sequence is found in 3a-HSD at amino acid residues 205-209 as Tyr-Cys-Lys-Ser-Lys and is also found in 38-, Up-, 17/3-, and prokaryotic 208-hydroxysteroid dehydrogenases (41) in the motif Tyr-X-X-Ser-Lys. It is also present in 15-hydroxyprostaglandin dehydrogenase from human placenta (40). This consensus sequence is also present in the aldose reductases and rho-crystallin which bear sequence homology with Sa-HSD. However, the conserved Tyr and Lys residues are not found in prostaglandin F synthase, the protein with the most homology with 3a-HSD, suggesting that this sequence may contribute to the differences in the activities of these two enzymes.
Many pyridine nucleotide-dependent oxidoreductases contain a structural domain which is responsible for binding the co-factor (Rossmann-fold (42)) which is characterized by a pattern of secondary structure corresponding to (P-a-p-a-&. Chou-Fasman predictions of the secondary structure of 3a-HSD reveal that this structural domain may reside in the NHz-terminal half of the protein. Wierenga et al. (43) have identified a "fingerprint" of amino acids present in NAD(P)+binding proteins, and 3a-HSD contains 2 glycine residues (Gly-20 and Gly-22) which are completely conserved in their model. This further supports the view that the co-factor binding site is located at the NH2 terminus of 3a-HSD.
Analysis of the deduced amino acid sequence described in this paper and the isolation of the active site peptides in the accompanying paper (45) indicate that the pyridine nucleotide-binding site may reside at the NH2 terminus while the steroid-binding site may reside at the COOH terminus of 3a-HSD. We have recently completed a detailed analysis of the kinetic mechanism for 3a-HSD which predicts an Ordered Bi Bi mechanism in which the pyridine nucleotide binds before the substrate (44). These findings imply that during catalysis the pyridine nucleotide binds first to the NH2 terminus, the steroid substrate binds to the COOH terminus, and the two structural domains are brought together to form the central complex so that dehydrogenation can proceed. Confirmation of these predictions will have to await the elucidation of the x-ray crystallographic structure of the enzyme.