Structure of the Rat PRPSl Gene Encoding Phosphoribosylpyrophosphate Synthetase Subunit I*

Phosphoribosylpyrophosphate (PP-Rib-P) synthetase (EC 2.7.6.1) subunit I gene (PRPSI) is constitutively expressed in various tissues (Taira, M., Iizasa, T., Yamada, K., Shimada, H., and Tatibana, M. (1989) Biochim. Biophys. Acta 1007, 203-208). We report here the exon-intron organization and the transcription promoter sequence of rat PRPSI gene. This gene has 22 kilobases and is split into 7 exons ranging in size from 99 to 251 base pairs (bp), except for exon 7 (1008 bp). A putative PP-Rib-P binding site is encoded in exon 5. The exon-intron boundaries are similar to the consensus sequences for mammalian introns. Sl nuclease and primer extension assays with the use of RNA from rat Yoshida ascites sarcoma cells led to the identification of four possible transcription start points closely spaced between 126 and 129 bp from the ATG initiation codon. In the upstream region from the transcriptional start sites, we observed a TATA-like sequence (TAATTTAAT) at nucleotides -28, a CCAAT element (AGCCAATC) at nucleotides -80, and three GC boxes (pu=Spl-binding sites) at nucleotides -103, -43, and -10. A comparison of the promoter region for PRPSl with those of other housekeeping genes revealed a homology resembling that of the & actin gene.

We report here the exon-intron organization and the transcription promoter sequence of rat PRPSI gene. This gene has 22 kilobases and is split into 7 exons ranging in size from 99 to 251 base pairs (bp), except for exon 7 (1008 bp
D-Ribose 5-phosphate + ATP + PP-Rib-P + AMP Since PP-Rib-P is an essential substrate for supply of a ribose moiety to form nucleotides, this reaction connects the major metabolic pathways of the pentose phosphate shunt and nucleotide synthesis. Thus, PP-Rib-P synthetase is an indispensable housekeeping enzyme. It would be expected that such a reaction would be subject to strict metabolic control, with regard to both enzymatic activity and gene expression.  (Switzer and Gibson, 1978;Hove-Jensen et al., 1986) and mammalian tissues (Fox and Kelley, 1971;Roth et al., 1974;Kita et al., 1989). The enzymatic activity is modulated by many effecters: Mg2+ and inorganic phosphate as activators; and ADP, 2,3-bisphosphoglycerate, or GDP as competitive or noncompetitive inhibitors (Becker et al., 1979). The enzyme is an oligomeric complex composed of about 34-kDa subunits, and we reported that the rat liver enzyme exists as complex aggregates of 34-, 38-, and 40-kDa components, the 34-kDa species being the catalytic subunit (Kita et al., 1989).
During rat cDNA cloning experiments and amino acid sequencing of the purified enzyme, we noted the presence of two distinct types of the 34-kDa subunits, PRS I and PRS II (Taira et al., 1987;Kita et al., 1989). The predicted proteins (both 317 residues) varied by only 13 amino acids. The nucleotide sequences of the two cDNAs suggested that PRS I and PRS II mRNAs were encoded by two distinct genes, designated as PRPSl and PRPS2, respectively. Human gene mapping showed that PRPSl and PRPS2 were located in the different regions of the X chromosome (Taira et al., 1989a).
Either PRPSl or PRPSS, or both mRNAs were detected in almost all tissues of the rat, and levels increased after partial hepatectomy (Taira et al., 1987(Taira et al., , 1989b. We report here the isolation and structural analysis of the entire rat PRPSl gene as well as the determination of transcriptional start sites. The coding region is contained within a 22-kilobase (kb) DNA segment and is divided into 7 exons. The sequence of the promoter region suggests the existence of a TATA box, a CCAAT element, and Spl-binding sites. To obtain a single-stranded DNA prior to labeling a recessed 5' end at the Sac1 site, the DNA fragment was strand-separated by electrophoresis on a 7 M urea, 5% polyacrylamide gel layered on 7% gel (49:l). The two single-stranded fragments were visualized with ethidium bromide and eluted from the gel. One of these, which did not hybridize to 32P-endlabeled Primer 1, was used as a Sl probe. The probe was end-labeled with [y-32P]ATP and T4 polynucleotide kinase to a specific activity of 9 X 10' cpm/pg. Unincorporated [-y-32P]ATP was removed with the use of a Quick Spin Column.
Sl nuclease protection analysis with the use of poly(A)+ RNA from YS cells was carried out as follows (Berk and Sharp, 1977). The poly(A)+ RNA (5 pg) was mixed with 6-30 X lo4 cpm of probe in 50 ~1 of a solution comprised of 56% deionized formamide, 0.4 M NaCl, and 1 mM EDTA. The hybridization mixture was incubated first at 90 "C for 3 min and then at 56 "C for 12 h.  I  II  II-I  I  II  II-+   AiG  T+iA   EcoRl  I  IIIIII~  I  I  I  I   Hindlll  '  I  '  II  II  II  III  I  I   Sac I  1 I I  I  I  I  I  I   BgllI  I  II  I  I  Sl mapping (lanes 1-4) and primer extension (lanes 5 and 6) were performed as described under "Experimental Procedures." The '*Plabeled probe (the antisense strand of RsaI/SacI fragment from +I56 to -226) was annealed to yeast tRNA (lane 2) or YS cell poly(A)+ RNA (lanes 3 and 4). Lane I, S1 nuclease minus and RNA minus; lanes 2,3, and 4, 500, 250, and 500 units, respectively, of Sl nuclease. The "P-labeled Primer 1, complementary to positions +133 to +156, was annealed to 5 pg of YS cell poly(A)' RNA (lane 5) or yeast tRNA (lane 6). To compare the Sl digestion products directly with the primer extension products, the 5' end of the Sl probe was prepared to coincide with the 5' end of Primer 1 as shown in the lower portion of the figure. Primer 1 was also used in dideoxynucleotide sequencing reactions (lanes A, G, C, and 7'). The left photograph corresponds to the region indicated by the asterisk in the right one. The upper nucleotide sequence is complementary to the one read from the right photograph. Arrows indicate positions of the fragments generated by the Sl mapping and the primer extension. The position of the longest fragment in the Sl mapping was assigned as the start site (position +l) in the genomic gene.
was hybridized to poly(A)+ RNA (5 pg) in 5 ~1 of a solution containing 10 mM Tris-Cl, pH 7.4, 100 mM NaCl. Mixtures were heated at 90 "C for 3 min and transferred to 37 "C for 2 h. The primer extension reactions were carried out at 37 "C for 60 min in 50 mM Tris-Cl, pH 8.  (Agarwal et al., 1981). The mixture was passed through a Quick Spin Column and ethanol-precipitated. The extension products were analyzed by electrophoresis on a 7 M urea, 5% polyacrylamide gel as described above for Sl nuclease mapping.

AND DISCUSSION
Isolation of Genomic Clones-About 2 x lo5 recombinant phages from a female rat liver genomic library were screened using "P-labeled rat PRPSl and PRPSB cDNA fragments as probes. Thirteen positive recombinant phages were obtained; the PRPSl cDNA probe was strongly hybridized to 12 clones, whereas the PRPSS probe only hybridized to one. The former 12 clones were subjected to further analysis by restriction enzyme mapping and Southern blot analysis. Nine out of 12 clones comprised an overlapping set of clones spanning about 44 kb and the entire PRPSl gene (Fig. IA). Their partial nucleotide sequences were identical to that of rat PRPSl cDNA, showing that this gene is rat PRPSl. To define positions and boundaries of the PRPSI exon blocks, the restriction fragments hybridized with the cDNA probe were subcloned (Fig. lA) and their sequences determined (Fig. 1B). The restriction maps and the exon/intron organization of the PRPSl gene is shown in Fig. L4. Thus, 7 exons were distributed over about 22 kb. BamHI fragments of 10 and 12 kb corresponded to the 2 bands out of 8 obtained from the genomic Southern blot analysis in previously published work (Taira et al., 1989b).
A putative PP-Rib-P binding site (Argos et al., 1983;Hove-Jensen et al., 1986) was encoded in exon 5 (indicated by thick bar). The segments 1,2, and 3 of a putative ATP binding site (Fry et al., 1986) could correspond to three regions (Taira et al., 1987; indicated by thin bars in Fig. 2). These regions were encoded in two exons 4 and 5. A highly conserved region involved in the binding of the divalent cation and nucleotides has been reported, the amino acid sequence being DLHAS-QIQGFFDIPVD (Bower et al., 1989). This domain was divided into two exons 3 and 4 (thick broken bar in Fig. 2).

Location of Transcription Initiation
Site-To determine the 5' end of the PRPSl mRNA, Sl nuclease protection assays were performed with poly(A)+ RNA isolated from YS cells, which highly expressed PRPSl mRNA (Taira et al., 1989b).
RsaI site is shown only at position -226). $1, sequences conforming to consensus Spl site; CCAAT, a putative CCAAT element; and TATA, TATA like-sequence are boxed. The approximate lengths of introns are indicated. Underlines, @ and s residues present at the 5' and 3' boundaries of introns and putative lariat branch points of 5 in the 3' splice acceptor sequences. Broken thick bar, a divalent cation-nucleotide binding site; thin bar numbered 13, regions corresponding segments l-3, respectively, of ATP binding site; thick bar, a putative PP-Rib-P binding site. The DNA fragment (RsaI/SacI 382 bp) labeled at the Sac1 site, a position 24 bases downstream from the translation initiation codon, was used as a probe. Protected fragments were estimated by comparison with the genomic nucleotide sequence ladder initiated from Primer 1, which was labeled at the same position as the Sl probe (Fig. 3). A major protected fragment of 147 bases and minor protected fragments of 148, 146, and 145 bases were detected (lanes 3 and 4). These results suggested that transcription of the PRPSl gene started from the tetranucleotide TGTA at position +l through +4 (as described below) in YS cells.
For purposes of confirmation, primer extension analysis was performed. A major extension product of 148 bases was obtained upon reverse transcription with Primer 1 in addition to minor extension products of 146,145, and 144 bases. These values are in good agreement with findings in case of the Sl protection assay. Therefore, the 5' end of the TGTA sequence suggested by the Sl mapping is defined as +l, which locates 129 bases upstream from the translation initiation codon ATG.
Predicted Promoter Elements-Several possible promoter elements were identified in the 5'-flanking region (Fig. 2). The TAATTTAAT sequence at positions -30 to -21 is embedded in the GC-rich region (from -43 to -5). Though this sequence is not similar to the canonical TATA box, TATA(A/T)A(A/T)A (Breathnach and Chambon, 1981), it seems to serve as the TATA box for the following reasons. 1) The 5' end of the TAATTTAAT sequence locates at -30, which corresponds to the general location of the TATA box at positions -34 to -26 (Breathnach and Chambon, 1981); 2) the mRNA start site of this PRPSl gene was defined in a narrow region rather than in multiple regions; multiple initiations were noted in several genes lacking the TATA box (Yamaguchi et al., 1987); and 3) the ATTTA motif in the TAATTTAAT sequence is found in reported TATA-like sequences of 14 genes including SV40 early promoter (TATT-TAT) (Mathis and Chambon, 1981) out of 168 (Bucher and Trifonov, 1986). A CCAAT sequence was found at the -80 position. This promoter sequence generally occurs between positions -70 and -80 in eukaryotic genes (Breathnach and Chambon, 1981). Furthermore, the sequence of CGGGTCCAGCCA-ATCCGGA (positions -89 to -71) resembles a CCAAT consensus sequence, (C/T)AG(C/T)NNN(A/G)RCCAAT-CNNNR, which is bound to a CCAAT binding protein CP2 (Chodosh et al., 1988).
The GC boxes (GGGCGG, putative Spl-binding sites) were found at positions -103 (reverse orientation), -43 (reverse orientation), and -10, all of which match perfectly the Spl consensus sequence, (G/T)(G/A)GGCG(G/T)(G/A)(G/A)(C/ T) (Briggs et al., 1986), except A at the 3' end of the site at position -10. It is tempting to speculate that these putative Spl sites may facilitate the recognition of TAATTTAAT as has been suggested for the SV40 early promoter region containing six Spl sites and a downstream "weak" TATA box of TATTTAT sequence (Vigneron et al., 1984;Mathis and Chambon, 1981).
Thus, the rat PRPSl gene seems to possess three kinds of fundamental promoter elements which may play a role in expression of this housekeeping gene. In this regard, we compared promoter sequences among other housekeeping genes and found that the sequence of rat PRPSl gene is homologous to those of rat and chicken p-actin genes (Nude1 et al., 1983;Kost et al., 1983); CCAAT boxes, TC-rich regions, and TATA boxes embedded in GC-rich regions correspond to each other (Fig. 4). The significance of this homology as well as activities of PRPSl promoter elements remain to be clarified.