Multiple mRNAs from the Punch Locus of Drosophila melanogaster Encode Isoforms of GTP Cyclohydrolase I with Distinct N-terminal Domains*

The GTP cyclohydrolase I gene of Drosophila melano- gaster, the Punch locus, encodes alternative transcripts of 1.7 and 1.76 kilobases (kb). These transcripts are dif- ferentially expressed throughout Drosophila development. cDNA clones representing these transcripts, cor- responding genomic regions and polymerase chain reaction-amplified primer extensions of the 5‘ ends of the RNAs were sequenced. Both RNAs contain five exons and derive from primary transcripts that extend over approximately 8 and 4 kb for the 1.7- and 1.76-kb R N A s , respectively. Their 6‘-most exons are unique and spliced onto common 3’ exons. The cDNAs each contain a single long open reading frame, which can be translated into a polypeptide of 273 amino acids for the 1.7-kb mRNA and 308 amino acids for the 1.75-kb mRNA. The unique exons confer distinct N-terminal domains to each predicted protein. Sequence comparisons reveal that the Dro- sophila GTP cyclohydrolase isoforms encoded by the multiple transcripts are highly similar to GTP cyclohy- drolases from humans, rodents, and bacteria, with one significant exception. The N-terminal domains encoded by the transcript-specific 6’ exons cannot be aligned with the N termini of any other

The first enzyme in the pteridine biosynthetic pathway is GTP cyclohydrolase I, which converts GTP to dihydroneopterin triphosphate, the rate-limiting step in the production of pteridines and therefore in the pteridine-limited synthesis of the catecholamines and serotonin (Nichol et al., 1985). It also is induced very early in the responses of some cell types to cytokines, suggesting that it plays a critical role in these responses as well (Ziegler, 1987;Werner et al. 1990;Huber and Tratkiewicz, 1990).
Pteridines serve as eye pigments in Drosophila, in addition to having functions that parallel those defined in mammalian cells (O'Donnell et al., 1989). Drosophila GTP cyclohydrolase I shares many catalytic characteristics with the mammalian enzyme, and it is likely that the regulation of GTP cyclohydrolase expression is also the central event in pteridine-related functions in Drosophila (Weisberg and O'Donnell, 1986;Hatakeyama et al., 1989). This expectation led to a genetic analysis of GTP cyclohydrolase in Drosophila, which established that the enzyme is encoded by the Punch ( P u ) locus (so named because some alleles impart a mutant eye color) (Mackay and O'Donnell, 1983). More than 60 mutant alleles of this gene were generated, and the ensuing genetic characterization demonstrated that Pu is a complex locus (Mackay et al., 1985;Reynolds and O'Donnell, 1988). This genetic designation indicates that some of the functions of the locus can be mutated independently of other functions, even though genetic complementation tests place all alleles within a single gene. Most mutant alleles are lethal during late embryogenesis, when homozygous embryos exhibit phenotypes characteristic of catecholamine defects. Some mutant alleles can be separated into distinct classes based upon their interallelic complementation characteristics, their effects on subsets of Pu mutant phenotypes and their distinct effects on the GTP cyclohydrolase polypeptide (Reynolds and O'Donnell, 1987;O'Donnell et al., 1989). Two functionally specific classes of mutations were also identified. One affects only the expression of GTP cyclohydrolase in the developing eye, where it participates in the production of pteridine pigments (Mackay and O'Donnell, 1983). The other class exhibits arrest of embryonic nuclear divisions and segment pattern defects (Reynolds and O'Donnell, 1987;O'Donnell et al., 1989L2 X. Chen, E. R. Reynolds, and J. ODonnell, in preparation.  I  I  I I I I I I I I I I I   I  1   I I I I I~I I I I SalI). The location of chromosomal breakpoints and other restriction fragment length changes that alter P u function are indicated below the restriction map, with the symbols referring to the designation for each mutant allele. Small deletions are shown as thin bars; insertions are shown as triangles. Other arrows show the placement of translocation and inversion breakpoints. Pu alleles r l , r331, and rAA4 affect only (or primarily) the ability of the locus to perform the eye pigmentation function. All remaining alleles disrupt vital functions during embryogenesis and also affect adult GTP cyclohydrolase.
A molecular analysis of the Pu locus was initiated to assess the basis of the genetic complexity. The locus was localized in a 120-kb chromosomal walk to a 30-kb region delineated by the positioning of a number of Pu mutations that cause restriction fragment length polymorphisms (McLean et al., 1990). The placement of the mutant alleles, shown in Fig. 1, suggested that the genetic complexity of Pu is rooted in a complex structural organization since each allele affecting eye pigmentation only maps within a central element that is flanked by regions to which lethal Pu alleles map. Organizational complexity was also suggested by the identification and preliminary mapping of 7-12 developmentally regulated mRNAs from this region. Because the biochemistry and function of GTP cyclohydrolase has been best characterized in adult Drosophila heads, initial studies have concentrated on transcripts expressed there. In particular, our attention was drawn to the 1.7-kb transcript, which is found primarily in young adult heads during the period coinciding with the transient elevation of GTP cyclohydrolase activity during eye pigmentation.
We had previously shown that the product of Pu that is responsible for this activity is a 39-kDa polypeptide and that this polypeptide is produced by the 1.7-kb RNA (Weisberg and ODonnell, 1986;McLean et al., 1990). The 39-kDa polypeptide and the 1.7-kb RNAare both absent in mutants homozygous for eye pigmentation-specific Pu alleles. Therefore, we set out to isolate cDNA clones representing the 1.7-kb RNA. Here we report the isolation and structural characterization of those cDNAs, as well as a related class corresponding to the 1.75-kb RNA (Fig. 1). The transcripts are interrupted by several introns, and each has an alternative 5' exon spliced to common 3' exons. The inferred protein sequence derived from these transcripts predicts polypeptides with identical C-terminal and distinct N-terminal domains. We find that the sequence common to these two alternate forms is highly similar to those derived from cDNA clones of rat, mouse, and human GTP cyclohydrolase, as well as bacterial forms of the enzyme.

EXPERIMENTAL PROCEDURES
Isolation of cDNA Clones-cDNA clones were isolated from a AgtlO library representing adult Drosophila head transcripts (a gift from A. Cowman). Plaque lifts were performed using MSI nylon filters, according to the procedure described by Sambrook et al. (1989). The filters were probed with cloned Drosophila genomic DNA from coordinates +8 to +13 (McLean et al., 1990) that had been random-primer labeled according to the procedure of Vogelstein (1983, 1984). After verification of positive plaques by rescreening isolates with probes from region 0 to +3.5 and +3.5 to +7, DNA was prepared from the recombinant phage with LambdasorbTM phage adsorbant (Promega), according to the supplier's protocol. ARer restriction enzyme mapping of all clones, selected representatives of each cDNA class were digested with EcoRI and the resulting fragments were gel-purified and subcloned into the Bluescript phagemid (Stratagene).
Amplification and Cloning of 5' Ends of P u Zkanscripts-The amplification of cDNA ends (Frohman et al., 1988) was performed using a 5'-RACE kit (Life Technologies, Inc.). Briefly, 1 pg of poly(A+) RNA isolated from the heads of 0-12-h-old adult Drosophila was reversetranscribed with Superscript reverse transcriptase, using the primer SK-12 (5'-GGAACGTGCAClTCTCGTG-3') located at the 5' end of Exon 11, which is common to the 1.7-and 1.75-kb RNAs. After RNase H digestion, the cDNA was purified on a Glassmax cartridge and tailed with dC. The 5' ends were then amplified with Ampli-Taq using an upstream anchor primer supplied with the kit and downstream exonspecific primers linked to restriction enzyme cloning sites. For amplification of the 5' end of the 1.7-kb RNA, the exon-specific primer was 5-2 (5'-GGGAA?TCA?TGGTGGACTCTGC-3'); for the 1.75-kb RNA, the exon-specific primer was SK-8 (5'-GGAACGTGCACTTCTCGTGG-3').
These sequences are located near the 5' ends of Exons Ia and Ib, respectively. After a 40-cycle PCR amplification, the products were purified in Glassmax columns, digested with appropriate restriction enzymes and cloned into pBluescript SK+ vector. The ligation products were transformed into electrocompetent XL1-blue cells. White colonies were screened by Southern blot analysis with Exon Ia-and Ib-specific probes.
Nucleic Acid Hybridizations-DNA and RNA isolations and subsequent Southern and Northern blotting were performed as described by McLean et al. (1990).
DNA Sequence Analysis-The nucleotide sequences of cDNA inserts in pBluescript and genomic DNA inserts into pBluescribe (Stratagene) were determined using the Sequenase kit, version 2.0 (U. S. Biochemical Corp.), the Tag polymerase sequencing kit (Stratagene), or the Bst sequencing kit (Bio-Rad), according to the suppliers' protocols Ganger et al., 1977). Regions of high secondary structure were sequenced using 7-deaza-dGTP or dITP analogues. Sequencing strategies are diagrammed in Fig. 2. For most genomic regions that were co-extensive with cDNA sequences, a single strand was sequenced and the sequence was verified by alignment with the cDNA sequence. In regions where alignment was questionable or where genomic sequence was not overlapped by cDNA sequence, both strands of the genomic DNA were sequenced. Both strands of the representative clones from each cDNA class were sequenced.
In Vitro Zkanscription and Zkanslation-In vitro transcription of cDNA clones was performed using the mCAPTM RNA capping kit from Stratagene, following the supplier's protocol. In vitro translation reactions were carried out with a rabbit reticulocyte translation kit (Life Technologies, Inc.), incorporating ~-[4,5-~Hlleucine (161 Cilmmol).
Computer Sequence Analysis-Sequence alignments and homology and structural analyses were performed using the GCG (Genetics Computer Group) Sequence Analysis Software Package (Deverew et al., 1984) and the BLAST network service at the NCBI (Altschul et al., 1990).

RESULTS AND DISCUSSION
Isolation of cDNA Clones a n d Initial Characterization-An adult head cDNA library was chosen for the isolation of recombinant molecules because GTP cyclohydrolase and the 1.7-kb RNA are abundant in heads within the first several hours following eclosion. The library was screened initially with a ge- Ib are unique to the 1.7-and 1.75-kb transcripts, respectively, with each appearing to initiate from separate promoters. These exons are alternatively spliced to the same splice acceptor site at the 5' end of Exon 11. The open portions of these two exons as well as the 3' portion of Exon V are untranslated sequences. All other sequences contain open reading frame. The arrows beneath the molecular map of the transcripts represent genomic subclones and cDNA fragments that were sequenced, using either primers within the vectors or internal primers generated from prior sequencing.
nomic DNA probe that readily detected a 3' portion of the 1.7-kb RNA, and which we now know is complementary to most of Exon V (genomic clone coordinates +8 to +13) (McLean et al., 1990). Approximately 70,000 plaques were screened, and 10 positive cDNAs were isolated. Eight of these clones were subjected to further analysis. To determine which transcripts the clones represent, each was rescreened with upstream genomic DNA that detects either the 1.7-kb RNA only or both the 1.7and 1.75-kb RNAs in adult heads (Fig. 1). The clones were then hybridized to dot blots of genomic DNA from 120 kb of the region in and around the Pu locus. All of the cDNA clones also were hybridized to each other to ascertain cross-complementarity and to Northern blots to detect complementary RNA.
All but one of the cDNA clones exhibited cross-complementarity and detected the 1.7-kb transcript in the heads of newly eclosed adult Drosophila. These cDNAs varied in size from under 1 kb to approximately 1.6 kb. Restriction map comparisons and Southern blot analysis indicated that the shorter inserts were prematurely terminated cDNAs. The 1.6-kb inserts, which were in the appropriate size range for either the 1.7-or 1.75-kb RNAs, fell into two groups. Representatives of one group appeared to coincide with the 1.7-kb transcript, extending over at least 8-10 kb of the genome, into that portion of the locus defined earlier as the "eye pigmentation-specific" region ( Fig. 1) (McLean et al., 1990). cDNAs in the other group were of comparable size to the first group but extended only over 4-5 kb of the genomic sequence, appearing to coincide with the 1.75-kb RNA. The remaining clone represents a portion of a 9-kb transcript derived from the neighboring tud locus (Go- lumbeski et al., 1991) and was not analyzed in greater detail.
Expression of Dunscripts Corresponding to the Isolated cDNA Clones-Representatives of each class of cDNAs were used as probes in a developmental Northern analysis. All hybridized to the 1.7-kb adult head transcript and to a 1.75-kb transcript expressed in embryos, larvae, and adults (Fig. 3A), supporting the preliminary conclusion based on genomic DNA probing of RNA blots that the 1.7-and 1.75-kb transcripts are structurally related. Assignment of clones into 1.7-and 1.75-kb classes was verified after the completion of sequencing by probing Northern blots with the PCR-amplified unique portions of each clone. Each was complementary to the expected single transcript (Fig. 3B). Fig. 3B also illustrates the differential regulation of the two transcripts. The 1.7-kb RNA, which we previously identified as the transcript required for eye pigment production, is clearly far more abundant in the heads of young adults than in any other tissue or developmental stage. The 1.75-kb RNA is more abundant than the 1.7-kb RNA in larvae and in embryos. adult heads;AB, adult bodies. A, the detection of 1.7-and 1.75-kb RNAs by a full-length cDNAclone representing the 1.7-kb transcript. This blot demonstrates that the two RNAs show extensive sequence similarity. The probe used as a loading control in this experiment was the Drosophila ras homolog (Dras) (Mozer et al., 1985). B, the 1.7-kb RNA was detected using PCR-amplified Exon Ia as a probe. Expression of this RNA is primarily restricted to adult heads. Longer exposures reveal very low levels of expression in third instar larvae and in 12-24-h embryos. The 1.7-kb transcript observed in the adult body lane is due to incomplete separation of heads from these tissues. Other preparations show no evidence of 1.7-kb RNAexpression in these tissues. The 1.75-kb RNA was detected using PCR-amplified Exon Ib as a probe. Expression of this transcript is observed throughout development, but the highest levels of expression are observed in late third instar larvae and in adult heads. The rp49 ribosomal protein gene is used as a probe for the RNA loading control.
Sequence Analysis-To facilitate the remaining discussion of the analysis, the Pu cDNA clones will henceforth be referred to as the 1.7 and 1.75 cDNAs. Several representatives of each cDNA class and genomic clones from the relevant portions of the Pu region were sequenced, following the strategy in Fig. 2. The 1.7 cDNAis 1587 nucleotides in length, and the 1.75 cDNA is 1601 nucleotides. The clones therefore appear to be somewhat shorter than predicted from the size of the corresponding transcripts. Using primers near the 5' ends of each cDNA, primer extensions and amplifications were performed using the 5'-RACE technique (Frohman et al., 1988). The amplified products were 175 base pairs in length for the 1.7-kb RNA and 260 base pairs in length for the 1.75-kb RNA, including linkers and 5' tails. Sequencing of multiple clones from each amplification revealed that the full-length products are 1632 bases and 1739 bases, sizes corresponding closely to those determined by Northern blot analysis.
Comparison of the cDNA and genomic sequences confirms that the 1.7-kb transcript arises from exons spanning at least 8-10 kb of the Pu region. This RNA initiates in a genomic region at approximately coordinate +l. The first exon of the 1.75-kb transcript is found in an intron of the 1.7-kb transcript, at approximately coordinate +5.5 (Fig. 2). Hence, these two RNAs appear to arise from distinct promoters. The genomic DNA upstream of the transcription units has been sequenced, but no functional analysis has yet been performed. Sequences from nucleotide 362 for the 1.7 RNA and 469 for the 1.75 RNA to their 3' ends are identical. Both forms of the transcript are composed of five exons, with Exons Ia and Ib being alternatively spliced at the Exon I1 acceptor site. Introns b, d, and e have been completely sequenced, as well as extensive regions 5' and 3' in Introns a and c. All introns have consensus splice donor and acceptor sites at their 5' and 3' ends (Mount, 1982).
The sequences of the two transcripts, portions of the introns and the inferred protein sequence are shown in Fig. 4. An initiator ATG is found beginning at base 159 in the 1.7-kb RNA. In the 1.75-kb RNA, a n in-frame ATG is found at base 169. The methionine codon is preceded by a GAAC in the 1.7-kb transcript and by CGAG in the 1.75-kb transcript. Both sequences are in reasonable agreement with the C(A/g)A(C/A) consensus sequence for translation starts in Drosophila (Cavener and Ray, 1991).
cDNAs representing both transcripts terminate with poly(A) tails. An AATAAA polyadenylation signal is located 19 bases from the poly(A) addition point. Of interest in the 3"untranslated region is the occurrence of four repeats of ATTTA (underlined in Fig. 41, a sequence similar to that shown to confer rapid turnover of mRNAs (Shaw and Kamen, 1986). These signals are consistent with the behavior of the 1.7-kb message, which is expressed abundantly in the heads of newly eclosed Drosophila and at 10% of that level 24 h later (McLean et al., 1990). A second potential polyadenylation signal is located 226 nucleotides 5' of the terminal adenylation signal. Since there is a n unusually long distance between this sequence and the poly(A) addition point, it probably is not used in these particular transcripts. However, we observe additional RNA size classes at other developmental stages (O'Donnell et al., 19891, so it is possible that the polyadenylation signal is used in other Pu transcripts. Three of the four putative instability signals would be eliminated by termination at the earlier site. The transcript lengths predicted from the cDNA molecules and their primer extensions are similar to those observed in Northern blot studies. Primer extension experiments using three additional methods yield the same size product^.^ The proposed initiator codons are located within acceptable translation initiation contexts, and there are no other methionine codons upstream of these. The predicted molecular masses of the polypeptides encoded by these transcripts are approximately 30 and 34 kDa for the 1.7-and 1.75-kb RNA, respectively. These are in reasonable agreement with the nearly 26-kDa molecular mass predicted for the polypeptide deduced for the full-length rat liver GTP CH cDNA (Hatakeyama et al., 1991). There are, however, several complications in this apparently straightforward analysis. First, the predicted protein from the 1.7-kb transcript is 23% lower in molecular size than the 39-kDa GTP CH polypeptide detected in adult heads or translated in vitro from the 1.7-kb RNA. We know of no posttranslational modifications that could account for this discrep-ancy. These proteins do not appear to be glycosylated,4 nor are they sufficiently phosphorylated5 to account for the difference between the expected and observed molecular weights. RNA was transcribed in vitro from 1.7-and 1.75 cDNA clones, and these transcripts were used as templates for in vitro translation reactions. The putative initiator methionine codons are the only methionine codons in the 5' portion of either clone. Both transcripts produce discrete polypeptides. The product derived from the 1.7 cDNA migrates as a 35-kDa protein, and the 1.75 cDNA product migrates as a 40-kDa protein (data not shown). In both cases, the apparent molecular weight is significantly greater than predicted. The discrepancy between predicted and observed molecular weight is due to anomalous electrophoretic mobility of the protein. This anomaly is not restricted to the Drosophila enzyme. Hatakeyama et al. (1991) find that the predicted molecular mass of the mature rat liver GTP CH derived from full-length cDNA molecules is 25.8 kDa, 14% lower than the molecular weight of the purified liver enzyme. Similarly, the predicted molecular mass of GTP CH derived from a human liver cDNA is 27.9 kDa, while the purified enzyme has a subunit molecular mass of 50 kDa, a 44% discrepancy (Togari et al., 1992;Schoedon et al., 1989).
A second unusual feature of the Drosophila sequence is that there are 52 codons of potential open reading frame upstream of the initiator methionine codon in the 1.7-kb transcript. This feature is shared with the mammalian counterparts. The fulllength rat liver cDNA molecule consists of open reading frame extending to the 5' end of the molecule, 42 codons upstream of the initiator methionine. Because the rat liver open reading frame aligns with the N-terminal sequence obtained from the purified rat liver GTP CH (Hatakeyama et al., 1991), it appears that the upstream sequences are not translated in this form of the enzyme. A similar situation exists with respect to the human GTP CH cDNA clones, which also consist of open reading frame from the initiator methionine to the 5' end of the molecules, a total of 22 codons upstream of the initiator AUG (Togari et al., 1992). In the mouse GTP CH cDNA molecules, the first nonsense codon is 19 codons upstream of the initiator methionine (Nomura et al., 1993). The total body of biochemical evidence makes it unlikely that these upstream sequences are translated in these forms of the enzyme, even though our alignments of the mammalian sequences shows them to be very highly c o n~e r v e d .~ Since we have recently isolated numerous additional cDNAs from embryonic libraries that display further alternative 5' splicing in regions of open reading frame: we suggest the possibility that much of the 5' open reading frames in all of these transcripts might, in fact, be utilized in other forms of the enzyme expressed in other tissues or at different periods of development.
Analysis of the Deduced Protein Sequences-Potential secondary structure determinations of the deduced proteins reveal few structural features of note with one exception. The unique region of the 1.75-kb product contains a strongly hydrophilic region derived in large part from a high arginine content in the residue 59-87 domain (12 arginined29 residues). By comparison the N-terminal domain of the deduced sequence from the 1.7-kb transcript is quite hydrophobic (30/66 residues are nonpolar), suggesting that the two proteins function in rather different cellular environments. The organization and extent of the hydrophobic areas, however, are not consistent with the structure of an integral membrane protein.

R Q G L I K T P E R A A K A M L Y F T K G Y D [intron cl exon 3:
AGAGTCTCGAGGgtagCCcatcagctt ... aaattttttttaatagATGTTCTCMTGGCGCCGTTTTCG

Q E R L T K Q I A V A R D S G P C N P A G V A
[intron el exon 5:

GTGGCGTGCAGMGATCMCAGCGCAAMCTGTTACCTCMCTATGCTGGGCGTGTTCCGAGACGATCCCM R G V Q K I N S K T V T S T M L G V F R D D P K GACCCGTGAGGMTTCCTGMCTTAGTCMTA~~TAGAGTGAGCACTAGAGGACGC~ACTGAAATCC
T R E E F L N L V N S K *

G G G M C M C T G A T M T~T C A C G G C A A A T T G T T T A G G G l L B 1545
TTMTCAGATCTATAGTTATGTATGTATGTTMCCTATATAMGAGTT~ATTCATMTGTTCCTCGACA 1615 GGGACCAAAGTATATTGGAATTTT~TGTT~TTATATGCTCATTATGTAAATGACGTTGCTTGT 1685

AGTGAGTCAAAGCTGCTTATTMTTATACAATMATTATGT-TMCGT 1739
uct, which is solely responsible for eye pigmentation, is quite liver and human GTP cyclohydrolase sequences with the dehydrophobic but is not an integral membrane protein (Weisberg duced Drosophila sequence; mouse brain GTP cyclohydrolase is and O 'Donnell, 1986). virtually identical to the rat liver sequence (Nomura et al., Sequence Comparisons-Searches of the sequence data 1993), and the bacterial sequences are discussed below. The rat banks revealed that the deduced sequence of the P u proteins liver and human GTP cyclohydrolase sequences (Hatakeyama are highly similar to both mammalian and bacterial GTP cyet al., 1991;Togari et al., 1992) are approximately 60% identical clohydrolases (Fig. 5 ) . We show only the alignment of the rat and 75% similar to both both Drosophila sequences cDNA. The  (Hatakeyama et al., 1991).  similarities between the mammalian and Drosophila sequences extend throughout the entire length of the deduced sequences and include long stretches of complete identity. The region of highest similarity (74% identity; 87% similarity) begins about midway through Exon 2 and extends to the end of the coding region. The more N-terminal sequences are much less similar and cannot be aligned to any significant extent. A line below the sequence beginning at position 142 in the aligned sequences shown in Fig. 5 designates a domain also noted by Hatakeyama et al. (1991) in the rat liver cDNA sequence, which is very similar to the dihydrofolate binding region of dihydrofolate reductase and may be involved in pteridine binding. Two highly conserved residues: a tryptophan at position 152 (in the human and rat alignments) and a phenylalanine a t position 161, are thought to interact with the pterin moiety of dihydrofolate. We find the phenylalanine but not the tryptophan residue in this region of the Drosophila sequence.

M S F T R Q L S E M -N D A I D D T N F P Q P T -D A T T A S A Q Q V A P R A P P R W
Although they have not been included in the sequence alignments shown in Fig. 5, the GTP cyclohydrolases from Bacillus subtilis and Escherichia coli are highly similar t o the Drosophila, as well as the mammalian, sequences. The B. subtilis GTP cyclohydrolase, a product of the mtrA gene, is 46% identical and 65% similar to the Drosophila protein, while the E. coli enzyme is 36% identical and 54% similar (Gollnick et al., 1990;Babitzke et al., 1992;Katzenmeier et al., 1991).
To date, the human gene is the only other GTP cyclohydrolase gene reported to undergo alternative splicing (Togari et al., 1992). Interestingly, the human forms differ only at their 3' ends, while the Drosophila forms described here, as well as others expressed during embryogenesis,6 differ only at their 5' ends, all resulting in distinct N-terminal protein domains. We note that the human forms diverge from one another at a point that is precisely equivalent to the point at which the Drosophila transcripts splice Exon 4 to Exon 5 (see Fig. 4). One of the human forms continues on with an open reading frame that is essentially identical to that of Drosophila, suggesting the pos-sibility that the organization of at least the 3' portion of the gene is conserved between the human and Drosophila genomes, with identically positioned introns. Although we have not yet detected any evidence of alternative splicing at this point nor any alignments of the remaining human transcripts with Drosophila intron sequence, it is conceivable that such forms would have been missed by our cDNA library screening strategy. 3'-RACE screening of RNA from defined points in Drosophila development should help to resolve this issue.
Another potentially important similarity between the Drosophila and human GTP cyclohydrolase sequences is in the positioning of putative phosphorylated residues (Fig. 5). The longest human form of the enzyme has seven possible phosphorylation sites (Togari et al., 1992). Each of these sites has a counterpart in the Drosophila sequence, whether or not the amino acid sequence at that position is conserved. The three most C-terminal sites (beginning at residues 195, 231, and 296), as well as a fourth site a t position 149, are identical to those observed in the human protein. Although all three aligned sequences also have a potential phosphorylation site at position 168, the SLED sequence in Drosophila is replaced with TISD in mammals. Finally, there are two phosphorylation sites in the mammalian sequences (at positions 99 and 136 in Fig. 5) that are not conserved in Drosophila. Interestingly, in each case, residues that could be phosphorylated are found almost immediately upstream in the Drosophila protein, at positions 95 and 130. Additional potential phosphorylation sites are located in the unique N-terminal domains of the Drosophila isoforms. Although we have preliminary evidence that the Drosophila protein is phosphorylated, the phosphorylated residues have not been identified and the function of phosphorylation of GTP CH is not known. Nevertheless, the close correspondence between potential phosphorylation sites in the human and Drosophila proteins suggests an important role for this modification in GTP CH function.
Conclusions-Our previous genetic and biochemical studies of the Pu locus and Drosophila GTP cyclohydrolase raised an interesting paradox. On the one hand, the enzymatic characteristics of this protein are highly conserved. For instance, the catalytic properties and reaction parameters of rat liver GTP cyclohydrolase are remarkably similar to those of the Drosophila GTP cyclohydrolase that is used for eye pigmentation (Weisberg and ODonnell, 1986;Hatakeyama et al., 1989). Regardless of the physiological roles of the pteridine end products, GTP cyclohydrolase seems to carry out its catalytic functions in the same way, in every organism studied to date. It seems clear that the functions for which the pteridines are targeted are closely regulated by pteridine availability and GTP cyclohydrolase activity Mackay and O'Donnell, 1983;Schoedon et al., 1987;Werner et al., 1989;Werner et al.., 1990). However, these functions occur in vastly different cell types and cellular environments. Mutational analysis of the Pu locus suggested that, in spite of the commonality in enzyme mechanisms observed for GTP cyclohydrolase, the gene itself is structurally and functionally complex (Mackay and O'Donnell, 1983;Mackay et al., 1985;Reynolds and O'Donnell, 1988). Furthermore, the nature of the genetic data suggested either that the locus expressed multiple proteins or a single protein with multiple functional domains. The observation that the two transcripts described here encode proteins with distinct N termini is consistent with this prediction, as is recent evidence that proteins containing these domains are expressed in vivo, and that the expression patterns for the two proteins are distinct (ODonnell et al., 1993).'j One of these, the product of the 1.7-kb mRNA, is responsible for eye pigmentation (Weisberg and O'Donnell, 1986;McLean et al., 1990). Furthermore, we find that all alternate forms analyzed thus far predict proteins with additional N-terminal variati01-1.~ We suggest that these domains are regulating associations and/or subcellular localization of the proteins in Drosophila, a hypothesis that we are testing with available genetic tools and molecular reagents. The recent discovery by Harada et al(1993) of BH4-dependent complex formation between GTP cyclohydrolase and an inhibitor protein in rat liver is consistent with our hypothesis. The discrete nature of the N-terminal domains provides us with probes of individual GTP cyclohydrolase functions. Because most of these functions have counterparts in mammals, it will be most interesting to learn whether homologous 5' coding sequences are present in mammals.