Isolation of a Genomic Clone for Drosophila sn-Glycerol-3-phosphate Dehydrogenase Using Synthetic Oligonucleotides*

A genomic clone containing Drosophila sn-glycerol- 3-phosphate dehydrogenase sequences has been isolated using a mixture of synthetic tridecanucleotides as a hybridization probe. The clone as well as the synthetic probe mixture was found to hybridize to an abundant poly(A)+ RNA of 1700 bases. A partial DNA sequence obtained for a 40-amino acid region contain- ing the oligonucleotide hybridization site was found to agree with the known Drosophila protein sequence data for this region of the protein. In situ hybridization of this clone to the polytene chromosomes of wild type flies and flies bearing chromosomal aberrations that delimit the Gpdh’ locus have allowed us to decisively place the gene in the distal region of 26A on the left arm of the second chromosome. Glycerol-3-phosphate dehydrogenase (sn-glycerol-3-phos-phate:NAD+ 2-oxidoreductase, EC 1.1.1.8) of Drosophila me-lunogaster consists of a family of three distinct isozymes, designated GPDH-1, -2, and -3,’ which are uniquely distrib-uted with respect to developmental and tissue-specific expression

A genomic clone containing Drosophila sn-glycerol-3-phosphate dehydrogenase sequences has been isolated using a mixture of synthetic tridecanucleotides as a hybridization probe. The clone as well as the synthetic probe mixture was found to hybridize to an abundant poly(A)+ RNA of 1700 bases. A partial DNA sequence obtained for a 40-amino acid region containing the oligonucleotide hybridization site was found to agree with the known Drosophila protein sequence data for this region of the protein. In situ hybridization of this clone to the polytene chromosomes of wild type flies and flies bearing chromosomal aberrations that delimit the Gpdh' locus have allowed us to decisively place the gene in the distal region of 26A on the left arm of the second chromosome.
Glycerol-3-phosphate dehydrogenase (sn-glycerol-3-phosphate:NAD+ 2-oxidoreductase, EC 1.1.1.8) of Drosophila melunogaster consists of a family of three distinct isozymes, designated GPDH-1, -2, and -3,' which are uniquely distributed with respect to developmental and tissue-specific expression (1). Extensive biochemical and genetic analyses have demonstrated that the primary structures of GPDH-1 and -3 are nearly identical and that each isozyme is encoded by the same structural gene (1)(2)(3)(4). The structural gene, Gpdh+, maps to a single site on the left arm of the second chromosome (5) and has been tentatively localized to the cytogenetic region 25F5 on the basis of dosage responses to segmental aneuploidy (1,6,7) and fine structure mapping (8). In addition, a number of variants have been isolated that affect the quantitative, temporal, and tissue-specific expression of the structural gene (9-12). This system is therefore under two levels of control. The expression of total enzymatic activity as a function of development is controlled by cis-acting genetic elements tightly linked to the structural gene, while the switch in isozyme expression during development is epigenetic, acting ' The abbreviations used are: GPDH-1, -2, and -3, isozymic forms of Drosophila sn-glycerol-3-phosphate dehydrogenase; CRM, glycerol-3-phosphate dehydrogenase specific cross-reacting material; kb, kilobase pair; SDS, sodium dodecyl sulfate. either at the level of post-transcriptional or post-translational processing.
In this article, we report the isolation of a genomic clone containing GPDH coding sequences using a set of mixed oligonucleotides deduced from an evolutionarily conserved region of the Drosophila GPDH amino acid sequence. This clone now provides us with the opportunity to examine the underlying molecular mechanisms involved in the genetic programming of the GPDH locus.

RESULTS
Design of Synthetic Oligonucleotide Mixture and Screening of Genomic Library-The general approach used for the isolation of a clone containing Drosophila GPDH took advantage of the partial amino acid sequence for the Drosophila protein and its homology to the rabbit muscle protein (16,17). Tryptic peptide T-11 of the Drosophila protein contains a short sequence of amino acids that represent a region of low codon redundancy in the corresponding mRNA (Fig. 1). This region was chosen for the design of a set of mixed tridecanucleotide probes complementary to all possible coding sequences for this portion of the mRNA. Before screening, the fidelity of the oligonucleotide mixture as a hybridization probe under conditions that would eliminate mismatching was tested by Northern blot analysis to poly(A)+ RNA derived from two wild type strains of Drosophila (Fig. 2). In each case, the oligonucleotide mixture hybridized to a unique RNA species of approximately 1700 nucleotides in length. Therefore, one sequence in the oligonucleotide mixture was the exact complement to a distinct RNA.
The tridecanucleotide mixture was subsequently used as a hybridization probe to isolate 17 clones from a screen of an estimated 7 x lo4 phage of a Drosophila genomic library in Charon 4 (18), representing approximately 7 haploid genome equivalents of fly DNA. We estimate that an oligonucleotide sequence of 15 bases would be the minimum length required to represent a unique DNA sequence in the Drosophila genome. A sequence of only 13 bases, as represented by our mixed synthetic probe, would be expected to identify more than one genomic clone. Therefore, we used in situ hybridization to polytene chromosomes as a method to rescreen the clones isolated from the genomic library. In Situ Hybridization of Genomic Clones to Polytene Chromosomes-Seven of the original 17 positive clones were randomly chosen for rescreening by in situ hybridization to Drosophila polytene chromosome preparations. The GPDH locus has been tentatively localized to cytogenetic region 25F5 (8). On this basis, any putative GPDH positive clones would be expected to hybridize close to or within this cytogenetic region. One of the seven clones, designated as XDmGOa(c), met this criterion by hybridizing to region 26A on the left arm of the second chromosome (Fig. 3A). In addition, this clone did not cross-hybridize with any of the other 16 clones as evidenced by blot hybridization analysis (data not shown).
Clone XDmGOa(c) was further characterized by in situ hybridization to determine its location relative to chromosomal aberration breakpoints which delimit the GPDH locus. Fig.  3B illustrates hybridization to only the wild type homologue of a Df(2L)PMlOl/+ heterozygote with a proximal breakpoint at 26A2-5. This result is consistent with genetic data showing that Df(2L)PMlOl deletes the structural gene for GPDH giving rise to a null activity phen~type.~ Similarly, when hybridization is to T(Y:2)D222/+ males with the autosomal breakpoint at 26A2-5 on 2L, the label is clearly located on the YP2D element (data not shown). This result is consistent with the observation of position effect variegation of GPDH expression in lines containing this translocation, and with the interpretation that the GPDH locus is translocated to the Y chromosome (8). Finally, hybridization to Df(2L)50078a/+ heterozygotes with a proximal breakpoint at 25F4-26A1 illustrates that the label extends across both homologues (data not shown), indicating that XDmGOa(c) sequences are proximal to this deficiency. In summary, these results are consistent with a cytological location of XDmGOa(c) sequences to 26A, and more specifically distal to the autosomal breakpoint of T(Y:2)D222 and the proximal breakpoint of Df(2L)PMlOl.
Hybrid Selection of GPDH mRNA-The 11.5-kb insert from XDmGO(c) was isolated by restriction digestion with EcoRI. The entire fragment was subcloned into the plasmid vector pUC9 (i.e. pDrnGOa(c)), and tested for its ability to hybrid select a poly(A)+ RNA that would translate into a GPDH polypeptide. Fig. 4 illustrates that mRNA selected by the recombinant plasmid directs the de novo synthesis of an immunoprecipitable 32-kDa protein when injected into oocytes, and that this protein co-migrates with an immunoprecipitable protein isolated from flies labeled in vivo which has previously been shown to be the authentic GPDH polypeptide (15). In addition, mRNA selected by pDmGOa(c) directed the synthesis of catalytically active GPDH in the oocyte translation system. Furthermore, the active enzyme displayed an electrophoretic mobility in native polyacrylamide gel electrophoresis gels which was identical to the electrophoretic allele maintained in the stock of flies from which the RNA was isolated, i.e. line RI-09-GpdhS; line WGM74-GpdhF. These results clearly demonstrate that the 11.5-kb insert contains sequences complementary to translatable GPDH mRNA.
Identification and DNA Sequence of the Site Complementary to the Oligonucleotide Probe-Digestion of the 11.5-kb EcoRI insert of pDmGOa(c) with PstI generates five DNA fragments (Fig. 5A). Southern blot analysis of this gel using the oligonucleotide mixture as a hybridization probe identified a 1.2kb PstI fragment as containing complementary sequences to the oligonucleotide. Since the oligonucleotide mixture was deduced from known amino acid sequence data, the 1.  PstI genomic fragment should contain coding sequences of the gene.
A 120-base pair partial DNA sequence which extends through the oligonucleotide hybridization site of the 1.2-kb PstI fragment is illustrated in Fig. 5B. Translation of this DNA sequence yields a contiguous 40-amino acid sequence that is homologous to residues 203-243 of the rabbit muscle protein. Within this region, our data differ from the rabbit protein sequence by 12 amino acid residues. In addition, our data are in exact agreement with the known Drosophila protein sequence for this region of the protein (174).
R N A Blot Analysis-Poly(A)+ RNA prepared from three different strains of flies was examined for the presence of transcripts homologous to pDm60a(c) (Fig. 6). An abundant RNA of approximately 1700 nucleotides of which approximately 860 would be required to code for a polypeptide of 32-kDa is identified on this blot. These data corroborate our observation in Fig. 2 using the oligonucleotide mixture as a hybridization probe.
There is considerable variation in the intensity of the hybridization signal to RNA isolated from each of the strains in Fig. 6. Lane A represents a wild type line which expresses high GPDH activity, RI-09. Lane B represents a variant that has been shown to express one-third to one-fourth the GPDH activity of line RI-09, i.e. BI-114 (12). Lane C represents a CRM-negative mutant at the GPDH locus, i.e. JH-231 (14). Trace amounts of RNA from line JH-231 can be observed only on overexposed autoradiograms. As a control, this filter was washed free of pDm60a(c) probe and rehybridized with a clone containing the entire Drosophila ADH gene, SAC1 (data not shown). This experiment demonstrated that the differences in signal intensity observed using pDM6Oa(c) as probe are due to differing amounts of GPDH-specific hybridizable mRNA and not to different amounts of total poly(A)+ RNA.

DISCUSSION
Using a mixture of short synthetic oligonucleotides, we have isolated a clone from a Drosophila genomic library that con- tains the gene encoding glycerol phosphate dehydrogenase. Four independent lines of evidence support this conclusion. First, in situ hybridization of the clone to Drosophila polytene chromosomes gives a cytological map position of 26A on 2L, which is consistent with genetic fine structure mapping data (8). Second, the cloned 11.5-kb insert contains sequences complementary to an RNA that translates into a 32-kDa polypeptide with properties identical to authentic GPDH isolated from flies. Third, the amino acid sequence deduced from a 120-nucleotide sequence that contains the hybridization site to the synthetic oligonucleotide mixture matches the known amino acid sequence for that region of the Drosophila protein. Finally, hybridization intensity of the nick-translated clone to an abundant 1700 nucleotide poly(A)+ RNA on Northern blots is consistent with steady state levels of CRM and translatable GPDH-mRNA in genetic variants of the gene.
The size of the Drosophila haploid genome, 1.65 x 10' base pairs, would necessitate a minimum sized oligonucleotide of 15 bases to represent a unique genomic sequence (29). However, we have demonstrated in this study that significantly shorter oligonucleotides are potentially useful probes in screening a Drosophila genomic library when coupled with in situ hybridization to polytene chromosomes as a secondary screening method. Rescreening, using biotinylatedprobes, can be accomplished in a matter of days. Therefore if a partial amino acid sequence has been determined for a given protein and if the corresponding gene has been genetically mapped, the use of short oligonucleotide probes of mixed sequence provides a powerful approach to the isolation of a specific gene from the Drosophila genome. In fact, short probes may represent the method of choice when codon ambiguity is high throughout a known protein sequence. It is also worth noting that our tridecanucleotide mixture hybridizes to a single unique RNA species on Northern blots (Fig. 2). In this regard, hDm60a(c) was the only genomic clone selected with this probe that contained sequences complementary to RNA, while the remaining isolates either do not code for RNA or code for RNA of such low abundance that they are beyond our ability to detect using standard RNA blotting techniques.
Kotarski et al. (8) have previously mapped the GPDH locus to band 25F5 based on the observation that sec0r.d chromosomes carrying either Df(2L)50078a or Df(2L)50075a, with proximal breakpoints at 25F4-26A1, also lack a wild type GPDH allele. However, two other deficiencies, Df(2LI2802 and a deficiency associated with the autosomal breakpoint of T(Y;2)H151, have the same cytological breakpoints as Df(2L)50078a and Df(2L)50075a and carry wild type GPDH alleles. It was therefore concluded that the GPDH locus must lie between different proximal, but cytologically indistinguishable, breakpoints of these two classes of deficiencies. It has recently been discovered that Df(2L)50078a and 50075a are on chromosomes containing a P-factor transposable element inserted in 26A just proximal to each deficiency breakp~int.~ If the GPDH null alleles associated with each of these deficiencies are due to P-factor inactivation, then the actual cytological location of the GPDH locus would be more proximal than 25F5. This conclusion is supported by in situ hybridization of Dm6Oa(c) to region 26A of Df(2L)50078a, indicating that the GPDH locus is outside of and proximal to We have used the GPDH clone pDm60a(c) to examine steady state levels of hybridizable transcripts in two previously characterized genetic variants of the Gpdh+ locus. Line BI-114 represents a low activity allele that acts in cis to the structural element and alters the rate of GPDH polypeptide synthesis but not degradation throughout development (12). Such a tightly linked regulatory element would be expected to operate at the level of transcription. The significant reduction in GPDH-specific hybridizable RNA shown in Fig. 6 provides a direct correlation between the rate of GPDH polypeptide synthesis and the available pool of GPDH-mRNA in this line, providing evidence for control at the transcriptional or steady state mRNA level. Line JH-231 represents a null activity allele at the Gpdh locus previously characterized as CRM-and with no detectable GPDH-translatable RNA (14, 15). These observations are extended in this study by demonstrating an almost complete lack of detectable RNA by blot hybridization. These combined results provide strong evidence that line JH-231 represents a true null allele at the level of transcription or transcript accumulation. This mutation arose spontaneously in flies isolated from wild populations (30), and it remains a distinct possibility that this mutation arose through insertional inactivation by a transposable element. However, line JH-231 has been shown not to contain a P-factor element in region 26A. 3 It has been firmly etablished that the family of isozymes associated with this enzyme system are all encoded by the same structural gene (1). At the protein level, a C-terminal heterogeneity is observed in that GPDH-1, the adult isozyme, is extended by amino acid residues -Gln-Asn-Leu-COOH when compared to GPDH-3, the larval isozyme (4, 31). This C-terminal heterogeneity appears to represent the biochemical basis for each isozyme and provides a secondary level of control for the developmental and tissue-specific expression of the structural gene. However, the molecular mechanism(s) directing the generation of each isozyme remain to be elucidated. The isolation of a GPDH clone now provides us with the material to examine the molecular mechanisms controlling GPDH isozyme differentiation by DNA sequence analysis of the 3' coding and noncoding end of the gene and the corresponding mRNAs.  d o n s t a n d a r d c a r n m e a l -m o l a s s e s -y e a s t -a g a