Structure and Expression of Gene Coding for Sex-specific Storage Protein of Bombyx mori*

A major plasma protein termed “SP 1” accumulates in a sex- and stage-specific manner in the larval hemo- lymph of the silkworm, Bombyx mori. We have cloned the genomic sequence coding for SP 1 and analyzed its primary structure. The SP 1 mRNA sequence is en-coded by five exons interspersed with four introns. Initiation site for the SP 1 gene transcription was identified at nucleotide level. Sequence homologous to the SV 40 enhancer “core” structure exists in two adjacent locations in the first intron. The 5’ flanking region of the SP 1 gene contains a sequence highly homologous to the putative hormone-receptor complex binding site found in the ecdysteroid-sensitive genes of Drosophila melanogaster. Developmental change in the level of the SP 1 mRNA precursor in the fat body faithfully reflects that of SP 1 mRNA, indicating that the biosynthesis of SP 1 in B. mori is regulated in a sex- and stage-specific fashion at the level of transcription.


Structure and Expression of Gene Coding for Sex-specific Storage Protein of Bombyx mori* (Received for publication, October 29, 1987)
Hiroshi Sakurai, Tomoko Fujii, Susumu Izumi, and Shiro Tomino$ From the Department of Biohgy, Tokyo Metropolitan University, Setagaya-ku, Tokyo 158, Japan A major plasma protein termed "SP 1" accumulates in a sex-and stage-specific manner in the larval hemolymph of the silkworm, Bombyx mori. We have cloned the genomic sequence coding for SP 1 and analyzed its primary structure. The SP 1 mRNA sequence is encoded by five exons interspersed with four introns. Initiation site for the SP 1 gene transcription was identified at nucleotide level. Sequence homologous to the SV 40 enhancer "core" structure exists in two adjacent locations in the first intron. The 5' flanking region of the SP 1 gene contains a sequence highly homologous to the putative hormone-receptor complex binding site found in the ecdysteroid-sensitive genes of Drosophila melanogaster. Developmental change in the level of the SP 1 mRNA precursor in the fat body faithfully reflects that of SP 1 mRNA, indicating that the biosynthesis of SP 1

in B. mori is regulated in a sex-and stage-specific fashion at the level of transcription.
In contrast to those in vertebrate sera, major protein components in insect hemolymph are not only limited in number, but they undergo qualitative as well as quantitative changes during the postembryonic development (1)(2)(3). In holometabolous insects, specific proteins termed as "storage proteins" comprise major protein components of the larval hemolymph (2)(3)(4). These proteins are synthesized in large quantities by the fat body of actively feeding larvae and released into hemolymph. At the conclusion of the feeding period, however, they are selectively taken up by the fat body cells and stored there in the form of protein granules that are required for the development of adult tissues (3-5).
In the silkworm, Bombyx mori, storage proteins are known to occur in two forms, referred to as SP 1 and SP 2, respectively (6). Both proteins are of molecular weights of approximately 500,000, and their native molecules are each composed of six identical subunits with molecular weights around 80,000 (6). SP 1 is characterized by an exceptionally high content of methionine, while the amino acid composition of SP 2 is analogous to the dipterous storage proteins, being rich in phenylalanine and tyrosine (6)(7)(8). Recently, the structural gene coding for SP 1 has been mapped at a site proximal to the tub gene on the 23rd chromosome, whereas the SP 2 gene locates on the 3rd chromosome (9). The B. mori SP 1 provides a unique opportunity for studying mechanisms bringing about the expression of secondary sexual characters in insects, since this protein exhibits the stage-specific sexual dimorphism in hemolymph. Hemolymphs of both sexes of B. mori contain nearly equal amounts of SP 1 until the end of the fourth larval instar. In the fifth (the final) instar larvae, however, the amount of SP 1 greatly increases in females, while markedly declines in males (6, 10). Our previous study employing the sex-mosaic individuals of B. mori provided evidence that the sex-dependent synthesis of SP 1 is primarily determined by the sex chromosome composition in each fat body cell and is developmentally regulated without participation of any sex-specific humoral factors functioning like sex hormones in vertebrates (10).
The present experiments have been undertaken to investigate further mechanisms underlying the sex-dependent expression of SP 1 in the B. mori silkworm and to elucidate molecular processes involved in the expression of secondary sexual characters in insects. For this purpose, we cloned the genomic sequence coding for SP 1 and analyzed its structure. Evidence is also presented that the expression of the SP 1 gene is regulated in a sex-and stage-specific fashion at the level of transcription in the fat body.

RESULTS AND DISCUSSION
Cloning of SP 1 Gene Sequence-To facilitate cloning of the SP 1 gene, an attempt was made to construct a B. mori mini gene library from the DNA fragments enriched for the SP 1 gene sequence. The EcoRI digest of the fat body DNA was fractionated by preparative gel electrophoresis as detailed under "Materials and Methods," in the Miniprint, by which the SP 1 genomic DNA was enriched by about 20-fold above total genomic DNA. The DNA fragments enriched for the SP 1 gene sequence were ligated with Charon 4A arm DNAs, and the resulting recombinant DNA was introduced into X phage particles (18). The library was screened for the SP 1 mRNA sequence by plaque hybridization, and four plaques which gave positive hybridization signal were isolated.
The genomic DNA inserts were subjected to restriction mapping analysis as shown in Fig. 1. One of these clones contained the 11-kb2 EcoRI fragment, and three others carried the 9-kb fragments which are identical with respect to the The "Materials and Methods" are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are included in the microfilm edition of the Journal that is available from Waverly Press. Comparison of physical maps between the 9-and the 11-kb DNA inserts indicated that major portion of two DNA inserts are structurally homologous, except for the 3' flanking segments of the mRNA-coding region. Moreover, partial nucleotide sequence of the 11-kb DNA completely matched with that of the 9-kb DNA (data not shown). This could mean that two DNA fragments might represent duplicate genes coding for identical SP 1 as observed with the genes for dipterous storage proteins (4,20,21). Alternatively, they might be in allelic relation, since these clones were isolated from a library which had been constructed from DNA of heterozygous individuals (Tokai x Asahi). Analogous phenomenon has been observed with the sericin genes of B. mori (22).
Exonllntron Composition of SP 1 Gene-structure of the 9-kb DNA insert was analyzed in detail. The size and number of exons in SP 1 gene were confirmed by the method of RNase mapping (17) as illustrated in Fig. 2. The SP 1 genomic DNA was digested with BglII and each of the 4-and the 5-kb fragments was inserted into multicloning sites of pSP64 and pSP65. The recombinant plasmids were transcribed by the SP6 RNA polymerase, and RNAs complementary to the SP 1 mRNA (cRNA) were synthesized. Hybridization of cRNAs with poly(A)+ RNA from the male fifth instar larvae (Fig. 2, lane 3) failed to produce any detectable signal on the autoradiogram, since, as described previously (ll), the male fifth instar larvae are virtually free of the SP 1 mRNA. When poly(A)+ RNA from the fourth instar larvae was subjected to hybridization analysis (Fig. 2, lanes 1 and 2), electrophoretic patterns of the hybridized fragments were essentially identical with those observed with poly(A)+ RNA from the female fifth instar larvae (Fig. 2, lane 4).
The transcript of the 4-kb DNA, which carries upstream half of the 9-kb SP 1 gene, yielded three protected fragments with 123, 134, and 269 bases in size, respectively (Fig. 2 A ) . The cRNA corresponding to downstream region of the 9-kb DNA protected three bands with 161, 565, and 1077 nucleotides, respectively (Fig. 2 B ) . Since it has been known from the results of preliminary S1 mapping experiments that the BglII restriction site is lying in the 834-bp exon, the 269 and the 565 base fragments seen on radioautograms are most likely to be derived from the 834-bp exon. Recently, an additional SP 1 cDNA clone (pBmSPlC2) carrying the 161and the 1077-bp exons a t 3' proximal region of the SP 1 gene pSP65BgA DNA was digested with HindIII and transcribed in uitro as described above. After hybridization of cRNA with poly(A)+ RNA followed by RNase treatment, the protected fragments were separated on 4% acrylamide denaturing gel. The fat body poly(A)+ RNAs used for hybridization were prepared from male fourth instar larvae (lane I), female fourth instar larvae ( l a n e Z), male fifth instar larvae (lane 3), and female fifth instar larvae ( l a n e 4). The RsaI-digested pBR322 and the HueIII-digested pBR322 were coelectrophoresed as size markers (the left-most lanes). Numbers to the right of radioautograms represent sizes of the protected fragments. These values were determined by sequence analysis.
the 834-bp third exon and that the 161-and the 1077-bp fragments are the fourth and the fifth exons, respectively. Accurate locations of exons in the SP 1 gene were assigned by comparing its nucleotide sequence with that of the SP 1 mRNA. The nucleotide sequence of the SP 1 mRNA was determined by primer extension technique in the presence of dideoxy-and deoxynucleotide triphosphates to yield its cDNA sequence. The primer DNA lying in the third exon was hybridized with total RNA from the female fifth instar larvae, and cDNA was synthesized. The SP 1 mRNA sequence homologous to the genomic sequence occurs at about 630 bp upstream from the 5' splicing site of the third exon. Sequence analysis was likewise repeated using primer DNA derived from the second exon, and it was concluded that the first exon is followed by a 1074-bp intron. Sizes of the first and the second exons are determined to be 123 and 134 bp, respec-tively, from the results of RNase mapping and direct sequencing of mRNA.
The mRNA-coding sequence of SP 1 gene stretches for a 4.8-kb region on the 9-kb DNA fragment and is separated into five exons interspersed by four introns (Fig. 3A). From the estimated lengths of exons total length of SP 1 mRNA is calculated to be about 2300 nucleotides. The value is well in agreement with that estimated by Northern hybridization of the fat body RNA (11) as well as that calculated from the molecular weight of the SP 1 subunit (6). The splice junctions of the SP 1 gene are summarized in Fig. 3B. Nucleotide sequences at the 5' and the 3' ends of introns are homologous to those reported for other genes in B. mori (22, 24, 25) and the consensus sequence for general splice junction (26).
Transcription Initiation Site of SP I mRNA-The transcription initiation site of SP 1 mRNA was deduced by direct sequencing of mRNA. The 73-base primer complementary to the sequence in the first exon was hybridized with the fat body RNA, and the primer was extended along with the mRNA sequence. As depicted in Fig. 4B, the primer DNA was elongated by some 40 nucleotides and terminated in sequence ladders, indicating that the sequence around the cap site of the SP 1 mRNA is AGUGUG and that adenine is the putative transcription initiation residue. The result was further confirmed by S1 nuclease mapping. The DNA fragment labeled at the Sty1 site was cleaved at AluI site within the 5' flanking region of the SP 1 gene, and the resultant 186 base probe DNA was hybridized with RNA. After treatment with varying concentrations of S1 nuclease, protected fragments were electrophoresed in parallel with the mRNA sequence ladders. Among a cluster of fragments seen in radioautogram, the most prominent was one with the terminal "A" residue ( Fig. 4A). Presence of multiple protected fragments probably represents nondiscrete digestion by S1 nuclease at the protected end rather than multiple transcription initiation sites on the SP 1 mRNA. Taken together with these results, the initiation The protected fragments were electrophoresed on the 8% acrylamide, 7 M urea sequence gel, and radioactivity was detected by autoradiography. The arrow to the left indicates the transcription initiation site. B, a labeled 73-nucleotide RsaI/StyI (position +40 to +113) singlestranded DNA was hybridized with the same RNA as in A and extended by reverse transcriptase in the presence of dideoxy-and deoxynucleotide triphosphates as described under "Materials and Methods." The extended cDNAs were analyzed by gel electrophoresis as above. cDNA sequence and corresponding mRNA sequence are indicated to the right of figure. site for the SP 1 gene transcription was identified as adenine at position +1 shown in Fig. 5. This is in agreement with the report that the cap sites of most eukaryotic mRNAs are adenine (27). The sequence TTCAGTG (underline indicates initiation site) in the SP 1 mRNA is highly homologous to the consensus sequence ATCAGTY ("Y" represents pyrimidine) at the cap site of insect mRNAs (28).
Nucleotide Sequence around Transcription Initiation Site-TATA box is the most established cis-acting element of genes for accurate transcription (29,30). In the SP 1 gene sequence TATATATA (TATA box-like structure) is identified at position -31 (Fig. 5). Some eukaryotic class I1 genes carry CCAAT sequence (CAT box) which may enhance transcription activity a t 80-100 nucleotides upstream from the cap site (31). The SP 1 gene, however, lacks such a sequence. The nucleotide sequence homologous to the SV 40 enhancer "coren sequence (32) is detected in two locations in the first intron of the SP 1 gene (Fig. 5).
We have suggested the possibility that cellular factor(s) together with balance in the hemolymph concentration of ecdysteroid and juvenile hormone participates in the sexually dimorphic expression of SP 1 in B. mori (10). In this connec- tion, it is of particular interest that a sequence TTTCCAT ---ATGGTAG exists in the 5' flanking region of the SP 1 structural gene (Fig. 5). This sequence bears striking homology with the TTTCCAT-"ATCGAAA sequence, which has been predicted for the binding site of the ecdysteroid-receptor complex in some of the ecdysteroid-sensitive genes of Drosophih mehmgaster (33). Unfortunately, however, it is not certain at present whether this structure functions as a regulatory element in the sex-and/or stage-dependent synthesis of SP 1.

~et-Leu-Pro-Arg-Gly-Glu-Thr-Phc-Val-Hi~-Th~-A~~-~l~-Le~-Gl~-~~t-Gl~-Gl~-~~l~-
In the 5' proximal region of the SP 1 mRNA there is a sequence GUUCUUCAAAC (position +16 to +26), which is partially complementary to the prine-rich region of 18 S rRNA (Ref. 34, Fig. 5). The first AUG codon for translation initiation (35) occurs 27 nucleotides from its 5' end, and the sequence around this region A A A C m is homologous with the consensus sequence A A Y C m ("Y" represents pyrimidine) for the site of translation initiation (31). Primary Structure of SP 1 Deduced from Nucleotide Seguence"Partia1 amino acid sequence of SP 1 was predicted from the cloned DNA sequence (Fig. 5). The deduced primary structure at the amino-terminal domain indicates a high content (>lo%) of methionine residues, which is consistent with the amino acid composition of SP 1 (6). Occurrence in accumulates in hemolymph in a sex-and stage-specific fashion (6, 10). Our previous study demonstrated that the sexual dimorphism in the SP 1 expression is regulated at the level of mRNA (11). To confirm the specific step at which the cellular level of SP 1 mRNA is regulated, the amount of precursor RNA for SP 1 mRNA in the fat body was measured by the method of intron-labeled S1 mapping.
As is evident from Fig. 6, the SP 1 precursor RNA was detectable in the fat body RNA of the male as well as the female fourth instar larvae. In the fifth instar larvae, the amount of the SP 1 precursor RNA markedly increases in females, while it is scarcely detectable in males. The precursor RNA reappears in the fat body of both sexes at the early pupal stage. The result of S1 nuclease mapping is consistent with that of RNA blot analysis (ll), indicating that the male fifth instar larvae contain neither SP 1 mRNA nor its precursor. Therefore, it can be safely concluded that the sex-and developmental stage-specific expression of SP 1 is regulated at the level of transcription in the fat body. nitrogen and DNA "a8 extracted in the presence of proteinase K according to the I)TOcednre described bv Ohahlma and Suzuki 1121. Since this DNA preparatlan was contaminated with a large amount of polysaccharides, the solution was clarifled by Centrifugation at 120,000 x g for 30 mi". The fat body DNA (2 m q ) was digested to completion with 2,000 un-%RI at 37'C for 3 hr. The DNA fragments were electrophoresed O n 0.6% agarose g e l (1.5 x 10 cm surface area and 6 crn length) equilibrated with TAE buffer 1161 and fractionated by use Of an automatic preparative g e l electrophoresis apparatus as described previously 1 1 1 1 .
The SP 1 gene sequence in the eluate was located by dot blot hybridization of an aliquot from each fraction with the 32P-labeled SP 1 =DNA probe. Fractions containing the SP 1 genomic DNA sequence was pooled and DNA was recovsred by ethanol precipitation.  . Tahara plaque hybridization with the 3rP-labeled pBmSPlC1 cDNA a= described 116). Plaques which gave positive hybridization signals were picked up and the phaqes were re-screened. After three Cycles of screeninq. well-isolated The recmbinant phases car* ing t h m sequence were identified by plaques were picked and stocked.
DNA sequencinq DNA sequence was determined by the dideoxynucleatide chain termination method of Sanger &e. ( 1 9 ) .

RNa9e m:winq
The 32P-labeled =RNA probe (10-20 ng, 1 x lo5 cpml and 5 ug Of hybridization solution containing 80% formamide 0.4 M NaCl 40 mM PIPES-NaOH, poly(A1 RNA were mixed and evaprated. Residue was dissolved in 30 u1 of a pH 6.4. 1 mM EDTA and the mixture was incubated'at 85'C for'5 min to denature RNAs. Hybridization reaction was then carried out at 55'C for 12-24 hours. After incubation, ""hybridized RNAs were digested with RNase A and T1 and the protected fragments were isolated according to the method Of Melton et dl. 117). The protected =RNAs were dissolved in a formamide-dye mixture?9= formarnide, 0.05% bromophenol blue and 0.05% xylene cyano11 separated on 4% or autoradlography.

Primer extension
The 5 ' end of the -I Site (nucleotide psition +113 from the cap sltel within the first exon of the SP 1 gene was labeled with [Y-32PlATP. The labeled DNA was digested with %I (position +lo) and the 73 base Jlngle stranded primer DNA was isolated. This primer DNA I1 x 105 cpm specific activity; 2 x lo7 cprnlugl was mixed with 30 ug Of total fat body &A from the female fifth instar l a r v a e . lyophilized, and dissolved in 30 u l of the hybridization solution. Nucleic acids were denatured and then hybridized at 37'C for 16 hr. The hybrid was precipitated with ethanol after 10-fold dilution with an ice-cold TE8 and dissolved in 20 u l of 2 x RT buffer I1 x RT buffer; 100 mM Tris-C1 pH 8.3 10 mM MgC12 10 mM dithiothreltol 5 0 mM KC1, The DNA fragment labeled at the site (position +113l was digested With AluI (position -7 4 ) and the 186 base single stranded DNA I1 x lo5 cpm, Specifractivity; 5 x 106 cpmlugl was isolated. This DNA probe was hybridized with 30 ug of the fat body RNA from the female fifth were dlluted with each 300 u l of an ice-cold s1 nuclease buffer (30 mM sodium instar larvae as described above. After hybridization, the reaction mixtures acetate. pH 4.6. 280 RUI NaCl, 4.5 mM ZnSOd and 20us/nl of heat-denatured salmon sperm DNA) containing 25, 100 and 400 unitslml Of S1 nuclease, protected from S1 nuclease digestion were pacified and separated on 8% respectively. The mixtures were incubated at 37'C for 1 hr. The fragments acrylamidel7 M urea gel.
Detection of recur80r rwR TO  DNA-RNA hybrids protected from S1 nuclease digestion were separated on 49 treated with 100 unita/ml of S1 nuclease at 37'c for 1 hr as above and the acrylamidel7 M urea gel and subjected to autoradiography.