Cloning of a cDNA for a Novel Insulin-like Peptide of the Testicular Leydig Cells*

We have isolated complementary DNA clones coding for a novel member of the insulin-like hormone superfamily from a boar testis cDNA library. Northern blot analysis and in situ hybridization revealed that the gene is expressed exclusively in prenatal and postnatal Leydig cells. We have tentatively proposed the name Leydig insulin-like (Ley I-L) for the gene and ita encoded protein. The Leydig insulin-like protein is synthesized as a 131-amino acid preproprotein, which contains a 24-amino acid signal peptide. Comparison of the deduced amino acid sequence of pro-Leydig insulin-like protein with members of the insulin-like hormone superfamily predicts that the biologically active protein, after proteolytic processing of the C-peptide, consists of a 32-residue-long B-chain and a 20-residue-long A-chain and has a molecular size of 0.25 kDa. The insulin-related gene family, comprised of insulin, relaxin, and insulin-like growth factors I and I1 (IGF I and II),l represents a group of structurally related polypeptides whose biological functions have diverged. They are regulators of both growth and development in many different tissues (1). Some of these factors have been found to play a crucial role in spermatogenesis. Insulin can stimulate the transfer of glucose to lactate in Sertoli cells (2)

W e have isolated complementary DNA clones coding for a novel member of the insulin-like hormone superfamily from a boar testis cDNA library. Northern blot analysis and in situ hybridization revealed that the gene is expressed exclusively in prenatal and postnatal Leydig cells. W e have tentatively proposed the name Leydig insulin-like (Ley I-L) for the gene and ita encoded protein. The Leydig insulin-like protein is synthesized as a 131-amino acid preproprotein, which contains a 24amino acid signal peptide. Comparison of the deduced amino acid sequence of pro-Leydig insulin-like protein with members of the insulin-like hormone superfamily predicts that the biologically active protein, after proteolytic processing of the C-peptide, consists of a 32residue-long B-chain and a 20-residue-long A-chain and has a molecular size of 0.25 kDa.
The insulin-related gene family, comprised of insulin, relaxin, and insulin-like growth factors I and I1 (IGF I and II),l represents a group of structurally related polypeptides whose biological functions have diverged. They are regulators of both growth and development in many different tissues (1). Some of these factors have been found to play a crucial role in spermatogenesis. Insulin can stimulate the transfer of glucose to lactate in Sertoli cells (2) which is the main energy source for spermatogenesis (cf. Ref. 3), while IGF I and I1 act as mitogens in spermatogonial proliferation (4).
We report here a novel member of the insulin-related gene family which is expressed only in pre-and postnatal testicular Leydig cells. As deduced from the cDNA sequence, the preproprotein is composed of 131 amino acids, containing a hydrophobic signal peptide. The Band A-chain sequences of insulin and relaxin are located at the Nand C-terminal regions of the proprotein, respectively, and are separated by a long C-peptide. Such a protein could play a very important role in testicular function.

EXPERIMENTAL PROCEDURES
Isolation of cDNA Clones for Porcine Ley 1 -6 3 0 0 0 unamplified recombinants of a randomly primed boar testis cDNA library (5) were * This work was supported by Grant En 84/19-2 from the Deutsche ForschungsgemeinschaR (to W. E.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
to the GenBankTMIEMBL Data Bank with accession numbeds) X58369.
The abbreviations used are: IGF I, insulin-like growth factor I; IGF 11, insulin-like growth factor 11; Ley I-L, Leydig insulin-like; bp, base pair(s); kb, kilobase(s). plated at a density of 500 plaque-forming units per 132-mm diameter plate and screened by the +/screening method (6). The 32P-labeled cDNAs to be used as probes were synthesized from poly(A)-rich RNA prepared from boar testis and liver using cloned Moloney murine leukemia virus reverse transcriptase (Bethesda Research Laboratories) and mixed phosphorylated hexanucleotides P (~N )~ from Pharmacia LKB Biotechnology Inc. as primers (7). Hybridization of the two labeled probes with the duplicate filters was carried out at 65 "C overnight in the following solution: 5 x SSC (0.15 M NaCl, 15 m~ sodium citrate, pH 7.0), 5 x Denhardt's solution (0.2% (w/v) polyvinylpyrrolidone, 0.2% (w/v) bovine serum albumin, 0.2% (w/v) Ficoll), 0.1% SDS, and salmon sperm DNA at 200 pglml. The filters were washed twice at 65 "C to final stringency at 0.2 x SSC, 0.1% SDS, dried, and exposed to Kodak X-Omat films at -80 "C for 1 week with intensifying screens. Several cDNA clones that gave strong hybridization signals in the autoradiographic films with the testis cDNA probe only were used to characterize gene expression pattern by Northern blot analysis. EcoRI fragments from these cDNA clones were isolated, 32P-labeled with a multiprime labeling kit (Amersham), and used to hybridize Northern blots containing total RNA from testis, brain, and liver under the same hybridization condition mentioned above. The 450-bp cDNA fragment of the clone BLey I-L1 was found to hybridize with a 0.9-kb testicular RNA, while no hybridization signal was obtained with the RNA of somatic tissues. The other cDNA clones were found to contain either repetitive sequences or to hybridize with transcripts from somatic tissues. The 450-bp cDNA fragment of the clone BLey I-L1 was used for further screening of a randomly primed boar testis cDNA library. Two cDNA clones, BLey I-L2 and BLey I-L3, were isolated and sequenced ( Fig. 1).

Amplification a 3' End of the Lq I-L cDNA--Total RNA(200 ng) from
boar testis was reverse-transcribed with rTth of the GeneAmp Kit (Perkin-Elmer Cetus Instruments) at 70 "C for 15 min using primer (dT),,. The cDNA was used as template for amplification with primer (dT),, and a gene specific primer containing the sequence from 516 to 537 (Fig.  2). T h e amplified product was electrophoresed in 1% low melting agarose gel, purified by GENECLEAN (Bio 101 Inc.), and directly sequenced (8) using a gene-specific primer. DNA Sequencing--To determine the nucleotide sequences of the cDNA clones, several subclones were constructed in pUCl8 (Pharmacia) (Fig. 1). Plasmid DNA was used for double-strand DNA sequencing using Sequenase T4 DNA polymerase under conditions recommended by the supplier (United States Biochemical Corp.). Either internal primers, designed from the cDNA sequence, or external universal primers from pUCl8 vector were employed. Sequences were determined completely on both strands several times. Sequence data were analyzed with the help of the DNA STAR computer program.

In Vitro Dunscription and Damlation-The 737-bp EcoRI fragment
of the BLey I-L2 cDNA clone containing the open reading frame sequence was subcloned into pGEM 3Zf(+) (Promega Biotec). The recombinant plasmid was linearized by digestion with Hind111 or with EcoRI, and the linear DNAs were transcribed in uitro with SP6 RNA polymerase and T7 RNA polymerase, respectively, in the presence of m7GpppG using the transcript kit (Boehringer Mannheim). The capped RNA was treated with 1.5 units of deoxynuclease I for 15 min at 37 "C, extracted with phenol/chloroform, and subsequently concentrated by ethanol precipitation. The integrity and quantity of the RNA products were evaluated by electrophoresis on 1% agarose gel containing formaldehyde (9). The transcripts (200 ng) were translated in a rabbit reticulocyte lysate (Boehringer Mannheim) by the procedure recommended by the manufacturer in the presence of [36Slmethionine (Amemham). Radiolabeled translation products were separated by electrophoresis through a 15% polyacrylamide gel containing 0.1% SDS (10). Gels were fixed in 20% trichloroacetic acid and 50% isopropyl alcohol, treated with Enlightening autoradiographic enhancer ( W o n t NEN), dried, and exposed to Kodak XAR film. Northern Blot Analysis-Total RNA was extraded (11) from M e rent porcine tissues (testis, ovary, brain, spleen, muscle, heart, lung, liver, kidney, hypophysis, hypothalmus), from testes of different species (human, baboon, bull, boar, goat, rabbit, hamster, mouse, rat), from testes of sexually mature and prepubertal bulls (1-8 months old), from round spermatids and pachytene spermatocytes isolated by centrifugal elutriation (12) from mature bull testis, and from Leydig cells isolated from 2-week-old immature porcine testes by collagenase treatment (13). The RNA was size-fractionated by electrophoresis on 1% agarose gel containing formaldehyde (S), transferred to nitrocellulose filters (14), and hybridized with the BLey I-L2 cDNA probe under the same conditions as used for library screening. To ensure that all RNA samples contained hybridizable RNA, the human a-actin (15) or the human acrosin cDNA (16) probes were used for reprobing of the Northern blots.
4-nm sections were cut and placed on 3-aminopropyltriethoxysilanecoated slides. To produce a porcine Ley I-L riboprobe, a 450-bp BarnHIl EcoRI fragment of the clone BLey I-L1 ( Fig. 1) was cloned into BarnHI and EcoRI sites of pSPT18 (Boehringer Mannheim). This plasmid DNA was digested with Sal1 and served as a template for T7 RNApolymerase to produce a 450-bp %ensen riboprobe. The EcoRI linearized plasmid DNA was used as a template for SP6 RNA polymerase to produce a 450-bp "antisense" riboprobe. The digoxigenin-UTP-labeled Ley I-L RNA was prepared by in uitro transcription using the kit from Boehringer Mannheim. Antisense and sense RNA probes were degraded to 100 bases of average length and then used to hybridize the testis sections. Prehybridization, hybridization, and washes were performed as described by Hemmati-Brivanlou et al. (17).

RESULTS
Isolation and Nucleotide Sequence of the Porcine Ley I-L cDNA-We have used a differential cDNA screening method to isolate clones representing genes expressed exclusively in testis but not in other tissues. A randomly primed boar testis cDNA library was screened using radiolabeled first strand cDNA probes prepared from poly(A)-rich RNA of testis and   liver. One cDNA clone, BLey I-L1, was found to identify testisspecific transcripts. Using the BLey I-L1 cDNA, two further cDNA clones could be isolated from the cDNA library and were sequenced according to the sequence strategy given in Fig. 1.
Since these cDNA clones were isolated from a randomly primed cDNA library, and therefore lacked 3' sequences, a rapid amplification of cDNA ends (RACE) by the polymerase chain reaction was performed (18). The polymerase chain reaction product was directly sequenced (8) (Fig. 1). The Ley I-L cDNA sequence contains a 5"untranslated region of 5 bp, followed by an open reading frame of 393 bp, a terminal codon, and a 3"untranslated region of 355 bp (Fig. 2). The suggested translation s t a r t site ATG a t position 6 to 8 is flanked by sequences which are identical with the Kozak translation consensus sequence (CCG/ACC(ATG)G) (19). The consensus polyadenylation sequence, AATAAA, is located 8 bp 5' of the polflA) tract. In order to prove a 5"untranslated region is included in the cDNA sequence, we performed primer extension analysis. This experiment revealed that mRNA transcription starts 14 nucleotides upstream of the translation initiation codon A'I'G (data not shown).
Deduced Amino Acid Sequence of Porcine Prepro Ley I-L -The Ley I-L cDNA sequence predicts a 131-amino acid protein (Fig. 2) of 14.134 kDa. To confirm the validity of the open reading frame, in vitro transcription was carried out using Ley I-L2 cDNA cloned into a pGEM plasmid as a template, and both sense and antisense Ley I-L RNAs were translated in vitro. On SDS-polyacrylamide gel, one translation product with a molecular mass of about 14 kDa was detected (Fig. 31, this molecular mass being closely similar to the molecular mass predicted from the open reading frame. A search of protein data banks (SWISS-PROT and NBRF-PIR) with the deduced amino acid sequence revealed homology only between the Ley I-L protein and members of the insulinrelated hormone superfamily from several species. A considerable homology between Ley I-L and a member of the insulinrelated superfamily exists in the B-and A-chains, which contain the sequence features that define the insulin-related hormone superfamily (1). Analysis of the hydropathy profile (20) of the Ley I-L indicates that the protein contains a hydrophobic domain at the N terminus similar to that of other signal peptide sequences. The signal peptide of Ley I-L is comprised of 24 amino acids (Fig. 2) and contains residues -12 (Leu) and -1 (Ala) (Fig. 2) which are conserved in all members of the insulinrelated superfamily (21). Alignments of the predicted primary amino acid sequence of the pro-Ley I-L to members of the insulin-like superfamily revealed the B-and A-chains to be located in Ley I-L extending from amino acids 1 to 32 and from 82 to 107, respectively (Fig. 2). Thus, 17 and 15 amino acids (including the 6 cysteine residues) are identical with those in equivalent positions in porcine relaxin (34%) (22,231 and insulin (30%) (241, respectively. The homology to IGF I (25,261 and I1 (26) is decreased to 28% (Fig. 4).
The Expression of the Ley I-L Gene-lb analyze Ley I-L gene expression, blots prepared with total RNA of different porcine tissues were hybridized with the 32P-labeled Ley I-L cDNA. The results obtained with RNAof ovary, liver, muscle, brain, kidney, spleen, and testis are given in Fig. 5. The Ley I-L gene is expressed exclusively in testis as a 0.9-kb RNA. To determine the cellular localization of the Ley I-L transcript in the testis, hybridization with total RNA of spermatocytes, spermatids, and Leydig cells was performed and resulted in a signal only with RNA from Leydig cells (Fig. 5). Northern blot analysis performed with total RNA of bovine testes of 1-to 8-month old animals demonstrated equal levels of Ley I-L gene expression during postnatal testicular development (data not shown). In situ hybridization on testis sections of a 3-month-old porcine embryo with digoxigenin-UTP-labeled antisense RNA probe of Ley I-L RNA showed expression of the Ley I-L gene in Leydig cells, but not in other testicular cells (Fig. 6). These results suggest that the Ley I-L gene is expressed exclusively in prenatal and postnatal Leydig cells.
In order to assess whether the Ley I-L gene is conserved in mammalian species and to demonstrate whether this gene is expressed in testis of different mammalian species, the porcine Ley I-L cDNA was used as a probe in low stringency Northern hybridization experiments with testicular RNA of different mammalian species. Cross-hybridization was detected with a 0.9-kb transcript of testicular RNA from human, baboon, bull, sheep, and goat, but not with that from hamster, rabbit, rat, and mouse (Fig. 7). Also at the level of genomic DNA, no hybridization signal was obtained with the DNA of these latter species (data not shown).

DISCUSSION
In this report we describe a gene encoding a protein which is exclusively expressed in prenatal and postnatal Leydig cells. This protein is designated as Leydig insulin-like peptide (Ley I-L). A comparison of the deduced amino acid sequence of Ley I-L with that of members of the insulin-like hormone superfamily (insulin, relaxin, and IGF I and 11) indicates that the Ley I-L is a novel member of the insulin-like hormone superfamily, containing a signal peptide, B-chain, connecting C-peptide, and A-chain.
In the members of the insulin-like superfamily, the signal peptide facilitates the secretion of the prohormone and the connecting C-peptide mediates correct folding of the protein and the formation of the three disulfide bridges in the active hormone. In proinsulin and prorelaxin, the Band A-chain are located at the N-and C-terminal, respectively, and are separated by a long C-peptide which is removed during processing to form the active hormone. The pro-IGF I and pro-IGF I1 contain a small C-peptide and two additional domains (D and E) at the C terminus of which the E-peptide is removed during processing while the C-peptide is maintained in the active protein (1). The primary structure of pro-Ley I-L shows more structural similarity to proinsulin and prorelaxin than to pro-IGF I and pro-IGF 11. The pro-Ley I-L contains the conserved amino acids of the Band A-chain at the N-and C-terminal regions which are separated by a long C-peptide (Fig. 2). We therefore suggest that the mode of the in vivo processing of the pro-ley I-L and the resulting structure of Ley I-L peptide is similar to that of insulin.
It is known that the processing of proinsulin occurs at dibasic amino acids at the N-and C-terminal ends of the C-peptide. In the pro-Ley I-L, cleavage at the C-peptiddA-chain junction could occur after Arg-81 within a group of 6 basic residues and after Gly-32 a t a single basic residue at the B-chainfC-peptide junction (Fig. 2). Our assumption that this single basic residue is used for cleavage is sustained by our observation that two basic amino acids in the human pro-Ley I-L are present at the same position (Arg-32 and Arg-33). Furthermore, proteolytic processing at single basic residues has been reported in the generation of other hormones including IGF I (271, IGF I1 (28), epidermal growth factor (291, and growth releasing factor (30). If the predicted proteolytic sites of the pro-Ley I-L are correct, the active Ley I-L peptide is composed of a B-chain with 32 residues and an A-chain with 26 residues. The putative protein structure would be expected to have a molecular size of 6.25 kDa and an isoelectric point of 7.7. The C-peptide which is removed during the processing of the pro-Ley I-L contains 49 residues and shows no sequence homologies to that of proinsulin (33 residues) (31) and prorelaxin (104 residues) (32).
Further sequence analyses suggest that not only the primary structure but also the tertiary structure of Ley I-L, insulin, and relaxin may be similar. The porcine Ley I-L has not only retained the 6 cysteine residues but also the glycines at B8 and B20 of porcine insulin and relaxin (Fig. 4) which provide unique torsion angles for chain folding. In addition, the residues B6, B11, B12, B14, B E , B18, Al, and A16 which contribute to the hydrophobic core of insulin (33) and relaxin (34) are all hydrophobic in Ley I-L with the exception of A1 (an isoleucine). Moreover, a histidine residue at B10 of insulin which was demonstrated to be essential for the binding of zinc in the hexamer structure (35) is found in an equivalent position in Ley I-L (Fig.  4). Such strong common structural features between Ley I-L, insulin, and relaxin would predict that the two disulfide crosslinks that connect the A-and B-peptides of Ley I-L occur between Cys-1Mys-92 and Cys-22-Cys-96 and that the intrachain disulfide bond within the A-chain occurs between Cys-91 and Cys-105 (Fig. 2).
Further evidence that the Ley I-L belongs to the insulin-like hormone superfamily comes from cloning of its gene.2 Alignment of the Ley I-L gene with those of the insulin-like family indicates that Ley I-L, insulin (36), and relaxin (32,371 share a similar gene organization. The coding sequences exist in two exons. The intron interrupts the coding sequence of the Cpeptide at an equivalent position in the three genes, namely at the Sth, 7th, and 14th amino acid codon, respectively, and after the first nucleotide in the codon. Northern blot experiments and in situ hybridization revealed that the Ley I-L gene is expressed only in prenatal and postnatal Leydig cells. The Leydig cells represent the endocrine tissue of the testis and provide hormones which play a role in spermatogenesis and differentiation and maintenance of the male phenotype (38). The Ley I-L peptide could play an important role in these developmental processes. The cloning of the Ley I-L cDNA is the initial step toward a better understanding of the role of this peptide in testicular function. In addition, the deduced amino acid sequence allows us to produce Ley I-L specific antiserum which will be a tool for the purification of the native protein and for the study of its receptor localization and thereby its biological function.