Prostatic steroid-binding protein. Isolation and characterization of C3 genes.

Prostatic steroid-binding protein, whose expression is stimulated by androgens, consists of two subunits, one containing the polypeptides C1 and C3 and the other containing the polypeptides C2 and C3. We have isolated and sequenced cDNA clones specific for C3 mRNA and used them to isolate and characterize genomic clones for two C3 genes. Both genes are 3.2 kilobases with identical exon/intron arrangements, which is similar to the organization of the C1 and C2 genes, suggesting that they may have arisen by duplications of an ancestral gene. Finally, homologous human genes have not been detected.

as modified by Land et al. (141 was used as previously described (16). Double-stranded cDNA molecules were inserted into the PstI site of PAT 153 (17) and colonies were first screened by in situ hybridization (18). Recombinant plasmids were isolated (19) and bound to diazobenzyloxymethyl paper (20) to identify them by mRNA purification. First, they were hybridized with total prostate poly(A)containing RNA (6 pg), washed, and the bound RNA was eluted and translated in a cell-free system derived from wheat germ (12).
DNA Sequencing-DNA sequence analysis of pA34 was carried out using the method of Maxam and Gilbert (21) by sequencing in both directions from the BstEII, XbaI, and BgnI restriction enzyme sites (Fig. 2).
Genomic DNA Cloning and Screening-Sprague-Dawley rat DNA, obtained from the ventral prostate of a single animal, known to contain four C3 related EcoRI fragments, was digested to completion with EcoRI and ligated into the purified arms of bacteriophage hgt WES. (22). A partial EcoRI and partial HaeIII rat DNA library constructed with liver DNA from a single Sprague-Dawley rat was also used (23). Screening was by the method of Benton and Davis (24). Initially, "P-labeled total cDNA was used as the DNA probe because we wished to isolate all three prostatic binding protein genes, namely C1, C2, and C3. After the clones were purified, they were distinguished using specific [32P]cDNA plasmids labeled by nicktranslation (25).
Restriction Enzyme Mapping-Rat liver and prostate DNAs were isolated using the method of Blin and Stafford (26) and, in cases where digestion with restriction enzymes proved difficult, were further purified through CsCl gradients. Rat DNA (20 pg) and recombinant phage DNA were digested with restriction enzymes and separated by electrophoresis on agarose gels. Transfer to nitrocellulose was as described by Southern (27). Hybridization was carried out with nicktranslated "P-labeled DNA probes and, in the case of cell DNA blots, dextran sulphate (28) was included in the hybridization buffers.
Analysis of R-loops in the Electron Microscope-DNA samples (X11B and X61) were hybridized at 25 pg/ml with prostate mRNA a t 50 p g / d in 70% formamide, 0.4 M NaC1, and 0.1 M 1,4-piperazinediethanesulfonic acid buffer, p H 7.2. After incubating for 1 h a t 47.5 "C, the sample was diluted and spread on Hz0 as described by Wahli et al. (29). The nucleic acids were visualized for electron microscopy by shadowing with platinum.
Analysis of Nuclear RNA on Agarose Gels-Rat prostate nuclei were prepared by the citric acid method described by Busch (30) and nuclear RNA was isolated as described by Roop et al. (31). The RNA samples were made 5 mM in methylmercury hydroxide and separated by electrophoresis on agarose gels containing 5 mM methylmercury hydroxide. After staining with ethidium bromide to visualize the 18 S and 28 S mRNA bands, which were used as markers, the RNA was transferred to diazobenzyloxymethyl paper as previously described (12,32). Hybridization with ["'P]pA34 and autoradiography were then carried out (12).

RESULTS
Identification and Characterization of C3 cDNA Clones-Total prostate cDNA was cloned in the PstI site of the plasmid PAT 153, and tetracycline resistant colonies, which contained prostate DNA sequences as shown by in situ hybridization, were selected for further study. Individual plasmids were identified by mRNA purification, and restriction enzyme analysis suggested that four contained C3 cDNA of approximately 600 nucleotides, which is similar to the size estimated for C3 mRNA (12). One clone, pA34, was selected for DNA sequencing ( Fig. 1). From the DNA sequence, it was possible to predict a polypeptide sequence and this agreed completely with the sequence of the secreted C3 polypeptide (32). In addition, it is likely that the protein is translated with a socalled signal peptide of 18 amino acids with an AUG start codon a t nucleotide position 55. Thus, C3 mRNA comprises a coding region of 285 nucleotides with a UAA termination Isolation of C3 Genomic Clones-Initially, we isolated genomic clones from two amplified Sprague-Dawley rat DNA libraries which were constructed from a partial EcoRI digest and a partial HueIII digest of liver DNA, cloned into the purified arms of bacteriophage X Charon 4A (23). By screening 600,000 plaques of each DNA library we identified thirteen EcoRI clones and three HueIII clones which hybridized with [:"P]pA34. Preliminary restriction enzyme analysis of the clones indicated that they covered similar regions of the rat genome and together comprised approximately 22 kbl of rat DNA as shown in XllB and X5D (Fig. 2). However, restriction enzyme analysis of DNA from individual animals suggested that most rats possess other C3 related sequences (Fig. 3). One possible explanation for failing to isolate clones containing such sequences was that they had been lost during amplification of the original libraries and therefore we have constructed our own nonamplified EcoRI library in bacteriophage hgt WES. Thus far, we have identified one additional 12.5 Kb clone (X61) which is distinct from those previously isolated. I The abbreviation used is: kb, kilobases. Characterization of C3 Genes-The organization of the C3 genes in representative clones XllB, X5D, and X61 and various derivative plasmid subclones has been investigated by restriction enzyme digestion and blot hybridization and by analysing R-loops in the electron microscope.
The cleavage sites of EcoRI, BamHI, BgZII, BstEII, HindIII, XbaI, PstI, and MspI were mapped and the DNA fragments that contain coding sequence were identified by hybridizing blots with :'2P-labeled pA34 which indicated that the C3 genes consisted of three exons separated by two intervening sequences (Fig. 2). The orientation of the gene was obtained by using a 5' and 3' specific DNA probe. The cDNA clone pA34 was digested with PstI to yield 140 nucleotide and 450 nucleotide DNA fragments which represent the 5' and 3' ends of the mRNA respectively (Fig: 2). The exon/intron arrangement was confirmed with X61 by electron microscopy. Measurements of 28 R-loops (Fig. 4) indicated that the gene was 3.2 f 0.2 kb and consisted of 3 exons of approximately 0.13, 0.21, and 0.21 kb in a 5' to 3' orientation separated by intervening sequences 1.72 f 0.10 kb and 0.77 f 0.08 kb.
In contrast, analysis of R-loops formed with XllB only partially confirmed the Southern blotting data and suggested that two exons of 0.15 and 0.20 kb were separated by an intervening sequence of 2.9 kb. However, a portion of mRNA appeared to form a "bridge" between the two exons which presumably is due to mismatching with the middle exon in 11B and results in failure to form an R-loop. It is noteworthy that although we detect this middle exon on blots, a comparison of an MspI blot of the two genomic clones shows that the 500-base pair fragments which contain this exon hybridizes more strongly to [:"PP]pA34 in X61 than in XllB. It is unlikely that the 11B gene is expressed in vivo in amounts comparable with the 61 gene because the presence of 11B mRNA in the ventral prostate would have lead us to observe at least some complete R-loops showing annealing of mRNA to the middle exon.
These results can be rationalized when analysis of the C3 gene in individual animals is taken into account by postulating that there are two C3 genes per haploid genome each of which can show allelic differences. As indicated in Fig. 3, individual rats contain various combinations of EcoRI fragments of 10.5, 11, 12.5, and 13 Kb, the latter two being unresolved but form a doublet which is more intense and broader than the former individual bands (see tracks 5 and 9). In view of the R-loop analysis, we conclude that the gene represented in X61 but not XllB is probably expressed in normal prostate to produce C3 and the animals in tracks 1 and 4 are homozygous for these two genes. Other animals are heterozygous and show polymorphism at the EcoRI sites flanking the gene to produce fragments of 10.5 and 11 Kb. Hind11 digests of cell DNA also support this conclusion (data not shown). Finally, during the course of blotting experiments, we noted that XllB contains an additional gene which is expressed in ventral prostate but which appears to share no homology with the C3 gene (data not shown).
Analysis of Nuclear RNA-We have analyzed nuclear RNA isolated from rats of different hormonal states on Northern blots to investigate the effect of androgens on C3 RNA levels and to identify potential primary RNA transcripts. We were unable to detect C3 RNA in rats castrated for 6 days but testosterone restored RNA species of approximately 650, 1,500, and 4,000 nucleotides even within 1 h (Fig. 5). The 650 nucleotide species represents mature mRNA (12) and we assume that the 4,000 nucleotide species represents the primary transcript and that the 1,500 nucleotide species represents a spliced mRNA precursor. This agrees reasonably well with the proposed organization of the C3 gene and suggests that the large 5' intervening sequence is removed prior to the 3' intervening sequence. Identification of Human Prostatic Steroid-binding Protein Genes-An analogous steroid-binding protein has been reported in human prostatic tissue (33) and therefore it was of interest to investigate whether there were homologous C3 genes in human DNA. However, analysis of DNA blots with [:"'P]pA34 both at high and low stringency (1 X SSC a t 65 "C) failed to reveal specific hybridizing fragments in human DNA (Fig. 3). Since we know that cross-hybridization of the C1 and C2 genes occurs a t 1 X SSC a t 65 "C and they share 75% homology (16), we assume that equivalent human genes, if they exist, must share less than 75% homology with the rat prostatic steroid-binding protein genes.

DISCUSSION
We conclude from R-loop analysis and Southern blotting data that there are two C3 genes per haploid genome, both of which we have isolated from rat DNA libraries. However, in noninbred strains of rat, such as Sprague-Dawley, it is often difficult to decide whether a particular gene is represented by multiple copies or whether it exhibits polymorphism without doing breeding experiments.
It is conceivable that both C3 genes are expressed in uiuo to produce the two C3 polypeptide chains in prostatic steroidbinding protein. However, the R-loop analysis leads us to think that the 11B gene is not transcribed to a comparable level with the 61 gene, thus, the latter is probably responsible for the production of both C3 polypeptide chains. It should be noted though, that only the C3 polypeptide in the S subunit of prostatic binding protein has been sequenced (32) and its similarity with the chain in the F subunit is based solely on electrophoretic mobility (11).
The similarity in exon/intron arrangements in X61 and X 1 1B and the DNA sequence homologies, a t least in the 5' and 3' exons, suggest that the two genes have arisen from the duplication of an ancestral gene. Interestingly, the C1 and C2 genes also have similar exon/intron arrangements to one another and share considerable DNA sequence homologies which suggests that they also arose by gene duplication (16, 34). More remarkable is the fact that these two genes and the C3 genes are all 3.2 kb and contain three exons separated by introns of 1.7-1.8 kb at the 5' end and 0.8-0.9 kb at the 3' end. Obviously, it is conceivable that these two pairs of genes are themselves derived by duplication of an ancestral gene but if this is the case, there has been considerable divergence between the C3 genes on the one hand, and the C1 and C2 genes on the other hand. Comparison of the DNA sequence of pA34 with the cDNA sequences of C1 and C2 (16) indicate several homologies. For example, in the first exon, nucleotides 58-79 in C1 share 75% homology with nucleotides 61-82 in C3 and in the second exon, nucleotides 210-235 in C2 share 76% homology with 229-251 in C3 but these homologies are not as extensive as those found between C1 and C2. It is also possible that all four genes are linked but we have no evidence for this inasmuch as there is no overlap between genomic clones.
Finally, it is surprising that we failed to isolate genomic clones which contained C3 homologous EcoRI bands in cell DNA of 10.5 and 11 kb since they should have been efficiently packaged in Xgt WES. We believe these fragments contain alleles of the genes isolated and are constructing and screening other nonamplified libraries to investigate this. Also, we are sequencing X61 and XllB to compare the two C3 genes and confwm whether or not X61 can encode C3.