The Primary Structure of the 3%kDa Subunit of Human Replication Protein A*

Replication protein A (RP-A) is a complex of three polypeptides of molecular mass 70, 32, and 14 kDa, which is absolutely required for simian virus DNA replication in vitro. We have isolated a cDNA coding for the 32-kDa subunit of RP-A. An oligonucleotide probe was constructed based upon a tryptic peptide sequence derived from whole RP-A, and clones were isolated from a Xgtll library containing HeLa cDNA inserts. The amino acid sequence predicted from the cDNA contains the peptide sequence obtained from whole RP-A along with two sequences obtained from tryptic peptides derived from sodium dodecyl sulfate- polyacrylamide gel-purified 32-kDa protein 29,228 agreement the of the 32-kDa the in the base. The the

No significant homology was found with any of the sequences in the GenBank data base. The protein predicted from the cDNA has an N-terminal region rich in glycine and serine along with two acidic and two basic segments. Monoclonal antibodies have been raised against the 70-and 32-kDa subunits of RP-A. The cloned cDNA has been overexpressed in bacteria using an inducible T7 expression system. The protein made in bacteria is recognized by a monoclonal antibody that is specific for the 32-kDa subunit of RP-A. This monoclonal antibody against the 32-kDa subunit inhibits DNA replication in vitro.
Efforts to understand the replication of chromosomal DNA in animal cells have long been frustrated by the large size and complexity of their genomes. In order to overcome this problem, many investigators have used viral genomes as model systems for studying the processes involved in DNA replication. Simian virus 40 (SV40)' has been extensively studied as a model system because the replication of its genome shares high performance liquid chromatograph(y); IPTG, isopropyll-thio-P-D-galactoside. many features with the replication of animal cell chromosomes (DePamphilis and Wassarman, 1982;Kelly et al., 1988). SV40 DNA replication requires only a single virus-encoded protein, large T antigen, the other replication functions being provided by the host cell.
In order to identify the cellular proteins involved in SV40 DNA replication, our laboratory has used a hypotonic extract of human tissue culture cells in which SV40 DNA is able to replicate in vitro Kelly, 1984, 1985). DNA replication in vitro shares many characteristics with SV40 DNA replication in uiuo, including a requirement for the SV40 origin of replication, a requirement for large T antigen, and a requirement that extracts be made from cells that are permissive for viral replication in viuo Kelly, 1984, 1985;Stillman and Gluzman, 1985;Wobbe et al., 1985). Our laboratory has undertaken fractionation of crude extracts made from HeLa cells and reconstitution of replication activity. Through this approach, we have obtained evidence for the involvement of at least seven cellular factors in the complete replication of SV40 origin-containing DNA (Wold et al., 1989). Five of these have been highly purified: DNA polymerase cu-primase complex Wold et al., 1989); topoisomerases I and II (Yang et al., 1987); proliferating cell nuclear antigen (Prelich et al., 1987;Wold et al., 1988Wold et al., , 1989; replication protein C (the catalytic subunit of protein phosphatase 2A) (Virshup and Kelly, 1989); and replication protein A (RP-A) (Wold and Kelly, 1988;Virshup et al., 1989), also known as replication factor A (Fairman and Stillman, 1988) or HeLa SSB (Ishimi et al., 1988).
RP-A is absolutely required for DNA replication in a reconstituted system (Wold and Kelly, 1988;Fairman and Stillman, 1988;Ishimi et al., 1988). It is believed that RP-A participates in a very early step in initiation because it is required for T antigen-dependent origin-dependent unwinding of the DNA template (Wold and Kelly, 1988). It is not known whether RP-A participates in the elongation step of replication as well as in initiation.
RP-A is purified from HeLa cells as a complex consisting of three subunits of 70,32, and 14 kDa (Wold and Kelly, 1988;Fairman and Stillman, 1988;Ishimi et al., 1988). This complex is very tightly associated since all three subunits cosediment in glycerol gradients run in 6 M urea (Fairman and Stillman, 1988) or in 1.7 M urea with 0.5 M KC1 (Wold and Kelly, 1988). RP-A is an SSB, demonstrating a lOOO-fold greater affinity for single-stranded DNA than for doublestranded DNA as measured by a nitrocellulose filter binding assay in which the binding of radiolabeled denatured DNA was competed with unlabeled single-and double-stranded DNA (Wold et al., 1989). Other investigators have reported that the affinity of RP-A for single-stranded DNA is 30-fold greater than for double-stranded DNA, as determined by measuring the DNA concentration at which 50% of the DNA is bound to a nitrocellulose filter (Fairman and Stillman, 1988). When the subunits of RP-A were separated on an SDSpolyacrylamide gel, transferred to nitrocellulose, and probed with radiolabeled single-stranded DNA, only the 70-kDa subunit was detected (Wold et al., 1989), indicating that the single-stranded DNA-binding activity may reside exclusively in this subunit. Other SSBs, including Escherichia coli SSB and the adenovirus DNA-binding protein, are able to substitute for RP-A in the initial unwinding step in SV40 DNA replication, but they are unable to substitute for RP-A in the complete DNA replication reaction Dean et al., 1987;Virshup and Kelly, 1989). This suggests either that a specific replication complex is formed with RP-A which cannot form with other SSBs or that RP-A has some additional essential activity beyond single-treatment DNA binding. RP-A was assayed for various enzymatic activities including ATPase, GTPase, 3' ~5' exonuclease, endonuclease, helicase, and topoisomerase I; none of these activities was found (Wold et al., 1989).
In order to understand better the role of RP-A in DNA replication as well as the nature of the strong interactions among the three subunits, we have undertaken the isolation and sequencing of the cDNAs coding for the three polypeptide subunits. We report here the cloning of a cDNA coding for the 32-kDa subunit.

MATERIALS AND METHODS
Pur&cation of RP-A-RP-A was purified as described (Wold and Kelly, 1988) except that the 1.3 M KSCN wash from the Affi-Gel Blue column was concentrated and desalted by passing it over a small (0.2 ml) hydroxylapatite column and eluting bound material with buffer F containing 70 mM potassium phosphate (Weld and . Peak fractions had a protein concentration as high as 1 mg/ ml. Peptide Sequencing-Peptide sequence was obtained from both whole RP-A and from the separated subunits. Approximately 1 nmol (150 fig) of RP-A was reduced in 100 mM Tris (pH 8.0), 1% SDS, 20 mM dithiothreitol for 60 min at 60 "C. After cooling to room temperature, cysteine residues were carboxymethylated by the addition of 0.05 volume of 0.44 M iodoacetamide (freshly made) with incubation at room temperature in the dark for 30 min. Then 9 volumes of -20 "C ethanol were added along with 3.85 rg (2.5% w/w relative to RP-A) of L-1-tosylamido-2-phenylethyl chloromethyl ketone-treated trypsin (Sigma; prepared from bovine pancreas). The protein was allowed to precipitate overnight at -20 "C and then was collected by centrifugation and resuspended in 0.3 ml of 100 mM NH4HC03 (pH 7.5). An additional 3.85 eg of trypsin was added to make a final ratio of 1 fig of trypsin/20 pg of RP-A, and incubation was carried out for 4 h at 37 "C. Samples were frozen in liquid Nz and stored at -70 "C.
For sequencing the separated subunits, 150 pg was reduced, carboxymethylated, and ethanol precipitated as above, without trypsin, and then was subjected to SDS-polyacrylamide gel electrophoresis in a Mini Protean II (Hoefer Scientific) using a 10% running gel and a 5% stacking gel (Laemmli, 1970). The protein was then transferred to nitrocellulose, and portions of the filter containing each separated subunit were excised and individually blocked with polyvinylpyrrolidone (Aebersold et al., 1987). Trvnsin freshly made UD in Tris (100 mM, pH 8.2), 5% acetonitrile was then added, and digestion was allowed to proceed overnight at 37 'C.  (Laemmli, 1970 (Wold et al., 1989). Replication was allowed to proceed for 2 h at 37 "C, and samples were treated as described (Wold et al., 1989).

RESULTS
Cloning of a cDNA for the 32-kDa Subunit of RP-A-A 22amino acid peptide sequence was obtained from a tryptic digest in solution of whole RP-A. That sequence was read as Ser/Gln-Ala-Val-Asp-Phe-Leu-Ser-Asn-Glu-Gly-Ala-Ile-Tyr-Ser-Thr-Val-Asp-Asp-Asp-His-Phe-Lys.
Using the table of preferred codon choice for human coding sequences (Lathe, 1985), a single nondegenerate 48-residue-long oligonucleotide of sequence 5'-GCTGTGGACTTCCTGTCCAATGAGGGC-GCCATCTACTCCACAGTGGAC-3' was synthesized based upon amino acids 2-17. This probe was used to screen a Xgtll HeLa cell cDNA library at low stringency (a final wash at 42 "C! in 0.2 x SSC). Approximately 200,000 plaques were screened, and 21 plaques that hybridized with the probe were plaque purified through three cycles of screening with the oligonucleotide.
DNA was prepared from each of these bacteriophages and digested with EcoRI. The HeLa cDNA inserts released by this digestion were subcloned into the EcoRI site of PBS for further analysis. In order to determine whether any of these cDNAs encoded the amino acid sequence that had been obtained by peptide sequencing, the inserts were subjected to DNA sequencing using primers homologous to the polylinker region of the PBS vector. One of the cDNA inserts was found to be homologous with the oligonucleotide probe at 37 out of 48 positions.
Translation of the DNA sequence yielded the amino acid sequence Gln-Ala-Val-Asp-Phe-Leu-Ser-Asn-Glu-Gly-His-Ile-Tyr-Ser-Thr-Val-Asp-Asp-Asp-His-Phe-Lys. This is identical to the amino acid sequence obtained by direct peptide sequencing, except that residue 11 was erroneously determined to be alanine by pep-tide sequencing rather than histidine as predicted from the cDNA. These 2 residues have very similar retention times (12.02 min for alanine uersus 12.24 min for histidine).
The recombinant plasmid consisting of the cDNA insert described above cloned into the EcoRI site of pBS was designated pLE1. The complete sequence of both strands of the cDNA was determined by using a series of deletions constructed at unique restriction sites within the insert. The insert was found to contain 1512 base pairs between flanking EcoRI sites (Fig. 1). Within this sequence there is an open reading frame beginning with the first AUG at nucleotide 78 and ending with a TAA at nucleotide 888. No other long open reading frame is found within this cDNA (Fig. 1). The cDNA contains 77 nucleotides of 5'-untranslated sequence and 622 nucleotides of 3'-untranslated sequence. The five nucleotides immediately upstream of the first AUG codon, CCAAG, are a reasonable match for the eucaryotic translation initiation consensus sequence CC(A/G)CC (Kozak, 1984). In particular, the 32-kDa subunit cDNA has an A at position -3, which is the case for 79% of eucaryotic initiator codons. The derived amino acid sequence strongly suggests that the cloned cDNA codes for the 32-kDa subunit of RP-A. The single long open reading frame of the cDNA specifies a polypeptide 270 amino acids long, including the N-terminal methionine residue. The predicted molecular mass of this protein is 29,228 daltons, in good agreement with the molecular mass of 32,000 daltons determined from SDS-polyacrylamide gel electrophoresis.
The open reading frame contains two amino acid sequences that are identical to those derived from sequencing tryptic peptides derived from SDS-polyacrylamide gel-purified 32-kDa subunit. The peptide sequences Ile-Met-Pro-Leu-Glu-Asp-Met-Asn-Glu-Phe and Ala-Pro-Thr-Asn-Ile-Val-Tyr-Lys were obtained from two different tryptic fragments derived from SDS-polyacrylamide gel-purified 32-kDa subunit. As is shown in Fig. 1, both of these sequences are present in the coding region of the cloned cDNA, and both are preceded by a lysine residue, as would be expected for tryptic peptides. For this reason, along with confirmatory evidence based on a monoclonal antibody that specifically recognizes the 32-kDa subunit (see below), we are confident that we have isolated a cDNA coding for the 32-kDA subunit of RP-A. Two significant features of the predicted amino acid sequence were noted. The N terminus of the protein is unusually rich in glycine and serine. Of the first 30 amino acids, 9 (30%) are glycine, and 7 (23%) are serine. Outside of this region, the protein as a whole is not especially glycine-or serine-rich, being 7.4% glycine and 10.4% serine, values that approximate those determined from human protein-coding sequences (Doolittle, 1986). An additional feature of this protein is the presence of several regions of very high acidic or basic character. There are two clusters of basic residues: one between position 37 and position 45 (net charge of +5), and one between amino acids 127 and 145 (net charge of +5). There are two acidic domains: one between residues 95 and 123 (net charge of -7), and one between position 247 and 270 (net charge of -4). Overall, however, the 32-kDa subunit of RP-A contains approximately equal amounts of acidic and basic residues (28 acidic, 30 basic).
The genes can be expressed from the bacteriophage T7 410 promoter.2 The plasmid is carried in E. coli BL21 (DE3), a protease-deficient strain in which the bacteriophage T7 RNA polymerase is carried on a X-prophage under the control of the lac operon. The orientation of the cloning sites in the vector is such that the T7 TI#I transcription termination sequence lies immediately downstream of the inserted gene. Logarithmically growing bacteria are induced with IPTG, an inducer of the lac operon, resulting in production of the T7 RNA polymerase and transcription of the inserted gene. The T7 expression vector used carries the AUG initiator codon from the T7 gene 10 protein, along with its efficient translation initiation signals, downstream from the phage 410 promoter. This AUG codon lies within an NcoI recognition site within the vector PET-8c. This site was converted to a BamHI site into which was ligated the BanHI site of the pLE1, which contains the RP-A 32-kDa subunit cDNA cloned into the polylinker EcoRI site, to produce the expression plasmid pLE2. Due to the way in which pLE2 was constructed, the open reading frame, beginning from the PET-8c AUG codon, contains 105 nucleotides between the PET-8c AUG and the first AUG codon of the cDNA open reading frame. As a result the translation product will consist of an N-terminal fusion of 35 amino acids onto the 270 amino acids of the RP-A 32-kDa subunit. Of these 35 additional amino acids, 10 are derived from the vector, the BamHI linker, and pBluescript polylinker sequences, while 25 are coded for by the 5'-untranslated region of the 32-kDa subunit cDNA. TGG UC AGT GGA TTC GM AGC TAT GGC AGC T:C TCA TAC GGG G&A   21  31  AGGYTPSPGGFGSPAF  GCC Got   GGC TAC AC0 CAG TCC CCG GGG @SC TTT CGA TCG CCC TAA   CTGGATCTAACTOGOTACCTGAGATATTTTACAGCTGG  954  AGCTCTGCATATGTCTGGCCAGGGGGC TTCTAGGAAGTAGGTTTCATCTATCAAATGTCTCCTC  101s  TGACTTCCTTTTG-CTTACCTGCTCTTCTGTTTTATTTTGTTTTGTTTG~~TCAGAG~AG  1062  ATGGGCAATTGACAGGGATGCAATCCAGGGTGGGATTTCTTGAGGMGTTACAAATAAGCTTGT 1146  TACAACATCMGATAGATGTTGGAAGGATGCT~~CCA~AGAGTACTTACATAGT~TCAGG  1210  AGTTTCTCTTCTT-TGTTTACT~TG-GATGA~AGGACCAG~CGTTAT~~A~  1274  CCTAOCCAGAMCCTGCTOOCCTCTGCCTOTTTTCATTTCCCACTTT~TTGTGT-ATT~T  1338  TTCAGGMTTOCACTTTCCTGCTTGTCATGACTTTTTGACACACTT~CATGACGTGTGTTTCTG  1402  TGMCATGMGTTCTOCGGTAGTtCCTCCA~CAOAGGIATTT  1466  TGTAC-TAAATACAGTCATATGTTTAATAAAACAGTTCTACCG  1512   125  Nucleotides are numbered in the 5' to 3' direction, beginning and ending with the EcoRI sites used to insert the cDNA in Xgtll. The deduced amino acid sequence is shown immediately above the corresponding nucleotide sequence.
The two amino acid sequences in boldface were read from tryptic peptides derived from SDS-polyacrylamide gel-purified RP-A 32-kDa subunit.
The amino acid sequence outlined in italics was read from a tryptic peptide of whole RP-A.
fingers (Berg, 1986) and leucine zippers (Landschultz et al., 1988), functional domains found in many DNA-binding proteins, as well as for the adenine nucleotide-binding consensus sequence (Walker et al., 1982). None of these was found. We also failed to note any internally repeated domain within this protein.
Expression of the 32-kDa Subunit of RP-A in Bacteria-The cDNA coding for the 32-kDa subunit of RP-A was cloned into PET-8c, a bacterial expression vector in which exogenous Bacteria containing either pLE2 or the vector PET-8c were induced with IPTG, grown for 2 h at 37 "C, and lysed. The lysates were electrophoresed on SDS-polyacrylamide gels, and the separated proteins were transferred to a nitrocellulose membrane. The filter was probed with a monoclonal antibody, designated 71, which was prepared as described under "Materials and Methods." When human RP-A is electrophoresed under these conditions, the monoclonal antibody recognizes only the 32-kDa subunit (Fig. 2, lane 1). The principal new protein produced when bacteria pLE2 are induced is a 32-kDa protein that is recognized by monoclonal antibody 71 (Fig. 2, lane 2). We estimate that approximately 2 pg of recombinant protein is produced/ml of cells. This 32-kDa protein recognized by the anti-RP-A monoclonal antibody is produced only at very low levels in uninduced cells containing pLE2 (Fig. 2, lane 4) and is not seen in bacteria containing the vector PET-8c, whether or not they are induced (Fig. 2, lanes 3 and 5). In addition to the predominant 32-kDa product, a slightly smaller species that is also recognized by the monoclonal antibody is produced in considerably smaller amounts upon induction of bacteria containing pLE2. The fact that the cloned cDNA codes for the production of a 32-kDa protein that is recognized by a monoclonal antibody against the 32-kDa subunit of RP-A confirms the identity of the cDNA that we have cloned. It might have been expected-since the bacterial expression construct has an additional 35 amino acids of open reading frame which are not present in the human cDNA-that the bacterially expressed 32-kDa subunit would be somewhat larger than the protein isolated from human cells. It may be that the bacterially expressed protein has undergone some proteolysis in the bacteria or upon cell lysis. It is also possible that the mammalian protein may contain some post-translational modifications, not produced in bacteria, which cause it to run a few kilodaltons larger on SDS-polyacrylamide gels than would be predicted by the size of the cDNA open reading frame. It is apparent from Fig. 2 The 32-kDa Subunit that several proteins present in uninduced bacteria are recognized to some extent by the anti-32-kDa subunit monoclonal antibody.
These species are not detected when the Western blot is probed only with rabbit anti-mouse IgG and protein A without the monoclonal antibody (data not shown). The strongest of these bands runs with a mobility just slightly less than that of the 32.kDa subunit of RP-A; this species may be a bacterial protein that shares some immunological characteristics with the mammalian protein. A Monoclonal Antibody against the 32.kDa Subunit of RP-A Inhibits DNA Replication in Vitro-In order to examine further the role of the 32-kDa subunit of RP-A in DNA replication zn vitro, we examined whether monoclonal antibody against the 32.kDa subunit of RP-A would inhibit DNA replication in a cell-free extract. A crude HeLa cytoplasmic was preincubated with either 2.8 or 5.6 Kg of ammonium sulfate-concentrated supernatant from the hybridoma line producing monoclonal antibody 71, and the additional components required for in uztro replication were then added. Replication was allowed to proceed for 2 h at 37 "C. The products of replication were electrophoresed on an agarose gel and analyzed by autoradiography.
The monoclonal antibody against the 32.kDa subunit of RP-A strongly inhibited DNA replication.
Preincubation with 5.6 pg of anti-32-kDa subunit monoclonal antibody produced a 74% inhibition compared with preincubation with antibody buffer alone, whereas preincubation with 2.8 pg of antibody resulted in a 72% inhibition of DNA replication (Fig. 3). This inhibition could a-32 kDs(pg) --5.6 2. be completely overcome by the addition of 200 ng of purified RP-A prior to the start of the replication reaction, indicating that the inhibition is due to a specific interaction between the antibody and RP-A and not some nonspecific inhibitory component in the antibody preparation.
We report the cloning of a cDNA that encodes the 32.kDa subunit of human RP-A, a protein absolutely required for SV40 DNA replication m vitro. The sole long open reading frame within the cDNA would code for a protein of molecular mass 29,228 daltons, a number in good agreement with the molecular mass of the 32-kDa subunit as measured on SDSpolyacrylamide gels. Within the open reading frame are found two peptide sequences identical to those obtained from direct peptide sequencing of the 32-kDa subunit purified by SDSpolyacrylamide gel electrophoresis. The cloned cDNA encodes the production in E. coli of a 32-kDa protein that is recognized by a monoclonal antibody specific for the 32-kDa subunit of RP-A. The predicted amino acid sequence exhibits no significant homologies with any of the proteins in the sequence banks.
One interesting feature of the derived amino acid sequence is the presence of two acidic regions: one of 29 amino acids with a net charge of -7, and one of 24 amino acids with a net charge of -4. Similar acidic stretches are found in several yeast transcriptional activator proteins, including GAL 4 (Laughon and Gesteland, 1984), GCN 4 (Hinnebusch, 1984), and PHO 4 (Legrain et al., 1986). For example, a 19.amino acid region of GCN 4, with a net charge of -5, appears to be sufficient to allow transcriptional activation (Hope and Struhl, 1986). Current models of transcriptional activation propose that the transcriptional activators interact with DNA through a DNA-binding domain and interact with other proteins through the acidic activator domain. It is possible that the acidic regions of the 32-kDa subunit of RP-A are involved in protein-protein interactions with the other subunits of RP-A and perhaps with other proteins involved in replication. Now that the cDNA for the 32.kDa subunit is in hand and has been expressed in bacteria, mutagenesis can be undertaken to define the sequences within this protein necessary for the intersubunit interactions as well as interactions with other replication proteins.
We have obtained the peptide sequence from the gel-purified 70 and 14-kDa subunits3 and are currently engaged in isolating cDNAs encoding these two proteins.
The exact role of the 32-and 14-kDa subunits of RP-A in DNA replication in vitro is still unknown. We have presented evidence that a monoclonal antibody directed against the 32-kDa subunit is able to inhibit replication, indicating that it does play an essential role in replication.
RP-A is required both for complete replication as well as in the presynthetic template-unwinding reaction carried out in the presence of origin-containing DNA and large T antigen. The RP-A complex has single-stranded DNA-binding activity that is intrinsic to the 70-kDa subunit (Wold et al., 1989). The singlestranded DNA-binding activity of the 70-kDa subunit seems to be sufficient for presynthetic template unwinding, since single-stranded binding proteins from autologous sources will substitute for RP-A in this reaction. The single-stranded DNA-binding activity, however, appears not to be sufficient for the complete replication reaction, since SSBs from autologous sources will not substitute for RP-A in replication.
It is possible that the 32-and 14-kDa subunits provide some enzymatic activities that are necessary for replication but not for unwinding.
Another possibility is that the proteins involved in SV40 DNA replication form a large multiprotein complex such as has been proposed to be involved in the replication of E. coli (Baker et al., 1986) and bacteriophage X-DNA (Dodson et al., 1986). There might then be very specific interactions between RP-A and other replication proteins which are unable to be reproduced by heterologous SSBs. The 32-and 14-kDa subunits of RP-A might play an essential role in mediating these protein-protein interactions.
The ability to overexpress these proteins and carry out mutagenesis should allow delineation of the precise nature of the interactions in which these molecules participate. We would also like to thank Vicki Saylor for production of monoclonal antibodies.