Arrangement of a highly repeated DNA sequence in the genome and chromatin of the African green monkey.

The DNA of the African green monkey contains three components that are distinguishable by the kinetics of reassociation. The rapidly reassociating component represents about 20% of the total DNA and is composed almost entirely of a sequence (AGMr(HindIII)-1) which is repeated 6.8 x 10(6) times. The majority of the AGMr(HindIII)-1 sequences are organized in long tandem repeats of a segment of 172 base pairs in length. However, a fraction of the AGMr (HindIII)-1 sequences is interspersed with another 37% of the genome. The structure of the chromatin containing the AGMr-(HindIII)-1 sequence is indistinguishable from that containing total DNA. Furthermore, there is nothing inherent in the nucleotide sequence of AGMr(HindIII)-1 which specifies a unique location for nucleosomes.

(Received for publication, November 8, 1978) Dinah S. Singer+ From the Laboratory of Biochemistry, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20205 The DNA of the African green monkey contains three components that are distinguishable by the kinetics of reassociation.
The rapidly reassociating component represents about 20% of the total DNA and is composed almost entirely of a sequence (AGMr(HindIII)-1) which is repeated 6.8 X 10' times. The majority of the AGMr(HindIII)-1 sequences are organized in long tandem repeats of a segment of 172 base pairs in length. However, a fraction of the AGMr (HindIII)-1 sequences is interspersed with another 37% of the genome. The structure of the chromatin containing the AGMr-(HindIII)-1 sequence is indistinguishable from that containing total DNA. Furthermore, there is nothing inherent in the nucleotide sequence of AGMr(HindIII)-1 which specifies a unique location for nucleosomes.
The genomes of the African green monkey and a derived cell line, BSC-1, both contain a class of highly repeated DNA sequence: digestion of total DNA with the restriction enzyme, Endo R.HindIII, yields a series of discrete DNA fragments (1). The larger fragments appear to be integral multiples of the smallest fragment of 172 base pairs, designated AGMr(HindIII)-1 (2). In partial digests of total DNA with Endo R.HindIII, fragments up to 29 times the monomer length have been observed (3), suggesting that these sequences can occur in long tandem arrays. Upon complete digestion of total DNA with Endo R.HindIII, only monomers, dimers, and a trace amount of trimers are observed; approximately 8% of the DNA is released into both monomers and dimers, in a ratio of 7:1, respectively (2). The AGMr(HindIII)-1 sequences contained within the monomer band have been isolated and the complete nucleotide sequence specifying the most abundant nucleotide residue at each position has been reported (2). Sequence analysis revealed that the many repeats of this complex unit are not all identical but represent a set of closely related segments. The sequence is not internally repetitive although it does contain clusters of pyrimidines and purines.
Earlier studies identified a fraction of the African green monkey DNA, called Component a, as a buoyant density shoulder of the main band DNA (4). Subsequent studies have demonstrated that Component o( is composed almost entirely of AGMr(HindIII)-1 sequences, since digestion of Component (Y DNA with Endo R-Hind111 results in a nearly quantitative conversion of the DNA into AGMr(HindIII)-1 and -2 (3).
* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. + Present address, Immunology Branch, National Cancer Institute, National Institutes of Health, Bethesda, Md. 20205.
In the present study, the organization of the class of AGMr(HindIII)-1 sequences within the total DNA and chromatin have been examined.
Studies of eukaryotic genome organization have used two major approaches: analysis of the genomic DNA sequence arrangement and analysis of the structure and arrangement of chromatin. The DNA of a wide variety of eukaryotes has been shown to consist of different classes of DNA sequence. The number and nature of these classes for a given species depends on the experimental technique used. The eukaryotic genome can be divided into several repetition frequency classes of DNA sequence, distinguished by reassociation kinetics, ranging from presumed single copy to highly repetitive sequences (8). Highly repeated, complex sequence classes of DNA can be interspersed in a regular fashion with lower repetition frequency classes of DNA (9). Alternatively, analysis of total DNA on buoyant density gradients often reveals one or more satellites of the main band DNA (4,10). In many cases, these satellite DNA sequences consist of simple, short sequences organized in long tandem arrays (11, 12). Finally, restriction enzyme analysis of total genomic DNA can reveal regularities in the spacing of restriction enzyme sites (1,13) Table I. Concentration of AGMr(HindIII)-1 DNA in the Total Genome-The concentration of AGMr(HindIII)-1 DNA in the total genome was determined by analysis of the kinetics of reassociation of "'P-labeled AGMr(HindIII)-1 tracer with an excess of randomly sheared, total BSC-1 DNA (Fig. 1).
AGMr(HindIII)-1 is calculated to be 19.3% of the total DNA from the relative rates of reassociation of the tracer with pure AGMr(HindIII)-1 and with total DNA (Table II). Reassociation of the 'a2P-labeled AGMr(HindIII)-1 tracer with total DNA occurs over the same range of Cot as does reassociation of the major rapidly reassociating component of the total DNA (Table I). Therefore, AGMr(HindIII)-1 DNA clearly derives from this component.
Furthermore, the rapidly reassociating component of the total DNA is composed almost entirely of AGMr(HindIII)-1 sequences, since 20.8% of the total DNA is in this component (Table I) and AGMr(HindIII)-1 DNA represents 19.3% of the total DNA. The kinetic parameters of the reassociation studies with the "'P-labeled AGMr(HindIII)-1 DNA tracer are summarized in Table II. Genome Size-From these data, an estimate of the total genome size of BSC-1 cells can be made by two independent methods. Comparing the rate of reassociation of slowly reassociating DNA sequences (presumed to be single copy) of the BSC-1 genome with that of a known standard, in this case sea urchin DNA (K = 1.25 x lo-", 8.3 x lo* base pairs per haploid genome (42)), the size of the BSC-1 genome is estimated to be 6.1 x 10' base pairs per haploid genome (Table III). Using the

reassociation
of AGMr(HindIII)-1 as a standard, the size of the BSC-1 genome is calculated to be 3.3 X 10' base pairs haploid genome (Table III). The accuracy of the reassociation kinetics studies is probably no greater than within a factor of two.

Reiteration
Frequency ofAGMr(HindIII,-I-It is also possible to calculate the reiteration frequency of the AGMr(HindIII)-1 sequence in the genome by independent methods. The various calculations are summarized in Table  III. The rate of reassociation of a highly repeated DNA component relative to that of single copy DNA has been widely used as a measure of reiteration frequency. In this case, the relative rates indicate a reiteration frequency of 1.07 x 10" copies/haploid genome for sequences homologous to AGMr(HindIII)-1.
This value is in good agreement with earlier estimates (44). However, it is also possible to calculate the reiteration frequency directly from the known fragment length of AGMr(HindIII)-1 (172 base pairs), the fraction of the genome which it represents (0.193) and the estimated genome size (6.1 x 10" base pairs/haploid genome). In this case, the reiteration frequency is estimated to be 6.8 x 10". This is a frequency seven times greater than that determined by kinetic parameters alone. Even if the lower genome size estimate is used in the calculation, a reiteration frequency of 3.7 x 10" is obtained. Possible sources of the discrepancies in the two methods of analysis will be discussed later.

Interspersion of AGMr(HindIII)-1 with
Unrelated Sequences in the Genome-The organization of sequences homologous to AGMr(HindIII)-1 within the total genome was investigated by the interspersion analysis as first described by Davidson (9). Previous studies have demonstrated that at least a part of the family of AGMr(HindIII)-1 sequences is arranged in long tandem repeats (2,3). The results shown in Fig. 2  tion of the labeled tracer is retained on hydroxyapatite following hybridization at a Cot 3 x lo-". This indicates that sequences homologous to AGMr(HindIII)-1 are spaced at 2200 nucleotide intervals. The curve intersects the ordinate at a value of 15%, indicating that about 15% of the total genome is homologous to AGMr(HindIII)-1. This value is in reasonable agreement with the results obtained by reassociation kinetics. Above a tracer length of 2200 nucleotides, the proportion of the tracer which is hybridized remains constant at 52%; it is calculated that AGMr(HindIII)-1 sequences are interspersed with another 37% (52 -15%) of the total BSC-1 genome. Furthermore, these data indicate that there is no other detectable interspersion of sequences homologous to AGMr- (HindIII)-1 at intervals greater than 2200 nucleotides. The data do not exclude the possibility that each 2200-base pair interval contains several AGMr(HindIII)-1 sequences in tandem, or alternatively, less than a complete copy of AGMr(HindIII)-1.
However, if it is assumed that only a single complete copy of AGMr(HindIII)-1 occurs within every 2200base pair interval, then the number of AGMr(HindIII)-1 sequences interspersed with unrelated sequences can be calculated as follows. Each 2200-base pair interval will contain 172 base pairs of AGMr(HindIII)-1 and 2028 base pairs of unrelated sequences. Altogether, these interspersed unrelated sequences comprise 37% of the total BSC-1 genome. Using the estimated genome size of 6.1 x 10' base pairs/haploid genome (Table III, Method I), there are (6.1 x 10') x (0.37)/2028 = 1.1 X 10" intervals of 2200 base pairs. Therefore, the number of AGMr(HindIII)-1 sequences interspersed with unrelated sequences is also 1.  The organization of the family of AGMr(HindIII)-1 sequences in chromatin has been investigated and compared with the organization of total chromatin. Digestion of BSC-1 nuclei with staphylococcal nuclease releases deoxyribonucleoprotein complexes corresponding to nucleosomal core particles, monomers, and multimers. The repeat length for BSC-1 nucleosomes was determined by comparing the electrophoretie mobilities of DNA fragments purified from nucleosomal core particles, monomers, dimers, and trimers, respectively, with the electrophoretic mobilities of standards of known length in polyacrylamide gels (Fig. 3, left panel). The length of core particle DNA was estimated to be 145 base pairs, and the monomer, dimer, and trimer nucleosomal DNA lengths were 185,370, and 560 base pairs, respectively. Analysis of the length of the DNA fragments using denaturing gels gave the same estimated sizes (data not shown). Therefore, the approximate nucleosomal repeat length for bulk chromatin in BSC-1 DNA is 185 base pairs, consistent with lengths reported by others (24,45).
The presence of AGMr(HindIII)-1 sequences in nucleoso-ma1 core particle, monomer, and dimer DNA was demonstrated by hybridization of "'P-labeled AGMr(HindIII)-1 to these DNA fragments. Purified nucleosomal DNA fragments corresponding to core particle, monomer, and dimer were separated by electrophoresis in a polyacrylamide gel, transferred to a nitrocellulose filter, and hybridized with ""P-labeled AGMr(HindIII)-1 DNA. The results (Fig. 3, right panel) indicate that AGMr(HindIII)-1 sequences occur in all three nucleosomal classes. Furthermore, the size distribution of DNA fragments containing AGMr(HindIII)-1 sequences in each nucleosomal DNA class is indistinguishable from that of the total nucleosomal DNA. Therefore, the highly repeated DNA fraction of the BSC-1 genome is packaged into nucleosomes with the same average repeat length as the total DNA. Extent of Packaging of AGMr(HindIII)-1 into Nucleo-some+-It is not known whether all DNA sequence classes are packaged in chromatin such that they are equally susceptible to staphylococcal nuclease digestion. To examine this question for the AGMr(HindIII)-1 DNA sequence class, the concentrations of AGMr(HindIII)-1 sequences in both total DNA and  Reassociation rates of 32P-labeled AGMr(HindIIQ-1 in the presence of various fractions of chromosomal DNA Reassociation rates of the sequences homologous to AGMr-(HindIII)-I in various DNA fractions were determined in two ways one preparation of 32P-labeled AGMr(HindIII)-1 with a specific activwithin a given assay. In the first determination the reassociation rate ity of 1.8 X lo7 cpm/pg. The values in Experiment II were determined of 32P-labeled AGMr(HindIII), present in tracer amounts, was meas-with another preparation of "P-labeled tracer with a specific activity ured in the presence of an excess (2,500-to lO,OOO-fold) of the driver of 1.5 X 10' cpm/pg. The measured reassociation rate of the tracer DNA fraction. In all cases, the driver DNA fraction was 3H-labeled. has been observed to vary by no more than about I-fold between The reassociation rate of the rapidly reassociating component of the different tracer preparations. It is likely that the difference reflects 3H-labeled driver DNA was measured in parallel with that of the '*P-the method of preparation of the labeled tracer. The data in Experilabeled tracer. The values in Experiment I were all determined with ment 1 were derived from Fig. 1; the data in Experiment 2 were derived from Fig. 4. showed no detectable hybridization to the "'Plabeled tracer over the range of Cot tested (Fig. 1, Table IV). Total DNA and nucleosomal core particle DNA reannealed with the AGMr(HindIII)-1 tracer with rate of 382 and 458 M-' S-l, respectively (Fig. 1). These results demonstrate that the representation of AGMr(HindIII)-1 sequences in the nucleosomal core particle DNA fraction is indistinguishable from its representation in the total genomic DNA. In a variety of tissues, nucleosomes containing actively transcribed DNA sequences appear to be differentially sensitive to the action of DNase I (32, 46). To test the possibility that nucleosomes containing different DNA sequence classes are differentially sensitive to DNase I, the concentration of AGMr(HindIII)-1 sequences in DNA that resisted digestion of BSC-1 nuclei with DNase I was determined.
BSC-1 nuclei were treated with DNase I under conditions which solubilized 20% of the total DNA (in control experiments, it was shown that the rates of digestion of AGMr(HindIII)-1 and total DNA by DNase I are equal; data not shown). The resistant DNA was then purified and the concentration of AGMr(HindIII)-1 sequences determined by the kinetics of reassociation with a =P-labeled AGMr(HindIII)-1 DNA. In two separate experiments, the rate of reassociation of the tracer in the presence of the DNase I-resistant DNA was indistinguishable from its rate of reassociation with total DNA (Fig. 4, Table IV). Therefore, nucleosomes containing the highly reiterated DNA class of the BSC-1 genome are not differentially sensitive to the action of DNase I.

Spatial Relationship between Nucleosomes and Protected
DNA Sequences-The results presented thus far indicate that the organization of the highly reiterated DNA fraction of AGMr(HindIII)-1 sequences in nucleosomes is grossly indistinguishable from that of the total DNA. Therefore, this sequence can be used to investigate a more general question, namely, does a given DNA sequence uniquely define the g 8 0.5 The specific activity of the "'P-labeled AGMr(HindIII)-1 tracer was 1.5 x 10" cpm/pg. Driver DNA excess ranged from 2,5OO-to 10,000-fold. Preparation of total BSC-1 DNA (0) was as described under "Experimental Procedures." position which a nucleosome occupies on that sequence? Two extreme possibilities can be considered: 1) The arrangement of nucleosomes is nonrandom such that a particular region of the AGMr(HindIII)-1 sequence tends to be associated with the same regions of the nucleosomal core, or 2) the arrangement of nucleosomes on the family of AGMr(HindIII)-1 sequences is completely random with respect to sequence. These alternatives can be distinguished experimentally as follows. Staphylococcal nuclease digestion of BSC-1 nuclei will solubilize the internucleosomal DNA linker, leaving core particle DNA intact. If the same region of the AGMr(HindIII)-1 sequence is always associated with the core particle, the resulting core particle DNA fragments of AGMr(HindIII)-1 will be in register with respect to the AGMr(HindIII)-1 sequence. Thus, reannealed core particle DNA will only form AGMr(HindIII)-1 duplexes 145 base pairs in length. On the other hand, if the arrangement of nucleosomes is random with respect to sequence, nuclease digestion will release a series of fragments of AGMr(HindIII)-1 which are circularly permuted with respect to the AGMr(HindIII)-1 sequence. In this case, reassociated core particle DNA duplexes will be concatamers of the AGMr(HindIII)-1 sequences. Thus, the fragment lengths of core particle DNA prepared After electrophoresis on polyacrylamide gels, the DNA in the gel was transferred to a nitrocellulose Fiter and hybridized to a "'Plabeled AGMr(HindIII)-1 DNA. The results (Fig. 5) demonstrate that AGMr(HindIII)-1 sequences have formed concatamers following denaturation and reannealing. In a similar experiment (data not shown), DNA obtained from trimmed nucleosomes prepared according to the method of Whitlock and Simpson (33) was used. Concatamers of AGMr(HindIII)-1 were again observed following reassociation.
These results indicate that most of the AGMr(HindIII)-1 sequences in core particle DNA are circularly permuted and therefore that there is no unique arrangement of nucleosomes of these sequences. The generation of concatamers of AGMr(HindIII)-1 might be expected to regenerate some Endo R-Hind111 sites. Endo R. Hind111 digestion of concatamers obtained from reannealed core particle DNA might then release fragments of 172 base pairs containing AGMr(HindIII)-1.
These would be observed as a discrete band of lower mobility than the starting material of 145 base pairs. Such a band was not observed following Endo ReHindIII digestion of concatamers of DNA derived from core particles prepared according to the method of Varshavsky et al. (22) (Fig. 5) but has been observed with concatamers of DNA derived from trimmed nucleosomes (data not shown). This difference probably derives from the fact that preparation of trimmed nucleosomes results in a smaller size distribution of DNA fragments than the Varshavsky method.
The observed concatamer formation is not due to any inherent property of the AGMr(HindIII)-1 sequence itself (such as inverted repeats), since reassociation of purified AGMr(HindIII)-1 does not result in concatamer formation (Fig. 5b). This result was expected from the known nucleotide sequence of AGMr(HindIII)-1 (2). The observation that direct digestion of mononucleosomal DNA (185 base pairs) with Endo R -Hind111 generates a smear of DNA fragments smaller than 185 base pairs which contain AGMr(Z%indIII)-1 sequences but no discrete band (data not shown) is consistent with the conclusion that the arrangement of nucleosomes on AGMr(HindIII)-1 DNA is random with respect to sequence.
It is impossible to exclude the possibility that randomization of nucleosomes on the AGMr(HindIII)-1 sequence occurred during the isolation procedures. However, attempts were made to minimize this possibility. All isolation procedures were carried out in low ionic strength buffers, and similar results were obtained when core particle DNA was prepared by two different methods. DISCUSSION The Arrangement ofAGMr(HindIIQ1 in the BSC-1 Genome The arrangement of a highly repeated DNA sequence, AGMr(HindIII)-1, in the BSC-1 genome was studied by two approaches. The analyses both of the reassociation kinetics of the total DNA and of the products of restriction enzyme digestion provide a description of the organization of a well defined family of sequences within the genome, and also reveal the limitations of each approach. Sequences Homologous to AGMr(HindIII,Ll Are Interspersed in the Genome-Analysis of the total BSC-1 DNA by reassociation kinetics reveals the presence of three distinct kinetic components: a rapidly reassociating fraction, a mod-erately rapidly reassociating fraction, and a slowly reassociat-