Base Stacking and Even/Odd Behavior of Hairpin Loops in DNA Triplet Repeat Slippage and Expansion with DNA Polymerase*

Repetitions of CAG or CTG triplets in DNA can form intrastrand hairpin loops with combinations of normal and mismatched base pairs that easily rearrange. Such loops may promote primer-template slippage in DNA replication or repair to give triplet-repeat expansions like those associated with neurodegenerative diseases. Using self-priming sequences ( e.g. (CAG) 16 (CTG) 4 ), we resolve all hairpin loops formed and measure their slippage and expansion rates with DNA polymerase at 37 °C. Comparing CAG/CTG loop structures with GAC/GTC structures, having similar hydrogen bonding but different base stacking, we find that CAG, CTG, and GTC triplets predominantly form even-membered loops that slip in steps of two triplets, whereas GAC triplets favor odd-numbered loops. Slippage rates decline as hairpin stability increases, supporting the idea that slippage initiates more easily in less stable regions. Loop stabilities (in low salt) increase in the order GTC < CAG < GAC < CTG, while slippage rates decrease in the order GTC > CAG ’ GAC > CTG. Loops of GTC compared with CTG melt 9 °C lower and slip 6-fold faster. We interpret results in terms of base stacking, by relating melting temperature to standard enthalpy changes for doublets of base pairs and mispairs, considering enthalpy-en-tropy compensation. Repetitive DNA sequences such as tandemly repeated triplets of bases are abundant and highly polymorphic in the human genome, probably because of strand slippage promoted by repetition in DNA replication,

Repetitive DNA sequences such as tandemly repeated triplets of bases are abundant and highly polymorphic in the human genome, probably because of strand slippage promoted by repetition in DNA replication, repair, or recombination (1)(2)(3)(4)(5)(6). Occasionally, a repetitive region within a human gene expands sufficiently to cause inherited human disease (7). At least 12 neurological disorders are associated with triplet repeat expansions within human genes (5).
Eight such disorders arise by the expansion of CAG repetitions in gene regions encoding glutamine repeats in protein. In each case, involving a different gene (Huntingon's disease, spinobulbar muscular atrophy, dentatorubral-pallidoluysian atrophy, and spinocerebellar ataxias 1, 2, 3, 6, and 7), the number of CAG repeats in tandem is expanded from a normal range of 5-30 to a disease-causing range of 40 -100 (5). The resultant polyglutamine expansion in protein may cause dis-ease by forming insoluble nuclear protein aggregates (8 -10), possibly with cross-linking by transglutaminase (11,12), an enzyme abundant in the brain (13,14), enabling glutamine reaction with lysine to form a peptide-like cross-link between polypeptides.
Another four disorders involve other triplet repetitions expanded in noncoding regions of genes. Myotonic dystrophy, for example, is associated with repeating CTG triplets expanded in an untranslated 3Ј-terminal gene region, from a normal range of 5-40 repeats to a disease-causing range of 50 -3000 (5). Expansions of this magnitude are also observed in CGG repeats and CCG repeats, found in 5Ј-untranslated regions of genes associated with fragile X syndromes A and E, respectively (5). Similarly large increases are observed for GAA repetitions in an intron of a gene associated with Friedreich's ataxia (5). In each case, the noncoding region involved is transcribed into RNA but not translated into protein. The increased number of repetitions in RNA may cause disease by interfering with RNA transcription or processing within cell nuclei (15)(16)(17)(18).
In DNA undergoing replication or repair, repeating triplets may enable one DNA strand to slip relative to the other so that the number of triplet repeats can be expanded by DNA polymerase (2,4,19). Also, because CNG triplets associated with disease can form intrastrand hairpin folds with secondary structure (20 -22), it is of interest to determine how such folding affects slippage and expansion with polymerase. Recently, we developed a convenient in vitro assay, using self-priming repeat sequences, to measure rates of slippage and expansion in relation to hairpin structure (4).
In an earlier study of the major (CAG/CTG) class of diseaseassociated triplets (4), we examined DNA polymerase extension products of the self-priming sequence, (CTG) 16 (CAG) 4 , which forms a well defined series of hairpin loops. Using high resolution gel electrophoresis to analyze product lengths, we found that the CTG repeats form loops that are predominantly even-membered (i.e. have even rather than odd numbers of bases in the hairpin bend). The products of polymerase extension were observed to slip and expand in steps of two triplets at rates of ϳ1 step/min at 37°C. In the present study, a similar analysis is made for the complementary case, (CAG) 16 (CTG) 4 , and "sister" cases obtained by replacing CAG/CTG with GAC/ GTC. Since these cases exhibit the same hydrogen bonding with different base stacking, they reveal the influence of base stacking on the even-odd character of hairpin folds and slippage rates leading to repeat expansions with polymerase.
DNA Polymerase and Substrates-The KFexo Ϫ polymerase used to achieve rapid primer 3Ј extension on template, an Escherichia coli DNA polymerase I Klenow fragment mutant (D355A,E357A) devoid of 3Ј 3 5Ј as well as 5Ј 3 3Ј exonuclease activity, was purified from overproducing strains (23). The dNTP substrates used in extension experiments were purchased from Amersham Pharmacia Biotech, along with ddNTPs employed for sequence analysis of extended products.

Methods
Melting Curves-Thermal denaturation profiles were obtained for each DNA 60-mer at the same strand concentration (2 M) in 0.02 M Na ϩ phosphate buffer (4), by measuring UV absorbance (A 260 ) versus temperature, from 20 to 85°C at 2°C/min.
Extension Reactions-Radiolabeled DNA samples at 10 nM strand concentration, in polymerase reaction buffer (50 mM Tris-HCl, pH 7.5, 10 mM MgCl 2 ), were incubated at 37°C for 5 min to allow equilibration. A 120-l aliquot was then micropipetted into a 0.5-ml polypropylene microcentrifuge tube containing 30 l of polymerase plus dNTPs in reaction buffer at 37°C, at which point (within 3 s), running time (t) for reaction was started. The concentrations of DNA polymerase KFexo Ϫ and each dNTP in the reaction mixture were 60 and 400 nM, respectively. After short intervals of reaction time (t ϭ 15 s, 30 s, etc.), a 5-l aliquot of reaction mixture was removed and added (within 2 s) to 10 l of 20 M formamide solution containing 20 mM EDTA to quench the reaction.
Denaturing Gel Electrophoresis-Extension products of radiolabeled DNA were separated into bands of increasing chain length by electrophoresis at 2000 V on 12% polyacrylamide slab gel (40 cm ϫ 40 cm ϫ 0.2 mm) containing 16 M formamide as denaturant, in TBE buffer (90 mM Tris borate, pH 8.3, 2 mM Na 2 EDTA). Gels were dried on paper and scanned by a Molecular Dynamics Storm 860 PhosphorImager. Frag-meNT Analysis software (Molecular Dynamics) was used to integrate the intensity of each radioactive band in a gel lane, expressed as a percentage of total integrated band intensities in the lane.
Resolution of Hairpin Loops-In each of the four cases studied here, the initial 60-mer sequence (A) forms a series of self-primed hairpin loops, A n (n ϭ 0, 1, 2, etc.), where n is the number of overhanging template triplets at the 5Ј-end, as illustrated ( Fig. 1, a-d), with an asterisk indicating the 5Ј-end labeled with 32 P. The loops are rapidly extended to their corresponding blunt-end products by adding DNA polymerase and sufficient amounts (0.4 M each) of the three dNTPs needed to reach near maximum velocity of primer extension on template, with minimal background "pause" banding caused by misinsertion or terminal transferase activity (4). In 15 s of reaction time (t) with polymerase, the primer 3Ј-end of A n is extended by n primer triplets to form the corresponding blunt-end product, A n plus n triplets, measured as a discrete gel band intensity (I n ) by denaturing gel electrophoresis and PhosphorImager analysis. Each I n (n ϭ 0, 1, 2, etc.) changes as reaction time t is increased, because each blunt-end product undergoes slippage and further extension with polymerase. By analyzing I n versus t and extrapolating back to t ϭ 0, we evaluate I n 0 , indicating the approximate amount of loop A n present initially, before polymerase was added.
The loops are classified as even-numbered or odd-numbered, depending on whether an even or odd number of bases is enclosed within the loop. The even-numbered loops (n ϭ 0, 2, . . . 10) have the same kind of hairpin bend made with an even number of unpaired bases, minimally four as shown ( Fig. 1, a-d). The odd-numbered loops (n ϭ 1, 3, . . . 11) have another kind of bend made with an odd number of unpaired bases (minimally 3). The other loops shown (n ϭ 12-15) have one or more primer triplets in the bend region where there is less opportunity for correct base pairing; these are much less stable loops, indicating how bend regions change as primer triplets enter the bend region in extreme cases of slippage.
Evaluation of Band Intensity I o -The initial blunt-ended loop, A 0 , which cannot be extended unless slippage occurs, is seen as band intensity I 0 remaining after 15 s of reaction. As A 0 undergoes slippage to an extendable form (A 1 , A 2 , etc.), I 0 declines in a simple manner, suggesting a first order differential equation, dI 0 /dt ϭ Ϫk 0 I 0 , or in terms of measured differences (⌬) as follows, where k 0 is the slippage rate constant and I 0 is the average intensity value measured in a short time interval, ⌬t, of 15 or 30 s. For ⌬t between reaction times t x and t y , the intensity change is ⌬I 0 ϭ I 0 (t y ) Ϫ I 0 (t x ), and the corresponding I 0 on the right side of Equation 1 is the average value, I o ϭ 0.5(I 0 (t x ) ϩ I 0 (t y )).
In our experiments, I 0 decays to a low residual "background" intensity (I 0b ) that remains nearly constant, indicating that a small fraction (Ͻ1%) of initial 60-mers are unreactive at the 3Ј-end. To take I 0b into account, we modify Equation 1 as follows.
The resultant integrated equation is then the following, where I 0 0 is the initial intensity corrected for background. To evaluate I 0b and rate constant k 0 , we plot ⌬I 0 /⌬t versus I 0 and fit a straight line by least squares, obtaining slope Ϫk 0 and intercept k 0 I 0b according to Equation 2. Then, using Equation 3, we plot ln(I 0 Ϫ I 0b ) versus t and linearly extrapolate to t ϭ 0 to obtain I 0 0 , indicating the amount of A 0 present initially, as a percentage of total loops A n (n ϭ 0 -11).
Slippage Rate Constant Evaluation-The rate constant k 0 indicates the rate of A 0 slippage, resulting in extension with DNA polymerase. Two k 0 components are examined, k 01 contributing to band intensity I 1 (by A 0 3 A 1 slippage and extension by 1 triplet) and k 02 contributing to I 2 (by A 0 3 A 2 slippage and extension by 2 triplets). To determine the relative magnitude of these components, we start with the differential equations, dI 1 /dt ϭ k 01 I 0 Ϫ k 1 I 1 and dI 2 /dt ϭ k 02 I 0 ϩ k 12 I 1 Ϫ k 2 I 2 and replace differentials by measured differences, as in Equation 1 as follows.
is used to evaluate k 01 and k 1 by a linear least squares fit to a plot of ⌬I 1 /(I 1 ⌬t) versus (I 0 /I 1 ). To evaluate k 02 and k 2 in a similar manner, we examine linear versions of Equation 5 obtained for three different approximations: (a) k 02 Ͼ Ͼ k 12 , (b) k 02 Ϸ k 12 , and (c) k 02 Ͻ Ͻ k 12 . Experimental plots of ⌬I 2 /(I 2 ⌬t) against (a) (I 0 /I 2 ), (b) (I 0 ϩ I 1 )/I 2 , and (c) I 1 /I 2 are then compared to determine which plot gives the best linear least squares fit for k 2 evaluation. In all cases where even-numbered loops predominate, approximation a gives the best fit, indicating that extension following slippage is more frequent in steps of two triplets than in steps of one triplet. Only in the case of (GAC) 16 (GTC) 4 , where odd-numbered loops are favored over even-numbered, do we find approximations b or c giving a better fit.
Having evaluated k 0 , k 1 , and k 2 as described above, we use a similar approach to obtain other k n values for n Ͼ 2. In each case, starting with the general form of Equation 5, we apply approximations (a) k n Ϫ 2, n Ͼ Ͼ k n Ϫ 1, n , (b) k n Ϫ 2, n Ϸ k n Ϫ 1, n , and (c) k n Ϫ 2, n Ͻ Ͻ k n Ϫ 1, n to determine which gives the best k n estimate by linear least squares fit. In all cases where approximation a applies to n ϭ 2 (i.e. k 02 Ͼ Ͼ k 12 in Equation 5), we find that the corresponding general approximation (k n Ϫ 2, n Ͼ Ͼ k n Ϫ 1, n ) also applies to Equation 6 for evaluating other even-numbered k n values (n ϭ 4, 6, etc.). By using the k n values measured with corresponding simplified (linear) versions of Equations 5 and 6, we extrapolate I n back from t ϭ 15 s to t ϭ 0, to estimate I n 0 , indicating the initial amount of loop A n present before extension with polymerase.

RESULTS
Each of the self-priming sequences examined here, like (CTG) 16 (CAG) 4 previously studied (4), forms a series of hairpin loops, A n (n ϭ 0, 1, 2, etc.), with stable p/t 1 duplex enclosing a less stable t/t loop domain ( Fig. 1, a-d). The subscript n in A n indicates that there are n overhanging template triplets on which DNA polymerase can extend the primer 3Ј-end rapidly (in seconds) to yield blunt-end product (A n ϩ n triplets) proportional to the amount of A n present. With increasing reaction time, each blunt-end product undergoes slippage, allowing more triplets to be added by DNA polymerase, as shown in Figs. 2 and 3.
In each case ( Fig. 1, a-d), the even-and odd-numbered loops A n up to n ϭ 11 have all four primer triplets correctly hydrogen-bonded and stacked in Watson-Crick bp (indicated by dots) with four antiparallel, template triplets located n triplets from the 5Ј-32 P-end (*). The p/t duplex in A n (n ϭ 0, 1, . . . 11) encloses a t/t loop domain containing 12 Ϫ n template triplets. The t/t domain has the hairpin bend made with an even or odd number of unpaired bases and also has mispaired bases held between correct bp formed between opposing (antiparallel) template triplets. Opposing triplets of type CTG or GTC form mispairs of type T opposite T (T/T) held between correct G/C and C/G bp (dots), as shown ( Fig. 1, a and b) for sequences (CTG) 16 (CAG) 4 and (GTC) 16 (GAC) 4 , respectively. On the other hand, opposing triplets of type CAG or GAC form mispairs of A opposite A (A/A), as shown (Fig. 1, c and d) for (CAG) 16 (CTG) 4 and (GAC) 16 (GTC) 4 , respectively.
Loops A 12 to A 15 ( Fig. 1, a-d), having fewer than four of the primer triplets correctly bound to template triplets, are much less stable and are hardly apparent initially. These loops are included to show how bends change in extreme cases of slippage after more then 11 triplets are added by DNA polymerase in longer reaction times (Figs. 2b and 3b).
Thermal Melting Profiles Reveal the Relative Stability of the Two Domains-To measure the relative stability of loop structures formed in absence of polymerase, thermal denaturation curves were obtained for each 60-mer sequence in low salt 1 The abbreviations used are: p/t, primer-template DNA duplex; t/t, loop domain of template triplet interactions; bp, base pair(s).
FIG. 1. Possible hairpin loops formed by self-priming sequences of CAG/CTG and GAC/GTC triplet repeats. A, self-priming DNA sequence having 16 repeats of a template triplet followed by four repeats of complementary primer triplet; A n , hairpin fold of sequence A, having n overhanging template triplets on which DNA polymerase can rapidly add n complementary primer triplets to the primer 3Ј-end. a, loops A n (n ϭ 0, 1, 2, etc.) with hydrogen-bonded base pairs for sequence 5Ј-(CTG) 16 (CAG) 4 -3Ј; b, corresponding loops for "sister" sequence of same base composition, (GTC) 16 (GAC) 4 ; c and d, corresponding loops for respective "complementary" cases, (CAG) 16 (CTG) 4 and (GAC) 16 (GTC) 4 . The hairpin bends in even-numbered loops (n ϭ 0, 2, . . . 14) are made with an even number of unpaired bases, minimally four as shown; those in odd-numbered loops (n ϭ 1, 3, . . . 15) are made with an odd number of unpaired bases (at least three). The hydrogen-bonded, Watson-Crick base pairs are indicated by dots in a horizontal series; three dots between triplets in parentheses indicate stably paired primer-template triplets in the p/t duplex domain; two dots between triplets in parentheses indicate less stably paired template triplets in the t/t loop domain. The asterisk indicates the 5Ј-end of template labeled with 32 P. Note that loops A n have n template triplets available for extending primer 3Ј-end on template with DNA polymerase (e.g. polymerase KFexo Ϫ ). In the presence of three dNTPs (n ϭ C, G, and A or T), polymerase KFexo Ϫ rapidly extends A n by n primer triplets to form product blunt-end hairpins. The products in each case, A n plus (CAG) n (a), A n plus (GAC) n (b), A n plus (CTG) n (c), or A n plus (GTC) n (d), are observed as a series of band intensities I n (n ϭ 0, 1, 2, etc.) by denaturing gel electrophoresis (Figs. 2b and 3b). The changes in I n with increasing reaction time are measured to determine rates at which successive blunt-end product hairpins undergo slippage resulting in further polymerase-catalyzed expansion. Extrapolation of I n to zero reaction time yields I n 0 , the initial amount of A n formed by each sequence.
buffer (0.02 M Na ϩ ), by plotting UV absorbance (A 260 ) versus temperature (Figs. 2a and 3a). In each case, as found previously for (CTG) 16 (CAG) 4 (4), two sigmoidal transitions are evident, the first with melting temperature (T m ) below 60°C and the second with T m near 80°C. The first transition indicates the melting of the t/t loop domain containing the base pairs and mispairs of template triplet interactions. As anticipated from previous work (22) and seen by comparing Figs. 2a and 3a, this domain is most stable for CTG triplets (T m ϭ 56°C) followed by GAC (54°C), CAG (52°C), and last GTC (47°C). The second transition indicates the dissociation of stable p/t duplex, which melts at nearly the same temperature (T m Ϸ 79°C) in all four cases.
If slippage in the stable p/t duplex is promoted by slippage in the less stable t/t domain, as we suggested previously (4), then (GTC) 16 (GAC) 4 , having the lowest T m for the first transition (47°C, Fig. 3a), should also have the highest rate of slippage and expansion with polymerase. This is evidently the case, as seen by comparing gel band patterns (Figs. 2b and 3b) obtained by primer extension with DNA polymerase KF exo Ϫ at 37°C, for reaction times ranging from 0.5 to 16 min.
Polymerase Extension Products Resolved by Gel Electrophoresis and PhosphorImager Analysis-The addition of DNA polymerase KFexo Ϫ and appropriate (0.4 M) dNTP substrates results in rapid extension of loops A n to their blunt-end products, A n ϩ n triplets, resolved as band intensities I n by electrophoresis and 32 P PhosphorImager analysis (Figs. 2b and 3b). After stopping the reaction at various times and resolving product bands in gel lanes, we evaluate band intensities I 0 , I 1 , I 2 , etc. by integration in each lane, to measure the relative amounts of products A 0 , A 1 ϩ 1 triplet, A 2 ϩ 2 triplets, etc. as a function of time. The I n values obtained at times of 0.25 and 0.5 min, before products rearrange significantly by slippage, are used for extrapolation back to 0 time, to obtain I n 0 , indicating the initial amounts of A n present when polymerase was added.
In the case of (CTG) 16 (CAG) 4 , whose band patterns are shown in Fig. 2b (left), we see that the loops are predominantly of the even-numbered type, as previously reported (4). The complementary case, (CAG) 16 (CTG) 4 , shown in Fig. 2b (right), also forms mainly even-numbered loops, as does (GTC) 16 (GAC) 4 , shown in Fig. 3b (left) After 0.5 min of reaction time, the most intense bands correspond to even numbers of triplets added (0, 2, . . . 10). As reaction time is increased, the band intensities gradually change as blunt-end products rearrange by slippage and are expanded further, mainly in steps of two triplets.
Extension by Slippage Compared with Melting Temperature-By examining band intensity changes with reaction time (Fig. 2b), we see that expansion by slippage is several times faster for (CAG) 16 (CTG) 4 than for (CTG) 16 (CAG) 4 . A faster slippage rate is in keeping with the observation (Fig. 2a) that the first transition has a 4°C lower T m value, 52°C, compared with 56°C for (CTG) 16 (CAG) 4 , The second transition in both cases is the same (T m ϭ 79°C).
The "sister" sequence (GTC) 16 (GAC) 4 , which has the same base composition as (CTG) 16 (CAG) 4 , has a much lower first transition (T m ϭ 47°C), with no significant change in the second transition (T m ϭ 78°C), as seen in Fig. 3a. The first T m is 9°C below that for (CTG) 16 (CAG) 4 , and the rate of expansion by slippage is also much faster as found by comparing the gel band patterns in Figs. 3b (left) and 2b (left) Nevertheless, (GTC) 16 (CAG) 4 still forms even-numbered loops preferentially and slips in steps of two triplets, as do (CTG) 16  In contrast, (GAC) 16 (GTC) 4 , which has the same base composition as (CAG) 16 (CTG) 4 , tends to form odd-numbered loops. Unlike the other three cases, the most intense bands in this case (Fig. 3b, right) correspond to odd numbers of triplets added (1, 3, 5, etc.). The changes in band intensity with time indicate that expansion by slippage is much slower than observed for (GTC) 16 (CAG) 4 in Fig. 3b (left), being comparable with that of (CAG) 16 (CTG) 4 in Fig. 2b (right). A slower slippage rate is consistent with the melting curve in Fig. 3a, showing that the first transition for (GAC) 16 (GTC) 4 has T m ϭ 54°C, about 7°C higher than observed for (GTC) 16 (GAC) 4 (Fig. 3a) and 2°C above that found for (CAG) 16 (CTG) 4 (Fig. 2a).
Initial Blunt-end Hairpin Amount and Slippage Rate-In all four cases, after 15 s of reaction, band intensity I 0 , representing blunt-ended hairpin A 0 , shows simple exponential decay to a small residual "background" intensity I 0b , which we evaluate by Equation 2 under "Experimental Procedures." The rate constant k 0 for this decay is obtained from a linear least squares fit to a plot of ln(I 0 Ϫ I 0b ) versus reaction time, according to Equation 3. Extrapolation to 0 time yields an I 0 0 value indicating the initial amount of A 0 present at the time polymerase was added.
By evaluating two components of k 0 , namely k 01 for slippage by one triplet (A 0 3 A 1 ) and k 02 for slippage by two triplets (A 0 3 A 2 ), as described by Equations 4 and 5, we find that k 01 is an order of magnitude smaller than k 02 in every case except (GAC) 16 (GTC) 4 . For the latter case, k 01 appears equal to or greater than k 02 , verifying that this case differs from the rest by forming odd-numbered loops equally well or better than even-numbered loops. A plot of k 0 against T m for the first transition (Fig. 4) shows that (GAC) 16 (GTC) 4 has a k 0 value (open circle) somewhat greater than expected from the trend indicated by the other three cases (solid circles). The slippage rate for (GAC) 16 (GTC) 4 (open circle) is equivalent to that of (CAG) 16 (CTG) 4 (closed circle), although the first transition T m is 2°C higher, suggesting that the ability to slip by one triplet as well as two enhances the rate.
Decline in Slippage Rate with Increasing Number of Triplets Added-While k 0 measures the slippage rate within the initial blunt hairpin A 0 , the k n values obtained for n ϭ 1, 2, etc. measure the slippage rates within the extended blunt-end hairpins, A n plus n triplets. The k n values found in each case, by applying linear versions of Equation 6 under "Experimental Procedures," are shown plotted against increasing n (Fig. 5). As expected, extension increases hairpin stability, so k n decreases as n increases. The k n values decline monotonically to similar low values at n ϭ 10 for the two complementary cases with CTG and CAG triplets and to similarly low values at n ϭ 11 for the sister cases with GTC and GAC triplets. For n Ͼ 11, the k n value shows a more pronounced decline on a log scale (not shown), in keeping with our expectation that primer triplets

FIG. 2. Comparison of (CAG) 16 (CTG) 4 and (CTG) 16 (CAG) 4 melting transitions and corresponding band patterns showing triplets added with increasing time of DNA polymerase reaction. a,
melting curves obtained by plotting UV absorbance A 260 versus temperature, at the same (2 M) strand concentration in low salt buffer, showing that (CAG) 16 (CTG) 4 has lower first T m than (CTG) 16 (CAG) 4 (52 versus 56°C) but the same second T m (79°C). b, patterns of bands on denaturing gel, obtained by reaction with DNA polymerase KFexo Ϫ at 37°C, using 0.4 M each of three dNTPs (N ϭ C, G, and A or T) required for correct extension of primer triplets on template triplets. The outer lanes marked ddG show bands corresponding to 1, 2, etc. primer triplets added, found with ddGTP included in the reaction mixture to cause termination with dideoxyguanosine.

FIG. 3. Comparison of (GAC) 16 (GTC) 4 and (GTC) 16 (GAC) 4 melting transitions and corresponding band patterns showing triplets added with increasing time of DNA polymerase reaction. a,
melting curves obtained by plotting UV absorbance A 260 versus temperature at the same (2 M) strand concentration in low salt buffer, showing that (GAC) 16 (GTC) 4 has higher first T m than (GTC) 16 (GAC) 4 (54 versus 47°C) but the same second T m (78°C). b, patterns of bands on denaturing gel, obtained by reaction with DNA polymerase KFexo Ϫ at 37°C, using 0.4 M each of three dNTPs (N ϭ C, G, and T or A) required for correct extension of primer triplets on template triplets. The outer lanes marked ddC show bands corresponding to 1, 2, etc. primer triplets added, found by including ddCTP in the reaction mixture to cause termination with dideoxycytidine. entering the hairpin bend lose base pairing with template triplets, thereby making less stable bends as in loops A 12 to A 15 ( Fig. 1, a-d).
Estimated Amounts of Loops Present Initially-Using Equation 6 with observed k n values, we extrapolate each band intensity I n to time 0 of reaction, in order to estimate I n 0 , the amount of loop A n initially present. The I n 0 values shown plotted versus n in a histogram (Fig. 6) were obtained using gel band patterns like those in Figs. 2b and 3b, but for shorter (15rather than 30-s) intervals of reaction time, starting at 15 s (data not shown).
Loop Stability in Relation to Base Stacking-If the stabilities of hairpin loops depended simply on the number of possible hydrogen-bonded bp shown (Fig. 1, a-d), then the predicted order of stability would be A 0 Ͼ A 1 Ͼ A 2 , etc. In this case, initial band intensities should decrease in the order, I 0 0 Ͼ I 1 0 Ͼ I 2 o , etc., and slippage by one triplet (A 0 3 A 1 ) should require less energy and occur more readily than slippage by two triplets (A 0 3 A 2 ). As seen in Fig. 6, none of the four cases examined conform to this simple model. Only in one case, (GAC) 16 (GTC) 4 , is I 1 0 greater than I 2 0 , and in this case, I 1 0 is also greater than I 0 0 . In the other three cases, I 1 0 is much less than I 0 0 and also less than I 2 0 , indicating that odd-numbered loops are less stable than even-numbered ones. Since the four cases have the same hydrogen bonding in different nearest neighbor sequence contexts, we see that nearest neighbor base stacking has a strong influence on the even-odd character of hairpin loops and their stabilities.
We note that the even-numbered loops A 0 , A 2 , . . . A 8 each have a four-base bend held by two stacked bp (two adjacent dots) next to it (Fig. 1, a-d). The odd-numbered loops, A 1 , A 3 , . . . A 9 , however, each have a three-base bend held by only a single bp (lone dot). If this lone bp is unstable because of unfavorable base stacking, there will be seven rather than three unpaired bases in the odd-numbered bend. In this case, with seven versus four unpaired bases in odd versus even bends, A 1 may become less stable than A 2 , A 3 less stable than A 4 , etc. so that slippage in steps of two triplets becomes favorable (A 0 3 A 2 3 A 4 etc.). This kind of behavior, which we first observed (4) for (CTG) 16 (CAG) 4 (Fig. 1a), we now also observe for (GTC) 16 (GAC) 4 and (CAG) 16 (CTG) 4 (Fig. 1, b and c) but not for (GAC) 16 (GTC) 4 (Fig. 1d). To explain this, we examine the stacking of nearest neighbor doublets of base pairs and mispairs.
Evaluation of Nearest Neighbor Doublet Stabilities-Both CAG repeats (CAGCAG . . . ) and CTG repeats (CTGCTG . . . ) contain the nearest neighbor doublet, GC. This self-complementary doublet (5Ј-GC-3Ј), when hydrogen-bonded to its own kind in antiparallel, forms the strongly stacked bp doublet, 5Ј-GC/CG-5Ј, which has the highest T m value found for nearest neighbor doublets of DNA base pairs in low to physiological salt concentrations (24). The T m value obtained for the GC/CG bp doublet is almost as high in 0.02 M salt (136°C) as in 1 M salt, 139°C (Table I).
The strong base stacking within the bp doublet GC/CG is largely responsible for the stability of both the p/t duplex domains of (CTG) 16 (CAG) 4 and (CAG) 16 (CTG) 4 and their t/t loop domains (Fig. 1, a and c). The p/t duplex in each case has the same kind of repeating unit, in which the strongly stacked GC/CG doublet is accompanied by two correct but less strongly stacked bp doublets, CT/GA and CA/GT, with T m values of only 58 and 55°C, respectively, in 0.02 M salt ( Table I). The observed T m for the p/t duplex in each case, 79°C (Fig. 2a), is close to the   FIG. 4. Rate constant k 0 for A 0 slippage and expansion in relation to first T m in melting profile. Results shown as solid circles are for the three cases that predominantly form even-numbered loops, mainly A 0 , with k 0 measuring slippage by two triplets, A 0 3 A 2 . These results indicate that k 0 declines with increasing T m of the first transition seen in Figs. 2a and 3a. The lone open circle is for the case of (GAC) 16 (GTC) 4 , which favors odd-numbered loops, mainly A 1 (Fig. 1d). In this case, k 0 measures slippage by both one triplet (A 0 3 A 1 ) and two triplets (A 0 3 A 2 ).
FIG. 5. Decline in slippage rate with increasing number of triplet repeats added by DNA polymerase. The rate constant k n for slippage and extension of blunt-end hairpin, A n plus n triplets, was evaluated from changes in band intensity I n with DNA polymerase reaction time, as described by Equation 6, using gel data shown (Figs. 2b and 3b) and additional data at 0.25-min intervals (not shown).

FIG. 6. Initial amounts of even-and odd-numbered loops A n obtained for CAG/CTG and GAC/GTC triplet repeat sequences.
The results shown are expressed as percentage of total intensity of bands found by extrapolation to zero reaction time with DNA polymerase, using rate constants (Fig. 5) and band intensities observed in short reactions of 0.25 min (not shown) and 0.5 min (Figs. 2b and 3b).
average value for these three doublets, (136°C ϩ 58°C ϩ 55°C)/3 ϭ 83°C. The latter is the predicted T m for an infinitely long duplex of CAG/CTG repeats in 0.02 M salt (24).
In the t/t loop domains of (CAG) 16 (CTG) 4 , the GC/CG doublet is flanked by A/A mispairs (Fig. 1c). The poor stacking of A/A between C/G and G/C bp yields the weak mispaired doublets, CA/GA and AG/AC, which are equivalent. Considering the observed T m of 52°C (Fig. 2a) as approximately the average value for all of the doublets involved in t/t domain melting, we can evaluate T m for CA/GA as follows.
We can also evaluate T m for CA/GA by including the observed T m for p/t duplex (79°C, Fig. 2a), described as follows.
The two methods of evaluation agree that CA/GA has a very low T m value, 13 Ϯ 3°C in 0.02 M salt. From this value, we predict a corresponding low enthalpy change for CA/GA doublet melting, ⌬H 0 ϭ 1.0 Ϯ 0.3 kcal/mol (Table I) (Table I). Thus, in low salt (0.02 M), we find that GA/CA is the strongest, while CA/GA is the weakest, of the mispaired doublets formed in hairpin folds of CAG/CTG and GAC/GTC triplet repeats. DISCUSSION The major (CAG/CTG) class of DNA triplet repeats found expanded in neurodegenerative diseases has the ability to form stable hairpin loops on each strand, as shown in previous studies (21,22,25). The formation of such loops when repeating sequences are being replicated or repaired in vivo may promote primer-template slippage like that demonstrated here in vitro, enabling DNA polymerase to catalyze repeat expansions (4,26,27). In the present study, comparing self-priming loops of (CTG) 16 (CAG) 4 and (CAG) 16 (CTG) 4 , we find that even-numbered loops starting with A 0 (Fig. 1, a and c) are strongly favored in both cases, promoting slippage and expansion in steps of two triplets, A 0 3 A 2 3 A 4 etc. (Fig. 2b). The CAG loop domains (Fig. 1c), having a 4°C lower melting temperature (52 versus 56°C) (Fig. 2a), also show a 4-fold higher rate of slippage and expansion (Figs. 2b and 5). As seen in Fig. 5, slippage rate declines as hairpin structures become more stable with increasing numbers of triplets added, adding support to the idea that slippage initiates in less stable regions of loop structure (4) By replacing CAG/CTG with GAC/GTC, we find that evenmembered loop domains are still favored by GTC repeats (Fig.  1b) but not by GAC repeats (Fig. 1d). Compared with evennumbered CTG loops (Fig. 1a), the GTC loops are considerably less stable and much more prone to slippage, melting at a 9°C lower temperature (47°C, Fig. 3a) and showing a 6-fold higher rate of slippage and expansion (Figs. 3b and 5). For GTC, CAG, and CTG triplets, all of which form even-numbered loops almost exclusively (Fig. 6), a trend of decreasing slippage rate with increasing loop domain melting temperature is indicated (Fig. 4, solid circles).
The GAC triplets, on the other hand, forming odd-numbered as well as even-numbered loops, appear to slip somewhat more rapidly than expected (Fig. 4, open circle). To explain why even-numbered loops are strongly favored by CAG, CTG, and  GTC triplet repeats, but not by GAC repeats, we examine the base stacking properties of nearest neighbor doublets of base pairs and mispairs formed in each case.
Stacking of Nearest Neighbor Doublets-In the p/t duplex formed by complementary primer and template triplet repeats, the stacking of Watson-Crick bp can be described in terms of three bp doublets. For CAG/CTG repeats, these doublets are GC/CG, CA/GT, and CT/GA; for GAC/GTC repeats, they are CG/GC, GA/CT, and GT/CA. As seen in Table I, GC/CG is much stronger than CG/GC in 0.02 M salt, but CA/GT and CT/GA are considerably weaker than GA/CT and GT/CA. The resultant average doublet strength is nearly the same in each case. The average doublet value of T m being similar explains why the p/t duplex melts at a similar high temperature (ϳ79°C) in all four cases examined (Figs. 2a and 3a). With p/t duplex being more or less equally stable in all cases, we see that p/t slippage is probably controlled by the less stable t/t domain, whose T m shows an inverse relationship to slippage rate (Fig. 4).
The hairpin loop t/t domains have lower melting temperatures because they have mispaired bases (A/A or T/T) in place of correct (A/T or T/A) base pairs. The stacking of a correct C/G bp followed by mispair of type A/A or T/T corresponds to doublet CA/GA or CT/GT, respectively, while the stacking of G/C followed by A/A or T/T corresponds to GA/CA or GT/CT, respectively. As shown (Table I), CA/GA stacking is very weak in 0.02 M salt (T m ϭ 13°C), whereas GA/CA stacking is comparatively very strong (T m ϭ 50°C). This large difference in base stacking helps explain why GAC repeats form much more stable oddnumbered loops than CAG repeats.
Even-Odd Character of Hairpin Bends Explained by Mispair Stacking Energies-In loop domains of CAG or CTG repeats, each strongly stacked GC/CG doublet (T m ϭ 136°C) is accompanied by two weakly stacked doublets of type CA/GA (T m ϭ 13 Ϯ 3°C) or CT/GT (T m ϭ 19 Ϯ 3°C). Because the latter doublets are so poorly stacked (T m well below 37°C), we see that the lone C/G bp holding the three-base CTG or CAG bend in odd-numbered loops (Fig. 1, a and c) is probably unstable at physiological temperature. Accordingly, odd-numbered loops starting with A 1 should have seven unpaired bases rather than only three (Fig. 1, a and c). Thus, with more unpaired bases in the bend, A 1 becomes less stable stable than A 2 as well as A 0 , both of which have similar four-base bends firmly held by the strongly stacked GC/CG doublet (Fig. 1, a and c). The enlarged odd-numbered bend with poor stacking is probably the reason why odd-numbered loops are so disfavored for (CTG) 16 (CAG) 4 and (CAG) 16 (CTG) 4 , so that slippage occurs primarily in even numbered steps of two triplets (Fig. 2b).
In GAC and GTC loop domains, on the other hand, the corresponding mispaired doublets, GA/CA and GT/CT, have much better stacking as indicated by their high T m values of 50 Ϯ 6 and 40 Ϯ 6°C, respectively (Table I). The T m of GA/CA appears high enough to maintain a stable 3-base bend in oddnumbered loops as shown (Fig. 1d). However, GT/CT, being only marginally stable at 37°C (T m ϭ 40°C), is less likely to keep such a bend (Fig. 1b) from opening up to seven unpaired bases. This may be the reason why GTC repeats still prefer even-numbered loops, whereas GAC repeats favor odd-numbered loops (Figs. 3b and 6).
Remarkably, GA/CA has T m and ⌬H 0 values almost as high as the normal doublets, CA/GT and CT/GA, in 0.02 M salt (Table I). This means that the A/A mispair, although not hydrogen-bonded, is strongly stacked in the doublets GA/CA and AC/AG, which are equivalent. With T m ϭ 50 Ϯ 6°C, well above 37°C, the stacking appears strong enough to hold the lone G/C bp needed to form a stable 3-base bend, as shown (Fig. 1d) for (GAC) 16 (GTC) 4 .
Dependence on Salt Concentration-It has been known for some time that normal doublets CA/GT, CT/GA, and CG/GC show similar percentage increases in T m (°C) as salt concentration is raised (24). For Na ϩ raised from 0.02 to 1 M, the increase is about 100% in all three cases (Table I), so we expect a 100% increase for CA/GA and CT/GT as well. On the other hand, no increase for GA/CA and GT/CT is expected, since GT/CA, GA/CT, and GC/CG show almost no sensitivity to salt concentration. The resultant ⌬H 0 values calculated by Equation 10 from T m in 1 M Na ϩ (Table I, last column) are consistent with recently published values (28), shown in parentheses.
Conclusion-Our analysis of melting profiles and polymerase extensions of self-priming sequences of DNA triplet repeats reveals several novel features of the CAG/CTG triplet class implicated in disease. (a) The intrastrand loops formed by this class of triplet repeats are stabilized largely by the strongly stacked GC/CG bp doublet. (b) The accompanying doublets, CA/GA and CT/GT, containing A/A and T/T mispairs, respectively, are very weakly stacked in low salt solution. (c) The weak stacking of CA/GA and CT/GT destabilizes the C/G bp that holds three-base bends of odd-numbered loops, making the four-base even-numbered bends much more favorable for both CAG repeats and CTG repeats. (d) The weaker stacking of CA/GA than CT/GT makes CAG loop domains less stable and more slippery than their CTG counterparts, resulting in a 4°C lower melting temperature and a 4-fold higher rate of repeat expansion by slippage in DNA polymerase reactions.
We also find that GAC/GTC triplet repeats, not yet implicated in disease, have very different base-stacking properties, although forming similar hydrogen-bonded base pairs. (a) Their intrastrand loops are mainly stabilized by the correct doublet, CG/GC, which has considerably weaker base stacking than GC/CG in low salt. (b) The weaker stacking of CG/GC is compensated by stronger stacking in the accompanying doublets with mispairs, GA/CA and GT/CT. (c) The stacking of GA/CA in low salt is surprisingly high, comparable with that of normal CA/GT and CT/GA doublets in correct duplexes of CAG and CTG repeats. (d) The strong GA/CA stacking preferentially stabilizes odd-numbered GAC loops and makes them equivalent to even-numbered CAG loops in stability, although CG/GC is less stable than GC/CG. (e) The GT/CT doublet, being weaker than GA/CA, is unable to stabilize odd-numbered bends sufficiently, so that GTC repeats still primarily form even-numbered loops, which melt 9°C lower than CTG loops and show a 6-fold higher rate of expansion by slippage.
Several models have been proposed to indicate how hairpin loop formation may contribute to the expansion of trinucleotide repeats with DNA polymerase (4,26,27,29). The data presented here should be helpful in testing such models. Our data reveal that nearest neighbor base stacking has a strong influence on loop stability and slippage rate, leading to repeat expansions. These data were obtained in 0.02 M salt in order to observe the melting of both loop and duplex domains in our self-priming sequences of triplet repeats. Since salt can strongly affect the stacking of base pairs and mispairs, additional data at salt concentrations closer to 0.15 M are needed to make more reliable predictions of DNA triplet-repeat folding and slippage under physiological conditions.