Methyl-directed DNA Mismatch Correction"

In 1964 Robin Holliday (1) proposed the correction of DNA base pair mismatches within recombination intermediates as the basis for gene conversion. The existence of the mismatch repair systems im- plied by this proposal is now well established. Activities that recognize and process base pairing errors within the DNA helix have been identified in bacteria, fungi, and mammalian cells. However, the functions and mechanisms of such systems are best understood in Escherichia coli (2-6), an organism that possesses at least three distinct mismatch correction pathways. These three systems are involved not only in the processing of recombination intermediates (5, 7, 8) but also contribute in a major way to the genetic stability of the organism, a function anticipated for mismatch repair by Tiraby and Fox (9) and by Wagner and Meselson (10). The significance of mismatch correction in the maintenance of low spontaneous muta-bility becomes apparent when one considers that seven E. coli mutator genes (dam, mutD, mutH, mutL, mutS, mutU, and mutY) have been implicated in mismatch repair (6, 8, 11-15).' This minireview will summarize information on the most extensively studied E. coli system for mismatch correction, the methyl-directed pathway for processing of DNA biosynthetic errors and intermediates in genetic recombination. A discussion of other E. coli mismatch correction systems may be found in the recent literature (6, 8, 16) and in several recent reviews (2-5, 7). Mismatch repair pathways in other organisms and descriptions of the structural properties of mispaired bases may also be found in several of these reviews (2, 4).


Methyl-directed DNA Mismatch Correction" Paul Modrich
From the Department of Biochemistry, Duke University Medical Center, Durham, North Carolina 27710 In 1964 Robin Holliday (1) proposed the correction of DNA base pair mismatches within recombination intermediates as the basis for gene conversion. The existence of the mismatch repair systems implied by this proposal is now well established. Activities that recognize and process base pairing errors within the DNA helix have been identified in bacteria, fungi, and mammalian cells. However, the functions and mechanisms of such systems are best understood in Escherichia coli (2-6), an organism that possesses at least three distinct mismatch correction pathways. These three systems are involved not only in the processing of recombination intermediates (5,7,8) but also contribute in a major way to the genetic stability of the organism, a function anticipated for mismatch repair by Tiraby and Fox (9) and by Wagner and Meselson (10). The significance of mismatch correction in the maintenance of low spontaneous mutability becomes apparent when one considers that seven E. coli mutator genes (dam, mutD, mutH, mutL, mutS, mutU, and mutY) have been implicated in mismatch repair (6,8,(11)(12)(13)(14)(15).' This minireview will summarize information on the most extensively studied E. coli system for mismatch correction, the methyl-directed pathway for processing of DNA biosynthetic errors and intermediates in genetic recombination. A discussion of other E. coli mismatch correction systems may be found in the recent literature (6, 8, 16) and in several recent reviews (2)(3)(4)(5)7). Mismatch repair pathways in other organisms and descriptions of the structural properties of mispaired bases may also be found in several of these reviews (2, 4).

Biology of Methyl-directed DNA Mismatch Correction
Methyl-directed Processing of Biosynthetic Errors-The mutator phenotype associated with genetic defects in mismatch repair implies that correction serves to eliminate base pairing errors that would otherwise be fixed as mutations. The existence of the methyl-directed pathway was predicted in 1976 by Wagner and Meselson (10) who suggested that strand-specific processing of mismatches within newly synthesized DNA could serve to eliminate DNA biosynthetic errors and thus contribute to the overall fidelity of chromosome replication (Fig. 1). This proposal included the suggestion that the requisite strand specificity could be based on a special relationship between the repair system and the replication complex or, since DNA methylation is postsynthetic, on the transient undermethylation of newly synthesized sequences.
The properties of the E. coli methyl-directed pathway are in complete accord with Wagner-Meselson hypothesis. Transfection of E. coli with artificially constructed DNA heteroduplexes, in which the two strands were in defined states of modification, has established the existence of a strand-specific mismatch repair pathway that is directed by the state of adenine methylation within d(GATC) sequences (14,(17)(18)(19)(20)(21)(22). Transfection with heteroduplexes methylated on only one DNA strand (the hemimethylated analogue of newly replicated DNA) has shown that mismatch repair occurs on the unmethylated strand, with the modified strand serving as template. When neither strand bears d(GATC) modification, mismatch correction occurs but in this case displays little strand bias. Heteroduplexes methylated on both DNA strands (the analogue of resting DNA) are refractory to repair, with exceptions to this rule reflecting the action of alternate correction systems (4-6, 8,23,24). In addition to establishing effects of d(GATC) methylation, such experiments have implicated mutH, mutL, mutS, and mutU (mutu is also referred to as uurD) gene products in methyl-directed correction (11-15) and have suggested that methyl-directed correction occurs by an excision repair mechanism in which several thousand nucleotides of the unmethylated DNA strand are excised and resynthesized (10).
Transfection analysis has also shown that the efficiency of methyldirected repair depends on the nature of the mismatch, thus proving that mismatch recognition occurs during the course of the reaction. Of the eight possible base-base mispairs, the G-T and A-C transition mismatches and the G-G and A-A transversion mispairs are typically well corrected, while the T-T, C-T, and G-A transversion pairing errors are weaker substrates (18,21,25). The C-C transversion mismatch appears to be subject to little if any methyl-directed repair. The specificity of the methyl-directed system is not restricted to basebase mismatches because transfection experiments have also demonstrated efficient correction of mismatches involving insertion/ deletion of a few nucleotides (20, 26).
Methyl-directed Repair as an Antirecombination Activity-In addition to its presumed role in replication fidelity, evidence is also available implicating methyl-directed mismatch correction in the processing of recombination intermediates. Feinstein and Low (27) have demonstrated that mutants deficient in methyl-directed mismatch repair display elevated frequencies of genetic recombination when acting as recipients in Hfr crosses. Radman (7) has suggested that such findings are due to an antirecombination activity of the methyl-directed pathway. According to this proposal, provocation of methyl-directed repair by mismatches located in regions of heteroduplex would lead to excision of the invading strand within a recombination intermediate resulting in abortion of the exchange event. This is not unreasonabIe because excision repair tracts associated with methyl-directed correction may be several kilobases in length (10,28,29) and because the analogue of the methyl-directed pathway in Streptococcus pneumoniae was identified by virtue of its ability to destroy recombination intermediates in precisely this manner (2). The implication of the antirecombination hypothesis is that methyldirected correction functions to ensure the fidelity of not only chromosome replication but genetic exchanges as well.

Mechanism of Methyl-directed Mismatch Correction
Analysis of the mechanism of the methyl-directed reaction has been facilitated by the development of a biochemical assay that permits mismatch correction to be scored in a cell-free system (17, 28). This method, which is illustrated in Fig. 2 and is based on the placement of mismatches within recognition sites for restriction endonucleases, utilizes 6.4-kilobase heteroduplexes derived from bacteriophage fl. Such molecules are subject to repair in E. coli extracts by a reaction that displays the features of methyl-directed mismatch correction observed in biological experiments (17, 23, 30).
Requirements for Methyl-directed Repair in Vitro-Repair of f l heteroduplexes in E. coli extracts mirrors the response to d(GATC) modification observed in transfection experiments. Repair of hemimethylated molecules is restricted to the unmethylated strand, correction of unmethylated heteroduplexes occurs with little strand bias, and DNAs modified on both strands are extremely poor substrates for repair (17,23,28,30,31). The cell-free reaction also displays the same genetic requirements observed in uiuo. Extracts prepared from mutH, mutL, mutS, or mutU (uurD) mutant strains are inactive, but mixed extracts complement in uitro (17). Exploitation of the cell-free assay has also implicated the E. coli single-strand binding protein (SSB)' and the y/7 subunits of the replicative DNA polymerase 111 holoenzyme in methyl-directed repair (4, 28,32). The involvement of SSB in mismatch correction in uiuo has not been addressed, but biological experiments have provided additional evidence for involvement of DNA polymerase I11 in the reaction. Schaaper (15) has recently demonstrated that a mutation in the mutD locus, which encodes the 6 subunit and editing exonuclease of DNA polymerase 111 (33), also results in a defect in methyl-directed mismatch correction in uiuo.

The complementation between mut-extracts has permitted isola-
The abbreviation used is: SSB, single-strand binding protein. The assay for correction is based on placement of base-base mismatches within restriction endonuclease recognition sites (17,23,30).
In the example shown a G-T mismatch resides within overlapping sequences recognized by Hind11 and XhoI endonucleases. Although the presence of the mispair renders this site resistant to cleavage hy either endonuclease, repair occurring on the c o m p l e m e n~~ (C) DNA strand yields an A-T base pair and generates a HindIII-sensitive site, whereas correction on the viral (V) strand produces a G-C pair and XhoI sensitivity. The shorter distance between the mismatch and the d(GATC) site in the molecule shown is 1024 base pairs. The mispair is nevertheless subject to efficient methyl-directed repair hy E. coli extracts and by the purified system described in the text (23,321. tion of MutH (25-kDa polypeptide), MutL (70-kDa), and MutS (97-kDa) proteins in near-homogeneous and biofogicalfy active forms (34-36). These three proteins together with the DNA helicase I1 product of the mutU (uurD) gene (37-39), SSB, and DNA polymerase 111 holoenzyme are able to mediate methyl-directed mismatch correction, but this reaction is inhibited by the presence of DNA ligase and occurs with lower efficiency than the reaction observed in crude extracts (32): This finding has led to isolation of a 55-kDa stimulatory protein that also may be involved in methyl-directed repair. In the presence of ATP, the four deoxynucleotide triphosphates, and NAD+ cofactors, this set of proteins (Fig. 1) catalyze a methyl-directed reaction that is identical in terms of efficiency and substrate specificity to that observed in E. coli lysates (32). DNA ligase is not required for mismatch repair per se, but correction by the purified system in the presence of this activity restores the repaired strand to a covalently closed state. Substrate Specificity of Methyl-directed Mismatch Correction-The mismatch specificity of methyl-directed repair hy the purified system cell-free systems have been found to process G-T, A-C, G-G, A-A, A-G, T-T, and T-C in a methyl-directed manner. As observed in biological experiments, some A-G, T-T, and T-C mispairs are weak substrates for in vitro repair, a consequence of the sequence environment in which the mismatch is embedded (2-4, 21). Several C-C mismatches have been tested, but none has been subject to significant methyl-directed repair in uitro (23,24,32), findings that also parallel those obtained in biofogicaI experiments.
The nature of d(GATC) site involvement in the reaction has been clarified by use of heteroduplexes containing only a small number of d(GATC) sequences. Thus fl heteroduplexes containing four d(GATC) sites were subject to methyl-directed repair both in uiuo and in uitro, despite the fact that the distance between the mismatch and the closest d(GATC) sequence in these experiments was 1000 base pairs (17). Moreover, analysis of repair DNA synthesis associated with correction of these molecules indicated that incision of an unmethylated DNA strand in the vicinity of a d(GATC) site is associated with methyl-directed repair (28). This led to the suggestion that cleavage of the unmethylated strand at a d(GATC) site could represent the basis of strand-specific excision repair (28). As discussed below, this idea has received additional support and is a key feature of current models for the mechanism of the methyl-di~ted reaction.
Direct involvement of d(GATC) sequences in methyl-directed repair predicts that heteroduplexes lacking such sequences would be refractory to correction by the mutHLSU pathway. This has been confirmed in vivo (40) and in uitro using extracts (30) and the purified system (32). Simple heteroduplexes containing a small number of d(GATC) sequences have also been used to examine the dependence of methyl-directed repair on the number of such sites and their location relative to the mismatch. Analysis by Lahue et al. (30) of a set of highly homologous fl heteroduplexes containing zero, one, two, or four d(GATC) sites demonstrated that the presence of a single d(GATC) sequence can be sufficient to support efficient methyldependent, mutHLS-dependent correction in uitro. A different result has been obtained upon transfection of E. coli with 4x174 derivatives containing one or two d(GATC) sites (41). The two-site heteroduplex was subject to mismatch correction, whereas the molecule containing only one d(GATC) sequence was not detectably repaired in uiuo. The basis of the differing response of single d(GATC) site molecules in vitro and in uiuo is not clear. While the cell-free reaction mimics in uiuo repair in terms of genetic requirements, mismatch specificity, and strand-specific response to d(GATC) methylation, it may not completely duplicate biological correction. Alternatively, the discrepancy may reflect the fact that only one heteroduplex containing a single d(GATC) site has been tested by transfection assay (41). I n uitro analysis has shown that individual d(GATC) sequences vary greatly in their ability to promote correction, a presumed consequence of the sequence environment in which such sites are embedded and/ or their location relative to the mismatch (30). In fact, local environment of d(GATC) sequences may be a major factor controlling the activity of such sites in mismatch repair. Different d(GATC) sequences are subject to differential recognition by the mutH gene product (as discussed below, the protein responsible for strand discrimination), with the efficiency of recognition of individual sites by this protein roughly correlating with their propensity to promote a repair event (35). Given the relative insensitivity of the transfection assay, the failure to detect in vivo correction of a heteroduplex containing one d(GATC) sequence (11) could reflect the presence of a "weak" site within the single construct tested by this method.

MutH, MutL, and MutS Proteins and I n i t~~i o n
of Me~hy~-directed Repair-As discussed above, methyl-directed mismatch correction is mediated by eight proteins (32) and involves recognition of DNA sites that can be separated by a thousand base pairs or more (17, 30, 32, 40). The overall mechanism of the reaction remains to be established, but events involved in initiation of correction have been partially defined by analysis of mut gene products and characterization of putative intermediates in the reaction. Analysis of purified mu&€€, mutL, and mutS gene products (34-36) has indicated that in the presence of ATP, these three proteins interact with a heteroduplex in a complex manner such that mismatch recognition at one point on the helix promotes incision of the unmethylated strand at a d(GATC) sequence located some distance away (Fig. 3).
Native MutS protein forms oligomers in solution (341, but the biologically active aggregation state of this 97-kDa polypeptide remains to be identified. This protein displays a weak ATPase that cosediments with MutS activity and is capable of binding to the eight possible base-base mismatches as judged by DNase I footprint analy-  sis (23,34).' The affinity of the protein for a mispair varies with the nature of the mismatch (23, 34). G-T is the tightest binding mispair and C-C is the weakest. Affinities for the other mismatches generally correlate with their efficiencies of correction, but exceptions exist, indicating that factors other than MutS recognition contribute to overall repair efficiency (23).
A simple activity has also been identified in near-homogeneous preparations of the 25-kDa MutH protein that can account for its involvement in methyl-directed mismatch correction (35). This activity, a Me-dependent endonuclease, cleaves 5' to the dG of d(GATC) sites leaving 3'-OH,5'-P04 termini. Symmetrically methylated d(GATC) sequences are resistant to this activity, hemimethylated sites are cleaved on the unmethylated strand, and unmethylated sites axe usually incised on only one of the two DNA strands. Such findings would be consistent with biological results suggesting that MutH functions in strand discrimination (la), but the MutH-a~ociated endonuclease has several unexpected properties. The activity is extremely weak (turnover number of about one scission per h per mol of MutH) and does not depend upon the presence of a mismatch within DNA substrate. Nevertheless, it has been argued that this d(GATC) endonuclease represents the biological activity of MutH in mismatch correction, with the anomalous properties of the isolated protein attributed to its existence in a Iargely inactive state that would undergo activation during the assembly of a repair complex on a h e t e r~u p l~x (4,35). Two lines of evidence support this view. First, as detailed below, the MutH-associated endonuclease does undergo a misma~h-dependent activation in the presence of other mut gene products. Second, both biological (41) and biochemical (32) experiments have shown that in the absence of ligase the presence of one single-strand break suffices to determine the strand specificity of repair and bypasses the requirement for MutH in correction. Analysis of the purified system has demonstrated that repair of a covalently closed heteroduplex, in the presence or absence of DNA ligase, only occurs in the presence of MutH and requires a d(GATC) sequence that is unmethylated on at least one strand (32). In contrast, heteroduplexes containing a nick in one DNA strand are subject to MutHindependent correction both in uiuo (41) and in vitro (32) provided that DNA ligase is absent. Nick-directed heteroduplex repair, which requires MutL, MutS, DNA helicase 11, SSB, and pol I11 holoenzyme  Su et al. (29). In this scheme MutH incision of the unmethylated strand at a hemimethylated d(GATC) sequence provides a single-strand break that serves as the site of initiation of excision, with removal of the unmethylated strand proceeding toward the mismatch. 5'-+3' repair DNA synthesis initiates in the vicinity of the mismatch or the d(GATC) site depending on whether the unmethylated strand was incised 3' or 5' to the mispair. While permitting this sort of bidirectional excision, this proposal is not meant to imply that the repair system possesses bidirectional capability, but rather that this possibility has not been excluded. The model in panel B has been described by Liingle-Rouault et aL (41).
The distinct features of this model are the obligate involvement of two d(GATC) sites and initiation of repair DNA synthesis only in the vicinity of such a sequence. This proposal also included the interesting suggestion that DNA helicase I1 might enter the helix at the mismatch in a reaction promoted by MutS and/or MutL proteins. Helicase 11, the mutU gene product, translocates 3"A' along a DNA chain in an ATP-driven reaction (42), serving to unwind the two DNA strands upon entry into regions of secondary structure (43). Thus, strand separation mediated by this activity and proceeding bidirectionally from the mismatch would serve to displace a section of unmethylated strand between the two d(GATC) sites. mismatch correction is endonucleolytic incision of the unmethylated strande5 Purified MutL protein exists in solution as a dimer of a '70-kDa polypeptide (36). No simple activity has been attributed to the MutL, but two in vitro effects of the protein have been demonstrated, both of which are dependent on the presence of other mut gene products. Using footprinting methods, Grilley et al. (36) have found that MutL binds to MutS-heteroduplex complexes in the presence of ATP. It has also been shown that the MutH-associated d(GATC) endonuclease undergoes activation in a reaction that requires MutS, MutL, ATP, and the presence of a mispair within the DNA substrate: The simplest scheme for initiation of methyl-directed repair that is consistent with this set of observations invokes mismatch recognition by MutS as a primary event. Interaction of MutS with a mispair provokes binding of MutL and MutH to the heteroduplex, with MutL serving to interface mismatch recognition by MutS with MutH-mediated incision occurring at a d(GATC) sequence (Fig. 3). This mechanism and the experimental findings on which it is based imply that a signal is transduced between the two DNA sites. Although the molecular basis of signal transduction and the involvement of the three proteins in the process are unclear, the following discussion will demonstrate that the nature of the subsequent excision-resynthesis reaction can place restrictions on the mechanism of signal transmission between these sites.
Possible Mechanisms for Excision Repair Associated with Methyldirected Correction-Only limited information is available concerning the nature of excision and resynthesis steps associated with methyls This conclusion bears on the discrepancy between in viuo and in vitro data with respect to substrate activity of single d(GATC) site heteroduplexes. As described in the text, single d(GATC) site substrates have been found to be aubstrates for in uitro repair (313, but the one constmct tested by in uiuo transfection assay was not corrected at detectable levels (41). In contrast, break, are subject to MutH-independent repair under conditions of ligase heteroduplexes lacking a d(GATC) sequence, but containing one single-strand deficiency both in uiuo (41) and in vitro (32). Given the apparent equivalence failure to observe repair of the single d(GATC) site heteroduplex upon trans-of a persistent strand break and MutH action at a d(GATC) sequence, the support correction. These several discrepancies are readily resolved if the weak fection is also at odds with the in vivo iinding that one nick is sufficient ta substrate activity of this heteroduplex is attributed to poor MutH recognition of $,he particular d(GATC) sequence present in the construct. directed correction. Since initiation events have been attributed to the d(GATC) site and the mismatch is followed by initiation of repair MutH, MutL, and MutS, the functions of the other five proteins synthesis near the original location of the mispair (29). However, it required for reconstitution of the reaction (DNA helicase 11, SSB, is important to note that these experiments provide no information DNA polymerase I11 holoenzyme, DNA ligase, and the 55-kDa stimon the nature of the excision process leading to the formation of the ulatory protein) are presumahly restricted to excision and resynthesis gapped precursor for DNA synthesis. Furthermore, since only one steps. However, this inference does not preclude involvement of heteroduplex was examined in these experiments (29), it would be MutH, MutL, or MutS in this latter stage of the reaction.
premature to conclude that repair DNA synthesis associated with TWO working models for the mechanism of the excision repair methyl-directed correction always initiates in the vicinity of the reaction have been described in the literature. Neither has been mispair. It is clear that much remains to be learned about the nature confirmed, but they will be considered briefly to illustrate current of excision-resynthesis and the manner in which this stage of the thinking concerning this phase of methyl-directed correction. These reaction is coupled to the early events of mismatch identification and proposals, shown in Fig. 4, invoke MutH cleavage at d(GATC) se-d(GATC) recognition and incision. The recent availability of a pure quences as the determinant of strand-specific excision but differ with system capable of supporting methyl-directed mismatch correction respect to the number of such sites required. The mechanism shown should facilitate analysis of such questions by permitting further in Fig. 4 A , which has been proposed by Lu et al.