Molecular Basis for DNA Double-Strand Break Annealing and Primer Extension by an NHEJ DNA Polymerase

Summary Nonhomologous end-joining (NHEJ) is one of the major DNA double-strand break (DSB) repair pathways. The mechanisms by which breaks are competently brought together and extended during NHEJ is poorly understood. As polymerases extend DNA in a 5′-3′ direction by nucleotide addition to a primer, it is unclear how NHEJ polymerases fill in break termini containing 3′ overhangs that lack a primer strand. Here, we describe, at the molecular level, how prokaryotic NHEJ polymerases configure a primer-template substrate by annealing the 3′ overhanging strands from opposing breaks, forming a gapped intermediate that can be extended in trans. We identify structural elements that facilitate docking of the 3′ ends in the active sites of adjacent polymerases and reveal how the termini act as primers for extension of the annealed break, thus explaining how such DSBs are extended in trans. This study clarifies how polymerases couple break-synapsis to catalysis, providing a molecular mechanism to explain how primer extension is achieved on DNA breaks.

(B) In the previously reported catalytically incompetent synaptic structure (PDBID:2R9L; Brissett et al., 2007), the protein monomers are again in a face to face orientation but the monomers are rotated by 180 degrees relative to one another.       depicted with protein side-chain neighbors that are within 4Å of the strand. Residues are coloured blue for protein monomer that binds the ds/ss junction and templating Loop1 contacts, yellow for the protein monomer that accepts the incoming primer strand.

(B)
A zoomed out view of the template/primer DNA strands contacts with a translucent solvent accessible surface. Neighborhood contacts of less than 4Å are tinted yellow.
(C) A view of the supporting structural role played by Loop 2 (cyan) in guiding the incoming primer strand (red). The translucent solvent accessible surface is coloured green for the area that is in contact with the 3'OH.  (E) Representation of the current structure as a binary (enzyme/substrate) complex.
(F) Conformational changes of ligands involved in nucleotide binding. Overlay of the preternary active site (pink, magenta; PDBID: 3PKY) components onto the current structure (yellow, tan, green).

Supplemental Experimental Procedures Crystallization studies
Mt-PolDom was expressed and purified as previously described (Pitcher et al., 2007). The oligonucleotides used to generate the DNA for crystallisation were the following: T (5'-GCCGCAGATC-3'), and 5'-phosphorylated D (5'-GCGGC-3'). T/D duplex DNA was prepared mixing equal amounts of the oligonucleotides to give a final solution of 3mM then heating this solution to 95°C and slow annealing over 45 minutes to 4°C in a PCR machine. The crystals grew to an average size of 175mm x 75mm. The crystals belong to space group P2 1 unit cell dimensions: a = 87.58Å, b = 80.11Å, c = 118.39Å, α = γ = 90º, β = 111.62º. The statistics for data processing are summarized in Table 1.

Structure solution and refinement of an in trans Mt-PolDom DNA synaptic complex
The structure of the Mt-PolDom-DNA complex was determined by molecular replacement using the program PHASER (McCoy et al., 2007). The crystallographic model of (apo) Mt-PolDom (PDB id: 2IRU) was used as a molecular replacement search model. A final refined model at 2.4Å resolution, with an R factor of 19.21% and R free of 24.18%, was obtained.
Crystals of Mt-PolDom complexed with DNA contained four protein molecules and eight DNA strands in the asymmetric unit, giving a VM of 2.78 Å -3 Da -1 corresponding to 55.34% (v/v) water content. The structure comprises amino acid residues 10-293, with no density observed for 9 amino acid residues at the N-terminal end and 7 amino acid residues at the C-terminal end. The terminal two bases from the downstream strand (D) of the duplex are not observed in the electron density. 94.9% of residues in the structure are in the allowed region of the Ramachandran plot with 4.4% of residues in the allowed region and 0.7% of residues are outliers on the plot (Table 1) (Figures 2, S2, cyan) conformation of the current structure to those of the previously reported structures, we observed that it is similar to the Apo, NTP-bound co-crystal and synaptic structures, leaving the pre-ternary complex (PDBID: 3PKY) as the only complex with the unique open Loop 2 conformation (Brissett et al., 2011).

Protein-DNA contacts in the annealed break DNA complex
As has been noted previously (Brissett et al., 2007), Mt-PolDom interacts with the DNA duplex predominantly via contacts with the recessed 5' phosphate moiety (Asn 13 , Lys 16 , Lys 26 , Arg 53 , Pro 55 ; the last four being invariant in LigDs; Figs. 2, 3 and S2). There are no significant differences between the three DNA-bound PolDom structures (synaptic, pre-ternary and annealed break), the notable changes mainly arise in the pre-ternary complex (Brissett et al., 2011). Further contacts with the template strand are depicted in figures S4 and S5, most of these are highly conserved among LigD members. The main-chain and side-chain atoms of Lys 66 are in non-bonded contact to A6 and maintain the templating base in its spatial orientation. Other contacts with the DNA, including Gln 67 and Thr 88 are still maintained. As expected, the protein-DNA contacts made by monomer B are almost identical to those for monomer A.

Formation of functional NHEJ complexes on short overhangs: role of 5' phosphate binding and dimeric versus monomeric configurations
Superposition of the gapped-substrate crystallised with Polβ on the structure of the microhomology-mediated synapsis by Mt-PolDom shows the possible new location of the upstream portion of the substrate, that would be now covered, and thus footprinted, by one PolDom monomer (Fig. S3A). This footprint size could be compatible with NHEJ reactions involving very short protrusions (Fig. S3B) that could be handled either by a single monomer or a dimeric arrangement as that previously described (Brissett et al., 2007). Figure S4 highlights the alternative rotamer conformation that His 83 adopts, which is different to that observed in all previously Mt-PolDom structures. This rotomer brings the Cε1 into non-bonding contact with OH of Tyr 90 , also the Nδ1 forms a potential H-bond with Oγ1 of Thr 88 . In previous PolDom/DNA bound structures, Thr 88 hydrogen bonds with a backbone phosphate oxygen of the templating strand; in the current structure this hydrogen bond is lost. The backbone carbonyl of His 83 hydrogen bonds with the phosphate oxygen O1P of templating strand T8. Arg 84 terminal amide groups interact with the phosphate oxygens O1P of templating strand T8 and A9 (Fig. 6A, S5 & S6A-B). Also, the Oγ of Ser 85 hydrogen bonds with the phosphate oxygens O1P and O2P of templating strand T8.

3´-protrusions in the template strands become primers during PolDom-mediated endsynapsis
The overall topologies of the DNA-bound Mt-PolDom complexes are different when considering the complexes that have a potential primer strand. The path that the templating DNA adopts is observed to differ between the annealed break and synaptic complexes. This is despite the residues involved with the splaying of the DNA at the ds/ss junction being in the same orientation for both complexes. The difference in the paths of the two templating strands is due to the interactions with the apical Loop 1 residues (His 83 , Arg 84 and Ser 85 , Fig. S4). As shown in more detail in Figs. S5A and S5B, Loop 1 (coloured blue with yellow side-chains) directs the path of the templating strand and "hands off" the strand to the opposite protein monomer (via Loop 2 coloured cyan with light blue sidechains). As the templating strand is passed to the opposite protein monomer, it now becomes the primer strand as it enters the active site. The opposite protein monomer interacts with the incoming primer via the Loop 2 residues, Met 215 , Lys 217 and Arg 220 . The primer terminus (3'OH) interacts with the active site via interactions with Asp 227 , Ser 229 and Lys 235 . Another view of the side-chains that interact with the template/primer strand is shown in Figure S6A, it is seen that the DNA is in contact with protein most of the time. This is more apparent when viewing the solvent accessible surfaces, the patches of yellow (Figs. S5B and S6B) depicting areas of close contact. Figure   S6B depicts how the annealed break is protected from the environment by a combination of Loop 1 and Loop 2 elements that make a continuous protein surface. Figure S6C depicts a reverse angle view of the incoming primer entering the active of Mt-PolDom. The feature to note here is that Met 215 and Lys 217 cradle the incoming primer and direct the 3' terminus into the active site. Previously, we reported that Loop 2 exists as a 3 10 helix in all of the determined Mt-PolDom structures, except the pre-ternary complex (Brissett et al., 2011) in which the helix unravels and adopts a random coil conformation. This significant conformational change results in Cα position shifts of up to ~6Å, inducing a significant repositioning of two conserved residues, Lys 217 and Arg 220 . On comparison, the current Loop 2 conformation is similar to the Apo, NTP-bound co-crystal and synaptic DNA structures. Although it has been shown that Arg 220 regulates the competency of the active site (Brissett et al., 2011), the importance of the highly conserved Lys 217 remained uncertain.
Notably, this positively charged residue contacts the 3'OH of G13 (template strand) in the PolDom-DNA complex featuring an imperfect synapsis of two DNA ends (Brissett et al., 2007). In the current complex, Loop 2 is also implicated in maintaining the position of the incoming primer in the fully complementary synapsis presented here. This is exemplified by contacts with conserved residues Met 215 , Lys 217 and Arg 220 (Fig. 6B, S6C), where Cγ of Met 215 makes a non-bonding contact with O1P of C10 from the incoming primer strand. Nζ of Lys 217 hydrogen bonds with O2P of C10 and O5* of A9 of the incoming primer strand. For Arg 220 , Nη1 hydrogen bonds with O1P and O2P of C10 whilst Nη2 hydrogen bonds with O2P of C10 of the incoming primer strand (Fig. S6A, S6C).

In trans docking of 3' hydroxyl of the incoming primer in the polymerase active site
The 3' hydroxyl directly interacts with the active site residues Asp 227 , Ser 229 and Lys 235 and indirectly with Gln 230 (Fig. 7A). Site-directed mutants Q230A and K235A show wild type like activity on gapped substrates but very poor activity on annealed breaks as found in the current structure (Fig 7C & D). The preformed template/primer stabilisation hypothesis explains this effect, but it should also be considered that these residues orientate the primer terminus and keep the 3' hydroxl in a 'stand-by' position prior to catalysis, which is specifically required in NHEJ reactions.
Comparison of this residue network, from previously published structures (Figs. S7A-C), demonstrates that the orientation depends on what moiety occupies the active site at the time.
For instance, in the apo structure (Fig. 7A), Lys 235 points away from the site indicating that this residue has a direct purpose in ligating a hydroxyl moiety in the active site. This is borne out by a water molecule being coordinated by Lys235, and the other residues in the network in the synaptic complex (Fig. S7B) and GTP bound co-crystal (Fig. S7C). This coordinated water occupies almost the exact position of the 3'-hydroxyl of the incoming primer and is displaced when the primer is bound.
In the dGTP-bound and pre-ternary co-crystal structures (Fig. S7A), we observe Gln 230 adopting a mm -40 rotamer (as opposed to the mt -30 rotamer observed in the other structures; Emsley et al., 2010). We conclude that Gln 230 is involved binding/recognition of the incoming NTP as well as orienting the primer terminus. Figure S7D places the 3'-hydroxyl of the incoming primer in the current structure into context with active site residues from Pol λ and Pol µ. This demonstrates that the incoming primer adopts an orientation that could be tolerated by polymerases from the Pol X family. From this we can conclude that the primer terminus positioning in the current complex is compatible with catalysis.
The current structure represents a binary-type complex (Fig. S7E), as it lacks metal ions and a NTP, and the bound DNA provides both primer and template. In Fig. S7E, the current complex is placed in the same orientation as observed in Fig. 3A of Brissett et al., 2011. Loop 2 and active site residues are the in the same conformation as the synaptic (PolDom-DNA binary) complex. The fact that the Loop 2 conformation doesn't change from this conformation suggests that the primer is oriented prior to binding of the NTP and catalysis. Active site residues, in general, maintain their conformations when NTP is bound (Fig. S7F). Only Lys 175 , Arg 244 (triphosphate tail binding), Asp 139 (catalytic metal binding) and Gln 230 (NTP binding/recognition) have altered conformations.