Structural and dynamic basis of DNA capture and translocation by mitochondrial Twinkle helicase

Abstract Twinkle is a mitochondrial replicative helicase which can self-load onto and unwind mitochondrial DNA. Nearly 60 mutations on Twinkle have been linked to human mitochondrial diseases. Using cryo-electron microscopy (cryo-EM) and high-speed atomic force microscopy (HS-AFM), we obtained the atomic-resolution structure of a vertebrate Twinkle homolog with DNA and captured in real-time how Twinkle is self-loaded onto DNA. Our data highlight the important role of the non-catalytic N-terminal domain of Twinkle. The N-terminal domain directly contacts the C-terminal helicase domain, and the contact interface is a hotspot for disease-related mutations. Mutations at the interface destabilize Twinkle hexamer and reduce helicase activity. With HS-AFM, we observed that a highly dynamic Twinkle domain, which is likely to be the N-terminal domain, can protrude ∼5 nm to transiently capture nearby DNA and initialize Twinkle loading onto DNA. Moreover, structural analysis and subunit doping experiments suggest that Twinkle hydrolyzes ATP stochastically, which is distinct from related helicases from bacteriophages.


INTRODUCTION
A mitochondrion, the powerhouse of a cell, harbors its own genome, which is replicated and maintained differently from its nuclear counterpart. The circular double-stranded mitochondrial genome contains two separate origins (1,2). The replication of the two DNA strands is asynchronous and does not involve Okazaki fragment synthesis (3,4). The mitochondrial replisome shares a distant homology with the replisomes from bacteria and bacteriophages (2,5). Its core components include DNA polymerase ␥ , Twinkle helicase and mitochondrial single-stranded DNA-binding protein (6). DNA synthesis on both strands is initialized with RNA transcripts produced by the mitochondrial RNA polymerases (7,8). Mutations and deletions in the mitochondrial genome are correlated with numerous neuromuscular diseases and premature aging (9).
Twinkle is a mitochondrial replicative helicase (10,11). The expression level of Twinkle is correlated with the copy number of mitochondrial DNA (12), and the homozygous deletion of Twinkle is embryonically lethal (13). Nearly 60 point mutations on Twinkle have been linked to various human mitochondrial diseases (11,14). Patients with Twinkle mutations exhibit reduced mitochondrial DNA copy numbers or partial deletions of the mitochondrial DNA (9). Moreover, Twinkle also participates in mitochondrial DNA repair (15,16) and degradation (17). Consistent with its diverse functions, Twinkle interacts with a number of proteins involved in mitochondrial DNA replication and repair (6,(16)(17)(18).
Helicases are chemo-mechanical motors that couple ATP hydrolysis to directional translocation on DNA or RNA (19,20). Based on sequence conservations, helicases are divided into six superfamilies (SFs), with SF1 and SF2 helicases being monomeric and SF3-SF6 helicases being hexameric (20). Replicative helicases are all hexameric and belong to SF3, 4 and 6. The six helicase subunits form a ringor lock-washer-shaped hexamer, binding ATP at each of the subunit interfaces and DNA in its central channel. DNAbinding loops from different subunits stagger like a staircase to interact with the DNA backbone. While hexameric helicases may display different directionality, step sizes and other mechanistic details, a sequential hand-over-hand translocation mechanism has been proposed for most of them (21)(22)(23)(24)(25)(26)(27). Following ATP hydrolysis and product release at one end of the hexamer, the DNA-binding loops or the entire subunit move from one end of the DNA to the other end to form a new ATP-binding site. Sequential hydrolysis along the hexamer leads to unidirectional and processive translocation on DNA. A similar hand-over-hand translocation mechanism has been proposed for many hex-americ protein translocases (28). Furthermore, the hexameric rings of replicative helicases pose topological restrictions for their loading onto the DNA substrates. To overcome this challenge, bacterial (SF4), archaeal (SF6) and eukaryotic nuclear (SF6) helicases utilize specialized loaders to crack their rings open (29). In contrast, Twinkle (SF4) and bacteriophage T7 gp4 (SF4) helicases are capable of self-loading onto DNA (30,31), but their mechanisms are poorly understood.
Twinkle is a member of the SF4 family helicases and shares low homology with bacterial DnaB and bacteriophage T7 gp4 (10,32). The N-terminal domain (NTD) of Twinkle is derived from the TOPPRIM family primase (33). However, key residues in the Twinkle primase active site from vertebrates are mutated, and no primase activity has been detected (11). The helicase domain is located on the Cterminal half of Twinkle. Similar to gp4 and DnaB (21,22), Twinkle forms a homo-hexamer and migrates in the 5 to 3 direction on the single-stranded DNA (ssDNA) (32,34). Interestingly, the reported Twinkle helicase activity is significantly lower than that of DnaB or gp4 (35). In addition, Twinkle can bind to and migrate on double-stranded DNA (dsDNA) and Holliday junction DNA (35,36). To date, structures of Twinkle in the absence of DNA have been reported (34,(37)(38)(39). However, how Twinkle functions mechanistically on DNA and how disease-related mutations affect Twinkle function are still largely unclear. Here we report cryo-electron microscopy (cryo-EM), biochemical and high-speed atomic force microscopy (HS-AFM) characterization of a vertebrate Twinkle homolog. Our data elucidate unique structural and dynamic features of Twinkle helicase, and support a novel molecular model explaining Twinkle self-loading, DNA unwinding and human disease-related mutations on Twinkle.

Sequence analysis
Twinkle homolog sequences from Arabidopsis thaliana, bacteriophage T7, Dictyostelium discoideum, Drosophila melanogaster, Gallus gallus and Homo sapiens (Hs) were used as baits for the BLAST search (https://www.uniprot. org/blast/). The target database was set to be the UniPro-tKB reference proteomes plus Swiss-Prot, with the Ethreshold set to be 10. Redundant sequences were eliminated, and sequences with <400 residues or >800 residues were removed. The remaining sequences were aligned with the MUSCLE sequence alignment tool (40). The sequence identity matrix was plotted with Matlab.

Plasmid construction
The Lates calcarifer (Lc) Twinkle (Entrez ID 108894827, synthesized by GeneUniversal Inc.) was cloned into a modified pET28a vector with an N-terminal histidine tag and a PreScission protease cleavage site. Mutations of LcTwinkle were generated using methods described in the QuikChange mutagenesis kit (Agilent). Sequences of the LcTwinkle constructs were confirmed by sequencing the entire reading frames of each construct. The active site mutation of Homo sapiens (Hs) (uniport Q86RR1), Mus musculus (Mm) (uniport Q8CIW5) and Danio rerio (Dr) (uniport A0A0R4ICC1) Twinkle were all synthesized by GeneUniversal Inc. and cloned into the same pET28a vector.

Protein expression and purification
The LcTwinkle plasmid was transformed into Escherichai coli BL21 (DE3) (Novagen). Isopropyl-␤-Dthiogalactopyranoside (1 mM) was added to the cell culture at an optical density of 0.8 for protein induction. The E. coli cells were further incubated at 16 • C for 16 h under shaking at 150 rpm for protein expression. The cells were harvested and disrupted by sonication in a Lysis buffer containing 50 mM Tris-HCl, pH 8.0, 1 M NaCl and 5% glycerol. The soluble fraction was loaded onto a 5 ml HisTrap column (GE Healthcare), which was preequilibrated with the Lysis buffer plus 20 mM imidazole. The column was subsequently washed with 300 ml of Lysis buffer plus 50 mM imidazole, and the protein was eluted with 5 ml of Lysis buffer plus 300 mM imidazole. The eluted protein was treated with 10 U of PreScission protease (Sigma) at 4 • C for 2 h. Afterward, the protein was diluted into a MonoQ buffer containing 50 mM Tris pH (8.0), 0.1 mM ATP, 1 mM MgCl 2 , 150 mM KCl, 3 mM dithiothreitol (DTT) and 5% glycerol, and loaded onto a MonoQ column (GE Healthcare) pre-equilibrated with the MonoQ buffer. LcTwinkle was eluted by a gradient KCl concentration. The purified LcTwinkle was aliquoted and flash-frozen at -80 • C for further study. Mutant LcTwinkle proteins were expressed and purified according to the same protocol. Hs, Mm and Dr Twinkle were expressed and purified similarly to LcTwinkle. After elution from HisTrap columns, the proteins were mixed with ssDNA (5 -TGGTCTTTTTTTTTTTTTTTTTTTTTTTTT-3 ) at a 1:1.5 molar ratio in a buffer containing 50 mM Tris (pH 8.0), 150 mM KCl, 3 mM DTT, 1 mM ATP and 2 mM MgCl 2 . Following incubation on ice for 10 min, the LcTwinkle-DNA complex was loaded onto a Superose6 increase column (10/300 GL, GE Healthcare) equilibrated in the same buffer.

ATPase assay
LcTwinkle ATPase activity was assayed with radioactive [␥ -32 P]ATP and thin-layer chromatography. All reactions were performed in a reaction buffer containing 25 mM Tris-HCl (pH 8.0), 150 mM KCl, 12 mM MgCl 2 , 3 mM DTT, 0.05-0.5 M of wild-type (WT) or mutant LcTwinkle, 1 M ssDNA (5 -GGATTATTTACATTGGCAGATTCACC-3 ) and the desired concentration of ATP (with 0.2 Ci of [␥ -32 P]ATP). Reactions were carried out at 20 • C for 30 min and terminated with EDTA. A 2 l aliquot of terminated reaction mix was spotted onto polyethyleniminecellulose plates (Merck, Germany). ATP and released phosphate were then separated chromatographically in a buffer of 0.5 M LiCl. Plates were exposed to a phosphor screen (GE Healthcare) for 4 h. Phosphor screens were imaged using a Sapphire Biomolecular Imager-RGB IS1025 (Azure BioSystems). The data were analyzed with AzureSpot software equipped in the Sapphire imager. The data were fitted to Michaelis-Menten equation with GraphPad (GraphPad Nucleic Acids Research, 2022, Vol. 50, No. 20 11967 Software LLC). All results were based on at least three independent tests.

DNA unwinding assay
The helicase reactions were carried out in a buffer containing 50 mM Tris-HCl (pH 8.0), 50 mM KCl, 5% glycerol, 0.1 mg/ml bovine serum albumin (BSA), 5 mM MgCl 2 and 4 mM ATP. Each reaction contained 25 nM FAMlabeled fork DNA substrate, which was annealed with 5 -FAM-CCTAGCTCAGGTTCAGTACTCGAACTCT ACATAACTATACATGAATATCATAACTAATAA-3 and 5 -TTATTAGTTATGATATTCATGTATAGTT ATCATCTCAAGCTCATG-3 . The reactions were all performed at 37 • C for 30 min. The reactions were stopped with a quench buffer containing 20 mM EDTA, 1% SDS and 0.2 mg/ml proteinase K. The substrate and product were separated on a 12% native polyacrylamide gel, and the images were analyzed using a Sapphire Biomolecular Imager-RGB IS1025 (Azure BioSystems). All results were based on at least three independent tests.

DNA binding assay
The DNA binding assay was performed in 100 l of reaction buffer containing 25 mM Tris-HCl (pH 8.0), 150 mM KCl, 0.5 mM ATP, 1 mM MgCl 2 , 20 nM Cy5labeled ssDNA (5 -Cy5-GGATTATTTACATTGGCAGA TTCACC-3 ) and varying concentrations of WT or mutant LcTwinkle. The reaction mix was loaded onto 96-well plates and the fluorescence anisotropy was measured with 645 nm excitation and 670 nm emission using an Infinite M1000 Pro microplate reader (TECAN). The K D was analyzed similarly to in (35) with GraphPad software (GraphPad Software LLC).

Subunit doping
Cyan fluorescent protein (CFP)-Twinkle and yellow fluorescent protein (YFP)-Twinkle were purified similarly to WT Twinkle. For the fluorescence resonance energy transfer (FRET) experiments, the individual CFP-Twinkle and YFP-Twinkle proteins or the mixture of two were prepared in a buffer containing 500 KCl, 50 mM Tris-HCl (pH 8.0) and 3 mM DTT; 1 mM ATP and 2 mM MgCl 2 were added when indicated. The protein concentration is 0.2 mg/ml for each of the proteins. The protein sample was scanned at room temperature using a fluorometer (Cary Eclipse Fluorescence Spectrophotometer, Agilent, Santa Clara, CA, USA). During the scan, a 430 nm filter was used as the excitation wavelength and the emission between 430 and 570 nm was collected. For subunit doping experiments, WT and mutant Twinkles were buffer exchanged to 500 KCl, 50 mM Tris-HCl (pH 8.0) and 3 mM DTT. The total protein concentration after mixing was 2 M. All mixed proteins were incubated at room temperature for 5 min before the activity assay. The ATPase, helicase and DNA binding activities were determined as described above. Mixed LcTwinkle at 0.2 M was used for the helicase assay and at 0.05 M for the ATPase assay.

Negative staining EM
WT and mutant Twinkle were diluted to 0.02 mg/ml in a buffer containing 50 mM Tris (pH 8.0), 150 mM KCl, 3 mM DTT, 1 mM ATP and 2 mM MgCl 2 . The ssDNA substrate (5 -TGGTCTTTTTTTTTTTTTTTTTTTTTTTTT-3 ) was added at a 1:2 molar ratio relative to the LcTwinkle hexamers. A 3 l aliquot of LcTwinkle-DNA sample was deposited onto a freshly glow-discharged CF400-CU carbon grid (EMS Inc.). After absorbing the extra protein solution with filter paper, 3 l of 2% uranyl acetate were added to the grid. Following a 20 s incubation, the extra uranyl acetate was removed with filter paper. The grids were imaged on a JEOL2100 microscope operated at 200 kV voltage and ×400 000 magnifications. The contrast transfer function (CTF) estimation and 2D classification were performed with RELION 3.0 (41).

Cryo-EM sample preparation
E325Q LcTwinkle was mixed with ssDNA (5 -TGGTCTTTTTTTTTTTTTTTTTTTTTTTTT-3 ) at a 1:1.5 molar ratio in a buffer containing 50 mM Tris (pH 8.0), 150 mM KCl, 3 mM DTT, 1 mM ATP and 2 mM MgCl 2 . Following incubation on ice for 10 min, the LcTwinkle-DNA complex was loaded onto a Superose6 increase column (10/300 GL, GE Healthcare) equilibrated in the same buffer. The eluted LcTwinkle-DNA complex was concentrated to 1 M (∼0.4 mg/ml), as determined by the Bradford assay. A 3 l aliquot of LcTwinkle-DNA sample was deposited onto a freshly glow-discharged Quantifoil R1.2/1.3 300 mesh grid and blotted using a Vitrobot (FEI) with the standard Vitrobot filter paper, Ø55/20 mm (Ted Pella). The blotting time was set to 4 s, the blotting force was set to 4 and the blotting was done under 100% humidity at 20 • C. The grids were flash-frozen in liquid ethane and stored in liquid nitrogen.

Cryo-EM data collection and processing
A total of 6490 micrographs of LcTwinkle-DNA complexes were collected on a Titan Krios electron microscope operated at 300 kV (cryo-EM core facility at the University of Texas McGovern Medical School) using the superresolution mode with a nominal magnification of 130 K (calibrated pixel size of 1.07Å on the sample level, corresponding to 0.535Å in super-resolution mode). Movies were recorded with a K2 Summit camera, with the dose rate at the detector set to 7 e -/s/Å 2 . The total exposure time for each video was 7 s, which was fractionated into 35 frames of subimages. The defocus values ranged between 0.6 and 3 m. MotionCor2 (42) was used for drift correction and electron dose weighting for all movies. The defocus values were estimated on non-dose-weighted micrographs with Gctf (43). A total of 1 976 802 particles were picked from 6334 manually screened micrographs with Gautomatch (developed by Kai Zhang, http://www.mrclmb.cam.ac.uk/kzhang/Gautomatch/). After 2D classification with RELION (41), 939 599 particles were selected. An ab initio hexamer LcTwinkle model was generated in cryoSPARC (44). The particles from 2D classification were classified into six 3D classes with the RELION 3D classification tool (41). Particles from two of the six classes (one in hexamer form and the other in heptamer form) were selected for subsequent processing. Further 3D classification was performed with cryoSPARC (44) to clean the datasets and separate different conformations of LcTwinkle. For the LcTwinkle-DNA complex, the particles from cryoSPARC were analyzed with Ctf Refine and polishing in RELION (41). Final refinements were done with the non-uniform refinement in cryoSPARC (44).

Model building and refinement
A homolog model of LcTwinkle was generated with the I-TASSER server (https://zhanglab.dcmb.med.umich.edu/I-TASSER/) and used for the initial rigid-body search. ss-DNA coordinates adapted from the gp4 DNA structure (PDB ID: 6N7V) were used as the starting model for DNA. Each of the protein chains and the DNA were manually docked into the cryo-EM density maps in Chimera (45). The models were first manually adjusted in COOT (46) and then refined in Phenix (47), with real-space refinement and secondary structure and geometry restraints. Due to the low resolution of the NTDs, polyalanine models were used during the refinement, except fo the residues at the NTD-C-terminal domain (CTD) interfaces with good side chain densities. For the lower resolution LcTwinkle-DNA 2 , LcTwinkle 6 and LcTwinkle 7 structures, the monomers from the refined LcTwinkle-DNA model were docked into the cryo-EM density with Chimera (45) and the connections were adjusted in COOT (46). Statistics of all cryo-EM data collection and structure refinement are shown in Supplementary Table S1.

High-speed atomic force microscopy imaging in liquids
LcTwinkle (35 nM hexamer) was incubated with the linear DNA substrate containing a 37 nt ssDNA gap positioned at 23% from one DNA end (48) in Twinkle Reaction Buffer (20 mM HEPES pH 7.6, 150 mM NaCl, 7.5 mM MgCl 2 ) containing ATP (4 mM) for 1 min. The sample was diluted in Twinkle Reaction Buffer and deposited onto a freshly prepared 1-(3-aminopropyl)silatrane (APS)-treated mica surface. The sample was incubated with mica (SPI) for 2 min. The APS-mica surface containing the sample was washed with Twinkle Reaction Buffer and scanned in Twinkle Reaction Buffer using a Cypher VRS AFM (Asylum Research). BlueDrive Photothermal Excitation was used to drive a Bi-oLever fast (AC10DS) cantilever with the resonance frequency (f) at ∼1500 kHz and the spring constant (k) at ∼0.1 N/m. The images were scanned at 0.8-2 frames/s. The images were analyzed using commercially available Asylum MFP3D software.

The overall structure of Twinkle helicase
Previously, poor protein solubility hampered structural studies of Twinkle (34,49). Although a high-resolution structure of HsTwinkle was reported recently (39), it is with a disease-related mutant W315L and the DNA and ATP are absent. We attempted to identify a Twinkle homolog similar to the HsTwinkle and amenable to highresolution structural determination. Sequence alignment suggested that vertebrate Twinkle proteins share high identities and may have similar structures and functional properties (Supplementary Figure S1). We picked Hs, Mm, Dr and LcTwinkle homologs to test their stability and protein-DNA complex formation. LcTwinkle shares 65% identity and 90% similarity with HsTwinkle, with most of the essential residues for Twinkle activities and disease-related mutations conserved (Supplementary Figure S1B). Two variants of LcTwinkle sequences were predicted by different databases, one longer variant and one shorter variant lacking the N-terminal mitochondrial targeting sequence (MTS) and the zinc-binding domain (ZBD) ( Figure 1A; Supplementary Figure S1C). The core regions, including the helicase and primase-like domains, are identical in the two variants. The shorter LcTwnk was chosen for cloning. Hs, Mm, Dr and LcTwinkle homologs with active site mutations corresponding to gp4 E343Q (50) were preliminarily purified with an affinity column. Considering the similarities between gp4 and Twinkle, the same ssDNA substrate as in gp4 structure determination (5 -TGGTCTTTTTTTTTTTTTTTTTTTTTTTTT-3 ) (21) was used for Twinkle complex formation, although the primase recognition motif TGGTC turned out to be unnecessary at the end. Notably, among all Twinkle proteins tested, only LcTwinkle migrated as a single sharp peak in the gel filtration column in the presence of ATP and ssDNA ( Figure 1B; Supplementary Figure S2).
WT and E325Q LcTwinkle were thoroughly purified for further characterization. The purified WT LcTwinkle displayed robust ATP-dependent helicase activity ( Figure  1C). E325Q mutation eliminated the helicase-unwinding activity of LcTwinkle ( Figure 1C). Although WT LcTwinkle can bind ssDNA in the presence of non-hydrolyzable ATP analogs (Supplementary Figure S2D), the resulting ssDNA complex was unstable. It migrated as a much higher molecular weight peak in gel filtration (Supplementary Figure S2E). Consistently, the WT LcTwinkle-ssDNA-ATP␥ S complexes appeared as aggregates on cryo-EM grids (Supplementary Figure S2F). Thus, the purified E325Q LcTwinkle-DNA-ATP complex from gel filtration ( Figure 1B) was used for cryo-EM analysis. We determined the LcTwinkle-DNA complex structure with cryo-EM at 3.5Å resolution ( Figure 1D Table S1). The local resolutions of the Cterminal helicase domains (CTDs) were near 3Å (Supplementary Figure S3E). Atomic structures of all six CTDs can be built ( Figure 1D). The N-terminal primase-like domains (NTDs) were at 5-8Å resolution (Supplementary Figure S3E). The ␣-helices in the NTDs can be traced, and bulky side chains near the NTD-CTD interfaces are visible (Supplementary Figure S4A). The homology model of the LcTwinkle NTD can be confidently docked into the cryo-EM densities in four out of the six subunits ( Figure 1D). The CTDs of LcTwinkle are similar to the apo structures of HsTwinkle and AlphaFold (51) predicted structures, with root mean square deviations (RMSDs) ∼1Å (Supplemen- tary Figure S5B), whereas the NTDs are less conserved and the RMSDs are ∼2.5Å (Supplementary Figure S5C).
LcTwinkle forms a homo-hexamer wrapping DNA in its central channel ( Figure 1D). The six subunits are labeled A-F following their order on DNA in the 3 to 5 direction. The NTDs are attached to the side of the CTDs. The domain structure of LcTwinkle is consistent with previously published low-resolution structural models of Twinkle (34,37,38), but different from the apo structure of HsT-winkle and predicted Twinkle structures (39), where the NTD interacts with the CTD from the same subunit (Supplementary Figure S5A). This arrangement of the NTDs is also distinct from all the other replicative helicases, in which the NTDs are all located on top of the helicase rings at the 5 side of the DNA (Supplementary Figure S5D-I) (21)(22)(23)(24)26). Compared with gp4, the LcTwinkle NTD swings nearly 90 • to the side (Supplementary Figure S5J). Furthermore, the pseudo-primase active site is flipped to face downward, making interactions with the unwound DNA impossible (Supplementary Figure S5J). An extended linker connects the NTD and the CTD (N-C linker) in a domain-swapped manner ( Figure 1D, E). The N-C linker from one subunit interacts with the CTD of the second subunit on its 5 side of DNA. The CTDs form a non-planar lock-washer-shaped hexamer ( Figure 1D), similar to gp4 and DnaB (21,22). The subunits at the two ends of the lock-washer are bridged by the N-C linker, making the hexamer a complete circle (Figure 1D). ATPase sites are located at each of the subunit interfaces (Figure 2A), and five ATP molecules can be refined (Supplementary Figure S4E). A 12 nt ssDNA binds to the central channel of LcTwinkle, with an average of 2 nt per subunit (Figure 2A). The DNA conformation in Twinkle is similar to that in gp4, DnaB, CMG and MCM helicases, but distinct from the DNA in E1 helicase (Supplementary Figure S5K).
Careful 3D classification yielded three additional structures of LcTwinkle with distinct conformations (Figure 1F-H; Supplementary Figure S3A). One structure obtained at 7.5Å resolution contains DNA but exhibits altered CTDs (LcTwinkle-DNA 2 ). In LcTwinkle-DNA 2 , the terminal helicase subunit (subunit F) has departed from subunit E and travelled halfway to the 3 end of the DNA ( Figure  1F; Supplementary Figure S6A). This structure may represent an intermediate state during Twinkle translocation. Although the LcTwinkle sample was prepared in the presence of DNA, two apo LcTwinkle structures were identified, with one in the hexamer form (LcTwinkle 6 ) and the other existing as a heptamer (LcTwinkle 7 ) ( Figure 1G, H). These structures are similar to what was previously observed with apo Twinkle (34,(37)(38)(39). LcTwinkle 6 stays in a similar nonplanar lock-washer configuration as the LcTwinkle-DNA complex ( Figure 1G; Supplementary Figure S6B), different from the planar rings formed by gp4 or DnaB in the absence of DNA (52)(53)(54). LcTwinkle 7 at 8.5Å resolution contains an enlarged central channel of ∼40Å ( Figure 1H), similar to the HsTwinkle heptamer (39). Interestingly, a structural gap is also present in LcTwinkle 7 . However, the NTDs are completely disordered in both apo structures, suggesting that the NTDs are highly dynamic without DNA ( Figure  1G, H). It has been reported that in the absence of ssDNA, the NTD conformation is controlled by ATP binding (38). The NTDs are ordered without ATP but become disordered in the presence of ATP (38,39), consistent with our structures in the presence of ATP.

ATP-and DNA-binding interfaces of Twinkle
The LcTwinkle helicase contains five intact ATP-binding sites, each formed by residues from two adjacent subunits (Figure 2A). One Mg 2+ is associated with each ATP molecule (Supplementary Figure S4E). The Walker A motif from one subunit wraps the triphosphate group of the ATP with its backbone amide groups ( Figure 2B). The K301 side chain on the Walker A motif directly interacts with the ␥ -phosphate of the ATP. The T302 helps coordinate the Mg 2+ ion. The D396 on the Walker B motif is ∼4Å from the Mg 2+ and may coordinate the Mg 2+ through watermediated interaction. The mutated Q325 interacts with the ␥ -phosphate group of the ATP and Mg 2+ . Two side chains, F501 and R481, sandwich the adenine base of the ATP (Figure 2B; Supplementary Figure S7A). Similar to HsTwinkle (35), LcTwinkle is promiscuous and can utilize nucleotides with either an adenine or thymine base ( Supplementary Figure S7B). In contrast, gp4 prefers dTTP over ATP (55). The R481 on LcTwinkle comes from a secondary structural element different from that in gp4 and the F501 is the equivalent of the gp4 Y535 (Supplementary Figure S7A). Mutations of F501A and R481A reduce the K M of ATP by ∼10-fold (Supplementary Table S2). Surprisingly, F501Y reduces ATP binding as well and making LcTwinkle favor the adenine over the thymine base ( Supplementary Figure S7B). The K487 and R489 residues (the arginine finger) from the adjacent subunits contact the ␥ -phosphate of the ATP. In addition, the H438 side chain and the Q461 backbone are also in proximity to the ␥ -phosphate and may stabilize the water for nucleophilic attack during ATP hydrolysis ( Figure 2B). Mutations of R489A and H438A significantly reduce the ATPase activity and eliminate DNA binding and helicase activities of LcTwinkle, confirming their critical roles in ATP sensing and translocation ( Figure 2C, D; Supplementary Table S2).
DNA binding in LcTwinkle is mediated by two adjacent loops (LoopD1 and LoopD2 in Figure 2A and B). The G455 and the A457 backbone amide groups and the S456 side chain from LoopD1, and the R440 and the K441 from LoopD2 interact with the phosphate backbones of DNA ( Figure 2B). In addition, the Y412 side chain contacts the O4 atom on the sugar ring ( Figure 2B). The DNAbinding residues in each subunit span a distance of 4 nt, with residues on LoopD1 and the R440 on LoopD2 interacting with two nucleotides, and the K441 side chain grabbing two additional nucleotides on the 3 side of DNA ( Figure  2B). DNA bound to subunit F lacks the interaction with K441 due to the absence of a 5 -end neighbor. Two similar DNA-binding loops were found in gp4 helicase (Supplementary Figure S7C) (21). However, the corresponding position of F454 is an arginine in gp4, which is essential for gp4 DNA binding (56). LoopD2 in gp4 contains additional positively charged residues and packs on the 3 side of the coiled ssDNA (Supplementary Figure S7C) (57). Besides directly binding DNA, the positively charged LoopD2 was proposed to facilitate the large-scale domain movement during translocation and prevent DNA from slipping out of gp4 (Supplementary Figure S7D) (27,57). In contrast, LoopD2 in LcTwinkle is shorter and negatively charged. The DNA is exposed when observed from the C-terminal side of LcTwinkle (Supplementary Figure S7E). Overall, our structure suggested that LcTwinkle has reduced DNA interactions compared with that of gp4. Direct mutagenesis confirmed the structural model of Twinkle-DNA interaction. Mutations R440A and S456A eliminate DNA binding and helicase activities ( Figure 2C, D). K441A LcTwinkle had low solubility and could not be purified.
Each LcTwinkle helicase subunit is similar within the LcTwinkle hexamer, with an average RMSD of ∼0.2Å (Supplementary Figure S6C). The DNA-binding loops are superimposable, except that K441 from subunit A at the DNA 3 end takes a different conformation ( Figure 2E). Moreover, the ATP-binding sites at the domain interfaces  Figure  S6D). This is distinct from most other hexameric helicases, where gradual conformational changes of ATP-binding sites along the helicase hexamer were thought to correspond to sequential ATP hydrolysis and unidirectional translocation (21,(23)(24)(25)(26). When the planar apo gp4 is used as a reference, one subunit in an LcTwinkle dimer rotates 14 • relative to its neighboring subunit (Supplementary Figure S6E). As a comparison, the gp4 subunits rotate 17-23 • from the planar conformation (Supplementary Figure S6E). The almost identical configurations of the ATP-binding sites in the LcTwinkle hexamer suggested that all subunits have a similar chance of hydrolyzing ATP. Moreover, the similar hexameric structures with and without DNA (Supplementary Figure S6B) indicated that DNA binding would not significantly stimulate Twinkle ATPase activity. Notably, gp4 dTTPase activity is stimulated 40-to 100-fold by DNA, whereas Twinkle ATPase activity is only increased by <2fold upon DNA binding (Supplementary Table S2) (35,50).
To further investigate the mechanism of LcTwinkle ATP hydrolysis and translocation, we performed subunit doping experiments (50,58). When mutated enzymes with defected catalysis were titrated to the WT enzymes, the change in the activity level of the hetero-oligomer correlates with the subunit cooperativity. If the nucleotide hydrolysis is stochastic within the oligomer, the activity will decrease linearly; on the other hand, when the nucleotide hydrolysis is highly correlated, the activity will change exponentially. Previous subunit doping experiments indicated that the dTTPase activity is highly correlated in gp4 helicase (50), whereas ATP hydrolysis and translocation in archaeal MCM helicase is only moderately correlated (58). A prerequisite for subunit doping experiments is the efficient formation of the hetero-oligomeric helicase complex. To confirm that different types of LcTwinkle can form proper heterooligomers, we constructed and purified LcTwinkle with either N-terminal CFP or YFP (Supplementary Figure S8A,  B). Mixing of the two types of LcTwinkle produced a FRET signal at a wavelength of ∼536 nm, corresponding to YFP emission, while the CFP peak at ∼470 nm was reduced (Supplementary Figure S8C). Moreover, the FRET signal emerged within 2 min upon mixing (Supplementary Figure  S8D), suggesting efficient and fast hetero-oligomer formation. Similar to the gp4 subunit doping experiments (50), we picked E325Q LcTwinkle, which eliminates ATP hydrolysis but does not affect the ATP or the DNA binding ( Figure  2C). When E325Q LcTwinkle is titrated to the WT LcTwinkle, both the ATPase activity and the helicase activity drop linearly ( Figure 2G, H). The linear decrease suggested a mechanism of stochastic ATP hydrolysis within the LcTwinkle hexamer. Similar linear decreases for the ATPase and helicase activities were observed with H438A and R489A, despite their different roles in ATP hydrolysis and translocation (Supplementary Figure S8). In addition, we tested subunit doping with the DNA binding mutant R440A (Supplementary Figure S8I, J). The DNA binding affinity is reduced but much slower than the predicted linear decrease when R440A LcTwinkle was titrated. The helicase activity decreases linearly with the increasing amount of the R440A LcTwinkle, similar to those of ATPase site mutants.

Disease-related Twinkle mutations
Nearly 60 mutations on Twinkle have been implicated in human diseases ( Figure 1A; Supplementary Figure S1B) (11). Except for four mutations on the extreme N-and Cterminus, all other disease-related mutations on HsTwinkle can be mapped onto the LcTwinkle structure, with most of them conserved (Figure 3; Supplementary Figure S1B).
Besides mutations at the domain-swapped interface, 11 mutations are located on the CTD ( Figure 3B). Residues W321 (HsTwinkle W441), N318 (HsTwinkle N438), S306 (HsTwinkle S426) and F304 (HsTwinkle F424) are at the core of the CTD and may be important for CTD folding and stability. R423 (HsTwinkle R543), T291 (HsTwinkle T411), R446 (HsTwinkle K566), N279 (N399) and R280 (HsTwinkle R400) are near the subunit-subunit interface on the CTD. These residues may contribute to the CTD subunitsubunit interactions and ATP hydrolysis. In addition, seven mutations are mapped to the surface of the NTD, and most of them are charged residues ( Figure 3C). Of note, A243 (human disease W363L), R228 (human disease G348R) and R446 (human disease K566R) are different from the corresponding human residues. The differences in A243 (HsTwinkle W363) and R228 (HsTwinkle G348) are possibly due to changes in their surrounding residues (Supplementary Figure S9L, M). R446 (HsTwinkle K566) is part of LoopD2 but does not directly contact DNA (Supplementary Figure  S7C). How human K566R mutation would affect Twinkle function is unclear.

HS-AFM imaging of LcTwinkle DNA binding dynamics
Our recent AFM imaging in liquids showed that human Twinkle subunits could self-assemble into hexamers and higher order complexes, and switch between open and closed configurations (59). However, due to the low tempo- ral resolution of the conventional AFM imaging, we could not elucidate the real-time conformational changes of Twinkle during DNA binding. To overcome this technical barrier, we used HS-AFM imaging in liquids to simultaneously obtain structural and dynamic information of LcTwinkle. A linear DNA (2030 bp with a short 37 nt ssDNA gap) (59) and WT LcTwinkle were deposited onto an APSmica surface (60) in the presence of ATP. The AFM images were recorded at ∼0.8-2 frames/s to visualize the dynamics of LcTwinkle. Under this imaging condition, both DNA and LcTwinkle were mobile on the APS-mica surface.
In the absence of a proximal DNA molecule (LcTwinkle-DNA distance >200 nm), LcTwinkle exhibited limited conformational changes manifested as opening and closing of gaps between subunits (Supplementary Video S1; Figure  4A Figure 4E). The average length of the protrusion was 4.9 ± 1.8 nm ( Figure 4E, n = 27). When DNA moved around or multiple DNA segments were nearby, we observed frequent domain protrusions from multiple subunits of LcTwinkle (Supplementary Figure S10; Supplementary Video S4). In addition, during the DNA search, some Twinkle rings were stretched to an almost linear form ( Figure 4C; Supplementary Video S5). HS-AFM imaging also revealed the capture and the release of DNA at the central channel of LcTwinkle ( Figure 4D; Supplementary Video S6; n = 7 events). The DNA binding through domain protrusion appeared to proceed with the loading of DNA into the central channel ( Figure 4D, panels I-IV). It is worth noting that the ring opening of LcTwinkle allowed the entry or release of DNA from its the central channel ( Figure 4D, panels III and V). In addition, after DNA left the central channel, LcTwinkle transiently interacted with the protruding domain again ( Figure 4D, panels VII and VIII).  Figure  S5) (21)(22)(23)(24)26). The gp4 NTD encodes a primase domain, whereas the NTDs in bacterial, archaeal and eukaryotic replicative helicases physically interact with the primases (52,61). The NTDs are always on the 5 side of the DNA relative to the helicase domains to assist primer synthesis (Supplementary Figure S5). Like gp4, Twinkle helicases from plants and lower eukaryotes can catalyze primer synthesis (62,63). However, the primase active site residues are mutated in vertebrate Twinkle (11). Furthermore, the NTD rotates 90 • and flips 180 • away from the unwound DNA (Supplementary Figure S5J). The unique NTD arrangement is consistent with the lack of Okazaki fragment synthesis in animal mitochondria (3,4). Moreover, the NTD conforma-tions are likely to be controlled by both DNA and ATP. The NTDs are on top of the CTD ring in apo HsTwinkle structures (39). In our LcTwinkle structures prepared with a high concentration of ATP, the NTDs are disordered in the absence of DNA ( Figure 1G) but attach to the side of the CTD ring in the presence of DNA ( Figure 1D).

Most replicative helicases contain NTDs in addition to the C-terminal helicase domains (Supplementary
Although they do not have enzymatic activities, the Twinkle NTDs probably play important roles in Twinkle oligomerization, additional DNA binding and proteinprotein interactions, similar to the NTDs in gp4 and DnaB (52,64). Our LcTwinkle structure shows that the NTD and the N-C linker directly contact the CTD in its neighboring subunit in the DNA-bound state (Figures 1D and 3A). The interaction stabilizes Twinkle hexamers around DNA. Mutations at the interface alter oligomerization of Twinkle and reduce helicase unwinding (Supplementary Figure S9) Nucleic Acids Research, 2022, Vol. 50, No. 20 11975 (38). The domain-swapped interface is a hotspot for diseaserelated mutations ( Figure 3A). In addition, the NTD may facilitate Twinkle DNA capture and loading (as discussed below). Moreover, strand annealing and exchange activities have been reported for human Twinkle (35), and these activities require more than one DNA-binding interface. The six NTDs may capture multiple ssDNA molecules to assist their pairing. In addition, Twinkle is reported to interact with multiple factors in DNA replication and repair (6,(16)(17)(18). Any proteins operating on the lagging strand DNA may contact Twinkle through the NTD. Seven disease-related mutations are mapped onto the NTD. They are far away from the CTD or the NTD-CTD interface, and thus not likely to be directly involved in Twinkle unwinding. These mutations may affect Twinkle protein or DNA interactions.
Loading of replicative helicases onto the genomic DNA is a prerequisite and often a critical regulatory step in DNA replication (29). In the nucleus of bacteria, archaea and eukaryotes, an origin recognition complex recognizes the replication origin and recruits the helicase. In addition, a helicase loader helps open the helicase ring for loading (29). Neither an origin recognition complex nor a helicase loader has been identified in mitochondria. Twinkle itself must search and capture the DNA substrate in 3D. Remarkably, HS-AFM imaging in liquids uncovered a novel proximal DNA-induced conformational change of Twinkle. Using HS-AFM, we observed that when LcTwinkle is close to DNA (LcTwinkle-DNA distance <20 nm), a domain could move freely in solution and protrude ∼5 nm in length, away from the major portion of LcTwnk oligomers to search for and capture nearby DNA (Figure 4). We propose that the mobile domain is the NTD. The NTDs do not interact with each other (Figure 1D), while the CTDs can oligomerize in the presence of ATP, which was included in our AFM imaging buffer. Our apo LcTwinkle and previous apo HsTwinkle structures (39) confirmed the dynamic nature of the NTDs ( Figure 1G, H). The N-C linker spans >30 residues and can account for the long-distance protrusion when it becomes fully extended. Therefore, we termed the domain protrusion observed in HS-AFM as the 'N-protrusion' (Figure 4; Supplementary Videos S2-S6). The N-protrusion is probably guided by electrostatic interactions between the NTDs and DNA. There are several positively charged patches on the NTD that may potentially interact with the negatively charged DNA electrostatically over long distances ( Figure  4F). It is worth noting that the Debye screening length around dsDNA is ∼1 nm at the ionic strength used in HS-AFM imaging (65). However, the Debye screening length for DNA on the mica surface is unknown. In addition, since both proteins and DNA were mobile during AFM imaging, the precise distance between LcTwinkle and DNA that activates N-protrusion could be significantly shorter than what we measure based on individual AFM images. The length of the N-protrusion could also be augmented by mobile DNA attached to the NTD.
The extraordinary conformational change of the Nprotrusion suggests a plausible mechanism for Twinkle DNA loading in the absence of a helicase loader (Figure 5). Individual NTDs from an apo Twinkle molecule either sit on its own CTD without domain swap (39) or extend out to search for nearby DNA (Figure 4; Supple-mentary Videos S2-S6). After DNA capture, the protruded NTD helps bring the DNA into its central channel for loading ( Figure 4D; Supplementary Video S6). In addition, the Twinkle hexameric ring connected by the N-C linkers must open to allow DNA entrance. Apo LcTwinkle is likely to be prone to ring opening due to the lack of domain-swapped N-C interactions. HS-AFM revealed that LcTwinkle could switch between open and closed ring conformations in the absence of DNA ( Figure 4A, panels I and II; Supplementary Video S2). Our HS-AFM data also confirmed that Twinkle could capture DNA at its central channel through ring opening ( Figure 4D; Supplementary Video S6) without subunit association or dissociation. Taken together, the cryo-EM and HS-AFM data suggest two separate DNAbinding regions on Twinkle, the NTD for initial transient DNA capture and the helicase central channel for DNA unwinding. Our model of Twinkle loading ( Figure 5) is also consistent with a model proposed for gp4 helicase, which involves the initial binding of the DNA to the NTD outside of the helicase ring, a conformational transition followed by the migration of the DNA into the central channel, and ring closure (31). Although not directly observed, we cannot exclude alternative possibilities of LcTwinkle loading by assembling of monomers or dissociation of a subunit from a heptamer.
The ATP binding and hydrolysis empower helicase translocation on DNA. While several models of ATP hydrolysis have been proposed in hexameric ATPases, the sequential ATP hydrolysis model is widely accepted (66). In contrast, our structural and biochemical analyses suggest that the ATP hydrolysis is stochastic in LcTwinkle. The random ATP hydrolysis was also reported for the hexameric ClpX peptide translocase, where the loss of catalytic residues in one or more ATP-binding sites only reduces but does not eliminate translocation (67). Similarly, ATPbinding sites in the hetero-hexameric CMG helicase are not all required for its translocation (26). Helicase translocation is associated with frequent futile cycles, slipping and backtracking (55). Possibly, ATP hydrolysis near the 5 end of DNA will lead to translocation of one to two subunits, whereas ATP hydrolysis in the middle or close to the 3 end may result in futile ATPase cycles ( Figure 5). The en bloc movement of multiple subunits also explains how inactive subunits can be tolerated in Twinkle, CMG and ClpX. Twinkle can move on DNA unidirectionally in the 5 to 3 direction (32). Our LcTwinkle-DNA structure suggested that subunit F on the 5 end of DNA has reduced DNA binding, as one of the key DNA-binding residues K441 from a neighboring subunit is missing at the 5 end of DNA. It is possible that the subtle difference in DNA binding renders subunit F most mobile. On the other hand, the directional movement could be affected by the NTDs. SF4 G40P helicase has a minimal NTD and can translocate in both directions on ssDNA (68). In addition, a comparison of DNA-binding loops in gp4 and LcTwinkle also suggested that DNA may easily slide out from the Twinkle DNA-binding channel (Supplementary Figure  S7E). Thus, our structural and biochemical analyses suggest that Twinkle is inefficient in unwinding DNA due to the stochastic ATP hydrolysis and reduced DNA binding. Indeed, Twinkle ATP hydrolysis is 20-to 100-fold slower Figure 5. Model of LcTwinkle loading and translocation. Twinkle in the absence of DNA is prone to ring opening (State I). When DNA is nearby, the positively charged NTD captures the negatively charged DNA over long distances and the CTD hexamer opens further for DNA loading (State II). After loading, Twinkle forms a lock-washer-shaped hexamer with the NTDs attaching to the side of the CTDs and the DNA binding to the CTD central channel (State III). ATP hydrolysis is stochastic in Twinkle. When ATP is hydrolyzed at the subunit interfaces near the DNA 5 end (State IV and State VI), one or two subunits dissociate from their neighbor and travel to the other end of the DNA (State V and State VII). However, when ATP is hydrolyzed in the middle or close to the DNA 3 end (State VIII), the translocation is unfavored, leading to a futile cycle of ATP hydrolysis. than that of gp4 or DnaB (35). It is possible that Twinkle only stays in a partially active form and additional binding partners or post-translational modifications fully activate Twinkle.
In summary, our biochemical, biophysical and structural data illustrate unique structural and dynamic features of the mitochondrial replicative helicase Twinkle. Our data highlight the important role of the enzymatically inactive NTD in Twinkle loading and unwinding, and the NTD-CTD interface represents a hotspot for human disease-related mutations.

DATA AVAILABILITY
All original data and materials are available upon request. The 3D cryo-EM density maps for LcTwinkle complexes have been deposited in the EM Database under the accession codes EMD-27842, EMD-27843, EMD-27844 and EMD-27845, and the coordinates for the structure have been deposited in the Protein Data Bank under accession code PDB 8E2L.