Nonstructural N- and C-tails of Dbp2 confer the protein full helicase activities

Human DDX5 and its yeast ortholog Dbp2 are ATP-dependent RNA helicases that play a key role in normal cell processes, cancer development, and viral infection. The crystal structure of the RecA1-like domain of DDX5 is available but the global structure of DDX5/Dbp2 subfamily proteins remains to be elucidated. Here, we report the first X-ray crystal structures of the Dbp2 helicase core alone and in complex with ADP at 3.22 Å and 3.05 Å resolutions, respectively. The structures of the ADP-bound post-hydrolysis state and apo-state demonstrate the conformational changes that occur when the nucleotides are released. Our results showed that the helicase core of Dbp2 shifted between open and closed conformation in solution but the unwinding activity was hindered when the helicase core was restricted to a single conformation. A small-angle X-ray scattering experiment showed that the disordered amino (N) tail and carboxy (C) tails are flexible in solution. Truncation mutations confirmed that the terminal tails were critical for the nucleic acid binding, ATPase, and unwinding activities, with the C-tail being exclusively responsible for the annealing activity. Furthermore, we labeled the terminal tails to observe the conformational changes between the disordered tails and the helicase core upon binding nucleic acid substrates. Specifically, we found that the nonstructural terminal tails bind to RNA substrates and tether them to the helicase core domain, thereby conferring full helicase activities to the Dbp2 protein. This distinct structural characteristic provides new insight into the mechanism of DEAD-box RNA helicases.

Human DDX5 and its yeast ortholog Dbp2 are ATPdependent RNA helicases that play a key role in normal cell processes, cancer development, and viral infection. The crystal structure of the RecA1-like domain of DDX5 is available but the global structure of DDX5/Dbp2 subfamily proteins remains to be elucidated. Here, we report the first X-ray crystal structures of the Dbp2 helicase core alone and in complex with ADP at 3.22 Å and 3.05 Å resolutions, respectively. The structures of the ADP-bound post-hydrolysis state and apo-state demonstrate the conformational changes that occur when the nucleotides are released. Our results showed that the helicase core of Dbp2 shifted between open and closed conformation in solution but the unwinding activity was hindered when the helicase core was restricted to a single conformation. A small-angle Xray scattering experiment showed that the disordered amino (N) tail and carboxy (C) tails are flexible in solution. Truncation mutations confirmed that the terminal tails were critical for the nucleic acid binding, ATPase, and unwinding activities, with the C-tail being exclusively responsible for the annealing activity. Furthermore, we labeled the terminal tails to observe the conformational changes between the disordered tails and the helicase core upon binding nucleic acid substrates. Specifically, we found that the nonstructural terminal tails bind to RNA substrates and tether them to the helicase core domain, thereby conferring full helicase activities to the Dbp2 protein. This distinct structural characteristic provides new insight into the mechanism of DEAD-box RNA helicases.
DEAD-box RNA helicases represent the largest helicase family in the helicase superfamily 2 (1). The DEAD-box family is implicated in many aspects of RNA metabolism, such as remodeling the RNA-protein complex, ribosome biogenesis, mRNA and microRNA processing, RNA editing, and RNA degradation (2). Human DDX5, yeast Dbp2, and fly Rm62 represent a subfamily of DEAD-box proteins. Dbp2 is a bona fide RNA helicase that is now known to be involved in multiple pathways of RNA metabolism. It has been implicated in cotranscriptional mRNA-binding protein assembly (3) and facilitates efficient transcription termination via the Nrd1-Nab3-Sen1 complex (4). In particular, Dbp2 regulates lncRNA-DNA hybrid and R-loop formation and loss of Dbp2 leads to the accumulation of RNA-DNA hybrids in vivo (5). Moreover, Dbp2 possesses RNA-DNA unwinding and annealing activities in vitro (6). Given that DDX5 resolves Rloops at DNA double-strand breaks to promote DNA repair (7), Dbp2 may be involved in DNA double-strand break repair. Dbp2 also can disrupt RNA and DNA G-quadruplex (G4) structures with two G-tetrad stacks (8); G4 sequences are widely present in various virus genomes (9) and Dbp2 is also implicated in RNA virus infections (10). Moreover, the abnormal expression of DDX5 appears in various cancer cells, such as breast cancers (11), colon cancers (12), prostate cancer (13), lung cancer (14), gastric cancer (15), and leukemia (16). Therefore, DDX5 is a promising drug target for cancer diagnosis. In cells lacking Dbp2, ectopic expression of DDX5 could rescue defects in growth and gene expression (17). The complementation of Dbp2 with DDX5 illustrates the functional conservation of DDX5 and Dbp2 across evolution and confirms that the DDX5/Dbp2 subfamily shares highly conserved protein sequences and cellular functions.
Similar to other DEAD-box proteins, DDX5/Dbp2 is composed of two RecA-like domains that include 11 conserved motifs. Motifs Q, I, Ia, GG, Ib, II, and III are found in the first RecA-like domain (also named the DEAD domain or RecA1) and motifs IV, V, QxxR, and IV are found in the second RecAlike domain (also named cHelicase or RecA2), these motifs are involved in RNA and ATP binding (18). In addition to the conserved helicase core, the N-and C-terminal tails of DEADbox proteins are highly variable, giving them a role in augmenting enzymatic activity or regulating specific protein and nucleic acid interactions (19,20). N-and C-tails of DEAD-box proteins commonly contain low-complexity sequences (21). Low-complexity sequences rich in arginine and glycine, called the RGG motif regulate nucleic acid and protein interactions through arginine methylation and partner-binding proteins (22). Dbp2 contains two RGG motifs located in the N-and C- ‡ These authors contributed equally to this work as Joint First Authors. * For correspondence: Xu-Guang Xi, xxi01@ens-cachan.fr; Stephane Rety, stephane.rety@ens-lyon.fr; Xi-Miao Hou, houximiao@nwsuaf.edu.cn.
tails, including the C-terminal RGG motif which is critical for binding and destabilizing G4 DNA and RNA substrates (23). Dbp2 harbors two inverse activities, ATP-dependent unwinding and ATP-independent annealing activities but which domains perform these functions is still unclear. In particular, more detail is needed on the role of the N-and C-tails of Dbp2 in RNA remodeling.
Understanding the tertiary structure of Dbp2 is critical in the search for small molecule inhibitors of DDX5-associated cancers. Currently, only the crystal structure of the RecA1 domain (residues 79-303) of DDX5 is available. Here, we report the crystal structure of the Dbp2 helicase core, which consists of two RecAlike domains and N-/C-terminal extensions (NTE/CTE). A cyclic open-closed motion of the Dbp2 helicase core is required for rapid and efficient unwinding activity. Our structural-guided truncated mutants in combination with nucleic acid binding, ATPase activity, unwinding, and annealing studies show that the helicase core itself has no binding, unwinding, and annealing activities. In contrast, the terminal tails are essential for nucleic acid binding and ATPase activity and regulate the unwinding; the C-tail is specifically responsible for annealing. Furthermore, the intrinsically disordered N-and C-tails become compact upon RNA binding and RNA substrates induce conformational changes that confer full helicase activities to the Dbp2 protein. This systematic and thorough study revealed unique structural characteristics of Dbp2, thus providing invaluable insights into the mechanism of the DEAD-box protein.

Results
Dbp2 is an RNA unwinding helicase, particularly efficient on the forked RNA/DNA duplex unwinding Prior to characterizing the structure of the Dbp2 protein, we first investigated the intrinsic enzymatic activities of a highly purified recombinant Dbp2 from Saccharomyces cerevisiae. We used a fluorescence anisotropy assay with various RNA, DNA, and hybridized RNA/DNA (R/D) substrates to gain insight into the binding properties of Dbp2 (Fig. 1A). Apparent dissociation constants (K d ) from the protein concentration-dependent changes in fluorescence anisotropy show that Dbp2 displayed significantly higher binding affinities to the 22 nt of ssRNA (R-S22) and RNA G4 (R-G4 Tel ) than to short length ssRNA (R-S12) and ssDNA (D-S12). Dbp2 showed relatively weak binding to dsDNA (D-D12) (Fig. 1B, upper panel; Table S1).
The RNA-dependent ATPase activities of Dbp2 were measured using a coupled spectrophotometric ATPase assay. In the absence of RNA, Dbp2 had undetectable ATPase activity that was stimulated by the addition of R-S22 ( Table S1). Other short DNA or RNA substrates (R/D-D12, R-S12, D-S12, and D-D12) resulted in very weak ATPase activity (1.04 ± 0.515.68 ± 0.43 μmol ADP/min/μM Dbp2) in agreement with the binding studies.
The helicase activity of Dbp2 was further characterized with respect to substrate specificity and directionality. A stopped- Figure 1. Biochemical characterization of Dbp2. A, schematic presentation of the RNA and DNA substrates used in the binding, ATPase, unwinding, and annealing experiments. S represents a single strand; n, the length of single and duplex strands; -D, duplex strand; Oh, overhanging strand; Y, fork-structured strand. B, upper panel, the binding affinities of different substrates to Dbp2. The K d values were measured in three independent replicates; bottom panel, ATPase activities of Dbp2 in the presence of different nucleic acids, error bars represent SD from four independent reactions. C, Dbp2-catalyzed unwinding kinetics with RNA and DNA duplex substrates and RNA/DNA hybrid duplexes. D, Dbp2-catalyzed annealing kinetics with RNA-RNA strands and RNA-DNA strands. A stopped-flow kinetic trace was fitted from the average of 10 individual traces, data show mean ± SD from three independent experiments. flow fluorescence assay was used to determine the unwinding amplitudes (Am) and rates of different RNA duplexes and R/D hybrid duplexes. As shown in Figure 1C, the results indicate that Dbp2 generally displayed higher unwinding amplitudes and rates toward R/D duplexes than to RNA duplexes. Although the forked DNA duplex (D-Y) was not an appropriate substrate, unwinding of RNA duplexes occurred with both 3 0 and 5 0 overhang substrates, indicating that Dbp2 does not have polarity preference (comparison of substrate R-3 0 Oh versus R-5 0 Oh). Interestingly, Dbp2 displayed its highest unwinding amplitude and rates on the forked R/D duplex (R/D-Y). Altogether, these results indicate that Dbp2 is an RNA unwinding helicase, particularly the forked RNA/DNA duplex.
Furthermore, Dbp2 facilitates not only the disruption but also the formation of RNA duplexes (6). Using a stopped-flow FRET experiment, we tested the annealing kinetics of Dbp2 for different types of substrates ( Fig. 1D and Table S1). The results show that Dbp2 more efficiently annealed blunt-end RNA-RNA and RNA-DNA substrates than other types of substrates. These results together demonstrate that Dbp2 is an active RNA helicase in vitro.

Crystal structure of Dbp2 catalytic core
To obtain structural information on the DDX5/Dbp2 subfamily, Dbp2 from S. cerevisiae was used for a crystallographic study. Dbp2 shares 45.7% sequence identity with human DDX5 (Fig. S1). Full-length Dbp2 and several different truncated constructions were expressed and purified for crystallization but no crystals were obtained. Therefore, the full-length Dbp2 underwent limited proteolysis and a compact segment was purified for the crystallographic study (Fig. S2).
The crystal structures of Dbp2 catalytic core apo and ADPbound states were solved by molecular replacement and refined to 3.22 Å and 3.05 Å resolutions, respectively (Table S2), using the N-terminal domain of DDX5 as a search model (PDB 4A4D). The helicase core of Dbp2 consists of two RecA-like domains (RecA1: aa 101-319; RecA2: aa 332-465), an NTE (aa 53-100), a CTE (aa 466-496), and a linker (aa 320-331) ( Fig. 2A). The Dbp2 catalytic core apo (Dbp2-apo) crystallized in space group C121 with 2.5 molecules in the asymmetric unit: two molecules (aa 57-496) in nearly identical conformation (rmsd 1.11 Å) and a half molecule (aa 57-326). This molecule fragment composed of NTE-RecA1 is likely generated by the limited proteolysis of the linker between the RecA1 and RecA2 domains. The Dbp2 catalytic core bound with ADP (Dbp2-ADP) crystallized in space group P2 1 2 1 2 1 with six molecules (aa 53-496) in the asymmetric unit, arranged in a D3 hexamer (Fig S4A). For Dbp2-ADP structure, because of the crystals suffered from severe twinning and the structure was difficult to refine, we collected many crystals prepared in different conditions. Finally, we obtained the crystal of Dbp2 in complex with ADP and a 16 nt polyGT ssDNA (ssDNA is not visible in the density). ADP is no clearly visible after molecular replacement in Fo-Fc difference map but could be refined with good density (Final 2Fo-Fc map). Density for ADP was verified with calculation of a composite omit map (Fig. S3), meanwhile, comparison of ADP site with DDX17 structure (Protein Data Bank 6uv3) showed a good conservation of the nucleotide-binding site (Fig S4B).
In the crystal structure of Dbp2-apo, two RecA-like domains adopted an open state, with the NTE subdomain located on the exterior surface of RecA1 and the CTE subdomain inserted between two RecA-like domains (Fig. 2B). To estimate the position and intrinsic mobility of the two RecA-like domains, we calculated the distance between the centers of mass of the RecA1 and RecA2. This distance was 42.9 Å in the open state, showing that the two domains were far apart, as shown in Figure 2B. A flexible linker connects to the core domain but the conformation of Dbp2-apo is rigid. In the Dbp2-apo structure, the RecA1 and RecA2 domains are restrained by the linker and CTE subdomains. The RecA1, RecA2, NTE, CTE subdomains, and linker form an interaction network, stabilizing the conformation (Fig S4C).
When Dbp2 bound to ADP, the two RecA-like domains underwent a large conformational change, distance between the centers of mass of the RecA1-RecA2 domains was reduced to 30.8 Å (Fig. 2C) and the RecA2 domain rotated by 143.9 ( Fig. 2D). Thus, in presence of ADP, the RecA2 domain moved closer to the RecA1 domain and the helicase core adopted a more compact conformation than in the Dbp2-apo structure. Small-angle X-ray scattering (SAXS) analysis showed that Dbp2-ADP is monomeric in solution (Table S3) and the observed hexamer form is probably due to crystal packing. Although the conformation of the helicase core in the crystal structure of Dbp2-ADP was in a semi-closed state, it was in a completely open state in solution ( Fig S4, D and E). Previous studies have shown that conformation of the helicase core exists in the preferred orientation and moves independently (24). Hence, both conformations may exist upon the binding of ADP.
Based on the structural homology alignment with DALI (http://ekhidna.biocenter.helsinki.fi/dali_server/), the closest structure to Dbp2 is the DEAD-box protein DDX17 (25), the human paralog of DDX5, which has similar roles in RNA metabolism and cancer progression (26). As expected, the intrinsic mobility of the RecA2 domain was observed in the unbound and ATP-bound states of DDX17, and the distance between the centers of mass of the RecA1-RecA2 domains shifted from 42.9 Å to 29.2 Å (Fig. 2, E and F), while the RecA2 domain rotated by 170.0 . Moreover, the conformational change of RecA2 between Dbp2-apo and Dbp2-ADP was also similar to the difference between DDX17-apo and DDX17-ADP⋅BeF 3 --U10 (Fig. 2, D and G). Then, a comparison of the position of the RecA2 domain in the Dbp2-apo and DDX17apo structures showed a difference in the distance between the center of mass of RecA2 of 44.9 Å and a difference in orientation of 170.9 (Fig. 2H). Although the apo-structures of Dbp2 and DDX17 were both in the open state, the conformation of the helicase core was different in the RecA2 position. In contrast, comparing structures of Dbp2-ADP and DDX17-ATP analog (Fig. 2I), the distance between the center of mass of RecA2 of both structures was only 22.1 Å and the angle between the RecA2 domains was 50.4 , showing that both conformations are closer than conformations between Dbp2-apo and DDX17-apo structures. Furthermore, we found that the P loop of these structures is associated with adenosine nucleotides. Thus, we compared the P loop of Dbp2 and DDX17 with or without ADP and ADP⋅BeF 3 -, demonstrating that the P loop gradually moves to the left in the presence of the adenosine nucleotide and with an increase in the number of phosphate groups ( Fig S4F). These structures show that the helicase core of Dbp2/DDX17 alone tends to be in a fully open state, but, in the presence of adenosine nucleotides, the helicase core tends to be in a closed state.

The NTE and CTE of Dbp2 regulate ATPase activity
The overall structure of the Dbp2 helicase core is similar to that of other DEAD-box proteins and in particular to the structure of DDX17. However, the relative orientation and location of the NTE and CTE showed differences in detail. In the structure of Dbp2-apo, the CTE consists of an α-helix and a disordered loop that inserts into the cleft of the helicase core. Residues R495 and R496 of the CTE form five hydrogen bonds with V128, E131, and T137 of the Q motif in the RecA1 domain (Fig. 3A, black box). Simultaneously, we observed the interaction of ADP with the Q motif of the RecA1 domain in the Dbp2-ADP structure (Fig. 3A, blue box). Structural superposition showed that residues R495 and R496 occupy the ADP binding site, and ADP binding resulted in the CTE moving to the side of the RecA2 domain (Figs. 2J and 3A). Further, the R495A/R496A double mutant also showed increased ATPase activity in vitro (Fig. 3C). In contrast, the CTEs of both DDX17-apo and ATP-bound state were always on the side of the RecA2 domain (Fig. 3B). Moreover, we found multiple interactions in the CTE of the four structures, particularly with residue Q484 in Dbp2 corresponding to residue Q450 in DDX17, which is involved in conserved interactions in the four structures (Table S4). Mutant Q484A showed reduced ATPase activity compared with full-length Dbp2 (Fig. 3C). Therefore, our structural and in vitro biochemical studies both indicate that the CTE is critical in regulating the ATPase activity of Dbp2.
The NTE consists of universally conserved aromatic residues (Figs. 3D and S4G) and the orientation of the NTE found in the crystal structures of the Dbp2 catalytic core was similar to those in human DDX5 (27) and DDX17 (25). A previous study on DDX17 indicates that the NTE regulates ATPase activity (25); therefore, we tested whether the NTE of Dbp2 has a similar function. We investigated the roles of the conserved aromatic residues using a mutational analysis. The binding activities of mutants showed visibly reduced affinity, except for E80A/H81A (Fig. 3E). ATPase activity was abrogated for mutants F73A and F77A/Y78A, but activity in E80A/H81A showed a decrease of roughly 2.2-fold and in Y78E a decrease of 3-fold, but activity was nearly unaffected in Y78A (Fig. 3F). The unwinding activity was nearly abrogated in all mutants, except for E80A/H81A ( Fig. 3G and Table S5). Interestingly, the Y56E mutant in DDX17 showed an increase in ATPase and unwinding activities, but the corresponding mutant Y78E in Dbp2 showed a severe decrease in ATPase activity and a complete loss of binding and unwinding activities. Because mutations can introduce disruptions in protein folding, we performed CD experiments to determine if the NTE and CTE mutants introduced changes in the secondary structure. The results showed that the CD spectra of mutants were similar to the WT Dbp2 (Fig S4, H and I). Taken together, our results indicate that the NTE is involved in the unwinding activity of Dbp2 by modulating both substrate binding and ATPase activity.  (Fig. 4, B and C, apo). The individual smFRET traces for Dbp2 showed that the FRET value fluctuated between 0.41 and 0.93 (Fig. 4D), indicating that the helicase core undergoes dynamic conformation changes in solution. This is consistent with crystal studies that report DEAD-box proteins in different conformations (28). We then examined the distributions of three FRET states of Dbp2 upon binding of ADP⋅AlF 4 , ADP⋅AlF 4 and ssRNA, or ADP. As shown in Figure 4B, upon binding ADP⋅AlF 4 , the lowest FRET peak was still present and transited slightly to higher FRET peak (Fig. 4, B and C, ADP⋅AlF 4 ). In contrast, Dbp2 in complex with ADP⋅AlF 4 and ssRNA showed a remarkable increase in the E FRET 0.93 peak (Fig. 4, B, C and E, ADP⋅AlF 4 + ssRNA), which is consistent with the crystal structure of DDX17-ADP⋅BeF 3 --U10 in the closed conformation. The FRET distribution of Dbp2-ADP was similar to that of Dbp2-ADP⋅AlF 4 except for a decrease in the E FRET 0.93 peak (Fig. 4, B and C, ADP). These results demonstrate that the helicase core of Dbp2 without ligands in the solution adopts different conformations and that the simultaneous presence of an ATP analog and ssRNA promotes a relatively closed conformation. A previous study showed that DEAD-box protein unwinds nucleic acid duplexes using ATP-dependent conformational changes of the helicase core (29). To test whether the conformational change of the helicase core of Dbp2 is essential for unwinding activity, we used intramolecular crosslinking to lock the helicase core of Dbp2 in a single conformation and then detected for unwinding activity. In the crystal structure of Dbp2-ADP, residue Y221 of the RecA1 domain is close to residues G392 and D393 of the RecA2 domain (Fig. 4F); therefore, the spatial distance is in favor of an interdomain disulfide bridge formation between the RecA1 and RecA2 domains. To increase the chance of disulfide bridge formation, we constructed a three-cysteine-residues mutant Y221C/ G392C/D393C to form intramolecular disulfide bonds. First, we replaced the sequence of the linker with that of the 3C protease recognition site to verify the formation of disulfide bonds in the Y221C/G392C/D393C mutant. If Y221C/G392C/ D393C forms a disulfide bond, the position of the linker can be cleaved by a 3C protease but the disulfide bond still connects the helicase core; otherwise, the helicase core is split into two parts. SDS-PAGE analysis showed that WT Dbp2 could be cleaved by 3C protease either in the presence or absence of reducing agent tris (2-carboxyethyl) phosphine (TCEP), but the mutant could not be cleaved in the absence of TCEP (Fig. 4G), even upon increasing the amount of 3C protease ( Fig  S4E). However, Y221C/G392C/D393C was cleaved in the presence of TCEP, suggesting that mutant Y221C/G392C/ D393C could form disulfide bonds that can be disrupted by TCEP (Fig S5F). Then, we measured the unwinding activity of Dbp2 and Y221C/G392C/D393C in the presence and absence of TCEP. As a control, Dbp2 and Y221C, G392C/D393C mutants displayed normal ATP-dependent unwinding activity in the absence of TCEP (Fig S5, G and H) but the unwinding activity of Y221C/G392C/D393C was negligible in the absence of 2 mM TCEP (Fig. 4H), suggesting that Dbp2 is inactive in a single conformation. These results provide direct evidence that the unwinding activity of the helicase core of Dbp2 depends on dynamic conformational changes.

Inherently disordered N-and C-tails confer full helicase activities to the Dbp2 protein
To probe the spatial conformation of N-and C-tails relative to the full-length Dbp2 in solution, SAXS assays were performed. Fig S6A is the experimental SAXS profile of Dbp2. The Ensemble Optimization Method (30) analysis was carried out for Dbp2 by introducing flexibility in the N-and C-tails, and in the linker between the helicase core. An atomic model was finally obtained with χ 2 of 1.62 (Fig. 5A). The full-length Dbp2 underwent dynamic changes in solution. The Kratky curve in Fig S6B further indicates that the full-length Dbp2 is disordered, because the protein with a compact spherical structure will generate a symmetrical curve. Consistently, the N-and C-tails of Dbp2 predicted by AlphaFold (https:// alphafold.ebi.ac.uk/P24783) were also nonstructural (Fig. 5B). Inspection of the linear sequence of amino acids revealed that there are two and four conserved RGG motifs in the N-and C-tails, respectively. To further identify, which motifs play a crucial role in the regulation of helicase activity, the RGG tails were modified, resulting in six truncated versions, in which the RGG motif at the N-tail (Dbp2 53-546 and Dbp2  ) and the C-tail (Dbp2 1-496 and Dbp2 53-496 ) were deleted, respectively (Figs. 5C and S6C). The mutants (Dbp2 1-514 , Dbp2 1-524 , and Dbp2 1-530 ) were the modified versions in which the RGG motifs in the C-tail were sequentially truncated, while maintaining N-tail integrity. Although we were unable to obtain the separate N-and C-tails (Dbp2 1-52 and Dbp2 497-546 ), they were fused with the MBP tag to purify.
Thus, the above mutants were characterized in parallel to the WT enzyme under the same experimental conditions. Characterizations of the WT and modified proteins revealed three surprising results. First, in the RNA binding assay, although the helicase core protein (Dbp2 53-496 ) was totally inactive, the isolated N-and C-tails displayed RNA-binding activities comparable to the WT protein (Fig. 5D). Further, we found that the C-tail peptide 514 to 524 is critical to the binding of Dbp2 on R-S22 and peptide 496 to 524 is critical to the binding of R-G4 Tel (Fig. 5, B and E). Second, we examined the effects of N-and C-tails on the ATP-hydrolysis, unwinding, and annealing activities of Dbp2. Compared with the fulllength Dbp2, the helicase core showed very poor ATP hydrolysis activity (Fig. 5F); meanwhile, no unwinding and annealing activities could be detected (Figs. 5G and S6D). In contrast, upon removal of the N-tail or C-tail, the protein retained its ATP hydrolysis and unwinding activities to varying degrees, but the C-tail was essential for the annealing activity ( Fig. S6D and Table S6). Third, the most striking result was that the helicase core is entirely inactive without the assistance of the terminal tails (Fig. 5, A-I). To further confirm these results, a kinetic binding assay with labeled RNAs was carried out under equilibrium conditions using both anisotropy and SDS gel assays. In addition, parallel binding assays were performed under the same experimental conditions with and without proteolytic trypsin. Figure 5, H and I showed that Dbp2 binding activity gradually decreased with increasing time of trypsin cleavage.
Taken together, the N-and C-tails are critical to the RNA substrate binding, ATP hydrolysis, unwinding, and annealing activities of Dbp2. These results clearly indicate that the disordered terminal tails play an essential role in helicase activity: although the helicase core is inactive, the disordered terminal tails confer full helicase activity to the Dbp2 protein.
The structurally disordered N-and C-tails become compact upon RNA binding and coordinate the helicase core domain What is the potential molecular mechanism by which the "nonstructural N-and C-tails" confer full helicase activities to Dbp2? Using a computationally efficient approach that samples conformational space in a random manner (31), we found that the C β distances between the N-terminal T2 and C141 of RecA1 is 25.9 Å and the C β distances between the C-terminal Y546 and C373 of RecA2 is 40.8 Å; these distances are very suitable for use as pairs in FRET (Fig S6E) (32). Then, we constructed two mutants, T2C/C141 and C373/Y546C, which were stochastically labeled with Cy3 and Cy5 (Fig. 6, A and E). In T2C/C141, in addition to the mutation of threonine into cysteine at the second amino acid, all other cysteines were mutated into alanine. In C373/Y546C, in addition to tyrosine at amino acid 546 being mutated into cysteine, all other cysteines were mutated into alanine. Both mutants had normal helicase activity compared with WT Dbp2 (Fig S6F). The successfully fluorophore-labeled T2C/C141 or C373/Y546C was then immobilized on the surface via the biotinylated Histag antibody.
We first measured the apo state of T2C/C141 and C373/ Y546C. Both mutants showed a wide and flat FRET distribution with single peaks of E FRET at 0.66 for T2C/C141 and E FRET at 0.61 for C373/Y546C, respectively (Fig. 6, D and H, apo). In addition, the characteristic FRET traces of the two mutants show a high degree of fluctuations (Fig. 6, B, C, F and G, apo), which was similar to the helicase core conformation of Dbp2 in Figure 4B, indicating that the N-and C-tails are also highly dynamic in solution.
Next, we examined the FRET distribution of T2C/C141 and C373/Y546C in the presence of substrates. When R-S22*, R-G4 Tel *, and 3 0 R-G4 Tel * were added at saturated concentration to T2C/C141, the FRET distribution presented narrow peaks with higher E FRET at 0.7, 0.71, and 0.78, respectively, indicating that the RNA substrate induced the N-tail to approach the RecA1 domain (Fig. 6, B-D). The fluctuation range of FRET traces also narrowed, suggesting that the structure shifted from a highly dynamic to a static state. This phenomenon is consistent with the single band in the nondenatured PAGE when Dbp2 1-496 binds to the RNA substrate (Fig. S7, A-D). In contrast, after adding D-D12*, the FRET histogram of T2C/C141 was similar to the apo state, consistent with the weak binding activity of Dbp2 to duplex DNA (Fig. 1B). After adding saturated concentrations of R-S22*, R-G4 Tel *, and 3 0 R-G4 Tel * to C373/Y546C, two peaks appeared in the FRET distribution (Fig. 6, F-H). The FRET level after binding with substrate also rapidly shifted between the two states (Fig. 6, F-H), indicating that the C-tail may exist in two different conformations when it is adjacent to the RecA2 domain. These observations are consistent with the two bands observed in the nondenatured PAGE when Dbp2 53-546 was associated with the RNA substrate (Fig S7, A-D). As a control, the addition of D-D12* had little effect on C373/Y546C. Altogether, ssRNA, TelG4 RNA, and 3 0 TelG4 RNA bind to the N-and C-tails, which become compact and tether the substrates to the helicase core. As a result, the conformation of Dbp2 changes from a highly dynamic to a relatively static state after forming a complex with the substrates.

Discussion
Dbp2, a DEAD-box protein in S. cerevisiae, has functions in regulating mRNP assembly (6), transcription initiation (5), and metabolic gene expression (33). Here, we not only reported the crystal structures of the Dbp2 catalytic core alone and in complex with ADP but also determined the function of the terminal tails. We solved the first crystal structure of Dbp2 to date, illustrating that the protein is rigid and in an open state. Similar conformations have also been found in Prp5 (34) and Prp28 (28).
Interestingly, the addition of ADP changes the conformation of the helicase core from an open state to a nearly closed state, which is different from the ADP-bound structures of DDX19 (35), DDX21 (36) and Prp5 (34). Dbp2 bound and unbound ADP structures give important insights into the conformational changes between nucleotide-free and ADP-bound states. Similarly, smFRET experiments also captured the conformational changes of the helicase core in the absence and presence of nucleotides and ssRNA. Consequently, our study explains how the DEAD-box protein unfolds a duplex during a cycle of ATPdependent conformational changes.
In many crystal structures of DEAD-box proteins, the NTEs and CTEs participate in structural plasticity at the terminal boundary of the helicase core. Moreover, they are essential for RNA binding, ATPase, and unwinding activities (25,27,35,37,38). The NTE is conserved between the Dbp2-apo and Dbp2-ADP structures and enwinds the exterior of the RecA1 domain, reaching near the RNA substrate binding area. In contrast, the CTE has two conformations: one in which Dbp2apo is wedged between the helicase core and one in which the addition of ADP induces it to move out of the way of the ATP binding site. Mutational analyses also suggest that NTE and CTE are involved in regulating ATPase activity. Interestingly, the NTEs of Dbp2 and DDX17 simultaneously regulate ATPase and unwinding activities but they display different effects. Hence, knowing the structure and function of individual DEAD-box subfamily proteins provides a better understanding of the function of DEAD-box proteins in vivo.
As an RNA helicase, Dbp2 simultaneously possesses the ability of duplex disruption and formation; substrates with tails or blunt-ends are involved in the unwinding and annealing activities of Dbp2. Most DEAD-box proteins exclusively unfold RNA duplex substrates, but a few show both unwinding and annealing activity, such as Mss116p (39), CsdA (40) and Ded1 (41). There are several possible mechanisms that may mediate between the unwinding and annealing activities, such as protein oligomerization, ATP and ADP concentration, and the presence of an annealing domain (42). Here, upon truncation of the C-tail in Dbp2, annealing activity barely detectable, and unwinding activity was reduced by half. Hence, our results suggest that the C-tail of Dbp2 regulates the equilibrium between unwinding and annealing activities.
Due to the intrinsically disordered N-and C-terminal tails of DEAD-box protein, no structures of an intact full-length DEAD-box protein are available to date. Hence, we used SAXS and smFRET experiments to characterize the structural and conformational changes of full-length Dbp2 in solution. A previous study showed that the helicase core adopts a single conformation (43); our results showed that full-length Dbp2 adopts three flexible and dynamic conformations. Furthermore, we found that the N-and C-tails act as substrate binding sites and tether RNA substrates to the core domain. This result is consistent with previous studies on Mss116p and CsdA, in which C-tails can bind nonspecifically or specifically to RNA substrates and flexibly tether the substrates to the helicase core for the unwinding of neighboring duplexes (24,44). Intrinsically disordered N-and C-tails in DEAD-box proteins contribute to mediating-specific protein and RNA interactions. Like other DEAD-box proteins, the N-tails of DDX43 (45) and LAF1 (46) contribute to RNA binding, the C-tails of DDX17 (25), DDX21 (47), DDX55 (48), and CsdA (44) are critical for RNA-binding activity, while the helicase core barely displays any RNA-binding activity. Hence, it is interesting to note that the nonstructured N-and C-tails likely act as tentacles to capture the RNA substrates and facilitate the unwinding, ATPase, and annealing activities of DEAD-box proteins. Our study demonstrates that nonstructured N-and C-tails display full Dbp2 activity, which may be a key to understanding the disordered terminal tails in DEAD-box protein function and their role in RNA metabolism.

Protein expression and purification
The complementary DNA fragments of Dbp2 1-546 , Dbp2 53-546 , Dbp2 1-496 , Dbp2 1-514 , Dbp2 1-524 , Dbp2 1-530 , and Dbp2  were amplified by PCR and ligated into the pET21a HMT vector. The fused protein includes a His 6 -tag at the N terminus and a TEV protease cleavage site between Dbp2 and MBP. Protein expressions were done in Escherichia coli strain C2566H grown in LB medium at 37 C to an A 600 of 0.56 and induced with 0.2 mM IPTG at 18 C overnight. Cells were collected and resuspended in buffer (50 mM Tris-HCl, pH 8.0, 500 mM NaCl, 10% Glycerol, 5 mM imidazole) and then lysed by sonication. After high-speed centrifugation, the supernatants were loaded on a Ni 2+ charged immobilized metal ion affinity chromatography column (GE Healthcare) and washed with 1M NaCl to remove bound nucleic acids. The eluted proteins were cleaved using TEV proteases for 16 h. The proteins were further purified through an SP ion exchange chromatography (GE Healthcare), while Dbp2 53-496 used Q ion exchange chromatography to purify. Purified proteins were concentrated and stored at −80 C. All mutants were purified using the same protocol as Dbp2.

Limited proteolysis and purification
Dbp2 proteins were incubated with trypsin at a range of concentrations (0, 0.00001, 0.0001, 0.001, 0.01, 0.1 mg/ml) in 50 μl reaction buffer (20 mM Tris-HCl pH 8.0, 600 mM NaCl, 10 mM CaCl 2 ) at 16 C for 1 h, then 1 mM ethylene diamine tetraacetic acid was added into reactions to stop the reactions. The SDS-PAGE analysis indicated that 0.0001 mg/ml of trypsin was the optimal enzyme concentration. Then, we tested the time to trypsin cleavage. Finally, we determined the reaction condition 0.0001 mg/ml trypsin to react for 2.5 h at 16 C. For large-scale proteolysis, proteins, and trypsin were expanded proportionately. Proteolytic fragments were purified by Q ion exchange chromatography. The eluted proteins were combined and concentrated to 25 mg/ml for crystallization.
Crystallization, data collection, and structure determination Dbp2 proteolytic fragment was stored in the buffer (20 mM Hepes pH 7.0, 150 mM NaCl) for crystal screening. Crystals were grown using the sitting-drop vapor diffusion technique at 20 C. The helicase core of Dbp2 was diluted to 8 mg/ml and in complex with 20 mM ADP. Crystals of the helicase core of Dbp2 were grown in a reservoir solution containing 0.12 M monosaccharides, 1 M sodium Hepes; 3-morpholinopropane-1-sulfonic (acid) pH 7.5, 19% w/v PEG 500MME PEG 20000. Crystals of Dbp2 helicase core in complex with ADP and 16 nt polyGT ssDNA were grown in reservoir solution containing 2.02 M (NH4) 2 SO 4 and 100 mM Tris pH 7.5. The crystals are acquired after 1 to 2 days.
For data collection, 20% ethylene glycol was added to the reservoir solution and the crystals were immediately mounted and flash cooled in liquid N 2 . X-ray diffraction data were collected on beamline BL18U1 beamline at Shanghai Synchrotron Radiation Facility (SSRF) and were processed using the XDS package (http://xds.mpimf-heidelberg.mpg.de/) (49). The Dbp2 structure was solved by molecular replacement. Phenix was used for refined (50) and Coot was used for model building (51). For twinning problems of Dbp2-ADP-ssDNA crystal, we made a lot of efforts and optimally selected P2 1 2 1 2 1 space group. Crystals had unit cell 168.84 168.84 145.93 90.0 90.0 90.0 and could be apparently indexed in space group P4 3 2 1 2 with three molecules in the asymmetric unit. Though data collection statistics were good, solving the structure in this space group by molecular replacement are impossible with no solution found. Apparent P4 3 2 1 2 could be due to perfect merohedral twinning and possible space groups can be P4 3 or P2 1 2 1 2 1 as described in (52). We tried subgroups P4 3 with twin law h, -k, -l, and P2 1 2 1 2 1 with twin law -h, l, k and six molecules in the asymmetric unit for both cases. Twin fraction is 0.43 as calculated by Britton analyses. P2 1 2 1 2 1 gives better refinement statistics and we choose this space group. The detailed structure determination and refinement statistics are shown in Table S2.

Small-angle X-ray scattering
The full-length Dbp2 was prepared at 5.0 mg/ml in 25 mM Hepes pH 7.5, 500 mM NaCl, and 20% glycerol buffer. SAXS data were collected at the BL19U2 beamline of the Shanghai Synchrotron Radiation Facility. The SAXS data for Dbp2  with ADP was collected at beamline SWING (SOLEIL Synchrotron, Saint-Aubin). The scattering processing was measured as described (53). The SAXS data were analyzed using ATSAS (54) and model building was done with Ensemble Optimization Method (30). All SAXS data are shown in Table S3.

CD spectroscopy
CD spectra measurements were performed using a spectropolarimeter (Applied Photophysics). Purified Dbp2, NTE, and CTE mutants contained N-terminal MBP tag dialyzed in sodium phosphate buffer (Na 2 HPO 4 /NaH 2 PO 4 , pH 7.5) at 25 C with a protein concentration of 3 μM. CD spectra were collected in the wavelength range of 190 to 260 nm. CD results were analyzed using the CAPITO software (https://capito. software/) (55).

ATPase activity assay
The ATPase activity of Dbp2 and mutants was measured using an ATPase/GTPase activity assay kit (Sigma MAK113). Briefly, Dbp2 and several mutants were diluted at a final concentration of 0.1 μM in assay buffer (25 mM Tris-HCl pH 7.5, 30 mM KCl, 2 mM MgCl 2 ). Dbp2 and several deletion constructions were diluted in the assay buffer at a final concentration of 0.5 μM. The assay reaction mixture contained 0.75 μM 16 bp RNA-DNA hybrid duplex, 1.5 mM ATP, and assay buffer was added to bring the volume to 20 μl. The reactions were processed at 24 C for 15 min and were quenched by adding 100 μl regent. The absorbance was read out at 620 nm. The standard curve equations were used to calculate the free phosphate.

Fluorescent labeling
The purified C141/C373 and T2C/C141 proteins contain a His 6 -tag at the C terminus and the C373/Y546C protein contains a His 6 -tag at the N terminus. Before labeling, mutant proteins were incubated with 1 mM TCEP for 30 min at room temperature to reduce disulfide bonds. For fluorescent labeling, the double-Cys mutants were randomly labeled by sequentially adding Cy3-and Cy5-maleimide fluorophore (Lumiprobe). The dye and protein molar ratio are 5: 1. The reaction proceeded overnight at the room temperature while stirring in the dark. Free Cy3 and Cy5 fluorophore was removed by Ni affinity chromatography. The fluorescentlabeled proteins were eluted from Ni affinity chromatography. The 12% SDS-PAGE detected the labeled extent of proteins, the labeling efficiency was calculated by gel analysis and it was more than 80%. The concentrations of labeled proteins were measured by Thermo Fisher Scientific Nanodrop 2000c. Labeled proteins were stored at −80 C in 50 mM Tris-HCl, pH 8.0, 1 mM DTT, 600 mM NaCl, and 20% (vol/ vol) glycerol.

smFRET measurements
smFRET assay was performed with a home-built objectivetype total-internal-reflection microscopy (56). In short, 50 nM C141/C373 was added to the chamber and allowed to be immobilized for 10 min, then free C141/C373 molecule was removed by wash buffer 25 mM Tris-HCl pH 7.5, 30 mM KCl, 2 mM MgCl 2 , and an oxygen scavenging system (0.8% Dglucose, 1 mg/ml glucose oxidase, 0.4 mg/ml catalase, and 1 mM Trolox). Imaging was initiated after C141/C373. Singlemolecule measurements were performed using an exposure time of 100 ms. FRET analysis was referred to describe previously (57). The smFRET experiments of T2C/C141 and C373/Y546C were performed in similar process.

Stopped-flow FRET measurements
A stopped-flow FRET assay was used to measure the unwinding kinetic rate constants of Dbp2 and its variants. The unwinding experiments were performed as described (58). Briefly, the reaction buffer contains 25 mM Tris-HCl (pH 7.5), 30 mM KCl, and 2 mM MgCl 2 . 25 nM Dbp2 and 8 nM substrates (Table S7) were mixed in the reaction buffer. After incubation for 5 min at 25 C. The reaction was initiated by adding 2 mM ATP. Unwinding kinetic rate constants and amplitude were analyzed as described (59).
The annealing experiment was measured using a stoppedflow FRET assay. The whole operation is done in a similar way to the unwinding experiment. The experiments were completed at 25 C in a buffer of 25 mM Tris-HCl (pH 7.5), 30 mM KCl, and 2 mM MgCl 2 . Two complementary strands containing one of them were mixed with 25 nM Dbp2 and another strand was added to initiate the reaction. The data analysis refers to describe previously (60).

Electrophoretic mobility shift assay
Fam-labeled R-S22, R-G4 Tel , 3 0 R-G4 Tel , and 5 0 R-G4 Tel were mixed with varying amounts of proteins in the binding buffer containing 25 mM Tris-HCl (pH 7.5), 30 mM KCl, 2 mM MgCl 2 . The complexes were separated by native PAGE and visualized using a ChemiDoc MP (BIO-RAD).

Nucleic acid binding assay
A fluorescence anisotropy assay was used to determine the apparent dissociation constants of the proteins under the equilibrium nucleic acid binding conditions. Protein concentration-dependent changes in fluorescence anisotropy were measured with fluorescein-labeled nucleic acid by fluorescence polarization assay using an Infinite F200 instrument (TECAN). Varying amounts of protein were added to a 150 μl aliquot of buffer (25 mM Tris-HCl, pH 7.5, 20 mM NaCl/ 30 mM KCl, 2 mM MgCl 2 , and 1 mM DTT) containing 5 nM fluorescein-labeled nucleic acid (Table S7). Protein and substrate were incubated at 25 C for 5 min. After 5 min, the steadystate fluorescence anisotropy (r) was measured. Calculation of the equilibrium dissociation constant as described (61).

Data availability
All data are contained within the article. Data necessary to validate protein structure determination and modeling can be obtained in the Protein Data Bank under the following accession number: Dbp2-apo: 8ARK; Dbp2-ADP: 8ARP.
Supporting information-This article contains supporting information.