Structural insights into the functions of the FANCM-FAAP24 complex in DNA repair

Fanconi anemia (FA) is a genetically heterogeneous disorder associated with deficiencies in the FA complementation group network. FA complementation group M (FANCM) and FA-associated protein 24 kDa (FAAP24) form a stable complex to anchor the FA core complex to chromatin in repairing DNA interstrand crosslinks. Here, we report the first crystal structure of the C-terminal segment of FANCM in complex with FAAP24. The C-terminal segment of FANCM and FAAP24 both consist of a nuclease domain at the N-terminus and a tandem helix-hairpin-helix (HhH)2 domain at the C-terminus. The FANCM-FAAP24 complex exhibits a similar architecture as that of ApXPF. However, the variations of several key residues and the electrostatic property at the active-site region render a catalytically inactive nuclease domain of FANCM, accounting for the lack of nuclease activity. We also show that the first HhH motif of FAAP24 is a potential binding site for DNA, which plays a critical role in targeting FANCM-FAAP24 to chromatin. These results reveal the mechanistic insights into the functions of FANCM-FAAP24 in DNA repair.

FANCM is a key component of the FA core complex, which contains a DEAH helicase domain at the N-terminus and a putative nuclease domain and a tandem helix-hairpin-helix (HhH) 2 motif domain at the C-terminus. The DEAH helicase domain harbors an ATP-dependent DNA-remodeling translocase activity, which is important for the FA core complex to monoubiquitinate FANCI-FANCD2 (15)(16)(17)(18)(19). FANCM can interact with the FANCM-associated histone-fold proteins 1 and 2 (MHF1-MHF2) complex via the region following the helicase domain (residues 661-800) and with FAAP24 via the C-terminal domains to form two stable complexes to bind different DNA substrates, which are crucial for FANCM to facilitate repair of ICLs (3)(4)(5). FANCM can also interact with the RMI1-RMI2 complex via its MM2 motif (residues 1218-1251) to recruit the BLM-RMI1-Topo IIIa dissolvasome to ICL-stalled replication forks (20)(21)(22).
The FANCM-FAAP24 complex binds preferentially to single-stranded DNA (ssDNA). It plays a key role in the recruitment of the FA core complex to damaged DNA and the monoubiquitylation of FANCD2 (5,18). It is also essential for ataxia-telangiectasia and Rad-3-related (ATR)-mediated S phase checkpoint signaling (23)(24)(25). Moreover, in addition to their coordinated functions, the two proteins have non-overlapping and distinct functions: FAAP24 promotes ATR-mediated checkpoint activation particular in response to DNA ICL agents, whereas FANCM participates in recombination-independent ICL repair by facilitating recruitment of lesion incision activities (26). The FANCM-MHF complex binds preferentially to branched DNA and double-stranded DNA (dsDNA) in vitro and stimulates the DNA branch migration and replication fork reversal activities of FANCM (3,4). The MHF1-MHF2 complex assumes a heterotetrameric architecture similar to that of the histone (H3-H4) 2 heterotetramer and interacts with FANCM to form a binding site for DNA (27,28).
FANCM and FAAP24 are suggested to belong to the XPF family (5,16). All members of this family contain an excision repair cross-complementation group 4 (ERCC4) nuclease domain and a tandem helix-hairpin-helix (HhH) 2 motif domain and exist as heterodimers in eukaryotes and homodimers in archaea. These proteins can be divided into two groups: catalytic members and non-catalytic members. The catalytic members such as XPF and Mus81 contain a conserved GDX n ERKX 3 D motif at the active site of the nuclease domain and exhibit nuclease activity, and the non-catalytic members such as Eme1, Eme2 and ERCC1 lack the catalytic motif (29,30). Previous data showed that FANCM has a divergent sequence CDX n ERRX 3 E at the equivalent region of the catalytic motif and FAAP24 lacks the catalytic motif, and the FANCM-FAAP24 complex has no detectable nuclease activity in vitro (5,16). The (HhH) 2 domain is composed of two tandem HhH motifs, which can mediate the binding of different DNA structures and dimerization of the two proteins (31)(32)(33)(34).
We report here the crystal structure of human FANCM-FAAP24 complex, which together with the biological data provide the molecular insights into its functions in DNA repair. The FANCM-FAAP24 heterodimer structurally resembles the ApXPF homodimer. Although the putative active site of FANCM has a similar structure as that of XPF, differences of several key residues and the electrostatic surface of the surrounding region render an inactive nuclease domain, accounting for the lack of nuclease activity. The first HhH motif of FAAP24 is a potential DNA binding site, which is essential for proper localization of FANCM-FAAP24 to chromatin in DNA repair.

Cloning, expression and purification
The gene fragments encoding the C-terminal regions of human FANCM (residues 1727-2048, 1799-2048 and 1813-2031) and human FAAP24 (full-length and residues 17-215) were inserted into the pET-Duet plasmid (Novagen) and the pET-28 a plasmid (Novagen), respectively; the latter attaches a His 6 tag at the C-terminal of FAAP24. The two plasmids were cotransformed into Escherichia coli BL21 (DE3) Codon-Plus strain (Novagen). The transformed cells were grown at 37 C in lysogeny broth (LB) medium until OD 600 reached 0.8 and then induced with 0.2 mM isopropyl b-D-1-thiogalactopyranoside. The cells were collected by centrifugation, suspended in buffer A [20 mM Tris-HCl (pH 8.0), 500 mM NaCl, 10% glycerol and 1 mM phenylmethylsulfonyl fluoride] and lysed by sonication. The FANCM-FAAP24 complex was purified by affinity chromatography using a Ni-NTA column (Qiagen) with buffer A supplemented with 20 mM imidazole and 200 mM imidazole serving as washing buffer and elution buffer, respectively. The elution sample was further purified by gel filtration using a Superdex 200 16/60 column (GE Healthcare) pre-equilibrated with buffer B [10 mM HEPES (pH 7.5), 100 mM NaCl and 2 mM MgCl 2 ]. For the Se-Met derivative proteins, the cells were grown in M9 medium supplemented with amino acids Lys, Thr, Phe, Leu, Ile, Val, Se-Met and 1% lactose. The FANCM and FAAP24 mutants containing different point mutations were generated using the QuikChange Õ Site-Directed Mutagenesis Kit (Strategene). The resultant protein samples were of >95% purity as analyzed by SDS-PAGE.

Crystallization, data collection and structure determination
Crystallization was performed using the hanging drop vapor diffusion method at 20 C. Crystals of both native and Se-Met FANCM-FAAP24 were grown from drops consisting of 1 ml of protein solution ($6 mg/ml) and 1 ml of reservoir solution containing 1.0 M (NH 4 ) 2 SO 4 and 0.1 M HEPES (pH 7.2). For diffraction data collection, the crystals were cryoprotected using the reservoir solution supplemented with 30% glycol and then flashcooled into liquid nitrogen. Three Se-derivative datasets of 3.4 Å resolution and a native data set of 2.9 Å resolution were collected at 100 K at beamline 17 U of Shanghai Synchrotron Radiation Facility, China. The diffraction data were processed with HKL2000 (35). The statistics of the diffraction data are summarized in Table 1.
The structure of FANCM-FAAP24 was solved using the multi-wavelength anomalous dispersion method using Phenix (36), which identified 7 Se atoms and yielded a figure of merit of 0.44. The structure model was refined against the native data set using Phenix (36) and Refmac5 (37). Model building was performed using Coot (38). The final structure model contains residues 1815-2030 of FANCM and residues 18-213 of FAAP24, except that three surface exposed loops (residues 1903-1915 and 1965-1969 of FANCM, and residues 147-152 of FAAP24) could not be modeled owing to poor electron density. Structure analysis was carried out using programs in CCP4 (39) and the PISA server (40). All the structure figures were prepared using Pymol (http://www. pymol.org). The statistics of the structure refinement and the quality of the final structure model are also summarized in Table 1.

Electrophoretic mobility shift assay
Electrophoretic mobility shift assays (EMSA) were performed as described previously (5). The ssDNA, dsDNA and splayed-arm DNA substrates were also prepared as described previously (5,41). For comparison, all DNAs were labeled with 5 0 -biotin in oligo 1 and oligo 3. Reaction mixture (20 ml) contained 10 nM 5 0 -biotinlabeled DNA in a buffer containing 25 mM Tris-HCl (pH 7.5), 5 mM EDTA and 6% glycerol. Reactions were initiated by adding different amounts of FANCM-FAAP24 and then incubated for 30 min at 4 C.
Reaction mixture was loaded on a 6% neutral polyacrylamide gel prepared in 0.5 x TBE buffer (45 mM Trisborate (pH 8.0) and 1 mM EDTA). Gels were run in 0.5x TBE buffer at 100 V for 100-120 min at 4 C and then transferred to a positively charged nylon membrane (GE Healthcare) in 0.5x TEB buffer at 100 V for 60 min at 4 C. The nylon membrane was cross-linked at 120 mJ/cm 2 using UVC 5000 Crosslinker (Hofer). The biotin-labeled DNA was detected by LightShift Chemiluminescent EMSA Kit (Thermo Scientific).

Cell culture, transfection and immunofluorescence
The genes of the full-length FANCM and the C-terminal segment of FANCM (residues 1799-2048, FANCM L2 ) in wild-type, truncated and mutant forms were subcloned into the pEGFP-C3 vector (Clontech), which attaches an enhanced green fluorescent protein (EGFP) tag at the Nterminus, and the genes of the full-length FAAP24 in wildtype, truncated and mutant forms into the pcDNA3 vector (Invitrogen), which attaches a Flag tag at the Nterminus. HEK 293 T cells were cultured on coverslips in Dulbecco's modified Eagle's medium (Hyclone) supplemented with 10% fetal bovine serum (Biochrom AG) and transfected with 2 mg of plasmids using Lipofectamine 2000 (Invitrogen). Twenty-four hours after transfection, the cells were fixed with 4% paraformaldehyde (Sigma), washed three times with phosphate buffered saline (PBS) and permeabilized using PBS supplemented with 0.1% Triton X-100 and then blocked using 2% bovine serum albumin in PBS supplemented with 0.1% Triton X-100. Flag-FAAP24 was stained with mouse anti-Flag M2 mAb (1:1000, Sigma) followed by Alexa Fluor 647 goat anti-mouse mAb (1:1000, Invitrogen). DNA was stained with 4 0 ,6-diamidino-2phenylindole (DAPI, 1:2000, Sigma). Images were acquired utilizing a 63X oil immersion lens on a Leica TCS SP5 II confocal microscope (Leica).

Structure of the FANCM-FAAP24 complex
It was reported previously that two forms of the C-terminal fragment of human FANCM (residues 1727-2048, FANCM L1 ; residues 1799-2048, FANCM L2 ) can form stable complexes with full-length FAAP24 (residues 1-215, FAAP24 FL ), which exhibit binding activities to different types of DNA structures with preferences for ssDNA, splayed-arm DNA and a 3 0 -flap DNA 5-10-fold better than for dsDNA (5). We obtained these two forms of the FANCM-FAAP24 complex with high purity, stability and homogeneity and confirmed their binding to different types of DNA structures ( Figure 1A and Supplementary Figure S1), but failed to obtain any crystals. We then constructed a slightly shorter form of FANCM (residues 1813-2031, FANCM S ) and an N-terminal truncated form of FAAP24 (residues 17-215, FAAP24 S ), and the resultant FANCM S -FAAP24 S complex led to successful crystallization and structure determination of the complex ( Figure 1B). We also confirm that the FANCM S -FAAP24 S complex maintains the capability to bind different types of DNA substrates ( Figure 1B and Supplementary Figure S1). It is noteworthy that there are several shifted bands in the EMSA results. As the FANCM-FAAP24 complex can bind different types of DNA structures in a non-sequencespecific way, and the DNA substrates are relatively long, it is possible that a single DNA substrate might bind one or more protein molecules under the assay condition, leading to multiple shifted bands. This speculation is supported in part by the observation that the bands with larger molecular masses appear to be more intense at higher protein concentration. The FANCM S -FAAP24 S complex appears to bind the DNA substrates to form large protein-DNA complexes with higher molecular masses, which barely move in the gels. For simplicity, we will refer to this complex as the FANCM-FAAP24 complex hereafter unless otherwise specified. The crystal structure of the FANCM-FAAP24 complex was solved using the multi-wavelength anomalous dispersion method and refined to 2.9 Å resolution ( Figure 1B and Table 1). There is one FANCM-FAAP24 complex in the asymmetric unit, which contains residues 1815-2030 of FANCM and residues 18-213 of FAAP24 except that three solvent-exposed loops (residues 1903-1915 and 1965-1969 Figure S2C). The two nuclease domains and the two (HhH) 2 domains form two compact heterodimers, respectively.

Interactions between FANCM and FAAP24
The interactions between the nuclease domain and the (HhH) 2 domain of FANCM are mediated mainly by residues from a4 and the b2-b3, b3-b4, a4-b5 and a6-a7 loops of the nuclease domain, and a8 and the a7-a8 and a10-a11 loops of the (HhH) 2 domain, which involve numerous hydrophobic contacts and two hydrogen bonds and bury a total of 655 Å 2 solvent-accessible  Table  S1). The N-terminal of a8 of the (HhH) 2 domain is embedded in a hydrophobic surface patch formed by the b2-b3, b3-b4, a6-a7 loops of the nuclease domain, which contribute more than half of the hydrophobic contacts and one hydrogen bond at this interface (Supplementary  Table S1). Helix a4 and the following a4-b5 loop of the nuclease domain mainly interact with the a7-a8 and a10-a11 loops of the (HhH) 2 domain, which also contribute nearly half of the hydrophobic contacts and one hydrogen bond at this interface (Supplementary Table S1).
The two nuclease domains form a heterodimer with pseudo 2-fold symmetry such that the C-terminal edges of the two b-sheets (b6) are in contact with each other (Figures 1B and 2B). The interaction interface involves mainly residues from a4, a5 and b6 of FANCM and a3 0 , a4 0 , b6 0 and the a4 0 -a5 0 loop of FAAP24 and buries a total of 1021 Å 2 solvent-accessible surface area (Figures 1B  and 2B and Supplementary Table S2). The interactions are largely hydrophobic but also comprise 5 hydrogen bonds and 1 salt bridge. Specifically, the interactions between b6 and a5 of FANCM and b6 0 , a3 0 and a4 0 of FAAP24 contribute more than two-thirds of the hydrophobic contacts and four hydrophilic interactions at this interface. The interactions between a4 of FANCM and the a4 0 -a5 0 loop of FAAP24 contribute the rest of the hydrophobic contacts and two hydrogen-bonding interactions.
The two (HhH) 2 domains also form a heterodimer with pseudo 2-fold symmetry, and the interaction interface involves mainly residues from a7, a9 and a11, and the connecting loops of FANCM and a5 0 , a6 0 , a7 0 and a9 0 , and the connecting loops of FAAP24, and buries a total of 950 Å 2 solvent-accessible surface area ( Figures 1B and 2C and D and Supplementary Table S2). Similarly, the interactions are largely hydrophobic but also comprise 11 hydrogen bonds. Helix a9 and the preceding loop of FANCM mainly interact with a7 0 , a9 0 , the a7 0 -a8 0 loop, and the C-terminal of FAAP24, which contribute one fourth of the hydrophobic contacts and five hydrogen bonds at this interface. Helix a11 and the C-terminal of FANCM mainly interact with a5 0 , a7 0 , and the a4 0 -a5 0 and a6 0 -a7 0 loops of FAAP24, which contribute more than half of the hydrophobic contacts and five hydrogen bonds. The extensive hydrophilic and hydrophobic interactions between the two (HhH) 2 domains are in agreement with the observation that the (HhH) 2 domains mediate the dimerization in other XPF family members (31,32,34).

Structural comparisons with other XPF family members
Previously, the structures of ApXPF (Aeropyrum pernix, Ap) and DrMus81-HsEme1 (Danio rerio, Dr; Homo sapiens, Hs) containing both the nuclease and the (HhH) 2 domains, the dimeric (HhH) 2 domains of human XPF and human XPF-ERCC1, and the central domain of human ERCC1 have been determined (31)(32)(33)(34)42,43). Structural comparisons show that the nuclease and the (HhH) 2 domains of both FANCM and FAAP24 resemble those of ApXPF, DrMus81 and HsEme1, with root-mean-square deviations of 1.5-2.5 Å (Supplementary Figure S3A and B). The structure of the FANCM-FAAP24 heterodimer has the highest similarity to that of the ApXPF homodimer (Supplementary Figure S3C and D), even though they share $25% sequence identity (Supplementary Figure S4A). It was suggested previously that FANCM and XPF may have a common ancestor with a helicase domain at the N-terminal and a nuclease domain at the C-terminal (16). Although the structure of human XPF-ERCC1 containing both the nuclease and the (HhH) 2 domains is unavailable yet, the structural similarities between FANCM-FAAP24 and ApXPF and between the dimeric (HhH) 2 domains of FANCM-FAAP24 and human XPF-ERCC1 support this notion and further suggest that the overall structure of XPF-ERCC1 should be similar to that of FANCM-FAAP24.
Structural comparisons also show that the position of the dimeric (HhH) 2 domains relative to the dimeric nuclease domains in FANCM-FAAP24 is different from those in ApXPF and DrMus81-HsEme1. When these complexes are superimposed based on the dimeric nuclease domains, the position of the (HhH) 2 domains in FANCM-FAAP24 differs by a rotation of 60 from to that in the dsDNA-bound ApXPF, a rotation of 125 from to that in the apo ApXFP, and a rotation of 125 from to that in DrMus81-HsEme1 (Supplementary Figure  S3E-G). The previous structural data have shown that when dsDNA binds to ApXFP, the (HhH) 2 domains are rotated toward the nuclease domains, and the dsDNA is bound in the surface groove formed between the nuclease and the (HhH) 2 domains (31). The variations in the relative arrangements of the nuclease and the (HhH) 2 domains in these complexes might reflect their differed binding abilities and/or specificities for different DNA structures.

Molecular basis for the lack of nuclease activity
The catalytic members of the XPF family contain a conserved GDX n ERKX 3 D motif in the active site of the nuclease domain, which is responsible for the nuclease activity. The previous mutagenesis studies of HsXPF showed that the first two acidic residues of the catalytic motif participate in metal binding; however, the functional roles of the two basic residues Arg and Lys and the last acidic residue Asp in catalysis have not been fully understood yet (29). Consistently, the structure of ApXPF-dsDNA showed that the side chains of the first two acidic residues (Asp52 and Glu62) and the main chain of the first basic residue (Arg63) are involved in the binding of Mg 2+ , but the side chains of the two basic residues (Arg63 and Lys64) and the last Asp (Asp68) are not (31). Sequence comparison of FAAP24 from different vertebrates shows that the catalytic motif is completely absent in FAAP24 (Supplementary Figure S4B). Although the nuclease domain of FAAP24 has a similar structure as that of ApXPF and DrMus81, the corresponding region of the catalytic motif becomes PDX n LYVX 3 D in FAAP24 ( Figure 3A and Supplementary Figure S4A), suggesting the lack of capability to bind a metal ion that is required for the nuclease activity.
On the other hand, sequence comparison of FANCM from different vertebrates shows that the catalytic motif has evolved to CDX n ERRX 3 E in FANCM; in particular, the conserved Lys of the motif is substituted with Arg (Arg1866) and the last Asp is substituted with Glu (Glu1870) in human FANCM, both of which are shown to be involved in catalysis (Supplementary Figure S4A and C). Structural comparison shows that FANCM also has a similar active-site structure as that of ApXPF and DrMus81, and particularly all the key residues of the catalytic motif including the altered two residues assume similar side-chain conformations ( Figure  3A). Nevertheless, no metal ion is found at the active site of FANCM, despite the fact that there is 2 mM MgCl 2 in the protein buffer solution. We speculate that the altered residues in the catalytic motif of FANCM might change the coordination environment of the metal ion and/or the precise geometry of the active site, which could contribute in part to the lack of the nuclease activity of FANCM.
Furthermore, we found that the electrostatic surface surrounding the putative active site in FANCM-FAAP24 is substantially different from that in ApXPF and DrMus81-HsEme1 ( Figure 3B and Supplementary Figure S5). The electrostatic surfaces surrounding the active site in ApXPF and DrMus81-HsEme1 are largely positively charged. In ApXFP-dsDNA, the dsDNA is bound in the surface groove between the nuclease and the (HhH) 2 domains and interacts mainly with several basic residues and the G-I-G hairpin from the (HhH) 2 domain of monomer A, but it is positioned distantly from the active site of the nuclease domain ( Figure 3B and Supplementary Figure S3E). Structural analysis of ApXFP-dsDNA showed that there is a pseudo 2-fold symmetry between the two (HhH) 2 domains, implying that the (HhH) 2 domain of monomer B could also bind DNA via the equivalent region. The positively charged surface region formed by the equivalent basic residues of monomer B is located in the vicinity of the active site, and a modeled dsDNA could bind to this region and is positioned closely to the active site ( Figure 3B) (31). In addition, structural analysis of the HsXPF (HhH) 2 -ssDNA complex showed that the basic residues in the equivalent regions of both (HhH) 2 domains participate in the binding of ssDNA (Supplementary Figure S5) (31,33). Moreover, mutagenesis studies of DrMus81-HsEme1 suggested that several basic residues surrounding the active site are involved in DNA binding (Supplementary Figure S5) (32). These data together indicate that the positively charged surface surrounding the active site is involved in the binding of DNA, which is also essential for the nuclease activity.
In sharp contrast, structural analysis of FANCM-FAAP24 shows that the equivalent region surrounding the active site is relatively hydrophobic ( Figure 3B). In addition, although the nuclease and the (HhH) 2 domains form a surface groove on the top side and a surface cleft on the bottom side, both of them are also composed of largely acidic and hydrophobic residues. These results suggest that the active-site region in FANCM-FAAP24 is unable to bind a DNA substrate, which is in agreement with the previous biochemical data showing that the C-terminal segment of FANCM (FANCM L1 ) exhibits little DNA-binding ability to ssDNA, dsDNA or splayedarm DNA (5). Furthermore, as all members of the XPF family in eukaryotes exist as heterodimers, we also examined the DNA-binding ability of the dimeric nuclease domains of FANCM-FAAP24. Our EMSA results show that the dimeric nuclease domains of FANCM-FAAP24 have no detectable DNA-binding ability to different types of DNA structures (Supplementary Figure S6), which is also consistent with the structural data. Taken together, we suggest that the variation of two key residues at the active site and the lack of DNA binding ability of the surrounding region render a catalytically inactive nuclease domain of FANCM and account for the lack of nuclease activity of FANCM-FAAP24.
As reported previously, FANCM-FAAP24 targets the FA core complex to stalled replication forks during DNA replication at S phase (5,18). For repairing stalled replication forks, Mus81-Eme1 is required for the first incision to form a double-strand break (DSB) (44), and XPF-ERCC1 is required for the second incision (45)(46)(47). Thus, it seems that the nuclease activity of FANCM-FAAP24 is not required in the fork cleavage reaction. It is possible that the nuclease domain of FANCM was degenerated and became catalytically inactive during evolution to avoid redundant function, and FANCM may function mainly via its translocase activity in DNA repair. Indeed, it was recently shown that in addition to its coordinated function with FAAP24 in the activation of the FA pathway, FANCM can function alone in recombination-independent ICL repair by facilitating recruitment of lesion incision activities via its translocase activity (26).

Potential DNA binding site
The previous and our own biochemical data have shown that FANCM-FAAP24 could bind different types of DNA structures (5) ( Figure 1A and Supplementary Figure S1). The previous biochemical data have shown that the (HhH) 2 domain of FAAP24 is essential for targeting FANCM-FAAP24 to ICL DNA (24). Thus, we first investigated whether the (HhH) 2 domains of FANCM and FAAP24 are essential for their localization to chromatin ( Figure 4A). Our results show that the wildtype FANCM and FAAP24 co-localize to chromatin; however, deletion of the (HhH) 2 domain of either FANCM (FANCM   We next tried to identify the potential DNA binding site(s) in the (HhH) 2 domains of FANCM-FAAP24. Analysis of the electrostatic surface of the (HhH) 2 domains reveals two positively charged surface patches ( Figure 4B), and residues composing these patches are conserved in different vertebrate species (Supplementary Figure S4B and C). Region I is mainly composed of residues from helix a9 of FANCM and helix a9 0 of FAAP24, both of which belong to the second HhH motif. Region II is mainly composed of residues from helix a5 0 and the a5 0 -a6 0 loop of the first HhH motif of FAAP24. Particularly, two basic residues (Lys171 and Lys173) in region II are highly conserved, and the strictly conserved Gly168-Val169-Gly170 residues form a GhG hairpin (where h indicates a hydrophobic residue) with their main-chain amino groups exposed on the  Figure 3A. Mg 2+ in FANCM-FAAP24 was modeled based on ApXPF-dsDNA and is shown with a green sphere. Figure S7A), which may be equivalent to the GhG hairpin in ERCC1 and XPF that is involved in DNA binding (31,34,43).

surface (Supplementary
To assess the functional importance of these two regions, we examined whether mutations in the two regions affect the localization of the FANCM-FAAP24 complex to chromatin and the binding ability to DNA ( Figure 4C   The previous biological data have shown that FANCM can interact with the MHF1-MHF2 complex via the region following the helicase domain (residues 661-800) and with FAAP24 via the C-terminal region (residues 1799-2048) (3)(4)(5). To exclude the possible effect of the intrinsically expressed MHF1-MHF2 on the localization of FANCM to chromatin, we constructed a series of GFP-FANCM L2 in wild-type, truncated and mutant forms similar to those of full-length FANCM and examined the localization of FANCM-FAAP24 to chromatin. The immunofluorescence assay results show that the wild-type, truncated and mutant FANCM L2 exhibit similar chromatin localization patterns as the corresponding full-length FANCM, indicating that the interaction between the overexpressed GFP-FANCM and the intrinsically expressed MHF1-MHF2 has no notable effect on the localization of FANCM to chromatin (Supplementary Figure S7D). As both GFP-FANCM and Flag-FAAP24 were over-expressed in our assays and detected with the tag-specific antibodies, it is understandable that the signals for GFP-FANCM interacting with other intrinsically expressed protein partner(s) would be comparably very low and buried in the background. Our structural and functional data together indicate that region II (helix a5 0 and the a5 0 -a6 0 loop) of FAAP24 plays a critical role in DNA binding and is essential for the localization of FANCM-FAAP24 to chromatin. Most recently, Wienk et al. (48) reported the solution structure of the (HhH) 2 domain of FAAP24 and showed that the first HhH motif is involved in the binding of ssDNA, which is consistent with our results. As region II of FAAP24 is independent of FANCM, we speculate that the DNA-binding ability of this region is not only essential for targeting FANCM-FAAP24 to chromatin but also for the independent function of FAAP24 in promoting ATR-mediated checkpoint activation (26).
The previous biochemical data demonstrated that FANCM-FAAP24 binds preferentially to ssDNA, splayed-arm DNA and a 3 0 -flap DNA substrate 5-10fold better than to dsDNA (5), whereas FANCM-MHF binds preferentially to branched DNA and dsDNA (3,4). Both FAAP24 and MHF help to target FANCM to DNA. It is possible that FANCM-MHF-FAAP24 may work together as a molecular sensor, which scans along chromatin to detect stalled replication fork caused by ICLs (4). The two regions of FANCM interacting with MHF and FAAP24 may be located in proximity in the 3D structure such that MHF and FAAP24 could bind to dsDNA and ssDNA regions of one stalled replication fork, respectively, with high affinity and then cooperate to anchor FANCM to the branch point of the replication fork.
In summary, our structural and biological data demonstrate that like other XPF family members, both FANCM and FAAP24 consist of a nuclease domain and an (HhH) 2 domain. The overall structure of FANCM-FAAP24 exhibits a similar architecture as that of ApXPF. Although the active site of FANCM has a similar structure as that of ApXPF and DrMus81, the variation of two key residues at the active site of FANCM and the inability of the surrounding region to bind DNA render a catalytically inactive nuclease domain. Moreover, our data indicate that helix a5 0 and the a5 0 -a6 0 loop of FAAP24 is a potential binding site for ssDNA and is critical for the localization of FANCM-FAAP24 to chromatin in DNA repair.

ACCESSION NUMBERS
The coordinates and structure factors of the FANCM-FAAP24 complex have been deposited in the RCSB Protein Data Bank with the accession code 4M6W.