The canonical non-homologous end joining factor XLF promotes chromosomal deletion rearrangements in human cells

Clastogen exposure can result in chromosomal rearrangements, including large deletions and inversions that are associated with cancer development. To examine such rearrangements in human cells, here we developed a reporter assay based on endogenous genes on chromosome 12. Using the RNA-guided nuclease Cas9, we induced two DNA double-strand breaks, one each in the GAPDH and CD4 genes, that caused a deletion rearrangement leading to CD4 expression from the GAPDH promoter. We observed that this GAPDH–CD4 deletion rearrangement activates CD4+ cells that can be readily detected by flow cytometry. Similarly, double-strand breaks in the LPCAT3 and CD4 genes induced an LPCAT3–CD4 inversion rearrangement resulting in CD4 expression. Studying the GAPDH–CD4 deletion rearrangement in multiple cell lines, we found that the canonical non-homologous end joining (C-NHEJ) factor XLF promotes these rearrangements. Junction analysis uncovered that the relative contribution of C-NHEJ appears lower in U2OS than in HEK293 and A549 cells. Furthermore, an ATM kinase inhibitor increased C-NHEJ-mediated rearrangements only in U2OS cells. We also found that an XLF residue that is critical for an interaction with the C-NHEJ factor X-ray repair cross-complementing 4 (XRCC4), and XRCC4 itself are each important for promoting both this deletion rearrangement and end joining without insertion/deletion mutations. In summary, a reporter assay based on endogenous genes on chromosome 12 reveals that XLF-dependent C-NHEJ promotes deletion rearrangements in human cells and that cell type–specific differences in the contribution of C-NHEJ and ATM kinase inhibition influence these rearrangements.

conversely, DSBs can be an initiating event for cancer-associated mutations and chromosomal rearrangements (1,2). Indeed, end joining (EJ) repair of DSBs that causes insertion/ deletion (indel) mutations or chromosomal rearrangements could result in the loss of tumor suppressor genes or the formation of oncogenic fusion genes (3)(4)(5)(6)(7). For example, deletion rearrangements have been observed in a wide-array of tumor types, and 0.5 megabase pairs (Mbp) was the average size of somatic deletions found in a set of more than 700 cancer lines (6). As another example, radiation-associated secondary malignancies carry a relatively high frequency of deletion mutations with microhomology at the junctions and balanced inversion rearrangements (3). Thus, characterizing the mechanisms of DSB repair via EJ is critical for developing strategies to improve cancer radiotherapy and to understand cancer etiology (2,8).
The relative contribution of C-NHEJ and Alt-EJ to the etiology of chromosomal rearrangements has remained unclear. Cancer-associated chromosomal rearrangements often show evidence of microhomology, which could reflect a substantial contribution of Alt-EJ (5,7). However, rearrangements without microhomology are also readily detected (7). Furthermore, although microhomology is not required for C-NHEJ, this pathway can nevertheless mediate these repair events (22). The relative contributions of these pathways to rearrangement formation could also be specific to individual cell types and/or species. Indeed, experiments monitoring rearrangement frequencies induced by targeted DSBs (e.g. I-SceI or the RNAguided nuclease Cas9) have revealed such distinctions. Specifically, in mouse embryonic stem cells (mESCs), chromosomal translocations are relatively independent of the C-NHEJ pathway (23). In contrast, human cells appear to show a greater role for C-NHEJ in translocation formation, as detected using PCRbased assays (24).
Apart from cell type per se, DNA damage response signaling pathways appear to affect the contribution of C-NHEJ versus Alt-EJ in rearrangement formation. Namely, a prior study from our laboratory found that ATM inhibits chromosomal rearrangements mediated by C-NHEJ in mESCs, using an assay for a 0.4 Mbp deletion (25). In particular, ATM appears important to inhibit a rearrangement junction type that is a hallmark of C-NHEJ: EJ of blunt DSBs without indels, i.e. No Indel EJ (25,26). We have sought to perform an analogous set of experiments in human cells. Thus, we present an assay for a deletion rearrangement in human cells that uses endogenous genes and describe the influence of XLF and inhibition of the ATM kinase on this rearrangement.

Reporter assay for chromosomal rearrangements in human cells based on endogenous genes, with CD4 expression as the readout
We have sought to understand the mechanisms of chromosomal rearrangement formation in human cells via a reporter assay that uses endogenous genes. For this, we chose the CD4 gene on human chromosome 12, because expression of this gene is largely restricted to cells in the immune system (27,28). Furthermore, expression of this protein is readily detected by flow cytometry using commercial antibodies. Indeed, EJ reporter cassettes have been previously developed using the CD4 gene as the readout (17,29). Thus, we posited that rearrangements that link an active promoter with the CD4 coding region would lead to CD4 expression that could be detected by flow cytometry. To detect deletions, we used the promoter for GAPDH, which is in the same orientation, 0.25 Mbp upstream of CD4 (Fig. 1a). To detect inversions, we used the promoter for LPCAT3, which is in the opposite orientation, 0.23 Mbp downstream of CD4 (Fig. 1a). These locations are based on Genome Reference Consortium Human Build 38 (assembly accession reference: GCA_000001405.27).
We first tested this approach in the human osteosarcoma cell line U2OS, which is a common cell line for studies of the DNA damage response, as these cells retain intact cell cycle checkpoints (26, 30 -32). These cells also use the alternative lengthening of telomere pathway of telomere maintenance, and hence are also a model system to examine this aspect of the DNA damage response (33). We used a version of this cell line that is stably transfected with pFRT/lacZeo (U2OS Flp-In T-Rex) (26,30), which is relevant for another assay described below (i.e. integration of the EJ7-GFP reporter). With this cell line, we found that expressing Cas9 with sgRNAs targeting GAPDH and CD4, or LPCAT3 and CD4, substantially induces CD4ϩ cells (Fig. 1b). We isolated CD4ϩ U2OS cells via flow cytometry and used PCR to confirm the expected rearrangement (Fig. 1a). We also expressed Cas9 and the individual sgRNAs, and found that targeting GAPDH and LPCAT3 alone did not induce CD4ϩ cells (Fig. 1b). Finally, expressing Cas9 and the sgRNA targeting CD4 alone induced CD4ϩ cells above background levels, but the frequency of CD4ϩ cells was much lower than that of the rearrangements (Fig. 1, b and c).
Note that this assay relies on determining the percentage of cells that are CD4ϩ. In contrast, this analysis does not provide a measure of how many chromosomes per cell have undergone the rearrangement, because each cell likely has multiple copies of chromosome 12 (34). Namely, a cell with multiple copies of this region of chromosome 12 likely has a greater probability of forming the rearrangement, and hence becoming CD4ϩ, than a cell that has fewer copies of this chromosomal region. Thus, this potential effect of chromosome 12 ploidy on this assay system should be considered when comparing the overall frequencies of CD4ϩ cells between different cell types.
We then examined the feasibility of this approach in three other cell lines: the A549 lung cancer cell line (35), an adenovirus immortalized human embryonic kidney cell line that is stably transfected with pFRT/lacZeo (HEK293 Flp-In T-Rex) (36), and an SV40 immortalized human fibroblast cell line (GM00637, Coriell). In all experiments, CD4ϩ frequencies are normalized to transfection efficiency using a parallel transfection with a GFP expression vector (Fig. 1c). Beginning with the GAPDH-CD4 deletion rearrangement, in each of these cell lines, we found that targeting pairs of DSBs to GAPDH and CD4 caused a significant induction of CD4ϩ cells, compared with targeting a DSB to CD4 alone (Fig. 1c). Regarding the LPCAT3-CD4 inversion rearrangement, we found that targeting DSBs to LPCAT3 and CD4 in A549, HEK293, and GM00637 cells did not cause a significant induction of CD4ϩ cells, compared with targeting a DSB to CD4 alone (Fig. 1c). In summary, targeting Cas9-induced DSBs to specific sites on chromosome 12 in several human cell lines is sufficient to induce chromosomal rearrangements that can be detected through CD4 expression. However, detection of the GAPDH-CD4 deletion rearrangement was more robust in each cell type, compared with the LPCAT3-CD4 inversion rearrangement (Fig. 1c). Indeed, the inversion rearrangement was only significantly induced in U2OS cells, compared with targeting a DSB to CD4 alone.

The C-NHEJ factor XLF promotes deletion rearrangements in both U2OS and HEK293, whereas an ATM kinase inhibitor causes an increase in deletion rearrangements only in U2OS cells
Using the GAPDH-CD4 rearrangement assay, we sought to examine the influence of XLF on deletion rearrangements. We examined XLF, because this factor has emerged as a key stabilizing factor in the C-NHEJ complex (26,(37)(38)(39). Thus, we used Cas9 and a sgRNA targeting XLF (40) to generate knockout cell lines (XLF-KO) for both U2OS and HEK293 cells (Fig. 2a). We compared XLF-KO cells to both the parental cell line and to cells with expression of XLF WT (Fig. 2a). For the latter, an expression vector for XLF WT with an N-terminal 3ϫFLAG (3XF) immunotag is included in the transfection with the Cas9/ sgRNA plasmids. From these experiments, both U2OS and HEK293 XLF-KO cells showed a significantly lower frequency of the deletion rearrangement versus the parental cell line (Fig.  2b). Furthermore, expression of XLF WT in the XLF-KO cells restored the rearrangement frequency to near the levels of the parental cells (Fig. 2b). Thus, XLF promotes deletion rearrangements in both HEK293 and U2OS cells.

XLF-mediated deletion rearrangement
In parallel with the above experiments, we also examined the effect of treating cells with a small molecule inhibitor of the ATM kinase KU-55933 (ATMi) (41), because a prior study from our laboratory found that ATMi treatment of mESCs causes an increase in deletion rearrangements in a manner that depends on several C-NHEJ factors, including XLF (25). Notably, the transfections without ATMi were treated with the vehicle used for ATMi (i.e. DMSO). From these experiments, we found that ATMi treatment of U2OS cells caused an increase in the frequency of the GAPDH-CD4 deletion rearrangement in parental cells (Fig. 2b), but not the XLF-KO cells, unless these cells were complemented with the XLF WT expression vector (Fig. 2b). In contrast, ATMi treatment failed to cause an increase in the frequency of deletion rearrangements in HEK293 cells, irrespective of the presence of XLF. Thus, ATM kinase activity appears important to suppress XLF-dependent deletion rearrangements in U2OS cells.
We also tested other small molecule kinase inhibitors to evaluate the specificity of the effect of ATMi treatment in U2OS cells. For one, we tested a second ATM inhibitor, KU-60019 (ATMi-2) (42). We also tested a kinase inhibitor of DNA-PKcs, NU7441 (DNAPKi) (43). ATM and DNA-PKcs are both phosphoinositide three kinase-related protein kinases (44). DNA-PKcs associates with KU to form the DNA-dependent protein kinase, which catalyzes several autophosphorylation events on DNA-PKcs (45). We examined the effect of each of these kinase inhibitors on the frequency of the GAPDH-CD4 rearrangement in U2OS cells. We found that treatment with KU-60019 (ATMi-2) caused a significant increase in the frequency of the GAPDH-CD4 rearrangement, similar to treatment with Figure 1. A reporter assay for chromosomal rearrangements in human cells that uses the endogenous CD4 gene. a, shown is a schematic for examining deletion and inversion rearrangements in human cells using endogenous genes on chromosome 12. Relative positions are based on Genome Reference Consortium Human Build 38. Scissors indicate sgRNA target sites for the Cas9 nuclease. Also shown are PCR amplification products from sorted CD4ϩ U2OS cells to detect the GAPDH-CD4 deletion rearrangement and the LPCAT3-CD4 inversion rearrangement, with an amplicon of RAD52 as a control. Amplification products from untransfected (UT) cells are also shown. b, shown are representative flow cytometry plots for U2OS cells that were either untransfected (UT), or transfected with expression plasmids for Cas9/sgRNAs targeting the GADPH, LPCAT3, or CD4 locus only, or Cas9/sgRNAs targeting the GAPDH and CD4 loci (GAPDH plus CD4) or the LPCAT3 and CD4 loci (LPCAT3 plus CD4). c, shown is the frequency of CD4ϩ cells, normalized to transfection efficiency with parallel transfections with a GFP expression plasmid, for four different human cell lines. Shown is the mean with S.D. *, p Յ 0.04, using an unpaired t test with Holm-Sidak correction; ns, not significant. n Ն 6 for U2OS, and n Ն 3 for HEK293, A549, and GM00637 cells.

Influence of cell type, XLF, and ATMi treatment on rearrangement junctions
We then sought to examine how XLF disruption and ATMi treatment affected rearrangement junctions in both U2OS and HEK293 cells, because junction patterns can provide insight into the EJ pathways that mediate the rearrangements. For this, we performed the GAPDH-CD4 rearrangement assay described above, isolated the CD4ϩ cells by flow cytometry sorting, amplified the GAPDH-CD4 rearrangement junction, and examined the amplicons by deep sequencing (Fig. 3a). For each condition, we examined amplicons from three independent transfections (i.e. biological replicates). We determined the frequency of distinct junction types for each amplicon, and  Fig. 1c. n.s., not significant. Shown is the mean with S.D. *, p Յ 0.005 (n ϭ 8 for U2OS cells, and n ϭ 6 for HEK293 cells), using an unpaired t test with Holm-Sidak correction. c, shown is the percentage of CD4ϩ cells in U2OS cells transfected with the GAPDH and CD4 sgRNA/Cas9 expression vectors and treated with different small molecule kinase inhibitors: KU-55933 (ATMi), another ATM kinase inhibitor KU-60019 (ATMi-2), and the DNA-PKcs inhibitor NU7441 (DNAPKi). The control sample was treated with vehicle (DMSO), and CD4ϩ frequencies were normalized to transfection efficiency as in Fig.  1c. Shown is the mean with S.D. *, p Յ 0.001 (n ϭ 5), using an unpaired t test with Holm-Sidak correction.

XLF-mediated deletion rearrangement
then calculated the mean/S.D. of these frequencies from the three biological replicates. For this analysis, we categorized the junctions into three types: 1) rearrangements without insertion or deletion mutations at the edges of the two Cas9induced DSBs (i.e. No Indel), 2) insertions, and 3) deletions and complex junctions. Regarding the last, complex junctions involve deletions with inserted nucleotides and/or mutations at the junction.
Beginning with U2OS cells, each of these three junction types is readily detectable: 27% No Indel, 8.5% insertions, and 64.4% deletions and complex junctions (Fig. 3b). In contrast, the U2OS XLF-KO cells show a marked reduction of the No Indel (1.4%) and insertion (2.6%) junction types, and an increase in deletions and complex junctions (96%) (Fig. 3b). Thus, XLF is important for the No Indel and insertion junction types. Conversely, U2OS cells treated with ATMi showed a significant increase in the No Indel and insertion junction types (68.5 and 20.9%, respectively), and a decrease in deletions and complex junctions (10.5%) (Fig. 3c). Accordingly, ATMi treatment caused an increase in the junction types that are promoted by XLF, which supports the notion that ATM kinase activity is important to suppress XLF-mediated rearrangements.
With HEK293 cells, we found that compared with U2OS cells, these cells show a substantial frequency of the No Indel junction type (75%, 2.7-fold higher than U2OS), an increase in the insertion junction types (17%, 2-fold greater than U2OS), along with a marked reduction in deletions and complex junction types (7.8%, 8.2-fold lower than U2OS) (Fig. 3d). In contrast, the HEK293 XLF-KO cells show a marked reduction in the No Indel and insertion junction types (1.92 and 0.9%, respectively), as we found with the U2OS XLF-KO cells (Fig. 3e). Finally, ATMi treatment in HEK293 cells did not obviously affect the types of junctions (Fig. 3f).
In summary, XLF is required for the No Indel and insertion junction types in both HEK293 and U2OS cells. Furthermore, we found distinctions among the junction patterns between these cell lines. Namely, HEK293 cells show an increase in the XLF-dependent junction types (No Indel and insertions), compared with U2OS cells.
We then examined microhomology use in the deletion mutations. For this, we isolated all sequences that represented Ն0.25% of the total deletions or complex junctions per amplicon and then determined the microhomology usage for each of the simple deletion sequences. It is not possible to assign

XLF-mediated deletion rearrangement
microhomology to complex deletion mutations with insertions, or for simple insertions, because the sequence of the inserted nucleotides prior to ligation is unknown. From the analysis of simple deletions, we found that junctions from U2OS cells rarelyexhibitedeventsthatutilized0nt(5.8%)or1ntofmicrohomology (5.5%), whereas events with 2, 3, and Ն4 nt of microhomology were more prevalent (22.4, 22.2, and 43%, respectively) (Fig. 4a). Interestingly, XLF loss does not affect the spectrum of microhomology use in U2OS cells (Fig. 4a). Conversely, ATMi treatment in U2OS cells caused an increase in deletions with no microhomology (0 nt, 31%, 5.4-fold higher), and a decrease in the use of Ն4 nt of microhomology (28%, 1.6-fold lower) (Fig. 4b).
Compared with U2OS cells, we found that HEK293 cells exhibited a bias toward junctions with simple deletions with 0 to 1 nt of microhomology (40.9 and 22.9%, respectively, 7-and 4.1-fold higher than U2OS cells, respectively) (Fig. 4c) and a decrease in junctions with 3 and Ն4 nt of microhomology (6.7 and 12.1%, respectively, 3.3-and 3.6-fold lower than U2OS cells, respectively) (Fig. 4c). Finally, HEK293 XLF-KO cells, as well as HEK293 parental cells treated with ATMi, showed minor changes in the use of microhomology at the junctions (i.e. a decrease in 2 nt of microhomology, compared with parental cells) (Fig. 4, d and e, respectively). In summary, junctions without microhomology were rare for U2OS cells, but prevalent for U2OS cells treated with ATMi, and for HEK293 cells.
Given that we observed differences in the junction patterns, and effects of ATMi treatment, between U2OS and HEK293 cells, we next sought to examine a third cell type. Specifically, we used the A549 lung cancer cell line, because we found that the GAPDH-CD4 rearrangement assay is feasible in these cells (Fig. 1c). We also chose A549 cells as a means to evaluate a  Fig. 3, sequences that represented Ն0.25% of total deletions and complex junctions were isolated and analyzed for microhomology. From these sequences, shown are the percentage junctions with 0, 1, 2, 3, and Ն4 nt of microhomology. As in Fig. 3, n ϭ 3 independent biological replicates, n.s., not significant. Shown is the mean with S.D. b, microhomology use in U2OS parental (DMSO) versus ATMi (KU-55933). Analysis as in (a), *, p Ͻ 0.04, n.s., not significant, using an unpaired t test with Holm-Sidak correction. c, microhomology use in U2OS parental versus HEK293 parental. Analysis as in (a), statistics as in (b). d, microhomology use in HEK293 parental versus HEK293 XLF-KO. Analysis as in (a), statistics as in (b). e, microhomology use in HEK293 parental (DMSO) versus ATMi. Analysis as in (a), statistics as in (b). As in Fig. 3, the U2OS and HEK293 parental junctions are repeated in the different panels to facilitate the various comparisons.

XLF-mediated deletion rearrangement
second cancer cell line, in addition to U2OS. First, we examined whether ATMi treatment affected the frequency of the GAPDH-CD4 rearrangement in A549 cells, and found that such treatment had no obvious effect (Fig. 5a). This finding is similar to our observations with HEK293 cells (Fig. 2b).
We then examined deletion rearrangement junctions for A549 cells that were either treated with ATMi or vehicle (DMSO). Namely, we performed amplicon deep sequencing analysis on CD4ϩ cells isolated by cell sorting, as described above. From this analysis, we found that A549 cells showed an average of 62.8% No Indel, 28.7% insertions, and 8.5% deletions and complex junctions (Fig. 5b). From microhomology analysis ofthesimpledeletions,wefoundthateventswith0ntofmicrohomology were predominant (72.3%), whereas the events with 1, 2, 3, and Ն4 nt of microhomology showed frequencies of 13.9, 6.3, 1.4, and 6.1%, respectively (Fig. 5c). For both junction type and microhomology usage, ATMi treatment did not cause a significant effect (Fig. 5, b and c). Notably, the frequencies of junction types and microhomology use for A549 cells are similar to HEK293 cells (Fig. 5, d and e). Altogether, these findings indicate that A549 cells show similar results as HEK293 cells, regarding junction patterns, and the lack of an effect of ATMi treatment on deletion rearrangement frequency.

An XLF residue critical for the interaction with XRCC4 is important for both No Indel EJ and rearrangement formation
The above findings indicate that XLF is important to promote deletion rearrangements in human cells, as well as the No Indel junction type. To further examine if these functions are interrelated, we tested whether disruption of key residues in XLF may affect both rearrangement formation and No Indel EJ. In particular, we examined the XLF-L115D mutation, which is in the N-terminal globular head domain and has been shown to block the interaction of XLF with the C-NHEJ factor XRCC4 (39, 47), which we confirmed using co-immunoprecipitation analysis in U2OS cells (Fig. 6a). We also examined the XLF-K160D mutation, which is predicted to weaken the XLF homodimer interface because of disruption of a predicted salt bridge between the monomers (26,48).
We confirmed that these mutants are expressed using their N-terminal 3ϫFLAG immunotag (Fig. 6b) and examined their relative ability to promote rearrangements (GAPDH-CD4 rearrangement assay) and No Indel EJ using the EJ7-GFP reporter assay (26). As above, the GAPDH-CD4 rearrangement assay refers to co-expression of Cas9 with sgRNAs targeting GAPDH and CD4 to induce the GAPDH-CD4 rearrangement,

XLF-mediated deletion rearrangement
and subsequently quantifying the frequency of CD4ϩ cells, as in Fig. 1c. In the EJ7-GFP reporter, the GFP coding sequence has been disrupted by inserting a 46-nt spacer between the first two bases (GG) and the final base (C) of the codon for glycine 67 (Fig. 6c). We use sgRNAs to target two Cas9-induced DSBs to precisely excise the 46-nt spacer, such that No Indel EJ between the distal DSBs restores the GGC codon, and hence GFPϩ expression. The EJ7-GFP reporter was integrated into cells using the Flp/FRT system (26).
From these experiments in both U2OS and HEK293 cells, we found that the XLF-L115D mutant showed a marked defect in promoting both the GAPDH-CD4 rearrangement, as well as No Indel EJ using the EJ7-GFP assay, compared with XLF WT (Fig. 6, d and e). Similarly, the XLF-K160D mutant showed a significant defect in promoting these EJ events, but retained partial activity, and indeed for HEK293 cells with the GAPDH-CD4 rearrangement assay, this mutant was not statistically significantly different from XLF WT (Fig. 6e). These findings indicate that a residue of XLF that is critical for the interaction with XRCC4 is important for both deletion rearrangements, as well as No Indel EJ.
Based on these findings with the XLF-L115D mutant, we posited that XRCC4 itself is critical for these EJ events. To test this hypothesis, we used Cas9 and sgRNAs targeting XRCC4 to disrupt the XRCC4 gene in the HEK293 EJ7-GFP cell line (XRCC4-KO cell line) (Fig. 6f). Using this cell line, we examined Figure 6. Disrupting the XLF-XRCC4 interaction (XLF-L115D) and loss of XRCC4 itself each cause defects in end joining without indels and rearrangement formation. a, shown is the effect of the XLF-L115D mutation on forming a co-immunoprecipitation complex with KU70 and XRCC4. Lysates were prepared from U2OS XLF-KO cells transfected with a control EV, or 3XF-XLF WT and L115D expression vectors. A fraction of the lysate was used to examine the proteins in the input, and the rest was used for a FLAG-immunoprecipitation (FLAG-IP). Shown are immunoblot signals for input and FLAG-IP samples for FLAG (XLF), KU70, and XRCC4. b, shown are FLAG immunoblots confirming expression of 3XF-XLF WT, L115D, and K160D in U2OS XLF-KO (left) and HEK293 XLF-KO (right) cells, with tubulin loading control. c, shown is a schematic of the EJ7-GFP assay for end joining without insertion/deletion mutations (i.e. No Indel EJ). d, analysis of XLF mutants in U2OS XLF-KO cells. For the GAPDH-CD4 rearrangement assay, cells were transfected with Cas9 and sgRNAs targeting GAPDH and CD4 (GAPDH plus CD4), as in Fig. 1c, along with either a control EV, a 3XF-XLF WT, or a mutant (L115D or K160D) expression vector. Similarly, U2OS XLF-KO cells with the EJ7-GFP reporter were transfected with the Cas9/sgRNA vectors for this assay, along with the other vectors for complementation analysis. Shown are the frequencies of CD4ϩ cells for the GAPDH-CD4 rearrangement assay (top) or GFPϩ cells for the EJ7-GFP assay (bottom), normalized to transfection efficiency, as in Fig. 1c. n ϭ 6. Shown is the mean with S.D. *, p Ͻ 0.0005, using an unpaired t test with Holm-Sidak correction. e, analysis of XLF mutants in HEK293 XLF-KO cells. Shown are repair frequencies for the GAPDH-CD4 rearrangement and EJ7-GFP assays, as in (d). *, p Ͻ 0.004, n.s., not significant. f, analysis of an XRCC4-KO HEK293 cell line. Shown are repair frequencies for the GAPDH-CD4 rearrangement and EJ7-GFP assays, following transfecting cells with the Cas9/sgRNA plasmids as in (d), along with either control EV, or an expression vector for XRCC4. n ϭ 6, *, p Ͻ 0.004. Also shown is XRCC4 immunoblot analysis, with actin loading control, of the XRCC4-KO cell line transfected with EV or XRCC4 expression vector, and the parental HEK293 cells transfected with EV.

XLF-mediated deletion rearrangement
the frequency of the GAPDH-CD4 rearrangement, as well as No Indel EJ using the EJ7-GFP assay. From these experiments, we found that loss of XRCC4 caused a reduction in both the GAPDH-CD4 rearrangement, and No Indel EJ, compared with the parental HEK293 cell line (Fig. 6f). Furthermore, expression of XRCC4 in the XRCC4-KO cell line caused a significant increase in these EJ events (Fig. 6f). These findings indicate that XRCC4 promotes the deletion measured by the GAPDH-CD4 rearrangement assay, as well as No Indel EJ.

Discussion
Defining the mechanisms of DSB-induced chromosomal rearrangements provides insight into cancer etiology, as well as the consequences of clastogen exposure to genome stability. Here, we have described an assay to examine such rearrangements in human cells using the CD4 gene and flanking promoters in the GAPDH and LPCAT3 genes. Because these genes are already present on chromosome 12 (i.e. are endogenous to this chromosome), this assay has the potential to be versatile across human cells that do not already express CD4. Indeed, we have confirmed that this assay is robust in four different human cell lines. Thus, this assay could be used to examine the role of individual genes and small molecules on the frequency of chromosomal rearrangements in multiple cell types. Furthermore, in addition to frequency measurements, isolation of CD4ϩ cells enables the analysis of repair junction patterns, which can provide insight into the relative contribution of C-NHEJ to rearrangements.
Nonetheless, certain limitations of this assay should be considered. For one, the four cell lines tested here are readily transfected with simple lipofection, whereas other means of introducing the sgRNAs/Cas9 may be necessary for other cell types (e.g. nucleofection or viral transduction). Additionally, we found that these four cell lines showed very low background affinity for the phycoerythrin-CD4 antibody, whereas this assay is likely not feasible for cell lines with higher background staining. Furthermore, we found that an sgRNA targeting Cas9 to the CD4 gene alone is able to induce CD4ϩ cells, albeit at a low level. The mechanism of such expression of CD4 is unclear. Nevertheless, the frequency of this event is much lower than for the GAPDH-CD4 rearrangement, and hence does not appear to interfere with examining the frequency of this rearrangement. In any case, to adapt this approach to other cell lines, it will be critical to include analysis of the sgRNA targeting CD4 alone (see Fig. 1, b and c). Along these lines, although targeting DSBs to LPCAT3 and CD4 in U2OS cells induced CD4ϩ cells at a substantially higher frequency, compared with targeting a DSB to CD4 alone, the LPCAT3-CD4 inversion events are lower than for the GAPDH-CD4 deletion rearrangement. Thus, the GAPDH-CD4 rearrangement assay is more robust, and hence was the focus of our mechanistic studies.
With the GAPDH-CD4 rearrangement assay, we sought to examine the relative contribution of the C-NHEJ pathway to rearrangements in two cell types commonly used for studies of the DNA damage response (i.e. U2OS and HEK293). The relative contribution of C-NHEJ to rearrangement formation has been shown to vary among cell lines, but the primary focus has been on comparing mouse versus human cells. Namely, C-NHEJ appears dispensable for chromosomal translocations in mESCs, but was shown to promote translocations in a set of human cell lines (23,24). Consistent with this notion, in a prior study, our laboratory found that the C-NHEJ factor XLF is dispensable for a 0.4-Mbp deletion rearrangement in mESCs (25), whereas here we found that XLF promotes the GAPDH-CD4 rearrangement in both U2OS and HEK293 cells.
XLF not only promoted a higher frequency of this deletion rearrangement, but also was critical for the No Indel junction type in both U2OS and HEK293 cells. Other studies also support the notion that C-NHEJ factors are critical for No Indel EJ events (25,49,50), including a study with the EJ7-GFP assay used here (26). Indeed, we found that an XLF mutation that disrupts the interaction with XRCC4 (i.e. L115D), as well as loss of XRCC4 itself, each cause a reduction in both the GAPDH-CD4 rearrangement and No Indel EJ (EJ7-GFP assay), indicating that these EJ events rely on similar mechanisms. Consistent with a key role for this XLF residue in EJ, the XLF-L115D mutant has also been shown to be defective in promoting resistance to the clastogen Zeocin (51). In addition to L115D, we found similar results with a mutation that disrupts the XLF dimer interface (K160D), although the effects were more modest, and not significant for the GAPDH-CD4 rearrangement in HEK293 cells. Altogether, these findings support recent studies that XLF is a key stabilizing factor for the C-NHEJ complex and bridging DNA ends (39,47).
However, although XLF promotes the deletion rearrangement in both U2OS and HEK293 cells, from junction analysis, C-NHEJ rearrangements appear less frequent in U2OS cells. Specifically, three different junctions types that appear mediated by C-NHEJ were lower in U2OS cells: No Indel junctions, insertion mutations, and deletion mutations without microhomology. Each of these EJ events are likely dependent on C-NHEJ, as they involve ligation without an annealing intermediate to stabilize the junction (11). Consistent with this notion, the No Indel junction depends on XLF, as described above, and similarly insertions are also promoted by XLF. Deletion mutations without microhomology have been shown to be dependent on C-NHEJ in several studies (17,29), although we did not observe an obvious effect of XLF on microhomology use in this study, indicating a potential redundancy for XLF for such EJ events.
The mechanisms that underlie this difference in junction patterns between U2OS and HEK293 cells are unclear. Indeed, there are several possibilities, because the origins of these cell lines are very distinct (e.g. osteosarcoma cells versus an adenovirus immortalized kidney cell line, respectively) (36,52). Furthermore, the junction patterns for the HEK293 cell line appear similar to those of the A549 lung cancer cell line. Notably, U2OS cells treated with an ATM kinase inhibitor (i.e. ATMi) show similar junctions patterns as HEK293 and A549 cells. Namely, U2OS cells treated with ATMi showed a high frequency of the No Indel junction, which was similar to that of HEK293 and A549 cells. ATMi treatment of U2OS cells also caused an overall higher frequency of the GAPDH-CD4 rearrangement, in a manner dependent on XLF. These effects of ATMi treatment are similar to a prior report from our laboratory on a deletion rearrangement in mESCs (25), as well as other

XLF-mediated deletion rearrangement
reports that ATM is important to limit toxic C-NHEJ events (53). Thus, a role of ATM kinase activity in suppressing C-NHEJ-mediated rearrangements appears to be conserved in mouse and human cells.
However, ATMi treatment did not have substantial effects on HEK293 and A549 cells, perhaps because these cells already show a high frequency of C-NHEJ-mediated rearrangements. Thus, we speculate that HEK293 and/or A549 cells could be deficient in some aspect of the ATM-mediated DNA damage response signaling that is important to suppress C-NHEJ rearrangements. Conversely, U2OS cells may be hyperactive for this aspect of ATM-mediated signaling. In any case, examining these possibilities will require further insight into the mechanisms by which ATM kinase signaling suppresses C-NHEJ rearrangements. We suggest that the assay systems described here provide a platform for such further mechanistic studies in multiple human cell lines.

Cell lines and plasmids
The following human cell lines were authenticated by short tandem-repeat profiling: HEK293 Flp-In T-REx (36), U2OS Flp-In T-REx (26,30), and A549 (35). The GM00637 SV40 transformed human fibroblast cell line was acquired directly from the Coriell repository. The HEK293 and U2OS cells were cultured as described (54), and the same medium was used for A549 cells. GM00637 cells were cultured using minimum Eagle's medium supplemented with nonessential amino acids, 12.5% FBS, 1% penicillin/streptomycin, and 0.015% plasmocin (Invivogen). The EJ7-GFP reporter was integrated into the chromosomal FRT site of the HEK293 Flp-In T-REx cells, as previously described for the generation of the U2OS EJ7-GFP cell line (26).
For CAS9/sgRNA expression, the px330 plasmid was used (Addgene 42230) (55). These sgRNA sequences were used to target GAPDH (px330-GAPDH, 5Ј-GTATAGAAACCGGG-GGCGCGG, the first G base is not in the target locus but is required for transcription of the sgRNA), CD4 (px330-CD4, 5Ј-GGCGTATCTGTGTGAGGACT), and LPCAT3 (px330-LPCAT3, 5Ј-GATAGCGTTTTGCCCGCATT). The pCAGGS-3ϫFLAG-XLF (human) and empty vector (EV, pCAGGS-BSKX) were described previously (26,56) and used to generate the XLF mutants by inserting gBLOCK fragments (Integrated DNA Technologies). To generate the XLF-KO cell lines, an sgRNA sequence targeting XLF (5Ј-GTTGGTTTCA-GATCTTCAAC) (40) was introduced into px330 (px330-XLF). To generate the HEK293 XLF-KO cell line, px330-XLF (200 ng) was co-transfected with pgk-puro (54) (60 ng) using 1.8 l of Lipofectamine 2000 into the HEK293 EJ7-GFP cell line seeded on a 24-well plate. These cells were subsequently selected in puromycin (3 g/ml) for 3 days and plated at low density without puromycin selection to isolate individual clones. For the U2OS cell line, px330-XLF (800 ng) was co-transfected into the U2OS Flp-In T-REx cell line with dsRED (150 ng) with 6 l of Lipofectamine 2000 into cells seeded on a 6-well plate. Cells were subsequently sorted for dsRED expressing cells (ARIA sorter, BD Biosciences) and plated at low density to isolate indi-vidual clones. Clones were screened by immunoblotting (see below). The EJ7-GFP reporter was integrated into the U2OS XLF-KO cell line, as described above. This cell line was used only for the EJ7-GFP reporter analysis. To generate the HEK293 XRCC4-KO cell line, a similar procedure was used for creating the HEK293 XLF-KO cell line, as described above, except using two sgRNAs to target XRCC4, using these sequences cloned into px330: 5Ј-GATGACATGGCAATGG-AAAA and 5Ј-GTTAAACGTGTATACATCAGC. Also, in this case, the transfection was scaled to a 12-well plate using 400 ng each px330 plasmid and 100 ng pgk-puro plasmid. The pCAGGS-XRCC4 expression vector was described previously (26). The 7a and 7b sgRNA/Cas9 plasmids for inducing DSBs in the EJ7-GFP reporter were described previously (26).

CD4 rearrangement and EJ7-GFP assays
Cells were seeded on a 24-well plate and subsequently transfected with 200 ng each of px330-GAPDH and px330-CD4, or px330-LPCAT3 and px330-CD4, along with 20 ng of EV or pCAGGS-3ϫFLAG-XLF expression vector (WT or mutant), with 1.8 l Lipofectamine 2000 in a total of 0.6 ml. Transfections with one px330 plasmid included 220 ng of EV to maintain equivalent plasmid concentrations. Similarly, to determine transfection efficiency, 200 ng of GFP expression vector (pCAGGS-NZEGFP) (54) and 220 ng of EV were used. Experiments comparing HEK293 parental versus XRCC4-KO used 100 ng of EV or pCAGGS-XRCC4. The transfection mixes were removed after 4 h, the wells were washed with DMEM, and complete media was added either with 10 M ATMi (KU-55933, Selleck Chemicals, S1092) (41), 5 M ATMi-2 (KU-60019, Selleck Chemicals, S1570) (42), 10 M DNAPKi (NU7441, Selleck Chemicals, S2638) (43), or vehicle (DMSO). ATMi and DMSO treatments were for 3 days, at which point the cells were processed for analysis or cultured in untreated media for subsequent isolation by cell sorting. For CD4 staining, cells were harvested, washed with PBS, incubated in 100 l staining buffer (10% FBS, 1% sodium azide) with 2 l phycoerythrin-CD4 antibody (BioLegend, 317410) for 20 min on ice, followed by three washes with staining buffer. For flow cytometry analysis, cells were fixed by mixing 200 l cells in staining buffer with 90 l 10% formaldehyde (e.g. VWR International, MKH12108) and then analyzed with a CyAn ADP cytometer (Dako). For isolation of CD4ϩ cells, the staining was performed as with the analysis using double the volume of staining buffer and antibody, but without sodium azide in the staining buffer and without fixation, and cells were sorted using an ARIA III or ARIA SORP (BD Biosciences). For isolation of CD4ϩ cells from A549 cells, the transfections were scaled 4-fold onto a 6-well plate. For the EJ7-GFP reporter, transfections were performed as for the CD4 assay, but using the 7a and 7b sgRNA/ Cas9 plasmids and without DMSO or ATMi treatment. Analysis of GFPϩ cells was performed as described (26).

Rearrangement junction analysis
Genomic DNA was isolated from CD4ϩ or control cells as described (54) and used for PCR amplification (Platinum HiFi Supermix, Thermo Fisher). Amplicons of the deletion rearrangement products were generated with primers P1 (5Ј-ctac-

XLF-mediated deletion rearrangement
tagcggttttacgggc-3Ј) and P2 (5Ј-ctgacctctggaagctcaca-3Ј). Control amplification of RAD52 used these primers: 5Ј-aagtcccctcctttcctctg and 5Ј-ctcgctcaccctcactcttc. For sequencing library preparation, purified amplicons underwent two rounds of PCR. The first round of PCR used nested primers (5Ј-ctacacgacgctcttccgatctaagaccttgggctgggact and 5Ј-gtgactggagttcagacgtgtgctcttccgatctaccttacctctgggcttgc) with Illumina universal sequences and amplified for five cycles. The PCR products were purified with 1.0ϫ Ampure XP Beads (BD Biosciences, A63882) and followed by the second round of PCR with the barcoded index primers and amplified for four cycles. The final purified libraries were validated with the Agilent Bioanalyzer DNA High Sensitivity DNA Kit (Agilent, 5067-4627) and quantified with Qubit and qPCR. The libraries were sequenced on Illumina HiSeq 2500 with HiSeq Rapid SBS Kit v2 in the paired end mode of 101 cycles of read1, 7 cycles of index read, and 101 cycles of read2. Real-time analysis 2.2.38 software was used to process the image analysis and base calling.
Paired-end amplicon reads of 2 ϫ 101 bp were merged using PEAR (paired-end read merger v0.9.5), and the merged amplicon sequences were processed with customized scripts that tallied unique sequence and its occurrence in amplicons, aligned each unique sequence to the reference sequence with Novoalign (v3.02.07, Novocraft Technologies), and generated detailed information of mutation and indel using SAMtools (v0. 1.19) and VarScan (v.2.3.9). The sequencing reads of each amplicon (Ͼ1.4 million reads per sample) were examined to determine the frequencies (i.e. percentage) of distinct junction categories. For each condition, such analysis from three biological replicates (i.e. amplicon deep sequencing from three independent transfections) was used to calculate the mean and S.D. for the frequencies of distinct junction categories.
For each immunoprecipitation, 2 ϫ 10 5 U2OS cells were seeded on two 10-cm plates, and each plate was transfected with 3 g of 3ϫFLAG-XLF expression vector or EV with 12 l of Lipofectamine 2000. Following 2 days after transfection, protein was extracted using NETN along with Dounce homogenization, followed by incubation with anti-FLAG M2 magnetic beads (Sigma, M8823) overnight at 4°C. Beads were washed twice with NETN buffer, and proteins were eluted with 100 mM glycine (pH 2.5) and neutralized with 1 M Tris-HCl (pH 10.8).