NuRD and CAF-1-mediated silencing of the D4Z4 array is modulated by DUX4-induced MBD3L proteins

The DUX4 transcription factor is encoded by a retrogene embedded in each unit of the D4Z4 macrosatellite repeat. DUX4 is normally expressed in the cleavage-stage embryo, whereas chromatin repression prevents DUX4 expression in most somatic tissues. Failure of this repression causes facioscapulohumeral muscular dystrophy (FSHD) due to mis-expression of DUX4 in skeletal muscle. In this study, we used CRISPR/Cas9 engineered chromatin immunoprecipitation (enChIP) locus-specific proteomics to characterize D4Z4-associated proteins. These and other approaches identified the Nucleosome Remodeling Deacetylase (NuRD) and Chromatin Assembly Factor 1 (CAF-1) complexes as necessary for DUX4 repression in human skeletal muscle cells and induced pluripotent stem (iPS) cells. Furthermore, DUX4-induced expression of MBD3L proteins partly relieved this repression in FSHD muscle cells. Together, these findings identify NuRD and CAF-1 as mediators of DUX4 chromatin repression and suggest a mechanism for the amplification of DUX4 expression in FSHD muscle cells.


Introduction
Repetitive DNA sequences make up the majority of the human genome (Birney et al., 2007;de Koning et al., 2011), and these ubiquitous but understudied elements play a critical role in important biological processes such as embryogenesis and cellular reprogramming (Chuong et al., 2017;Elbarbary et al., 2016;Gerdes et al., 2016). For example, each unit of the D4Z4 macrosatellite repeat array contains a copy of the double homeobox 4 (DUX4) retrogene that is expressed in the germline and in four-cell human embryos where DUX4 activates a cleavage-specific transcriptional program (De Iaco et al., 2017;Hendrickson et al., 2017;Snider et al., 2010;Whiddon et al., 2017). This is in contrast to somatic tissues where DUX4 is silenced via repeat-mediated epigenetic repression of the D4Z4 arrays (Das and Chadwick, 2016;Daxinger et al., 2015;Snider et al., 2010;van Overveld et al., 2003;Zeng et al., 2009). To date, little is understood about how the epigenetic repression of DUX4 is relieved at specific times during germline and early embryo development, or what the mechanisms of establishing and maintaining epigenetic repression during later development and in somatic tissues are.
Facioscapulohumeral muscular dystrophy (FSHD) is caused by the mis-expression of DUX4 in skeletal muscle  and provides an experimentally tractable context in which to identify mechanisms that normally repress DUX4 in somatic cells as well as mechanisms that might regulate this repression during development. In individuals with FSHD, the epigenetic repression of DUX4 is incomplete as a consequence of having fewer than 11 D4Z4 repeats (FSHD type 1, FSHD1) or mutations in trans-acting chromatin repressors of D4Z4 (FSHD type 2, FSHD2), either of which results in ectopic expression of DUX4 in skeletal muscle when combined with a permissive chromosome 4qA haplotype that provides a polyadenylation site for the DUX4 mRNA (Lemmers et al., 2012;Lemmers et al., 2010;van den Boogaard et al., 2016). The mis-expression of DUX4 in skeletal muscle has many consequences that include induction of a cleavage-stage transcriptional program, suppression of the innate immune response and nonsense-mediated RNA decay (NMD) pathways, inhibition of myogenesis, and induction of cell death through mechanisms that involve the accumulation of aberrant and double-stranded RNAs (Bosnakovski et al., 2008;Feng et al., 2015;Geng et al., 2012;Kowaljow et al., 2007;Rickard et al., 2015;Shadle et al., 2017;Snider et al., 2009;Wallace et al., 2011;Winokur et al., 2003;Young et al., 2013). These cellular insults lead to progressive muscle weakness initiating in the face and upper body but eventually involving nearly all skeletal muscle groups .
A particular sequence of repetitive DNA that plays an important role in human health contains a gene called DUX4 in each repeat. DUX4 is normally active in stem cells and in early-stage embryos. This gene is then switched off or 'silenced' during later stages of development and in most cells of the body. However, in some individuals the DUX4 gene inappropriately activates in muscle cells. This causes a disease known as facioscapulohumeral muscular dystrophy (FSHD), in which muscle weakness begins in the face and upper body and eventually spreads to other muscles. Currently, there is no cure for FSHD.
Proteins that bind to DNA can control the activity of nearby genes. Little is known about which proteins silence DUX4 at the appropriate time and in the right cells, so Campbell et al. set out to identify the proteins that attach to the repetitive DNA sequences containing DUX4. Further investigation showed that several of these proteins play an important role in keeping DUX4 turned off, including two protein complexes called NuRD and CAF-1. These complexes are necessary to silence DUX4 in human muscle cells and stem cells. Campbell et al. also identified a protein that can increase the activity of the DUX4 gene in FSHD muscle cells by overcoming the silencing activity of the NuRD complex.
Overall, the results presented by Campbell et al. provide the groundwork for developing new treatments for FSHD. The next step will be to discover ways of enhancing the ability of NuRD and CAF-1 to silence the DUX4 gene.
siRNA-directed silencing has also been demonstrated to play a role in repressing the D4Z4 array Snider et al., 2009). The genetic lesions that cause FSHD disrupt these regulatory pathways resulting in D4Z4 DNA hypomethylation; reduced H3K9me2/3 and H3K27me3 levels; and loss of HP1g, EZH2, SMCHD1 and cohesin binding; which together culminate in ectopic DUX4 expression (Cabianca et al., 2012;Daxinger et al., 2015;Jones et al., 2014;Lemmers et al., 2012;van den Boogaard et al., 2016;van Overveld et al., 2003;Zeng et al., 2009). Although each of the above-mentioned studies tested specific factors based on knowledge of their role in chromatin, to date no studies have taken an agnostic approach to identify how these individual components might be integrated into repressive complexes or to understand how these complexes might be regulated.
Here, we report a locus-specific proteomics-based characterization of proteins that bind the D4Z4 array in human myoblasts and identify the NuRD and CAF-1 complexes as individually necessary to maintain DUX4 repression in skeletal muscle and induced pluripotent stem (iPS) cells. Further, we show that DUX4-mediated induction of the MBD3L family of factors relieves this repression and amplifies DUX4 expression. Together, these findings identify multiprotein complexes that regulate DUX4 expression and reveal a process for DUX4 amplification in FSHD muscle cells that provides a new candidate target for therapeutics.
A total of 261 proteins were identified (Supplementary file 1), including known D4Z4-associated factors SMCHD1, CBX3/HP1g and the cohesin complex components SMC1A, SMC3, RAD21 and PDS5B (Lemmers et al., 2012;Zeng et al., 2009) (Table 1). BRD3 and BRD4 were also identified (Supplementary file 1) and BET inhibitor compounds have recently been shown to regulate D4Z4 repression . D4Z4-bound proteins were enriched in gene ontology categories that included telomere maintenance and chromatin silencing (Supplementary file 2), consistent with the subtelomeric localization and transcriptionally repressed state of the D4Z4 array. Strikingly, CHD4, HDAC2, MTA2 and RBBP4, which comprise many of the components of the Nucleosome Remodeling Deacetylase (NuRD) complex (Basta and Rauchman, 2015), were among the isolated proteins (Table 1). While each of these factors was identified as associated with the D4Z4 repeat in more than one gD4Z4 sample, they were either absent or present in only a single replicate from the gMYOD1 pulldowns (Supplementary file 1).
Occupancy of CHD4, HDAC2 and MTA2 at the D4Z4 array was confirmed by chromatin immunoprecipitation (ChIP) in MB2401 myoblasts, an independent control muscle cell line ( Figure 1B-D). The NuRD complex can be recruited to methylated DNA by the MBD2 subunit (Le Guezennec et al., 2006;Zhang et al., 1999), and indeed, ChIP showed MBD2 enrichment at the D4Z4 region in MB2401 control myoblasts ( Figure 1E). Together, these data demonstrate that the D4Z4 macrosatellite repeat is bound by the MBD2/NuRD complex in control human muscle cells.

MBD2/NuRD complex components mediate transcriptional repression of the D4Z4 array
The NuRD complex represses gene transcription via the concerted effort of the core subunits HDAC1 and HDAC2; CHD3 or CHD4; MBD2 or MBD3; MTA1, MTA2 or MTA3; RBBP4 and RBBP7; and GATAD2A and GATAD2B (Basta and Rauchman, 2015) (Figure 2A). In MB2401 control myoblasts, small interfering RNA (siRNA) depletion of the lysine deacetylases HDAC1 or HDAC2 had no significant effect on DUX4 mRNA levels, whereas concurrent HDAC1/HDAC2 knockdown increased DUX4 mRNA 100-fold resulting in the activation of DUX4 target genes ZSCAN4 and TRIM43 ( Figure 2B and Figure 2-figure supplement 1A). In contrast, in MB073 FSHD1 and MB200 FSHD2 myoblasts, singular HDAC1 or HDAC2 depletion led to a ! 20-fold activation of DUX4 mRNA while dual HDAC1/HDAC2 knockdown increased DUX4 levels more than 140-fold, with comparable Figure 1. NuRD complex components bind the D4Z4 macrosatellite repeat. (A) Schematic summary of the enChIP procedure. A 3xFLAG-dCas9-HA-2xNLS fusion protein (FLAG-dCas9) consisting of an N-terminal triple FLAG (3xFLAG) epitope tag, catalytically inactive Cas9 endonuclease (dCas9), C-terminal human influenza hemagglutinin (HA) epitope tag and tandem nuclear localization signal (2xNLS) is expressed with one or more guide RNA (gRNA) in an appropriate cell context. Cells are crosslinked, chromatin is fragmented and complexes containing FLAG-dCas9 are immunoprecipitated with an anti-FLAG antibody. After reversing the crosslinks, molecules associated with the targeted genomic region are purified and identified by downstream analyses including mass spectrometry and next-generation sequencing. Adapted from Fujita et al. (2016). (B-E) ChIP-qPCR enrichment of NuRD complex components CHD4 (B), HDAC2 (C), MTA2 (D) and MBD2 (E) along the D4Z4 repeat in MB2401 control myoblasts. The Chr18q12 amplicon contains no CpG dinucleotides and serves as a negative control site, while the TMEM130 promoter is occupied by NuRD complex components in published ENCODE datasets (ENCODE Project Consortium, 2012) and functions as a positive control locus. Error bars denote the standard deviation from the mean of three biological replicates. Statistical significance was calculated by comparing the specific pulldown to the IgG control at each site using a two-tailed, two-sample Mann-Whitney U test. *, p 0.05; ns, not significant, p>0.05. See also Collectively, these results indicate that HDAC1 and HDAC2 are associated with, and function to transcriptionally repress, the D4Z4 array. These data also show that the D4Z4 repeat in control myoblasts is more resistant to de-repression than the D4Z4 repeat in FSHD cells, which are sensitized because of a shortened array (FSHD1) or SMCHD1 mutation (FSHD2).
We next evaluated the necessity of the ATP-dependent chromatin remodelers CHD3 and CHD4 for D4Z4 repeat repression. Depleting CHD4 from MB2401 control myoblasts had no effect on DUX4 expression ( Figure 2E and , consistent with its absence from the gD4Z4 enChIP purifications and the mutually exclusive nature of CHD3 and CHD4 within the NuRD complex. Together, these results reveal that CHD4 binds the D4Z4 repeat and is necessary to silence DUX4 expression in FSHD cells, whereas control myoblasts have a more stably repressed D4Z4 array. Similar to CHD4, depleting methyl-CpG-binding protein MBD2 from MB2401 control myoblasts had no effect on DUX4 mRNA levels ( Figure 2H and Figure 2-figure supplement 5A). However, depleting MBD2 from MB073 FSHD1 myoblasts moderately, but significantly, increased DUX4 expression, whereas DUX4 was not de-repressed when MBD2 was knocked down in MB200 FSHD2 myoblasts ( Figure 2I-J and Figure 2-figure supplement 5B-C). This difference suggests a possible D4Z4 context-dependent effect that was not observed for the single-copy NuRD complex-bound gene TMEM130 following MBD2 knockdown ( Figure 2-figure supplement 5). We further observed that depletion of MBD3, which can recruit the NuRD complex to unmethylated DNA (Baubec et al., 2013;Le Guezennec et al., 2006;Saito and Ishikawa, 2002), did not de-repress DUX4 in MB2401 control, MB073 FSHD1 or MB200 FSHD2 myoblasts ( Figure 2-figure supplement 6). Together, these data show that MBD2 occupies the D4Z4 array and is necessary for DUX4 repression in at least some contexts, and suggest that factors in addition to MBD2 might recruit components shared by the NuRD complex to silence the D4Z4 macrosatellite repeat.   Silencing the D4Z4 array requires components of the MBD1/CAF-1 complex The NuRD complex is known to cooperate with other complexes to carry out its cellular functions. For example, NuRD and the CAF-1 chromatin assembly complex work together in several molecular processes (Helbling Chadwick et al., 2009;Yang et al., 2015) and share a core subunit, RBBP4, which was identified as associated with the D4Z4 repeat by gD4Z4 enChIP purification ( Table 1) Notably, although knockdown of CHAF1A or CHD4 alone did not induce DUX4 expression in MB2401 control myoblasts ( Figure 2E and Figure 3B), simultaneous depletion increased DUX4 mRNA levels over 150-fold ( Figure 3H and Together, these results indicate that a combination of MBD1-and MBD2-mediated recruitment of the CAF-1 and NuRD repressive complexes, respectively, work together to silence the D4Z4 repeat in skeletal muscle cells. To extend these studies, we depleted CHD4, CHAF1A, MBD2, or MBD1 in five additional FSHD cell lines: one FSHD1 cell line (54-2) with three 4qA D4Z4 repeats (compared to the 8 repeats of the MB073 line), and four FSHD2 lines (2305, 2453, 2338, and 1881) with different SMCHD1 mutations and repeat sizes ranging from 11 to 15 D4Z4 units (Supplementary file 3). All five lines showed derepression of DUX4 upon knockdown of MBD2 or CHAF1A, and all but one (2453, an FSHD2 cell line) showed increased DUX4 expression following CHD4 depletion, whereas de-repression following MBD1 knockdown was evident in the FSHD1 and two of the FSHD2 cell lines (Figure 3-figure supplements 4-7). Taken together, these data indicate the combined roles of the NuRD and CAF-1 complexes in repressing DUX4, and that the relative necessity of specific components of each pathway might vary depending on the cellular context, or possibly the efficiency of each knockdown.

Components shared by the NuRD and CAF-1 complexes mediate D4Z4 repeat repression
To repress transcription, core members of the NuRD and CAF-1 complexes utilize a shared set of auxiliary factors, namely the tripartite motif-containing protein TRIM28, the lysine methyltransferase SETDB1, and the lysine demethylase KDM1A (Ivanov et al., 2007;Loyola et al., 2009;Sarraf and Stancheva, 2004;Schultz et al., 2001;Wang et al., 2009;Yang et al., 2015). Knockdown of TRIM28, SETDB1 or KDM1A de-repressed DUX4 in MB073 FSHD1 and MB200 FSHD2 myoblasts to   , implicating them in facilitating silencing of the D4Z4 array. Of these factors, only KDM1A knockdown de-repressed DUX4 mRNA in the MB2401 control myoblasts, indicating a necessary role for this demethylase in maintaining repression of both normal and pathological D4Z4 alleles in muscle cells. In support of these expression data, peptides for TRIM28 were present in gD4Z4 enChIP pulldowns, although they did not meet our filtering criteria to be included in the list of D4Z4-associated proteins.
Similarly, SIN3A peptides were found in a gD4Z4 pulldown before our final filtering steps. The transcriptionally repressive SIN3 complex shares core proteins HDAC1, HDAC2, RBBP4, and RBBP7 with the NuRD complex and is also composed of SDS3, SAP18, SAP30 and SIN3A or SIN3B subunits (Grzenda et al., 2009) (Figure 4-figure supplement 4A). Therefore, we tested its role in D4Z4 repeat repression and found that SIN3A or SIN3B depletion led to the activation of DUX4 and DUX4 target genes in FSHD cells (Figure 4-figure supplement 4B-G), supporting a role for the SIN3 complex in D4Z4 regulation. Taken together, these data indicate that D4Z4 array silencing is mediated by multiple chromatin regulatory factors that act together with core components of the NuRD complex and also depend on the CAF-1 chromatin assembly complex to achieve full epigenetic repression.
Proteins that repress the D4Z4 array in myoblasts also silence DUX4 in iPS cells We previously reported that DUX4 is expressed at very low levels in human iPS cell populations  and, similar to the expression pattern in FSHD myoblasts, this represents the occasional expression in a small number of cells (JWL, unpublished data). We have more recently shown that DUX4 is present in four-cell human embryos and that when expressed in iPS cells or muscle cells it activates a cleavage-stage transcriptional program similar to the program expressed in a subset of 'naïve' iPS or embryonic stem (ES) cells (Hendrickson et al., 2017;Whiddon et al., 2017).
To determine whether factors responsible for silencing the D4Z4 repeat in myoblasts have a similar function in a model of early development, we knocked down components of the NuRD and CAF-1 complexes in human eMHF2 iPS cells, which were derived from an unaffected (non-FSHD) individual, and assessed the impact on DUX4 expression. Similar to our myoblast results, depletion of HDAC1/ HDAC2, CHD4, CHAF1A, SETDB1 or SIN3B de-repressed DUX4 in iPS cells; whereas, unlike in myoblasts, knockdown of KDM1A in iPS cells had a more minor effect on the levels of DUX4 mRNA ( To determine whether iPS cells have a greater necessity for NuRD and CAF-1 components to maintain DUX4 repression compared to somatic cells, we transduced a human foreskin fibroblast cell line (HFF3) with the reprogramming factors Oct4, Sox2, Nanog, and Lin28 to generate isogenic iPS cell clones ( Figure 5-figure supplement 2). Notably, depletion of NuRD and CAF-1 complex components did not lead to DUX4 de-repression in the parental HFF3 fibroblast line, whereas the HFF3 iPS lines responded similarly to the eMHF2 iPS line ( Figure 5-figure supplement 3). These results indicate that the NuRD and CAF-1 complexes that silence the D4Z4 macrosatellite array in muscle cells also contribute to the regulation of this locus in human iPS cells, and that iPS cells have decreased D4Z4 repression compared to their somatic counterpart, similar to the decreased repression in FSHD myoblasts compared to control myoblasts.

MBD3L2 de-represses the D4Z4 repeat
In prior studies of DUX4-induced gene expression, we identified the MBD3L family (MBD3L2, MBD3L3, MBD3L4, and MBD3L5) as a direct target of DUX4 that was expressed in FSHD, but not control, muscle cells and muscle biopsies, and activated by exogenous DUX4 in cultured human myoblasts Yao et al., 2014;Young et al., 2013). MBD3L family proteins can replace MBD2 or MBD3 in the NuRD complex but they lack the CpG-binding domain and antagonize NuRD-mediated transcriptional repression, possibly by preventing the complex from being recruited to its DNA targets (Jiang et al., 2002;Jin et al., 2005). To determine whether MBD3L proteins de-repress the NuRD complex-regulated D4Z4 array, we transduced control and FSHD myoblasts with a lentiviral vector delivering a doxycycline-inducible MBD3L2 transgene and, after selecting for transgene-expressing cells, analyzed DUX4 mRNA and protein after 48 hr of doxycycline treatment. Similar to the knockdown of NuRD complex members, expression of MBD3L2 induced DUX4 5-18-fold in MB073 FSHD1 and MB200 FSHD2 myoblasts and increased by 10-fold the number of myoblast nuclei expressing DUX4 protein, whereas DUX4 was not derepressed in MB2401 control myoblasts ( Figure 6A-E and Figure 6-figure supplement 1A-C).
When cultured in low mitogen differentiation media, myoblasts fuse to form multinucleated myotubes, and DUX4 expression increases in FSHD myotubes compared to myoblasts (Balog et al., 2015). To determine whether the DUX4-induced MBD3L proteins might contribute to the increased DUX4 expression in myotubes, we expressed short hairpin RNA (shRNA) to inhibit MBD3L RNAs in MB073 FSHD1 and MB200 FSHD2 myotubes and found that these decreased DUX4 and DUX4 target gene expression by~50% and~30%, respectively ( Figure 6F-G, Figure 6-figure supplement 1D-E and Figure 6-figure supplement 2). Together, these data implicate MBD3L2 in the regulation of the D4Z4 array and demonstrate that endogenous DUX4-induced MBD3L proteins contribute to the amplification of DUX4 expression in FSHD myotubes.

Discussion
In this study, enChIP-MS identified factors that co-purified with the D4Z4 macrosatellite array in human myoblasts, and subsequent ChIP and knockdown studies revealed that the NuRD and CAF-1 complexes repress DUX4 expression from the D4Z4 repeat in skeletal muscle and iPS cells. To some extent, each complex appears to have a parallel, or redundant, function in DUX4 repression because knockdown of both pathways was necessary to induce DUX4 expression in MB2401 control myoblasts. The distinctive mutations causing FSHD, or other factors such as the distribution of DNA methylation on the D4Z4, might preferentially weaken different specific components of each pathway, as evidenced by the relative necessity for CHD4, MBD2 or MBD1 in different FSHD cell lines. However, the variable efficiencies of the individual knockdowns in each cell type and experiment might also contribute to these apparent differences. It is also important to note that CAF-1 is a chromatin assembly complex and that the knockdowns were performed in replicating myoblasts; therefore, CAF-1 knockdown might not have the same consequence in post-mitotic myotubes. Overall, despite the relative differences in the necessity of the specific protein knockdown of individual components of the NuRD or CAF-1 complexes in different FSHD cells, the data show that these complexes together are necessary to maintain D4Z4 repression. These two complexes also have shared auxiliary components, for example, TRIM28, SETDB1, and KDM1A, and knockdown of these factors also induced DUX4 expression in FSHD cells, with KDM1A knockdown being sufficient on its own to induce DUX4 in control myoblasts.
Our en-ChIP pulldowns identified several D4Z4-associated proteins that are involved in epigenetic silencing of variegated gene expression in mice (Blewitt et al., 2005;Daxinger et al., 2013). One of this group of Modifier of murine metastable epiallele (Momme) genes, Smchd1, was shown to directly repress DUX4 in human cells and to be a causative gene for FSHD2 (Lemmers et al., 2012). In addition to finding SMCHD1 associated with the D4Z4 in our enChIP, we identified the Momme genes PBRM1, RIF1, SMARCA4, SMARCA5 and UHRF1 as D4Z4-associated by enChIP-MS, and implicated the Momme genes HDAC1, SETDB1 and TRIM28 in the regulation of DUX4 through knockdown experiments. The convergence and striking overlap of the results of these two complementary approaches to understanding variegated gene expression suggest that conserved machinery may be responsible for repressing this type of locus across species. The presence of chromatin remodelers and positive transcriptional regulators, such as SMARCA5, BRD3 and BRD4, at the D4Z4 locus in the control cells used for the enChIP also indicates a dynamic balance between activators and repressors, which is consistent with the identification of sense and anti-sense transcripts associated with the D4Z4 repeats in both control and FSHD cells . Our findings also suggest that PBRM1, RIF1, SMARCA4, SMARCA5 and UHRF1 are candidates for playing a role in DUX4 regulation and deserve additional attention in future studies.
The necessity of the NuRD complex to maintain repression of DUX4 in FSHD cells suggests that the DUX4-mediated induction of the MBD3L family of factors might amplify DUX4 expression within a nucleus or facilitate the internuclear spreading of DUX4 in multinucleated myotubes. MBD3L factors replace MBD2 or MBD3 in the NuRD complex and antagonize its normal repressive function (Jiang et al., 2002;Jin et al., 2005). In this study, we showed that expression of MBD3L2 was sufficient to amplify DUX4 expression in FSHD cells and knockdown using an shRNA that targets the entire family showed that expression of the MBD3L family was necessary for the full induction of DUX4 expression in FSHD myotubes. The fact that DUX4 induces high expression of the clustered MBD3L genes reveals a positive feed-forward mechanism that might facilitate spreading of DUX4 expression between nuclei in myotubes. In FSHD myotubes, DUX4 expression apparently initiates in a single nucleus and the protein then spreads to adjacent nuclei in the syncytium. Similarly, MBD3L proteins are detected as spreading to adjacent nuclei (AEC, unpublished data) where they would facilitate DUX4 expression. In this manner, each DUX4 expressing nucleus would act to progressively amplify DUX4 expression in its neighbors, spreading DUX4 expression along the myofiber. This might be similar to, and additive with, the prior observation that the DUX4-mediated inhibition of NMD can amplify DUX4 expression by stabilizing the DUX4 mRNA, which is itself a target of NMD (Feng et al., 2015). It is interesting to speculate that the internuclear amplification of DUX4 expression might contribute to the susceptibility of skeletal muscle to damage in FSHD.
Together our data provide several complementary approaches to the challenge of creating an FSHD therapeutic. One strategy would be to enhance D4Z4 repression by designing drugs that increase NuRD complex-mediated repression. Although drugs that decrease epigenetic repression are in clinical use, including some that target members of the NuRD complex (SAHA, targeting HDAC1/2; ORY-1001, targeting KDM1A; GSK126, EZH2 inhibitor) drugs that enhance epigenetic repression have received less attention. This is partly due to concerns that they might also suppress important tumor suppressor genes, but the fact that mutations in SMCHD1 and DNMT3B that cause FSHD have limited genome-wide consequences suggests that some factors might be relatively specific for repressing repetitive regions of the genome. A second strategy would be to prevent the amplification of DUX4 after it stochastically 'bursts' on in a myotube nucleus. This might be accomplished by inhibiting the production of MBD3L proteins with small molecules or interfering RNAs. Alternatively, myoblast transplantation with cells containing larger D4Z4 repeat sizes or 4qB alleles might provide 'decoy' nuclei that would absorb MBD3L factors and not activate DUX4, or, in a similar decoy approach, autologous transplants following deletion of the D4Z4 array and/or the MBD3L cluster.
Although little is known about the regulation of DUX4 expression in cleavage-stage embryos and the testis luminal cells, it is evident from this study that the expression of DUX4 in a small percentage of iPS cells or ES cells shares mechanisms of molecular regulation with skeletal muscle cells. This also indicates similarities between the regulation of the human DUX4 retrogene and the mouse Dux retrogene that is also in a macrosatellite array, although thought to have arisen from a separate retrotransposition of the DUXC gene (Clapp et al., 2007;Leidenroth et al., 2012;Leidenroth and Hewitt, 2010). It was previously shown that CAF-1 depletion in mouse ES cells resulted in the expression of genes specific to two-cell embryos (Ishiuchi et al., 2015), and later shown that induction of these genes was blocked by simultaneous knockdown of mouse Dux along with Chaf1a (Hendrickson et al., 2017). Trim28, Kdm1a, and HDAC inhibitors have been shown to regulate Zscan4 and the early cleavage program in mouse ES cells (Macfarlan et al., 2012), and for Trim28 this activity was shown to be mediated through the induction of mouse Dux (De Iaco et al., 2017). Similarly, the NuRD complex and MBD3 have been shown to inhibit cellular reprogramming in mouse ES cells, and, conversely, reprogramming to a naive stem cell state was facilitated by inhibition of these complexes (Luo et al., 2013;Rais et al., 2013). The fact that inhibiting NuRD or CAF-1 activity potentiates stem cell reprogramming in mouse ES/iPS cells and, as shown in this report, potentiates human DUX4 expression, suggests that DUX4 itself might facilitate reprogramming to the naive state and that mouse Dux and human DUX4 might be subject to similar regulation, a finding not entirely obvious given that these retrogenes are thought to have been generated by independent retrotranspositions of the parental DUXC gene, as noted above.
In summary, we identified components of the NuRD and CAF-1 complexes as necessary to maintain repression of DUX4 expression from the D4Z4 repeat. In control myoblasts, either pathway was sufficient to maintain repression of DUX4, whereas in FSHD cells inhibition of either pathway resulted in higher levels of DUX4 expression. These same mechanisms repress DUX4 expression in iPS cells. In addition, the DUX4 induction of the NuRD antagonist MBD3L family further de-repressed DUX4 in FSHD cells. Together, these findings provide the basis for therapies directed at repressing DUX4 in FSHD and reveal a mechanism for the regulation of DUX4 in stem cells.  (Hendrickson et al., 2017) or derived inhouse from normal HFF3 foreskin fibroblasts reprogrammed via lentiviral transduction of Oct4, Sox2, Nanog and Lin28 (Yu et al., 2007), and grown in DMEM:Nutrient Mixture F-12 (1:1, Gibco) with 100 U/100 mg penicillin/streptomycin, 10 mM MEM Non-Essential Amino Acids (Gibco), 100 mM sodium pyruvate (Thermo Fisher Scientific, Waltham, MA, USA), 20% KnockOut Serum Replacement (Gibco), 1 mM 2-mercaptoethanol and 4 ng/ml recombinant human basic fibroblast growth factor under hypoxic (5% O 2 ) conditions on 0.1% gelatin-coated plates pre-seeded with 1.3 Â 10 4 cells/cm 2 of irradiated mouse embryonic fibroblasts. While the full haplotypes are unknown, eMHF2 cells utilize DUX4 exon 3, suggesting a 4qA161S allele, while HFF3 cells use DUX4 exon 3b, suggesting a 4qA161L allele (Lemmers et al., 2018

Cloning, virus production and transgenic cell line generation
To construct FLAG-dCas9-gRNA plasmids, the lentiCRISPRv2 vector (a gift from Feng Zhang, Addgene plasmid #52961) (Sanjana et al., 2014) was digested with AgeI and BamHI, PCR was used to amplify AgeI-3xFLAG-EcoRI from a synthesized template and EcoRI-dCas9-BamHI from pHR-SFFV-KRAB-dCas9-P2A-mCherry (a gift from Jonathan Weissman, Addgene plasmid #60954) (Gilbert et al., 2014), the three fragments were ligated together to create a 3xFLAG-dCas9-HA-2xNLS vector, and then D4Z4 or MYOD1 gRNA were inserted by digesting 3xFLAG-dCas9-HA-2xNLS with BsmBI and ligating it to annealed gRNA oligos. To construct the doxycycline-inducible MBD3L2 plasmid, the MBD3L2 coding region was subcloned into the NheI and SalI sites of the pCW57.1 vector (a gift from David Root, Addgene plasmid #41393). The pGIPZ-shControl and -shMBD3L vectors were obtained from the Fred Hutchinson Cancer Research Center Genomics Shared Resource. Lentiviral particles were produced in 293T cells by co-transfecting the appropriate lentiviral vector with pMD2.G (a gift from Didier Trono, Addgene plasmid #12259) and psPAX2 (a gift from Didier Trono, Addgene plasmid #12260) using Lipofectamine 2000 (Invitrogen, Carlsbad, CA) following the manufacturer's instructions. To generate polyclonal transgenic cell lines, myoblasts were transduced with lentivirus in the presence of 8 mg/ml polybrene and selected using 2 mg/ml puromycin. Monoclonal transgenic lines were generated by transducing at a low cell density using a low multiplicity of infection (MOI <1) and allowing cells that survived selection to form colonies before individual clones were isolated using cloning cylinders.

Protein extraction and immunoblotting
Total protein extracts were generated by lysing cells in SDS sample buffer (500 mM Tris-HCl pH 6.8, 8% SDS, 20% 2-mercaptoethanol, 0.004% bromophenol blue, 30% glycerol) followed by sonication and boiling with 50 mM DTT. Samples were run on NuPage 4-12% precast polyacrylamide gels (Invitrogen) and transferred to nitrocellulose membrane (Invitrogen). Membranes were blocked in PBS containing 0.1% Tween-20% and 5% non-fat dry milk for 1 hr at room temperature before overnight incubation at 4˚C with primary antibodies in block solution. Membranes were then incubated for 1 hr at room temperature with horseradish peroxidase-conjugated secondary antibodies in block solution and chemiluminescent substrate (Thermo Fisher Scientific) used for detection on film.
Immunofluorescence Cells were fixed in PBS containing 2% paraformaldehyde (Electron Microscopy Sciences, Hatfield, PA) for 7 min at room temperature and permeabilized for 10 min in PBS with 0.5% Triton X-100. Samples were then incubated overnight at 4˚C with primary antibodies, followed by incubation with appropriate FITC-or TRITC-conjugated secondary antibodies for 1 hr at room temperature prior to DAPI counterstaining and imaging with a Zeiss Axiophot fluorescent microscope, AxioCam MRc digital camera and AxioVision 4.6 software (Carl Zeiss Microscopy, Thornwood, NY). Image J software (Schneider et al., 2012) was used for image analysis and quantification.

enChIP-qPCR
FLAG-dCas9 chromatin occupancy was analyzed as previously described (Fujita and Fujii, 2013) using chromatin extraction and fragmentation methods from (Forsberg et al., 2000) and the following minor modifications. Five million trypsinized myoblasts were crosslinked with 1% formaldehyde (Thermo Fisher Scientific) for 10 min at room temperature. Chromatin was diluted to 0.5% SDS with IP Dilution Buffer (20 mM Tris pH 8.0, 2 mM EDTA, 150 mM NaCl, 1% Triton X-100, 0.01% SDS, cOmplete EDTA-free Protease Inhibitor Cocktail, 100 mM PMSF) and fragmented to an average length of 500 bp using a Fisher Scientific Model 500 Sonic Dismembrator probe tip sonicator. Soluble chromatin was diluted to 0.2% SDS with IP Dilution Buffer before pre-clearing with 5 mg of mouse IgG conjugated to 20 ml of Dynabeads-Protein G (Thermo Fisher Scientific) followed by immunoprecipitation with 5 mg of anti-FLAG M2 antibody conjugated to 50 ml of Dynabeads-Protein G.
Quantitative PCR was carried out on a QuantStudio 7 Flex (Applied Biosystems, Waltham, MA) using locus-specific primers and iTaq SYBR Green Supermix (Bio-Rad Laboratories, Hercules, CA). Primer sequences are listed in Supplementary file 4.

enChIP-MS
The enChIP-MS procedure was performed as described previously (Fujita and Fujii, 2013) using chromatin extraction and fragmentation methods from (Forsberg et al., 2000) and the following minor modifications. Forty million myoblasts were harvested by trypsinization and lysed in Cell Lysis Buffer (10 mM Tris pH 8.0, 10 mM NaCl, 0.2% IGEPAL-CA630, cOmplete EDTA-free Protease Inhibitor Cocktail, 100 mM PMSF). The isolated nuclei were crosslinked with 1-2% formaldehyde at room temperature for 10-20 min and then lysed in Nuclei Lysis Buffer (50 mM Tris pH 8.0, 10 mM EDTA, 1% SDS, cOmplete EDTA-free Protease Inhibitor Cocktail, 100 mM PMSF). Chromatin was diluted to 0.5% SDS with IP Dilution Buffer and fragmented using a Fisher Scientific Model 500 Sonic Dismembrator probe tip sonicator to an average length of 3 kb. Sonicated chromatin was diluted to 0.2% SDS with IP Dilution Buffer, pre-cleared with 25 mg of mouse IgG conjugated to 100 ml of Dynabeads-Protein G and immunoprecipitated with 70 mg of anti-FLAG M2 antibody conjugated to 180 ml of Dynabeads-Protein G. An additional two Dynabead washes in Low Salt Wash Buffer replaced the high-salt washes. Eluted and precipitated samples were resuspended in SDS sample buffer, boiled and subjected to SDS-PAGE. Entire gel lanes were excised and proteins analyzed using an OrbiTrap Elite mass spectrometer (Thermo Fisher Scientific) coupled to an Easy-nLC II (Thermo Fisher Scientific) at the Fred Hutchinson Cancer Research Center Proteomics Shared Resource. The raw spectra were searched against a UniProt human protein database that also included common contaminants as defined in Mellacheruvu et al. (2013) using Proteome Discoverer 1.4 software (Thermo Fisher Scientific) to generate peptide-spectrum matches. The number of peptides that mapped to each protein was summarized to generate a 'pseudoquant' metric. Proteins with at least one peptide-spectrum match in two experimental replicates were carried forward for further analysis, after filtering out common contaminants. Finally, the UniProt annotations for Function and Subcellular location were used to restrict the analysis to only the nuclear proteins to enrich for biologically relevant, nuclear interactions. The R code used for the proteomics data analysis can be accessed via github (https://github.com/sjaganna/2017-campbell_et_al) (Jagannathan, 2017). The gRNA sequences are listed in Supplementary file 4.

GO category analysis
GO analysis was carried out with the PANTHER classification system (Mi et al., 2016) using the statistical overrepresentation test against all human genes and the complete GO Biological process annotation. p-Values were corrected for multiple hypothesis testing using the Bonferroni correction.

ChIP-qPCR
The occupancy of NuRD complex components and acetyl-Histone H4 was determined using crosslinked ChIP coupled with micrococcal nuclease digestion as described previously (Skene and Henikoff, 2015). For acetyl-Histone H4 samples, the Lysis Buffer and IP Dilution Buffer were supplemented with 10 mM sodium butyrate. Quantitative PCR was carried out on a QuantStudio 7 Flex using locus-specific primers and iTaq SYBR Green Supermix. Primer sequences are listed in Supplementary file 4.

siRNA transfections
Flexitube and ON-TARGETplus duplex siRNAs were obtained from Qiagen (Hilden, Germany) or GE Dharmacon (Lafayette, CO), respectively. Transfections of siRNAs into myoblasts and iPS cells were carried out using Lipofectamine RNAiMAX (Invitrogen) according to the manufacturer's instructions. A double transfection protocol was followed in myoblasts to ensure efficient depletion of pre-existing proteins. Briefly, cells were seeded at~30% confluence in six-well plates and transfected~20 hr later with 6 ml Lipofectamine RNAiMAX and 25 pmol of either gene-specific siRNA(s) or a scrambled non-silencing control siRNA diluted in 125 ml Opti-MEM Reduced Serum Medium (Thermo Fisher Scientific). Forty-eight hours following this, myoblasts were transfected a second time and harvested for RNA analysis 48-72 hr later. In iPS cells, the same procedure was followed except cells were treated with 10 mM Y-27632 ROCK inhibitor (Miltenyi Biotec, Auburn, CA) for 24 hr before being trypsinized and seeded in mTeSR1 medium (STEMCELL Technologies, Vancouver, BC) at 1 Â 10 5 cells/well on Matrigel (Corning Life Science, Tewksbury, MA)-coated six-well plates, and were harvested 48 hr after a single transfection. The sequences of siRNAs are listed in Supplementary file 4.

RNA isolation and RT-qPCR
Total RNA was extracted from whole cells using the RNeasy Mini Kit (Qiagen) according to the manufacturer's instructions. The isolated RNA was treated with DNase I (Thermo Fisher Scientific), heat inactivated, and reverse transcribed into cDNA using Superscript III (Thermo Fisher Scientific) and oligo(dT) primers (Invitrogen) following the manufacturer's protocol. Quantitative PCR was carried out on a QuantStudio 7 Flex using primers specific for each mRNA and iTaq SYBR Green Supermix. The relative expression levels of target genes were normalized to that of the reference genes RPL27, RPL13A or GAPDH by using the delta-delta-Ct method (Livak and Schmittgen, 2001) after confirming equivalent amplification efficiencies of reference and target molecules. Primer sequences are listed in Supplementary file 4.

Additional information
Funding