DUX4 is a multifunctional factor priming human embryonic genome activation

Summary Double homeobox 4 (DUX4) is expressed at the early pre-implantation stage in human embryos. Here we show that induced human DUX4 expression substantially alters the chromatin accessibility of non-coding DNA and activates thousands of newly identified transcribed enhancer-like regions, preferentially located within ERVL-MaLR repeat elements. CRISPR activation of transcribed enhancers by C-terminal DUX4 motifs results in the increased expression of target embryonic genome activation (EGA) genes ZSCAN4 and KHDC1P1. We show that DUX4 is markedly enriched in human zygotes, followed by intense nuclear DUX4 localization preceding and coinciding with minor EGA. DUX4 knockdown in human zygotes led to changes in the EGA transcriptome but did not terminate the embryos. We also show that the DUX4 protein interacts with the Mediator complex via the C-terminal KIX binding motif. Our findings contribute to the understanding of DUX4 as a regulator of the non-coding genome.


INTRODUCTION
Mammalian pre-implantation development commences with conversion of the differentiated gametes into a totipotent zygote. Successful reprogramming of the zygote involves prominent chromatin remodeling and changes in epigenetic landscapes (Conti and Franciosi, 2018;Jukam et al., 2017;Li et al., 2018). Chromatin of the human mature oocyte is essentially inaccessible and transcriptionally silent, whereas progressive increase in chromatin accessibility commences soon after fertilization (Li et al., 2018;Liu et al., 2019;Wu et al., 2018). Embryonic genome activation (EGA) occurs in minor and major transcription waves. Minor EGA involves pervasive but low-level transcription that is necessary for pre-implantation development in mouse (Abe et al., 2015(Abe et al., , 2018Aoki et al., 1997;Zeng and Schultz, 2005). The minor and major EGA waves take place in humans at 4-cell and 8-cell stages, respectively (Braude et al., 1988;Dobson et al., 2004;Tesarik et al., 1987;Tohonen et al., 2015). The gene expression profile at the time of maternal-to-zygotic transition differs from that of later embryonic stages, involving transcription from non-coding genomic loci that are predominantly expressed in cleavage stage embryos (Kigami et al., 2003;Peaston et al., 2004;Tohonen et al., 2015).
The conserved DUX-family transcription factors are expressed in several mammalian cleavage stage embryos, including mouse and primate (Whiddon et al., 2017). Recent findings have suggested that DUX may act as a pioneer transcription factor in mammals (De Iaco et al., 2017;Hendrickson et al., 2017) similar to Zelda in Drosophila melanogaster (Liang et al., 2008;McDaniel et al., 2019). Dux knockout mice can survive until adulthood (Chen and Zhang, 2019) but litter sizes from these animals are significantly reduced, indicating cumulative defects over generations (De Iaco et al., 2020). Ex vivo culture of Dux knockout mouse embryos revealed delayed development beyond the genome activation stage with only 65% of the knockout embryos reaching the blastocyst stage at E4.5 (De Iaco et al., 2020). DUX4 is expressed in early human embryos (De Iaco et al., 2017;Hendrickson et al., 2017) and the DUX4 binding motif is enriched at In addition to protein coding transcripts, DUX-family transcription factors activate transcription from noncoding repeat elements (Geng et al., 2012;Whiddon et al., 2017;Young et al., 2013). Mouse Dux and human DUX4 transcription factors diverge on their homeodomain structure, correlating with their species specificity on retrotransposon activation (Whiddon et al., 2017). DUX4 activates transcription from ACRO1 and HSATII satellite repeats, as well as from the long terminal repeat (LTR)-containing elements (De Iaco et al., 2017;Hendrickson et al., 2017;Liu et al., 2019;Whiddon et al., 2017). Accumulating data indicate that repeat loci have been evolutionarily co-opted as regulatory elements for gene expression (Feschotte, 2008;Gerdes et al., 2016;Pontis et al., 2019;Thompson et al., 2016) and that particular repeat families have contributed to the evolution of gene regulatory networks; for example, in placentation (Chuong, 2013) and pregnancy (Lynch et al., 2011). Although transcriptional activation of LTR elements in human embryos (Goke et al., 2015;Grow et al., 2015;Hashimoto et al., 2021) and their invocation as alternative promoters have been established (Franke et al., 2017;Whiddon et al., 2017), broader implications of the DUX4-activated repeat elements in the context of human embryo development are largely unexplored.
Enhancers are short DNA regions that are typically characterized by depletion of nucleosomes, overlap with DNAse I hypersensitivity sites (DHS), and being flanked by specific histone modifications (Murakawa et al., 2016). Active enhancers generate RNAs in a bidirectional manner and they are usually positive for H3K27ac and H3K4me1 (Andersson et al., 2014;Arner et al., 2015;Henriques et al., 2018;Hirabayashi et al., 2019;Hon et al., 2017). Transcribed enhancers have a higher tendency of being functionally validated in reporter experiments when compared to non-transcribed enhancers identified only by using histone modifications or DHSs (Andersson et al., 2014). Indeed, functional enhancer units are precisely defined by active transcription start sites (Tippens et al., 2020). Recent analyses show that distal accessible chromatin regions in human early embryos overlap with oocyte hypomethylated regions, transposable elements, and putative cis-regulatory elements (Wu et al., 2018). Here, we elucidated the dynamics and involvement of DUX4 during the human EGA process and shed light on how newly identified DUX4-activated cis-regulatory elements regulate human EGA transcripts.

DUX4 activates thousands of newly identified bidirectionally transcribed enhancer-like regions that are enriched for ERVL-MaLR repeats
To extend previous analyses on chromatin accessibility and repeat elements in human embryos (Goke et al., 2015;Hendrickson et al., 2017;Li et al., 2018;Liu et al., 2019;Whiddon et al., 2017;Wu et al., 2018), we first identified loci that are associated with DUX4 expression. To this end, we performed the assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq) (Buenrostro et al., 2015) using doxycycline-inducible DUX4-TetOn human embryonic stem cells (hESC) (Figures 1A and S1A-S1D). Our analyses revealed substantial changes in the chromatin landscape of DUX4-activated hESCs after only a 4-h doxycycline treatment. We detected 13,826 peaks that were accessible only in DUX4-activated cells while 7,086 peaks were accessible only in control cells ( Figure 1B). The majority of the DUX4-activated peaks overlapped intronic and intergenic regions indicating that the non-coding genome had become accessible (Figures 1C and S2A). Gene ontology (GO) analysis for biological processes suggested that DUX4-activated peaks are associated with developmental processes including myotube differentiation ( Figure S2B). Integration of the ATAC-seq peaks with repeat elements showed $3-fold enrichment of ERVL-MaLR repeats (belonging to the LTR family) in DUX4-activated peaks but depletion in control peaks ( Figures 1D and S2C). The notable enrichment of non-coding ERVL-MaLR elements prompted us to study bi-directionally transcribed enhancer-like regions using native elongating transcript -cap analysis of gene expression (NET-CAGE) with high-throughput sequencing ( Figure S3A) (Hirabayashi et al., 2019) in DUX4-TetOn hESCs ( Figure 1A). Altogether, we identified $2M transcription start site (TSS) clusters of which $ 200,000 mapped to 5' -ends of genes (also referred to as promoters) and $1.3M mapped to intronic and intergenic regions ( Figure S3B). After excluding lowly expressed TSS clusters, we identified 84,946 promoters and 19,358 bi-directionally transcribed enhancer-like regions (Table S1) that correlated well between biological replicates ( Figure S3C). Remarkably, only 10.4% of DUX4-activated putative enhancers-like regions were also observed in other cell-types and tissues indicating the cell-type-specific na-  Comparison of control and DUX4-activated hESCs showed significant upregulation (FDR < 0.05) of 801 promoters (Table S1), which included known EGA genes such as ZSCAN4, DUXA, and LEUTX as well as recently annotated genes such as KHDC1P1 (Tohonen et al., 2015) ( Figure 1E). We also observed the significant upregulation (FDR < 0.05) of 5,156 putative enhancer-like regions (Figures 1F, 1G and Table S1) of which $50% also overlapped DUX4-activated ATAC-seq peaks ( Figure S3D). Similar to DUX4-activated ATAC-seq peaks, significantly upregulated promoters and enhancer-like regions were also enriched for ERVL-MaLR repeat elements (Figures 1G,1H,S3E and S3F). Consistent with previous findings (Geng et al., 2012;Liu et al., 2019;Whiddon et al., 2017;Young et al., 2013), our result emphasizes ERVL-MaLRs as repeat elements that potentially contribute regulatory accessible regions and transcripts for the human EGA genes.
Putative DUX4 target genes cloned from human 4-cell stage embryos Purification of millions of cells for NET-CAGE (Hirabayashi et al., 2019) using fluorescence activated cell sorting (FACS) was not feasible. Therefore, we separated the DUX4 expressing (EmGFP+) and control (EmGFP-) hESCs by FACS and performed bulk RNA-seq using the modified single-cell tagged reverse transcription (STRT) method (Krjutskov et al., 2016b). Comparison of mRNA levels in DUX4-activated and in control cells confirmed the significant upregulation of known EGA genes such as ZSCAN4 as well as the three recently annotated genes -KHDC1P1, RETT FINGER PROTEIN, and RING FINGER PROTEIN (Figure S4A and Table S2) (Tohonen et al., 2015) -in the DUX4-positive cells. These annotated genes are expressed in cleavage stage human embryos (Tohonen et al., 2015). We cloned the predicted cDNAs from human 4-cell stage embryos ( Figure S4B-S4D), confirming the presence of these transcripts in cleavage stage embryos.

Functional validation of DUX4-activated enhancer-like regions
The CAGE-based cap-trapping method (Murata et al., 2014) allowed us to pinpoint the TSS of the ZSCAN4 ( Figure S5A), and KHDC1P1 ( Figure S5B) promoters at nucleotide resolution. Annotation of the bidirectionally transcribed enhancer-like regions that were significantly upregulated after DUX4 expression revealed a potential enhancer for ZSCAN4 ( Figures 1F and 1G). The putative ZSCAN4 enhancer ( Figure 1G) is located around 20 kb from the ZSCAN4 promoter ( Figure S5A). The putative ZSCAN4 enhancer is also accessible in DUX4-activated cells but not in control cells, and it overlaps an ERVL-MaLR repeat element ( Figure 1G). To test the functionality of enhancer-like regions using CRISPR activation, we first generated a dCas9-DUX4 C-terminal fusion protein, which contains the DUX4 C-terminal 9aaTAD and KIX-binding motif (KBM) (but not the DUX4 N-terminal DNA-binding homeodomains), fused with endonuclease deficient dCas9 (hereafter dCas9-DUX4-C; Figures 1I and S5C). We used either dCas9-DUX4-C or the conventional VP16 trans activator domains containing dCas9-VP192 (Weltner et al., 2018) construct (Figures 1I and 1J) in combination with guide RNA (gRNA) pools to target the putative enhancer-like regions in HEK293 cells. We designed altogether five gRNAs (key resources table) for the ZSCAN4 enhancer region to experimentally test (E and F) Global differential expression analysis of DUX4-expressing (dox +) and control (dox -) hESCs for promoters (E) and putative enhancers (F). Log2 mean (counts per million, CPM) of four DUX4-expressing (dox +) and four control (dox -) replicates has been shown. Orange and purple dots indicate significantly upregulated (FDR < 0.05) promoters (E) and putative enhancers (F), respectively. Black dots indicate promoters for known 4-cell stage embryo genome activation genes. White dots indicated enhancers validated using the CRISPR activation assay. Yellow dots indicate significantly downregulated (FDR < 0.05) promoters (E) and putative enhancers (F), respectively. Grey dots indicate non-significantly differentially expressed promoters (E) and putative enhancers (F). iScience Article the capacity of this enhancer to activate expression of the putative target gene, ZSCAN4. Activation of the ZSCAN4 enhancer region, using both the dCas9-DUX4-C and the dCas9-VP192 construct with a pool of gRNAs, led to significant upregulation of the ZSCAN4 expression level, in comparison with the respective controls, dCas9-DUX4-C with TdT guide RNA construct (p = 0.0008, two-tailed Student's t-test Figure 1I) or dCas9-VP192 with TdT guide RNA construct (p = 0.0002, two-tailed Student's t-test Figure 1J). Similarly, we also tested the KHDC1P1 enhancer region ( Figures 1F and S5D). Activation of the KHDC1P1 enhancer region (using both constructs) led to significant upregulation of the KHDC1P1 expression level ( Figures S5E  and S5F). These findings reveal the functionality of specific DUX4-activated transcribed enhancers.

DUX4 expression dynamics and localization of the DUX4 protein in human zygotes and early embryos
To study the expression of DUX4 in human embryos, we utilized our published STRT sequencing data (Tohonen et al., 2015) that identified 5' transcript far ends (TFEs) in human metaphase II (MII) oocytes, zygotes, 2-cell, 4-cell and 8-cell stage embryos, and observed enrichment of DUX4 mRNA in zygotes ( (2015). A pseudo count of 1 was added.
(D) A box plot showing quantification of the DUX4 staining intensity in the nucleus in 3D normalized to the intensity in the cytoplasm. The samples are as described in (B and C). In each box the median is indicated, the edges are the 25 th and 75 th percentiles, the whiskers extend to the data points not considered outliers. See also Figure S6 and Videos S1, S2, S3, and S4.  2019), and that DUX4 activates ERVL-MaLR-enriched nascent enhancer RNAs, we next characterized DUX4 protein localization in early human embryos. We observed an overall increase in DUX4 antibody staining from zygote to 2-cell stage, and further to 4-cell stage and rapid clearance at the 8-cell stage ( Figures 2B and 2C). DUX4 staining was observed both in the cytoplasm and nucleus and we therefore quantified the nuclear DUX4 staining intensities from the three-dimensional confocal stacks. Quantifications revealed variable but increasing nuclear signals from zygotes up to 4-cell stage embryos, while only a weak signal was detected in the nuclei of 8-cell stage embryos ( Figure 2C insets and 2D). Supplemental 3D movies of unprocessed immunofluorescence stainings show DUX4 localization in the nuclei over the developmental trajectory from zygotes to 8-cell stage (Videos S1, S2, S3, and S4). Our analyses show that DUX4 transcripts become abundant after fertilization and rapidly reduce in 2-cell and 4-cell stage embryos. Nuclear localization of the DUX4 protein peaks during the first two days of human embryo development coincided with the onset of EGA.

DUX4 knockdown in human zygotes leads to minor changes in the embryonic transcriptome
Recent results have indicated that Dux is not necessary for mouse development (Chen and Zhang, 2019), although negative consequences of Dux knockout seem to accumulate over generations (De Iaco et al., 2020). We asked whether DUX4 affects the transcriptional program during EGA in human embryos, and approached this question using the best available material, human triploid (3PN) zygotes. We microinjected small interfering RNAs (siRNAs) targeting DUX4 (siDUX4) or control siRNAs (siControl) into human 3PN zygotes and monitored them until the third day of development, up to 8-cell-to-morula stage ( Figure 3A). Antibody staining of the DUX4 protein was positive in the siControl embryos but faint in the siDUX4 embryos, as observed 24 h after microinjection ( Figure 3B), indicating that the siRNAs targeting the DUX4 transcripts efficiently reduced DUX4 protein levels. The siDUX4 embryos proceeded through cleavages without differences when compared with the siControl embryos. The blastomeres from the microinjected embryos were dissociated and collected for STRT RNA-seq 48 h after microinjections, on the third day of development, when the majority of the EGA transcripts are highly expressed and the maternal transcripts are lowly expressed in humans (Braude et al., 1988;De Iaco et al., 2017;Liu et al., 2019;Tesarik et al., 1987;Tohonen et al., 2015). Comparison of 8,145 genes (Table S3) 3D). GO analysis for biological process suggested that upregulated genes were significantly associated with regulation of reproductive process while downregulated genes were significantly associated with translation and ribonucleoprotein complex biogenesis ( Figure S6E). Integration with a publicly available single-cell RNA-seq dataset (Yan et al., 2013) indicated that upregulated genes are usually expressed in oocytes, zygotes, 2-cell and 4-cell stages while downregulated genes are expressed in 8-cell, morula and late-blastocyst stages ( Figure 3E). These data suggest that the knockdown of DUX4 in human blastomeres leads to minor changes in embryonic gene expression program.

DUX4 C-terminal KIX binding domain interacts with MED15
DUX4 has been suggested to function as a pioneer factor (Choi et al., 2016;Hendrickson et al., 2017;Whiddon et al., 2017), given its ability to bind MaLR-enriched condensed chromatin loci and to recruit H3K27 acetyltransferase EP300 leading to locus activation (Choi et al., 2016). We asked whether DUX4 interacts with other proteins that could be related to its ability to accomplish genome-wide transcriptional changes.
To this end, we utilized the MAC-tag affinity purification mass spectrometry (AP-MS) method to identify iScience Article DUX4 protein-protein interactome. As a negative control we used GFP with nuclear localization signal in the same plasmid backbone as DUX4. MAC-tag allows identification of both stable (AP-MS) and dynamic (BioID-MS) protein-protein interactions gathered over the course of 20 h Varjosalo et al., 2013). We identified 43 stable AP-MS and 158 transient BioID-MS high-confidence (BFDR < 0.05) DUX4 interactions, including the previously shown DUX4 interaction partners EP300 and cAMP-response element-binding protein (CREB)-binding protein (CBP) (Choi et al., 2016) ( Figure S7 and Table S4, including the protein interactions of DUX4 and the negative control). Comparison of our list of DUX4-interacting proteins to the protein complex database (CORUM) yielded significant overrepresentation of the SWI/SNF chromatin remodeling complex, NSL and NuA4 histone acetyltransferase complex, SRCAP histone exchanging complex, and the Core Mediator complex, (FDR < 0.05, Fisher's exact test; Figure S7). In comparison to the protein-protein interactions of 110 transcription factors that were used as baits in the MAC-tag method (Gö ö s et al., 2021), DUX4 stands out as a notable binding partner of the Mediator complex ( Figure 4A). Indeed, out of the 26 known Mediator complex proteins, DUX4 interacted with 16. The Conservation of residues in primates versus human sequences (green curve) C-terminal to residue G153 and sequence alignment of three conserved regions with a disorder value lower than 0.5 (red curve). Residue numbering from UniProt: Q9UBX2. Two helical regions are predicted within the C-terminal region, the first one (cyan helices) and the second one (salmon helix) both containing the amphipathic ''FXXFF'' motif (F, bulky hydrophobic amino acid; X, any amino acid) found in several transcription factors reported to interact with KIX (Goto et al., 2002;Radhakrishnan et al., 1997;Wang et al., 2012). The position of the 9aaTAD (blue letters) and KBM (KIX binding motif; red letters) sequences are indicated by black bars. iScience Article majority of the DUX4 protein interactors, including the MED complex proteins, are expressed in human oocytes and pre-implantation embryos ( Figure S8). The mammalian Mediator is a transcription coactivator that transduces regulatory signals from transcription factors to RNA polymerase II (Chen et al., 2021). It thus mediates interactions between context-dependent transcription factors, enhancers, and promoters (Soutourina, 2018). Mediator subunit 15 (MED15) was observed as a stable and transient DUX4 protein interactor, suggesting that DUX4 can potentially accomplish some of its suggested functions through interactions with MED15.
To elucidate the functional mechanism of DUX4, we next aimed to identify the protein domain of DUX4 that mediates the interaction with MED15. The DUX4 N-terminal DNA-binding homeodomains are followed by an intrinsically disordered region with three regions of predicted low disorder that are conserved in primates. Within these regions, two predicted amphipathic helices contain a nine amino acid transactivation domain (9aaTAD (Mitsuhashi et al., 2018)), which is also present in another EGA gene, LEUTX (Katayama et al., 2018), and a motif known to recruit the KIX domain (Piskacek et al., 2016) of the CBP (Choi et al., 2016) ( Figure 4B). DUX4 has previously been shown to interact with EP300/CBP through its C-terminus (Choi et al., 2016). Indeed, the deletion of the last 98 amino acids from the full-length DUX4 C-terminus abolished the ability of DUX4 to interact with either EP300 or CBP (Choi et al., 2016). The DUX4 C-terminus also has been shown to have a dominant negative activity to full-length DUX4 as shown by co-transfection of the full-length DUX4 and C-terminus of DUX4 leading to inhibition of DUX4-induced expression of its well-known target gene, and performed a co-immunoprecipitation. While the V5-tagged MED15 was precipitated with the HAtagged wildtype DUX4, no interactions were found in the presence of the DUX4-KBM mutant ( Figure 4D). In summary, our analyses suggest that the 6 amino acid KBM at the end of the DUX4 C-terminus mediates interaction with MED15, alluding to DUX4 having all the attributes needed for rapid target activation.
We observed prominent DUX4 immunofluorescence signal in the cytoplasm of the human early embryos ( Figures 2B and 2C). We thus asked whether the homeodomain1-linker-homeodomain2 structure would be stabile as a unit without bound DNA and subjected the crystal structure of DUX4 (PDB: 6E8C (Lee et al., 2018), Data S2) to molecular dynamics simulations. Ten residues, highly conserved in primates, formed two interacting clusters ( Figures S9D and S9E) stabilizing both domains even in the absence of DNA (Videos S5A and S5B). While predominantly the charge-charge interactions hold the two homeodomains together ( Figures S9F-S9I Hilton et al., 2015). Epigenetic pre-patterning of developmental gene expression has been shown to occur in Zebrafish prior to EGA (Lindeman et al., 2011). Recent evidence also indicates that human embryonic genome undergoes priming that involves the acquisition of a globally permissive chromatin state before major EGA (Xia et al., 2019). Of interest, distal candidate cis-regulatory elements are highly accessible in 4-cell stage embryos and may functions as enhancers (Xia et al., 2019). Moreover, recent data also imply that evolutionary young TE elements expressed in the early human embryo can serve as enhancers, also for the genes that are required later in development (Pontis et al., 2019).
siRNA-mediated knockdown of DUX4 in human triploid zygotes did not lead to embryonic arrest by the third day of development, in agreement with what has been shown for DUX4 knockout mouse (Chen and Zhang, De Iaco et al., 2017). The siDUX4 blastomeres exhibited minor downregulation of the EGA transcriptome with several retained maternal genes. Maternal mRNA clearance takes place in at least two phases, during oocyte maturation and early embryo development (Vastenhouw et al., 2019), thus before and after EGA, respectively. Recent findings indicate that maternal mRNAs in human oocytes can be clustered based on their degradation rate, suggesting selective mRNA clearance during human maternal-to-zygotic transitions (Sha et al., 2020). Intriguingly, the clearance of a subset of maternal mRNAs was dependent on EGA (Sha et al., 2020). It remains to be elucidated whether DUX4 directly participates in the clearance of maternal mRNAs and if DUX4 is required for human embryo development. DUX4 was recently suggested to play a central role in the regulation of 'maternally biased genes' at the 4-to 8-cell stage in a study that investigated parent-of-origin effects in biparental and uniparental human early embryos (Leng et al., 2019). While the DUX4 binding motif was identified as the most enriched motif for maternally biased genes, many of the putative DUX4 targets were also involved in a transcriptional regulatory network, indicating that they could also be regulated by other factors, such as DUXA and NANOG (Leng et al., 2019). In agreement with these analyses, we anticipate that factors other than DUX4 also function as early regulators of EGA and may compensate for the reduced DUX4 activity in the siDUX4 embryos.
In addition to several chromatin modifiers, our DUX4 protein interactome analysis revealed contacts with RNA-binding proteins and mRNA splicing proteins (Ansseau et al., 2016). Further studies are required to elucidate whether cytoplasmic DUX4 protein interactions relate to the observed DUX4 protein localization in the cytoplasm of early embryos, and whether they are functionally important. DUX4 has previously been shown to recruit EP300/CBP (Choi et al., 2016). We revealed that the DUX4 C-terminal KIX-binding motif recruits the MED15 protein. This suggests that in addition to recruiting acetyltransferase EP300 and CBP, DUX4 also directly interacts with MED15, most likely associated with DUX4-induced transcription initiation. In conclusion, we characterize the dynamics of DUX4 RNA and protein expression in human zygotes and embryos and elucidate its potential functions in EGA. Our results expand the information about DUX4 as a multifunctional factor that regulates both the coding and non-coding genome.

Limitations of the study
We note that there are a few limitations to our study. Although we were able to achieve statistical significance for differentially expressed genes in the DUX4 knockdown experiment, the overall number of blastomeres included in the study was low. We also observed heterogeneity in the expression of genes within siControl and siDUX4 cells potentially due to the use of 3PN embryos. Another possible cause for heterogeneity in gene expression among the siDUX4 embryos is the timing of the microinjection with respect to zygotic enrichment of DUX4. The question whether DUX4 is an essential transcription factor in early human development remains to be resolved. Future studies with a higher number of zygotes and culturing embryos up to the blastocyst stage or 14 days of development following knockdown would provide a broad picture of the role of DUX4 in human development. Additionally, our study had technological limitations. It is currently not feasible to perform NET-CAGE in single cells in oocytes and embryos, owing to the large number of cells required for the library preparation (Hirabayashi et al., 2019). Therefore, the number of transcribed enhancers that are functionally active in human embryos is yet to be determined.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:  Conti, M., and Franciosi, F. (2018). Acquisition of oocyte competence to develop as an embryo: integrated nuclear and cytoplasmic events. Hum. Reprod. Update 24, 245-266.

Materials availability
This study did not generate new unique reagents.

Data and code availability
Accession numbers for data generated in this paper and weblinks to the code have been listed in the key resources table. The ATAC-seq, CAGE/NET-CAGE, and bulk STRT data have been deposited in Gene Expression Omnibus (GEO: GSE171803). Cloned transcript sequences have been deposited in European Nucleotide Archive (ENA: LR694082-LR694089). Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

EXPERIMENTAL MODEL AND SUBJECT DETAILS
Collection and experiments on human oocytes and embryos were approved by the Helsinki University Hospital ethical committee, diary numbers 308/13/03/03/2015 and HUS/1069/2016. Human surplus zygotes and embryos were donated by couples that had undergone infertility treatments at the Reproduction Medicine Unit of the Helsinki University Hospital. The donations were done with an informed consent.

METHOD DETAILS
Human ESC culture hESC lines H1 (WA01) and H9 (WA09) were maintained on Geltrex, hESC-qualified, reduced growth factor basement membrane matrix-coated tissue culture dishes in Essential 8 culture medium and passaged every three to five days by 3-5-min incubation with 0.5 mM EDTA (all from Thermo Fisher Scientific).

Generation of DUX4 TetOn human embryonic stem cells
hESCs were incubated with StemPro Accutase (Thermo Fisher Scientific) until the edges of the colonies started to curl up. The Accutase was aspirated, and the cells were gently detached in cold 5% FBS (Thermo Fisher Scientific) 13PBS (Corning) and counted. One million cells were centrifuged at 1073g for 5 min and the pellet was transferred into 120 mL of R-buffer containing 1 mg of pB-tight-DUX4-ires-EmGFP-pA-PGK-Puro, 0.5 mg of pBASE (Wang et al., 2008)and 0.5 mg of rtTA-M2-IN plasmids (Takashima et al., 2014). 100 mL of the cell-plasmid suspension was electroporated with two pulses of 1100V, 20 ms pulse width, using Neon Transfection system (Thermo Fischer Scientific). The electroporated cells were plated on Geltrex-coated dishes in Essential 8 medium with 10 mM ROCK inhibitor Y27632 (Selleckhem). The following day, the medium was exchanged with fresh Essential 8 medium without ROCK inhibitor. The cells were selected with Puromycin at 0.3 mg/mL. The DUX4TetOn hESC clones were picked manually on Geltrex-coated 96-well plates, expanded, and selected again with Puromycin. Appearance of the EmGFP reporter protein was tested using Doxycycline at concentrations ranging from 0.2 mg/mL to 1.0 mg/mL and detected using an EVOS FL Cell imaging system (Thermo Fisher Scientific iScience Article iScience Article placed in a 48-well plate in which a universal primer, template-switching oligos, and a well-specific 8 bp barcode sequence were added to each well (Krjutskov et al., 2016a). The synthesized cDNAs from the samples were then pooled into one library and amplified by single-primer PCR with the universal primer sequence. The resulting amplified library was then sequenced using an Illumina NextSeq500 instrument. Alignment of raw reads to the hg19 reference genome, normalization and DE was performed as per the STRTprep pipeline (Krjutskov et al., 2016a).
cDNA cloning of previously unannotated genes A cDNA library was prepared from a single human 4-cell embryo according to the protocol by Tang et al.,

KHDC1P1 and ZSCAN4 enhancer validation
Putative KHDC1P1 and ZSCAN4 enhancer regions were predicted from DUX4TetOn hESC NET-CAGE dataset. The guide RNAs targeting the each of the putative enhancers were designed using the Benchling CRISPR tool (https://benchling.com), targeting them +/À200 base pairs of the putative enhancer midpoint. Guide sequences were selected according to their on-and off-target score and position. Guide RNA oligos are shown in key resources table. Guide RNA transcriptional units (gRNA-PCR) were prepared by PCR amplification with Phusion polymerase (Thermo Fisher), using as template U6 promoter and terminator PCR products amplified from pX335 together with a guide RNA sequence-containing oligo to bridge the gap. The oligos for guide RNA transcriptional units are as in (Balboa et al., 2015). PCR reaction contained 50 pmol forward and reverse primers, 2 pmol guide oligo, 5 ng U6 promoter and 5 ng terminator PCR products in a total reaction volume of 100mL. The PCR reaction program was 98 C/10sec, 56 C/30sec, 72 C/12sec for 35 cycles. Amplified gRNA-PCRs were purified and transfected to HEK293 cells as described in (Balboa et al., 2015).

HEK cell transfections
HEK 293 cells were seeded on tissue culture treated 24-well plates one day prior to transfection (5 3 10 4 cells/well). Cells were transfected using FuGENE HD transfection reagent (Promega) in fibroblast culture medium with 500 ng of either dCas9-DUX4-C or dCas9VP192 transactivator encoding plasmid and 200 ng of guide RNA-PCR product or TdTomato guide RNA plasmid. Cells were cultured for 72 h posttransfection, after which samples were collected for qRT-PCR.
RNA isolation, reverse transcription and quantitative real-time PCR from DUX4 TetOn hESCs and HEK293 cells Total RNA was isolated using NucleoSpin RNA kit (Macherey Nagel). 1mg of RNA was reverse transcribed by MMLV-RTase with oligo dT, dNTPs, and Ribolock in MMLV-RTase buffer (Thermo Fisher Scientific). 53 HOT FIREPol qPCR Mix (Solis Biodyne) was used to measure relative mRNA levels with LightCycler (Roche). The DDCT method was followed to quantify the relative gene expression where CYCLOPHILIN G (PPIG) was used as endogenous control. Relative expression of each gene was normalized to the expression without doxicycline treatment. The primer sequences are listed in the key resources table. Cells transfected with either dCas9-DUX4-C or dCas9-VP192 transactivator together with TdTomato targeting guide plasmid and were used as controls.
Data analyses on published single-cell tagged reverse transcription (STRT) data from human oocytes and embryos We analysed single cell RNA-sequencing data from Tohonen et al., ( ) (Tohonen et al., 2015 for MII oocytes (n = 20), zygotes (n = 59), 2-cell (n = 4), 4-cell (n = 15) and 8-cell (n = 14) embryos. The expression of DUX4 is elusive due to a high number of identical or nearly identical copies present in the human genome.

OPEN ACCESS
Overrepresentation analysis of statistically significant interactions matching protein complex database CORUM (Giurgiu et al., 2019) (https://mips.helmholtz-muenchen.de/corum/) and Gene Ontology terms was performed using R-package enrichR (Chen et al., 2013). Protein interaction networks were constructed from statistical significant (BFDR <0.05) protein-protein interactions imported to Cytoscape 3.6.0 (Shannon et al., 2003). Known prey-prey interactions were obtained from the iRef database (http://irefindex.org). The negative control (GFP) samples were treated similarly as the DUX4 samples (tetracycline induction in the case of AP-MS and biotin treatment in the case of BioID-MS). The data from the negative control MS runs are summarized in Table S4.

QUANTIFICATION AND STATISTICAL ANALYSIS
No statistical methods were applied to pre-determine sample sizes. Statistical analysis was performed using R version 3.6.1 or Microsoft Excel (t-test). The statistical test and the number of replicates for each analysis is described in the figure legends or STAR methods section. A p-value < 0.05 was considered significant.