Hox-dependent coordination of mouse cardiac progenitor cell patterning and differentiation

Perturbation of addition of second heart field (SHF) cardiac progenitor cells to the poles of the heart tube results in congenital heart defects (CHD). The transcriptional programs and upstream regulatory events operating in different subpopulations of the SHF remain unclear. Here, we profile the transcriptome and chromatin accessibility of anterior and posterior SHF sub-populations at genome-wide levels and demonstrate that Hoxb1 negatively regulates differentiation in the posterior SHF. Spatial mis-expression of Hoxb1 in the anterior SHF results in hypoplastic right ventricle. Activation of Hoxb1 in embryonic stem cells arrests cardiac differentiation, whereas Hoxb1-deficient mouse embryos display premature cardiac differentiation. Moreover, ectopic differentiation in the posterior SHF of embryos lacking both Hoxb1 and its paralog Hoxa1 results in atrioventricular septal defects. Our results show that Hoxb1 plays a key role in patterning cardiac progenitor cells that contribute to both cardiac poles and provide new insights into the pathogenesis of CHD.


Introduction
Heart morphogenesis and patterning require precise temporal differentiation of distinct cardiac progenitor populations that arise from two early sources of mesoderm progenitors, the first heart field (FHF) and the second heart field (SHF) (Buckingham et al., 2005). The FHF originates from the anterior splanchnic mesoderm and forms the cardiac crescent. The SHF is a progenitor population originating in pharyngeal mesoderm that contributes to heart tube elongation through the progressive addition of cells from the dorsal pericardial wall to both poles of the forming heart. The SHF gives rise to right ventricular and outflow tract myocardium at the arterial pole, and to atrial myocardium including the dorsal mesenchymal protrusion (DMP) at the venous pole (Zaffran and Kelly, 2012). In the absence of SHF cell addition, impaired heart tube elongation and looping leads to early embryonic lethality, and perturbation of this process underlies a spectrum of common congenital heart defects (CHDs) (Prall et al., 2007;Cai et al., 2003). Lineage tracing analysis in mammals has revealed that the SHF is sub-divided into distinct anterior and posterior regions (aSHF and pSHF) (Domínguez et al., 2012;Lescroart et al., 2012;Vincent and Buckingham, 2010). Cardiac progenitors in the aSHF contribute to the right ventricular and outflow tract myocardium (Buckingham et al., 2005), whereas pSHF cells participate to the formation of the atrial and atrioventricular septation through development of the DMP that forms the muscular base of the primary atrial septum (Briggs et al., 2012). In addition, a sub-population of the pSHF has been shown to contribute to the inferior wall of the outflow tract which gives rise to the myocardium at the base of the pulmonary trunk (Bertrand et al., 2011).
A complex network of signaling inputs and transcriptional regulators is required to regulate SHF development (Rochais et al., 2009). Among these signaling molecules, retinoic acid (RA) has been shown to pattern the SHF (Stefanovic and Zaffran, 2017;Hochgreb et al., 2003). Specifically, RA signaling is required to define the limit between the anterior and posterior SHF, as indicated by the abnormal posterior expansion of the expression of aSHF markers genes, including Fgf8, Fgf10, and Tbx1, in Raldh2-mutant embryos (De Bono et al., 2018;Ryckebusch et al., 2008;Sirbu et al., 2008). Hoxa1, Hoxa3 and Hoxb1 are expressed in overlapping sub-populations of cardiac progenitor cells in the pSHF and downregulated prior to differentiation (Bertrand et al., 2011). Hoxb1-and Hoxa1-expressing progenitor cells located in the pSHF segregate to both cardiac poles, contributing to the inflow tract and the inferior wall of the outflow tract Bertrand et al., 2011). In contrast, cardiac progenitors that contribute to the superior wall of the outflow tract and right ventricle do not express Hox transcription factors. Hoxb1 is required for normal deployment of SHF cells during outflow tract development (Roux et al., 2015). TALE-superclass transcription factors (three-amino acid length extension) such as Pbx1-3 or Meis1-2, which are co-factors of anterior Hox proteins, are also expressed in cardiac progenitors, suggesting a wider role for HOX/TALE complexes during SHF development (Paige et al., 2012;Wamstad et al., 2012;Stankunas et al., 2008).
Identification of SHF-restricted regulatory elements has provided evidence that different transcriptional programs operate in distinct SHF sub-populations. Cells expressing Cre recombinase under the control of a SHF-restricted regulatory element from the Mef2c gene contribute widely to the outflow tract and right ventricle, as well as to a population of cells at the venous pole of the heart giving rise to the primary atrial septum and DMP Goddeeris et al., 2008;Verzi et al., 2005;Dodou et al., 2004). Although subdomains of the SHF prefigure and are essential to establish distinct structures within the mature heart, it is unclear how distinct sub-populations are defined. Here, we identify the genome-wide transcriptional profiles and chromatin accessibility maps of sub-populations of SHF cardiac progenitor cells using RNA-and ATAC-sequencing approaches on purified cells. Through gain and loss of function experiments we identify Hoxb1 as a key upstream player in SHF patterning and deployment. Mis-expression of Hoxb1 in the Hox-free domain of the SHF results in aberrant cellular identity of progenitor cells and arrested cardiac differentiation, leading ultimately to cell death. The addition of progenitor cells from the pSHF to the venous pole is also impaired in Hoxa1 -/-; Hoxb1 -/hearts, resulting in abnormal development of the DMP and consequent atrioventricular septal defects (AVSDs). Hoxb1 is thus a critical determinant of cardiac progenitor cell fate in vertebrates.

Results
Transcriptomic and epigenetic profiling of the SHF To identify the transcriptional profiles of distinct cardiac progenitor populations, we made use of two transgenic mouse lines, Hoxb1 GFP and Mef2c-AHF-Cre (Mef2c-Cre), that drive reporter gene expression in sub-domains of the SHF (Roux et al., 2015;Bertrand et al., 2011;Briggs et al., 2013;Verzi et al., 2005). At embryonic day (E) 9.5 (16 somites [s]), the GFP reporter of Hoxb1 GFP embryos is detectable in the posterior region of the SHF ( Figure 1A). Genetic lineage analysis of Hoxb1-expressing cells using the Hoxb1 IRES-Cre mouse line showed that Hoxb1 progenitors contribute to both atria, the DMP and the myocardium at the base of the pulmonary trunk at E11.5-E12.5 ( Figure 1B,C). Genetic lineage analysis of Mef2c-Cre-labelled cells using Mef2c-Cre;Rosa tdT mouse line showed that Tomato-positive (Tomato+) cells are detected in the arterial pole of the heart and the DMP at E9.5-E10.5 ( Figure 1D,E). At E12.5, the contribution of Mef2-Cre-expressing cells is observed in the great arteries (aorta and pulmonary trunk) and the right ventricle ( Figure 1F), consistent with previous observations (De Goddeeris et al., 2008;Verzi et al., 2005). To further characterize the expression pattern of these two reporter lines we performed RNA-FISH (RNAscope fluorescent in situ hybridization). At E8.5-9, RNA-FISH showed that expression of Osr1, Mef2c-Cre labeled cells (Tomato; red). (K) Cartoon summarizing the contribution of the Hoxb1-Cre (green) and Mef2c-Cre (red) lineages in the embryo at E9.5. Nuclei are stained with Hoechst. ao, aorta; at, atria; aSHF, anterior second heart field; ift, inflow tract; la, left atria; lv, left ventricle; oft, outflow Figure 1 continued on next page a gene expressed in pSHF progenitors (Zhou et al., 2015), largely overlapped with Hoxb1 expression ( Figure 1G and I), whereas Mef2c-Cre predominantly labeled a distinct progenitor cell population to Osr1 ( Figure 1G,H). Double whole-mount in situ hybridization identified a minor subset of cardiac progenitors co-labeled by Hoxb1 and Mef2c-Cre;Rosa tdT (Tomato+), likely corresponding to progenitor cells giving rise the DMP at the venous pole and inferior outflow tract wall at the arterial pole ( Figure 1I-K; Roux et al., 2015;Briggs et al., 2013;Bertrand et al., 2011;Verzi et al., 2005).
After micro-dissection and dissociation of the SHF region from Hoxb1 GFP (GFP+) and Mef2c-Cre; Rosa tdT (Tomato+) embryos at E9.5 (16s; n = 3 each), GFP+ and Tomato+ cells were purified by flow cytometry-activated cell sorting (FACS) and subsequently used for RNA-seq ( Figure 2A). FACS analysis showed that Tomato+ and GFP+ cells comprise, respectively, 33% and 23% of the total microdissected region ( Figure 2B,D). The enrichment was validated by qPCR for a set of genes known to be specifically expressed in the pSHF (Hoxa1, Hoxb1, Osr1, Tbx5 and Aldh1a2) ( Figure 2C,E). Although the pSHF markers Tbx5 and Osr1 were not detected in the Tomato+ cells, Hoxa1, Hoxb1 and Aldh1a2 transcripts were amplified in cells from that micro-dissected region ( Figure 2C). Detection of Hoxb1 mRNAs in the Tomato+ cells confirms the co-labeling observed between Hoxb1 and Mef2c-Cre;Rosa tdT (Tomato+) expression in a subset of cardiac progenitors that contribute to both poles of the heart tube ( Figure 1I-K). Principal component analysis (PCA) and calculation of the Euclidean distance between the regularized log (rlog)-transformed data for each sample using DESeq2 demonstrated the strong similarity between biological replicates (Figure 2-figure supplement 1A,B). We identified 11,970 genes expressed in both cell types. 2,133 genes were transcribed specifically in the GFP+ population ( Figure 2F and Figure 2-figure supplement 1C). Gene ontology (GO) enrichment analysis for the biological processes linked to these genes showed a significant enrichment of GO terms associated with 'heart development', 'epithelium development', 'cardiac chamber morphogenesis' and 'cell adhesion' ( Figure 2G). Included in the 'heart development' list we identified several genes previously described as being expressed in the posterior region of the SHF (e.g., Tbx5, Osr1, Tbx18, Foxf1, and Wnt2), as well as Bmp4, Nr2f2, Sema3c, Gata4 ( Figure  Given the small overlap between Hoxb1 and Tomato expression ( Figure 1J), we generated triple transgenic Hoxb1 GFP ;Mef2c-Cre;Rosa tdT embryos at stage E9.5 ( Figure 2I) and purified double positive (GFP+/Tomato+) and simple positive (GFP+ or Tomato+) cells. FACS analysis showed that the GFP+/Tomato+ gate comprised only 1% of the total micro-dissected region ( Figure 2J). Interestingly, transcriptional analysis revealed that both Hoxb1 and GFP transcripts were decreased in the GFP+/Tomato+ population compared to the GFP+ population. Conversely, Cre transcripts were equally expressed in both the Tomato+/GFP+ and Tomato+ populations suggesting that the Mef2c-AHF enhancer was still active in pSHF cells at this stage Rana et al., 2014;Briggs et al., 2012;Xie et al., 2012;Goddeeris et al., 2008).
To define accessible sites for transcriptional regulation in SHF sub-populations, we performed ATAC-seq (Buenrostro et al., 2015). We performed ATAC-seq on FACS-sorted Tomato+ and GFP+ cells from E9.5 (12-14 s) embryos. Samples were subjected to massively parallel sequencing and overlapping peaks from replicate samples were merged to identify high-confidence regions of open chromatin. The correlation heatmaps and PCA plot highlighted the differences in ATAC read concentrations between Tomato+ and GFP+ samples (Figure 3-figure supplement 1A,B). ATAC-seq peaks representing open chromatin were highly reproducible between biological replicates and showed a clear enrichment at putative regulatory elements (Figure 3-figure supplement 1C). We performed a stringent analysis to identify quantitative differences in chromatin accessibility. By comparing the signal for each peak in Tomato+ and GFP+ populations, we identified 1285 peaks that were exclusively accessible in the GFP+ population ( Figure 3A,B). Approximately 94% of peaks found in both the GFP+ and Tomato+ populations, while 3.5% of the peaks were exclusively present in the GFP+ population ( Figure 3B). We then asked whether DNA regions differentially accessible  between the Tomato+ and GFP+ populations were selectively associated with genes specifically expressed in the pSHF that were identified from our RNA-seq analysis. In order to assess differential chromatin accessibility at each consensus peaks we used an affinity analysis as a quantitative approach ( Figure 3C; see Materials and methods). Quantitative analysis confirmed the number of peaks identified by the qualitative approach for each GFP+ or Tomato+ population. In addition, quantitative differences in peak signals were observed between the Tomato+ and GFP+ populations (Figure 3-figure supplement 1D). We next asked if global differences between the accessible chromatin landscapes of the Tomato+ and GFP+ populations correlated with changes in gene expression. Thus, we mapped individual ATAC-seq peaks in the Tomato+ and GFP+ populations based on their distance to the nearest transcription start site (TSS) and examined the expression of the corresponding genes ( Figure 3D,E). Changes in chromatin accessibility did not correlate precisely with changes in gene expression and several peaks near differentially expressed genes were not differentially accessible. Such decoupling between enhancer accessibility and activity has been observed in other developmental contexts, including early cardiogenesis (Racioppi et al., 2019;Paige et al., 2012;Wamstad et al., 2012). However, we found 53 (Tomato+ population) and 65 (GFP+ population) peaks correlating with changes in gene expression ( Figure 3E). Interestingly, we observed an over-representation of TSS in the 'GFP+' peaks compared to the 'Tomato+' peaks suggesting that genes with open promoters are preferentially activated in the pSHF (  (Dodou et al., 2004) was marked by open chromatin in the Tomato+ population, but not the GFP+ population ( Figure 3-figure supplement 1F), confirming that ATAC-seq predominantly marks active promoters and enhancers in prominent compartment-specific patterns. Among the ATAC-seq peaks we observed a high read density within an intron of the Mef2c gene, approximately 2.5 kb or 4 kb upstream of the previously described Mef2c-AHF enhancers (Dodou et al., 2004;von Both et al., 2004).
Together this analysis indicates that our dataset can be used to identify regulatory elements in distinct SHF sub-populations. ATAC-seq in GFP+ cells identified several pSHF-specific peaks indicating that these regions may function as enhancers ( Figure 3F-H). We then used HOMER to identify enriched putative transcription factor motifs in the open chromatin regions of the GFP+ and Tomato + populations ( Figure 3I; Figure 3-figure supplement 1G). One of the most highly enriched motif in open chromatin regions of the GFP+ population was the consensus HOX motif ( Figure 3I; Fan et al., 2012). Other significantly enriched motifs include putative binding sites for PBX and MEIS proteins ( Figure 3I), TALE-class transcription factors interacting with HOX proteins Ladam and Sagerströ m, 2014). Members of the PBX and MEIS families have been previously identified as cofactors of Hoxb1 in mammalian cell lines and embryonic tissues, indicating a high level of functional conservation (Roux and Zaffran, 2016;Ladam and  (G) Gene ontology (GO) analysis of GFP+ progenitor cells performed with ClusterProfiler system showing enrichment of upregulated genes in the pSHF with ranked by -log 10 (p-value). The percentage corresponds to the 'BG ratio' obtained in the GO analysis. (H) Example of the heatmap of 'heart development' GO term associated genes analyzed by RNA-seq (n = 3 from GFP+ cells, n = 3 from Tomato+ cells). (I) Whole-mount fluorescence microscopy of triple transgenic Hoxb1 GFP ;Mef2c-Cre;Rosa tdT embryos at stage E9.5. (J) FACS profile of E9.5 cardiac progenitor cells isolated from Hoxb1 GFP ;Mef2c-Cre;Rosa tdT embryos. (K) Expression of pSHF markers (Hoxa1, Hoxb1, Osr1, Aldh1a2, Tbx5), GFP and the Cre recombinase were analyzed with real-time qPCR. Data were normalized to HPRT and expressed as folds of increase over negative population. aSHF, anterior SHF; oft, outflow tract; pSHF, posterior SHF; rv, right ventricle. Scale bars: 500 mm. The online version of this article includes the following figure supplement(s) for figure 2:    Mann et al., 2009). In addition, we found overrepresentation of motifs for GATA and TEA domain (TEAD) family transcription factors as well as for the NKX2-5 homeodomain. Because transcription factors function in a combinatorial manner, we identified combinations of multiple motifs that were most enriched at pSHF candidate enhancers relative to non-pSHF enhancers ( Figure 3I). Our computational analysis showed that the most enriched combinations contained HOX motifs adjacent to TALE transcription factor recognition sequences. Consistent with these observations, chromatin immunoprecipitation (ChIP)-sequence data for the cofactors of HOX proteins (Meis1, Nkx2-5, HDACs) revealed that these factors bind putative regulatory elements marked by open chromatin in the GFP+ but not in the Tomato+ population ( Figure 3F-H). In open chromatin regions of the Tomato+ population we observed enrichment of motifs for RUNX, LHX and TEAD transcription factors (Figure 3-figure supplement 1G). However, we did not identify combinations of multiple motifs that were enriched in these regions relative to non-specific enhancers.
Mis-expression of Hoxb1 in the Mef2c-AHF-Cre lineage disrupts right ventricular formation In order to investigate the role of Hoxb1 during heart development we generated a conditional activated Tg(CAG-Hoxb1-EGFP) 1Sza (Hoxb1 GoF ) transgenic mouse, in which Hoxb1 cDNA expression is activated upon Cre-mediated recombination. As previously reported Hoxb1 GoF mice without any Cre allele were viable and healthy . Hoxb1 GoF mice were crossed with to Mef2c-AHF-Cre (Mef2c-Cre) mice to mis-express Hoxb1 in Mef2c-AHF+ cells (Verzi et al., 2005). Hoxb1 GoF ;Mef2c-Cre embryos exhibited severe heart defects as early as E9.5, as observed by a looping defect and common ventricular chamber in transgenic compared to control embryos ( Figure 4A-B). Expression of GFP in the aSHF and its derivatives confirmed Mef2c-Cre-driven recombination ( Figure 4A',B'). Immunostaining of E10.5 control embryos revealed normal future right and left ventricular chambers with developing trabeculae ( Figure 4C). However, in Hoxb1 GoF ; Mef2c-Cre embryos, the heart was abnormally shaped with no clear distinction between right and left ventricular chambers ( Figure 4D). The phenotype was even more pronounced at E12.5, when Hematoxylin and Eosin-stained transverse sections showed a hypoplastic right ventricle with abnormal positioning of the ventricular septum ( Figure 4E and F). Hypoplasia of the right ventricle in Hoxb1 GoF ;Mef2c-Cre embryos was evident at E15.5 (Figure 4G-J; 8/8). At this stage, an abnormally thin right ventricular wall was observed (5/8). In addition, 50% of Hoxb1 GoF ;Mef2c-Cre embryos showed misalignment of the great arteries (4/8) and 63% displayed ventricular septal defects (VSD; 5/8). In this context, we never observed viable Hoxb1 GoF ;Mef2c-Cre pups with hypoplastic right ventricle. Overall, these results suggest that ectopic Hoxb1 expression in the Mef2c-AHF lineage disrupts the contribution of anterior cardiac progenitor cells to the forming heart.

Figure 3 continued
Tomato+ Mef2c-Cre labeled cells (red). Data represent merged technical and biological replicates for each cell type. The y-axis scales range from 0 to 80 in normalized arbitrary units. The tracks represent ATAC-seq, whereas the bar graphs represent RNA-seq. Boxed regions show cell-type-specific peaks around Aldh1a2, Osr1, and Sema3C gene loci. (E) Change in accessibility versus change in gene expression in GFP+ and Tomato+ cells. For each ATAC-seq peak, the log of the ratio of normalized ATAC-seq reads (GFP/Tomato) is plotted on the x-axis, and the log of the ratio normalized RNA-seq reads corresponding to the nearest gene is plotted in the y-axis. Peaks that are both significantly differentially accessible (FDR < 0.1) and significantly differentially expressed (FDR < 0.1) are colored green (more open in GFP+ cells, higher expression in GFP+ cells; 65 peaks) or red (more open in Tomato population, higher expression in Tomato; 53 peaks). (F-H) Browser views of Sema3c (F), Osr1 (G) and Aldh1a2 (H) gene loci with ATACseq profiles of GFP+ pSHF progenitor cells (green) and Tomato+ Mef2c-Cre labeled cells (red). Open chromatin profiles correlate with transcription factor binding at putative enhancers specific for cardiac progenitor cells. (I) pSHF enhancers were enriched in DNA binding motifs for HOX and known cardiac transcription factors such as PBX and MEIS. HOX recognition motifs were strongly enriched in a known motif search in pSHF enhancers. Other enriched matrices match binding sites of known cardiac regulators. HOX binding motifs are highly enriched at genomic regions bound by cardiac transcription factors. p-values were obtained using HOMER (Heinz et al., 2010). Combinations of 3 sequence motifs contained within 500 bp are shown. The online version of this article includes the following figure supplement(s) for figure 3:  To determine the origin of this decrease we performed proliferation and cell death assays at E9.5. More specifically, we determined the mitotic index (pHH3+ cells) of Isl1+ cardiac progenitor cells. There was a modest reduction in cell proliferation between control and Hoxb1 GoF ;Mef2c-Cre embryos ( Figure 4K-M). However, a significant increase of Caspase-3 and TUNEL-positive cells was detected in the aSHF (Isl1+) of Hoxb1 GoF ;Mef2c-Cre compared to control embryos, indicating increased apoptosis ( Figure 4N-Q). Hence, the severe right ventricular hypoplasia resulting from ectopic Hoxb1 expression was primarily due to extensive cell death and secondarily to reduced proliferation in the aSHF.
To further examine whether over-expression of Hoxb1 in pSHF cells can lead to heart defects, Hoxb1 GoF mice were crossed with Hoxb1 IRES-Cre (Hoxb1-Cre) mice (Bertrand et al., 2011). In order to better visualize the morphology of the forming heart, whole-mount RNA in situ hybridizations were performed on Hoxb1 GoF ;Hoxb1-Cre and control embryos using a myosin light chain 2 v (Mlc2v) riboprobe ( Read counts showed that Hoxb1 mRNA was upregulated less than two fold in Hoxb1 GoF ;Mef2c-Cre  compared to control embryos (Supplementary file 1). We identified 1378 genes upregulated and 1345 genes downregulated in the SHF of Hoxb1 GoF ;Mef2c-Cre embryos. GO enrichment analysis for the biological processes associated with the upregulated genes showed significant enrichment of the GO terms 'cell death' and 'apoptotic signaling pathway' ( Figure 5B and Figure 5-figure supplement 1C-D), consistent with the decrease of aSHF cell numbers and increase of Cas3+ or TUNEL + cells observed in Hoxb1 GoF ;Mef2c-Cre embryos ( Figure 4N-Q). These genes included regulators of programmed cell death (e.g., Bad, Bmf, Trp53, and Dapk3) as well as modulators of growth and RA signaling (e.g., Gsk3a, Crabp2, and Rara) ( Figure 5-figure supplement 1D). GO analysis of the downregulated genes revealed an enrichment in the GO terms 'heart development' and 'muscle cell differentiation' suggesting an inhibition of cardiac differentiation ( Figure 5B and Figure 5-figure supplement 1D). Fgf10, a well-characterized marker of the aSHF (Kelly et al., 2001), was among the most significant downregulated genes in Hoxb1 GoF ;Mef2c-Cre embryos (p=0.002) ( Figure 5A and We further examined whether ectopic expression of Hoxb1 in the aSHF can alter chromatin accessibility. After micro-dissection and dissociation of the aSHF region from Hoxb1 GoF ;Mef2c-Cre (Hoxb1 GoF ) and Mef2c-Cre;Rosa tdT (Tomato+) embryos at E9.5 (n = 3 each), Hoxb1 GoF and Tomato+ regions were used for ATAC-seq (

Hoxb1 loss of function leads to premature differentiation in the SHF
To complement our functional analysis, we determined if the cardiac differentiation program was affected in the absence of Hoxb1 function. As previously reported Hoxb1 mutant embryos have outflow tract defects leading to later anomalies including VSD and mis-alignment of the great arteries (Roux et al., 2015). RNA-seq transcriptional profiling was performed on progenitor regions isolated from E8.5 (6-8 s) wild-type and Hoxb1-deficient embryos (n = 2 each; Figure 5L; Figure 5-figure supplement 3A-B). Interestingly, GO term analysis of the upregulated genes revealed a significant enrichment of terms including 'heart development', 'cardiac muscle tissue development' and 'regulation of cell differentiation' (Figure 5M and Figure 5-figure supplement 3C-D). Upregulation of several myocardial-specific genes (e.g., Myl2, Myh7, Actn2, Myl3, Nppa (Anf), and Nppb (Bnf)), indicates a premature cardiac differentiation in the SHF of Hoxb1 -/mutant embryos ( Figure 5N; Figure 5. Hoxb1 regulates progenitor identity and differentiation in the pSHF. (A) Volcano plot of transcriptional profiling results with significantly dysregulated genes between Hoxb1 GoF ;Mef2c-Cre and control SHF. The y-axis corresponds to the mean expression value of log 10 (p-value), and the x-axis displays the log2 fold change value. The colored dots represent the significantly differential expressed transcripts (p<0.05); the black dots represent the transcripts whose expression levels did not reach statistical significance (p>0.05). Differential expression analysis performed using DESeq2 Figure 5 continued on next page Figure 5-figure supplement 3E). The upregulation of these myocardial genes was confirmed by qPCR ( Figure 5-figure supplement 3F). The GO term 'epithelial cell development' was significantly enriched in the downregulated genes (e.g., Cdh1, Llgl2, and Lrp5) ( Figure 5M and Figure 5figure supplement 3C-D), consistent with both the deregulation of the epithelial properties of SHF cells  and premature differentiation of cardiac progenitor cells (Soh et al., 2014). This loss-of-function analysis complements and supports the conclusions of our gain-of-function analysis and identifies a role for Hoxb1 in delaying differentiation and regulating progenitor cell identity in the SHF.
Abnormal development of the SHF results in AVSD in Hoxa1 -/-;Hoxb1 -/embryos The formation of a transcriptional boundary between arterial and venous pole progenitor cells in the SHF has recently been shown to reflect the dynamic expression of two genes encoding T-box transcription factors, Tbx1 and Tbx5 . Immunofluorescence analysis of E9.5 (22-23 s) embryos confirmed the complementary expression of Tbx1 and Tbx5 proteins in the SHF ( Figure 6A). Tbx5 expression is restricted to cells in the pSHF close to the inflow tract, whereas Tbx1 is detected in SHF cells close to the outflow tract. In Hoxb1 GoF ;Mef2c-Cre embryos the relative length of the Tbx1+ region close to the outflow tract revealed a significant reduction in the size of the Tbx1+ versus Tbx5+ domains ( Figure 6A,B), although the boundary between Tbx1 and Tbx5domains was established. Measurement of the length of each domain revealed that the Tbx1+ domain was significantly reduced in Hoxb1 GoF ;Mef2c-Cre compared to Mef2c-Cre (control) embryos ( Figure 6C). In addition, despite a general reduction of the SHF, the Tbx5+ domain is increased in Hoxb1 GoF ;Mef2c-Cre compared to control embryos ( Figure 6D). Therefore, our data are consistent with a shift of the boundary between Tbx1 and Tbx5-domains. We further analyzed the expression of Tbx1 and Tbx5 in Hoxb1-mutant embryos. The Tbx5+ domain appears slightly shorter in Hoxb1 -/embryos than in Hoxb1 +/littermates ( Figure 6E,F). Due to redundancy between Hoxa1 and Hoxb1 we performed RNA-FISH analysis in compound Hoxa1 -/-;Hoxb1 -/embryos. We found that the Tbx5+ domain was shorter in double Hoxa1 -/-;Hoxb1 -/compared to Hoxb1 +/littermate embryos ( Figure 6G,H). These experiments confirm that forced activation of Hoxb1 in the Mef2c-AHF+ domain perturbs development of the aSHF. We subsequently investigated posterior SHF contributions to the venous pole of Hoxa1 -/-;Hoxb1 -/hearts. Characterization of cardiac morphology in  H) and Tbx5 (G, I) on E8.5-9 Mef2c-Cre and Hoxb1 GoF ;Mef2c-Cre embryos. An anteriorly shifted expression of Osr1 and Tbx5 is detected in Hoxb1 GoF ;Mef2c-Cre embryos compared to their control littermates (same somite stage -15-16s for Osr1 and 12-13s for Tbx5). RNA-FISH against Osr1, Tbx5 and GFP on serial sections in Hoxb1 GoF ;Mef2c-Cre embryos (K, K', K'') compared to control littermates (J, J'; Serial sagittal sections). (L) Volcano plot showing differential expressed genes between Hoxb1 -/and wild-type samples. The y-axis corresponds to the mean expression value of log 10 (p-value), and the x-axis displays the log2 fold change value. Colored dots represent the significantly differential expressed transcripts (p<0.05); the black dots represent the transcripts whose expression levels did not reach statistical significance (p>0.05). We identified 249 genes upregulated, and 292 genes downregulated in Hoxb1 -/embryos. (M) GO analysis of genes deregulated in Hoxb1 -/embryos with ranked by -log 10 (p-value). (N) Chord plot showing a selection of genes upregulated in dissected pharyngeal mesoderm of Hoxb1 -/embryos present in the represented enriched GO terms. Outer ring shows log2 fold change or GO term grouping (right, key below). Chords connect gene names with GO term groups. Nuclei are stained with Hoechst. at, atria; BA1, branchial arch 1; oft, outflow tract; SHF, second heart field; V, ventricle.   Hoxa1 -/-;Hoxb1 -/hearts at fetal stages revealed lack of the DMP, a posterior SHF derivative, resulting in a primum type atrial septal defect, a class of AVSD (3/3; Figure 6I,J). Inappropriate differentiation of SHF cells may contribute to the loss of DMP formation in these mutants, providing the first evidence that Hoxa1 and Hoxb1 are required for atrioventricular septation.

Hoxb1 is a key regulator of cardiac differentiation
We next sought to investigate the function of Hoxb1 upon cardiac induction of mouse embryonic stem (mES) cells ( Figure 7A). Using a time-course gene expression analysis during cardiac differentiation of mES cells, we detected a peak of Hoxb1 expression at day 4-5 just before the onset of cardiac differentiation ( Figure 7B), as determined by the activation of specific cardiac markers such as Myh6, Myh7, Mlc2a and Mlc2v (Figure 7-figure supplement 1A). Consistent with the initial activation of Hoxb1 and Hoxa1 during heart development in the mouse (Bertrand et al., 2011), the peak of Hoxa1 expression was detected after day 5 ( Figure 7B).
Next, we challenged the system by inducing continuous Hoxb1 overexpression using the mES Teton/Hoxb1 line (Gouti and Gavalas, 2008). Permanent DOX (1 or 0.2 mg/ml) treatment from day 4 of direct cardiac induction onwards ( Figure 7C    these conditions, we found a decreased number of beating embryoid bodies (EBs) ( Figure 7D and Figure 7-figure supplement 2). Absence of upregulation of cell death markers attested that reduction of beating EBs was not caused by ES cell apoptosis ( Figure 7D and Figure 7-figure supplement 1C). Interestingly, we also observed an upregulation of Osr1, Tbx5 and Bmp4 expression under these conditions, suggesting that the cellular identity of differentiating EB cells is changed, consistent with our in vivo observations ( Figure 5C-K and Figure 7-figure supplement 1D). Therefore, our data suggest that Hoxb1 activation in mES cells results in arrest of cardiac differentiation and failure of proper identity commitment, consistent with in vivo results.

Hoxb1 represses the expression of the differentiation marker Nppa
To identify how Hoxb1 controls cardiac differentiation, we analyzed the regulation of Nppa and Nppb expression, two markers of chamber-specific cardiomyocytes (Houweling et al., 2005). RNAseq data showed a higher read count for Nppa and Nppb in the Hoxb1 -/compared to control embryos ( Figure 8A). At E9.5, RNA-FISH confirmed an ectopic expression of Nppa in the SHF of Hoxb1 -/- (Figure 8B,C) and Hoxa1 -/-Hoxb1 -/- (Figure 8D,E) embryos. In contrast, upon Hoxb1 induction, the expression of Nppa and Nppb was downregulated in EBs (Figure 7-figure supplement  1E). Therefore, we hypothesized that Nppa and Nppb may be negatively regulated by Hoxb1 in the pSHF. ChIP-seq data in mouse ES cell lines had shown that Hoxa1, a paralog of Hoxb1, and HDAC-1 and À2 bind the Nppa and Nppb loci (De Kumar et al., 2017;Whyte et al., 2012; Figure 8G). Coherent with these observations, we found potential HOX binding sites in a 0.7 kb Nppa fragment previously shown to be responsible for the expression of the Nppa in the developing heart (Habets et al., 2002). Thus, we hypothesized that Nppa and Nppb may be a direct Hoxb1 target genes in the pSHF. As described (Durocher et al., 1997), transfection of Nkx2-5 alone or co-transfection of Nkx2-5 and Gata4 resulted in strong activation of the 0.7 kb Nppa promoter containing an HOX-motif in both Cos-7 and NIH3T3 cells ( Figure 8I and Figure 8-figure supplement 1A). However, this activity decreased three-fold upon co-transfection with a Hoxb1 expression vector or co-transfection of Hoxb1 and Hoxa1, which are co-expressed in pSHF progenitor cells in vivo ( Figure 8I), demonstrating the repressive role of Hoxb1 on the 0.7 kb Nppa promoter. We next assessed the activity of the 0.7 kb Nppa promoter in cells treated with trichostatin-A (TSA) an inhibitor of the HDAC activity known to regulate HOX functions (McKinsey, 2012). HDACs inhibition increased the luciferase activity of the reporter constructs ( Figure 8J and Figure 8-figure supplement 1B). When co-transfected, Hoxb1 or Hoxa1 suppressed the TSA-mediated activation of the Nppa promoter in cell culture ( Figure 8J). We further examined whether Hoxb1 can also repress the transcription of another myocardial gene, Tnnt2 (cardiac troponin T, cTnT). As for Nppa and Nppb, RNA-seq data showed a higher read count for Tnnt2 in the Hoxb1 -/compared to control embryos ( Figure 8F). Analysis of ChIP-seq dataset showed that Hoxb1 as well as Meis1 bind to the Tnnt2 locus ( Figure 8H). We identified potential HOX binding sites in the promoter of the rat cTnT promoter (0.5 kb), which contains also MEF2 binding sites (Wang et al., 1994). We assessed the activity of the 0.5 kb Tnnt2 promoter in cells treated with TSA. As in the case of the Nppa promoter we observed that co-transfection with a Hoxb1 expression vector or co-transfection of Hoxb1 and Hoxa1 suppressed TSA-medicated activation of the Tnnt2 promoter in Cos7 cells ( Figure 8J). Together these results suggest that Hoxb1 inhibits differentiation in the pSHF by directly repressing myocardial gene transcription even under conditions of histone acetylation.

Discussion
In this study, we characterize the transcriptional profile of subpopulations of SHF progenitor cells contributing to the forming heart and identify central roles of Hoxb1 in the posterior SHF. We report that forced activation of Hoxb1 in the Mef2c-AHF lineage results in a hypoplastic right ventricle and show that Hoxb1 has a dual role in activating the posterior program of the SHF and inhibiting premature cardiac progenitor differentiation through the transcriptional repression of myocardial genes. Thus, Hoxb1 coordinates patterning and deployment of SHF cells during heart tube elongation and altered Hoxb1 expression contributes to CHD affecting both poles of the heart tube. Here, we present the first analysis of open chromatin in purified SHF progenitor cells using the ATAC-seq method. Such datasets are important to understand the tightly regulated genetic networks governing heart development and to determine how these networks become deregulated in CHD. Integration with transcriptomes of sorted cardiac progenitor cells led us to the discovery of novel pSHF markers illustrating how this study provides a large dataset and multiple new avenues of investigation for the future. Among genes enriched in the Hoxb1-expressing progenitors we found several genes known to be important for the inflow tract development, including Tbx5, Osr1,Foxf1,Bmp4,Wnt2,and Gata4. Importantly,Tbx5,Osr1,Foxf1,and Bmp4, have also been implicated in DMP, which derives from both Hoxb1 and Mef2c-AHF-lineages, (Zhou et al., 2017;Burns et al., 2016;Zhou et al., 2015;Hoffmann et al., 2014;Briggs et al., 2013;Xie et al., 2012). Our analysis of the double positive Hoxb1 GFP (GFP+) and Mef2c-Cre;Rosa tdT (Tomato+) population at E9.5 showed an activation of Osr1 and Adlh1a2 transcripts, whereas both Hoxb1 and Tbx5 transcripts are weakly expressed ( Figure 2K). These results suggest a progressive change in cell identity during SHF deployment to alternate cardiac poles.

Hox genes are required for atrioventricular septation
Although recent single-cell RNA-seq analyses identified specific pSHF and aSHF clusters, the subpulmonary and DMP specific sub-populations have not yet been characterized, probably due to the restricted number of these progenitors (de Soysa et al., 2019;Pijuan-Sala et al., 2019). The DMP protrudes into the atrial lumen to contribute to the atrioventricular mesenchymal complex and muscular base of the atrial septum. Perturbation of DMP development in embryos mutant for transcription factors and signaling molecules regulating posterior SHF deployment results in AVSDs Rana et al., 2014;Briggs et al., 2012;Xie et al., 2012;Goddeeris et al., 2008). We observed that either Hoxb1 or Hoxa1;Hoxb1 loss of function leads to premature differentiation in the SHF. Importantly, premature myocardial differentiation in the SHF is known to contribute to defective DMP development and leads to AVSD (Goddeeris et al., 2008). Consistent with these observations we found AVSD with absence of the DMP in Hoxa1 -/-;Hoxb1 -/mutant embryos, identifying Hox genes as upstream players in the etiology of this common form of CHD. Among the pSHF genes upregulated in Hoxb1 GoF ;Mef2c-Cre and downregulated in Hoxb1 -/mutant embryos were Bmp4 and Gata4 ( Figure 5). Bmp4 is expressed in the DMP (Briggs et al., 2016;Burns et al., 2016;Sun et al., 2015;Briggs et al., 2013) and mutations in BMP4 and the BMP receptor Alk2 have been implicated in atrial septal defects and AVSDs (Smith et al., 2011). Similarly, Gata4 is required for DMP development and atrioventricular septation (Liao et al., 2008, Zhou et al., 2017.  (Whyte et al., 2012). (I) Constructs were co-transfected with Nkx2-5, Gata4, Hoxa1 and Hoxb1 expression vectors into Cos-7 cells. Luciferase activity was determined and normalized as fold over the reporter alone (mean ± SEM, n = 3, *p<0.05 for Nkx2-5 and Hoxb1 versus Nkx2-5, using ANOVA). (J) Luciferase reporter assays on the À633/+87 bp region of the Nppa promoter and the À497/+1 bp region of the Tnnt2 promoter. Cos-7 cells co-transfected with Hoxa1 and Hoxb1 expressing vector or not were treated in the absence or presence of 30 nmol/l TSA. Bars represent mean ± SEM (n = 3). Statistical test was conducted using ANOVA ( * p<0.05 for Hoxa1, Hoxb1 and TSA treatment versus Hoxb1 or TSA treatment). oft, outflow tract; lv, left ventricle; ra, right atrium. The online version of this article includes the following figure supplement(s) for figure 8: Hoxb1 is required to repress cardiac differentiation in the pSHF Examination of ectopic expression of Hoxb1 in the Mef2c-Cre derivatives showed a hypoplastic right ventricle phenotype. RNA-seq analysis showed that ectopic expression of Hoxb1 results in mis-specification of the anterior program leading to the downregulation of genes involved in cardiac differentiation. Consistent results were observed in Hoxb1 -/mutant embryos where GO analysis showed upregulated genes related to 'cardiac muscle tissue development' and 'muscle system process'. This aberrant gene program prompted us to hypothesize that Hoxb1 blocks the differentiation process by inhibiting activation of a set of myocardial genes. Our observations in the mES cell system confirmed the repressive function of Hoxb1 during cardiac differentiation. Consistent with our previous work, the present study suggests that Hox genes delay or repress the differentiation of cardiac progenitor cells, thus allowing them to participate in heart formation at the right time and coordinating progressive heart tube elongation. Interestingly, a small number of apoptotic cells were found to be present in the aSHF in Hoxb1 GoF ;Mef2c-Cre embryos ( Figure 4N-Q). However, although we observed downregulation of cardiac differentiation markers after Dox treatment of mES Tet-on/Hoxb1 cells, no activation of the cell death program was detected in the beating EBs. These observations suggest a difference between the in vivo and in vitro models, which may be linked to culture conditions used for mES cells. Therefore, we speculate that the cell death within the aSHF could be due to lack of survival signals. Such survival signals may include FGFs since increase apoptosis of Isl1+ cardiac progenitor cells was reported in the double homozygous Fgf8;Fgf10 mutants (Watanabe et al., 2010), and conditional Fgf8 flox/mutants (Park et al., 2006). Accordingly, our RNA-seq analysis revealed a significant downregulation of Fgf10 mRNA level in Hoxb1 GoF ;Mef2c-Cre embryos ( Figure 5A; p=0.005). In vitro analysis in conjunction with our RNA-seq analysis suggest that Hoxb1 may act as transcriptional repressor of structural myocardial genes. Hoxb1 activity on Nppa and Tnnt2 promoters was consistent with our in vivo analysis. The 0.7 kb Nppa fragment is responsible for the developmental expression pattern of the Nppa gene and is itself regulated by Gata4 and Nkx2-5 (Habets et al., 2002). Our data provide molecular evidence that Hoxb1 functions to repress differentiation in the pSHF. Several studies have suggested that Hoxb1 plays a repressive role in differentiation of other cell types Bami et al., 2011). Indeed, HOX transcription factors can function as activators or repressors (Mann et al., 2009;Saleh et al., 2000) raising the question of how gene activation versus repression is determined. Our analysis of ATAC-seq in pSHF population demonstrated overrepresentation of specific binding motifs. Several transcription factors known to bind to these motifs, including Nkx2-5, Gata4, Pbx1/2/3 and Meis1/2, have been previously associated with cardiac differentiation. Nkx2-5 cooperates with Hox factors to regulate the timing of cardiac mesoderm differentiation (Behrens et al., 2013). TEAD proteins regulate a variety of processes including cell proliferation survival and heart growth (Lin et al., 2016;von Gise et al., 2012). Unexpectedly, our results implicate TEAD proteins in activating regulatory genes and potentially promoting the proliferation/outgrowth of cardiac progenitors in the pSHF, where nuclear accumulation of the TEAD partners YAP/TAZ has been reported (Francou et al., 2017). The HOX protein family has been previously shown to interact other with homeodomain proteins, including the TALE-class Pbx and Meis family members . This further supports the conclusion that Hoxb1 uses cofactors to activate or repress cardiac genes. Repression is mediated by direct recruitment of repressor complexes, such as NuRD, or through maintenance of repressed chromatin states, such as those mediated by Polycomb complexes functioning in an HDAC complex (Schuettengruber and Cavalli, 2009). Several HOX proteins reportedly bind HAT or HDACs enzymes (Ladam and Sagerströ m, 2014;Saleh et al., 2000). Meis proteins promote HAT recruitment by displacing HDACs to permit HAT binding (Shen et al., 2001). Our results suggest that Meis proteins might thus modulate HAT/HDAC accessibility at Hox-regulated regulatory sequences to delay differentiation in the SHF. Indeed, a recent study using live imaging of cell lineage tracing and differentiation status suggests that there is a discrete temporal lag between the first and second waves of differentiation that form the mouse heart (Ivanovitch et al., 2017). Such a delay of differentiation is essential to orchestrate early cardiac morphogenesis. Hoxb1 may thus contribute to controlling this differentiation delay, in particular through maintaining pSHF progenitor cells in an undifferentiated state until they are added to the venous pole or the inferior wall of the outflow tract. Future work will define how Hoxb1 expression is downregulated to release the cardiac differentiation process during SHF deployment.

Cell culture
Mouse ES Tet-On/Hoxb1 ES lines were generated by the Gavalas laboratory (Gouti and Gavalas, 2008). ES cells were cultured on primary mouse embryonic fibroblast feeder cells. ES cells medium was prepared by supplementing GMEM-BHK-21 (Gibco) with 7.5% FBS, 1% non-essential amino acids, 0.1 mM beta-mercaptoethanoland LIF conditioned medium obtained from pre-confluent 740 LIF-D cells that are stably transfected with a plasmid encoding LIF (Zeineddine et al., 2006). For cardiac differentiation, ES cells were re-suspended at 2.5 Â 10 4 cells/ml in GMEM medium supplemented with 20% fetal calf serum, 1% non-essential amino-acids, and 0.1 mM beta-mercaptoethanol in hanging drops (22 ml) plated on the inside lids of bacteriological dishes. After 48 hr EBs were transferred in 10 ml medium to 10 cm bacteriological dishes. At day 5 EBs were plated on tissue culture dishes coated with gelatin, allowed to adhere. Expression of Hoxb1 was induced by addition of doxycycline (DOX) (Sigma -1 or 0.2 mg/ml) from day four to the end of the experiment. The medium was changed every two days.

Cell sorting
E9.5 (16s) transgenic progenitor heart regions were dissected, pooled (n > 3 embryos for each genotype) and digested with 0.25% Trypsin/EDTA (Invitrogen), neutralized in DMEM (Invitrogen) containing 5% FBS and 10 mmol/L HEPES (Invitrogen), rinsed and resuspended in PBS, and passed through a 70 mm nylon cell strainer (Falcon). Samples were sorted on a FacsAria flow cytometer (BD) using FACSDiva 8.0.1 software. Samples were gated to exclude debris and cell clumps. The number of E9.5 Hoxb1 GFP progenitor cells and Mef2c-Cre;Rosa tdT progenitor cells per embryo obtained were typically 600 to 900, respectively. Fluorescent cells were collected into PBS and processed for RNA extraction or ATAC-seq.

Histological and immunostaining
Standard histological procedures were used (Roux et al., 2015). Heart from Hoxb1 GoF ;Mef2c-Cre and littermate controls were fixed in neutral-buffered 4% paraformaldehyde in PBS, rinsed, dehydrated, paraffin-embedded and tissue sections cut at 8 mm. Sections were stained with Harris' hematoxylin and eosin (H and E) (Sigma). For immunostaining embryos from Hoxb1 -/or Hoxb1 GoF ;Mef2c-Cre and littermate controls were fixed at 4˚C for 20 min in 4% paraformaldehyde, rinsed in PBS, equilibrated to 15% sucrose and embedded in O.C.T. Cryo-sections were cut at 12 mm, washed in PBS and pre-incubated in blocking solution (1%BSA, 1% Serum, 0.2% Tween20 in PBS). Primary antibodies were applied overnight at 4˚C, followed by secondary detection using Alexa Fluor conjugated (Molecular Probes) secondary antibodies. Sections were photographed using an AxioImager Z2 microscope (Zeiss) and photographed with an Axiocam digital camera (Zen 2011, Zeiss).
X-gal staining X-gal staining was performed on whole-mount embryos as previously described (Roux et al., 2015). For each experiment, a minimum of three embryos of each genotype was observed. Embryos were examined using an AxioZoom.V16 microscope (Zeiss) and photographed with an Axiocam digital camera (Zen 2011, Zeiss).

In situ hybridization
Whole-mount in situ hybridization (WISH) was performed as previously described (Roux et al., 2015). The riboprobe used in this study was Mlc2v. For WISH, hybridization signals were detected by alkaline phosphatase (AP)-conjugated anti-DIG antibodies (1/2000; Roche), followed by color development with NBT/BCIP (magenta) substrate (Promega). After staining, the samples were washed in PBS and post-fixed.

ATAC-seq
For each sample, 10,000 FACS-sorted cells were used (n > 3 embryos for each genotype). Cell preparation, transposition reaction, and library amplification were performed as previously described (Buenrostro et al., 2013). Paired-end deep sequencing was performed using a service from GenomEast platform (IGBMC, Strasbourg).
To evaluate reproducibility between replicates and retain peaks with high rank consistency, we applied the Irreproducible Discovery Rate (IDR; https://f1000.com/work) methodology from ENCODE. Only peaks with an IDR value lower than 0.1 were retained.
Narrow peaks were called with MACS 2.1.1. (https://f1000.com/work) BigWig files were generated from bedGraph files to visualize fold enrichment and p-value for all regions within UCSC genome browser.
A Volcano Plot was used to represent peaks in function of the log 2 ratio between two conditions, and the adjusted p-value scaled as -log10. A peak located on the upper right part of the plot corresponds to a significantly strongly enriched peak in the GFP+ (Hoxb1-GFP) population.
A MA plot (log 2 fold change vs. mean average) was used to visualize changes in chromatin accessibility for all peaks. MA plot depicts the differences between ATAC-seq peaks in the experimental samples by transforming the data onto M (log ratio) and A (mean average) scales, then plotting these values. Differential chromatin accessibility is expressed as a log fold change of at least two folds and a p-value of <0.1 and reveals the relative gain of chromatin regions in GFP+ cells (above the 0 threshold line) as compared to the gain in the Tomato+ cells (below the 0 threshold line).
Differential peaks between GFP+ (Hoxb1-GFP) and Tomato+ (Mef2c-Cre;Rosa tdT ) samples were identified using a bed file containing selected peaks from IDR methodology and the DiffBind R package (10.18129/B9.bioc.DiffBind). Peaks with FDR (False Discovery Rate) at 10% were kept. Differential peaks were annotated using the HOMER 7 software and genes in the vicinity of peaks (+ / -150 kb from summit) were selected.
In order to perform motif analysis, we generated a Fasta file containing all sequences surrounding peak summits (+ / -200 bp) and used the HOMER findMotifsGenome feature. Known and de novo motifs were identified. All possible order-3 combinations motifs were generated for each known and de novo motifs. To assess enrichment of motifs combinations, p-values were computed using 20,000 random sequences of 400 bp from the mouse genome. Fisher's exact test was applied to compare random and peaks sequences.

RNA-seq
Total RNA was isolated from the pharyngeal region and sorted cells with NucleoSpin RNA XS (Macherey-Nagel) following the protocol of the manufacturer. cDNA was generated and amplified with the Ovation RNAseq v2 kit (NuGEN). Briefly, 2 ng of total RNA were used for mixed random-/ polyA-primed first-strand cDNA synthesis. After second strand synthesis, the double-stranded cDNA was amplified by single primer isothermal amplification, and the amplified cDNA was bead-purified (AmpureXP, Beckman-Coulter). Paired-end deep-transcriptome sequencing was performed using a service from GenomEast platform (IGBMC, Strasbourg).
E9.5 Mef2c-Cre:Rosa tdTomato and Hoxb1 GFP embryos were dissected on ice-cold PBS to isolate the SHF and cells were FACS. After FACS Tomato+ and Tomato-as well as GFP+ and GFP-cells were homogenized in Trizol (Invitrogen) using a Tissue-lyzer (Qiagen). RNA was isolated from pharyngeal region of E8.5 wild-type and Hoxb1 -/-, and E9.5 control and Hoxb1 GoF ;Mef2c-Cre embryos (n = 3 embryos per each genotype). RNAs of each genotype were pooled to obtain one replicate. RNA was prepared using the standard Illumina TrueSeq RNASeq library preparation kit. Libraries were sequenced in a Hiseq Illumina sequencer using a 50 bp single end elongation protocol. For details of analyses of RNA-seq data see Supplementary file 1. Resulting reads were aligned and gene expression quantified using RSEM v1.2.3 (Li and Dewey, 2011) over mouse reference GRCm38 and Ensembl genebuild 70. Gene differential expression was analyzed using EdgeR R package (McCarthy et al., 2012;Robinson et al., 2010). Genes showing altered expression with adjusted p<0.05 were considered differentially expressed. For the set of differentially expressed genes a functional analysis was performed using Ingenuity Pathway Analysis Software and DAVID (Huang et al., 2009), and some of the enriched processes were selected according to relevant criteria related to the biological process studied.

RNA-Seq data processing
Raw RNA-seq reads were aligned using the STAR aligner version 2.5.2 (Dobin et al., 2013) on the reference GRCm38 mouse genome. Coverage visualization files (WIG) were generated with the STAR aligner software and were converted into BigWig files using UCSC wigToBigWig files to allow their visualization within the UCSC genome Browser.
In parallel, transcripts abundance was computed using HTSEQ-count 0.9.1 (Anders et al., 2015) and the background was estimated through read counts from intergenic regions using windows of 5 kb length.
Normalization and differential gene expression analysis between conditions were performed using R (R version 3.3.4) and DESEQ2 (Love et al., 2014).
For each sample, genes with null expression were removed and we set the 95th percentile of the intergenic read counts as the threshold of detection (log 2 (normalized count + 1)). Heatmap were generated with the Pheatmap R package.

Functional annotation
For RNA-seq and ATAC-seq, genes lists were annotated with the ClusterProfiler  system. Circos plots were generated with Goplot package (Walter et al., 2015).

In vitro reporter assays
Luciferase reporter constructs were co-transfected with expression constructs for human HOXB1, GATA4 (Stefanovic et al., 2014;Singh et al., 2009) andNKX2-5 (Singh et al., 2009). Cos-7 (ATCC CRL-1651) and NIH3T3 (ATCC CRL-1658) cell lines were culture in DMEM high glucose (4.5 g/L, D-glucose, 4 nM L-Glutamine, 1 mM sodium pyruvate, 10% fetal calf serum). Plasmid transfection was performed using PEI (Polyethylenimine) transfection reagent. 24 hr after transfection, cell extracts and luciferase assays were performed following the protocol of the manufacturer (Promega). Mean luciferase activities and standard deviations were plotted as fold activation when compared with the empty expression plasmid. Cos-7 and NIH3T3 are a fibroblast-like cell lines suitable for transfection. Their mycoplasma contamination status resulted negative.

Quantitative RT-PCR analysis
Total RNA was isolated from pharyngeal regions and sorted cells with NucleoSpin RNA XS (Macherey-Nagel) following the protocol of the manufacturer. cDNA was generated using the Affini-tyScript Multiple Temperature cDNA synthesis kit (Agilent). The expression level of different genes was assessed with quantitative real-time PCR. LightCycler 480 SYBR Green I Master mix (Thermo Fisher Scientific) was used for quantitative real-time qRT-PCR analysis with a LightCycler 480 (Roche Diagnostics) following the manufacturer's instructions. All primers are described in the Key Resources Table. Values were normalized to HPRT expression levels.

Statistics
Statistical analyses were performed using unpaired two-tailed t-test to assess differences between two groups. Data are presented as mean ± SD. A p-value of <0.05 was considered significant. The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.