Loss of Extreme Long-Range Enhancers in Human Neural Crest Drives a Craniofacial Disorder

Summary Non-coding mutations at the far end of a large gene desert surrounding the SOX9 gene result in a human craniofacial disorder called Pierre Robin sequence (PRS). Leveraging a human stem cell differentiation model, we identify two clusters of enhancers within the PRS-associated region that regulate SOX9 expression during a restricted window of facial progenitor development at distances up to 1.45 Mb. Enhancers within the 1.45 Mb cluster exhibit highly synergistic activity that is dependent on the Coordinator motif. Using mouse models, we demonstrate that PRS phenotypic specificity arises from the convergence of two mechanisms: confinement of Sox9 dosage perturbation to developing facial structures through context-specific enhancer activity and heightened sensitivity of the lower jaw to Sox9 expression reduction. Overall, we characterize the longest-range human enhancers involved in congenital malformations, directly demonstrate that PRS is an enhanceropathy, and illustrate how small changes in gene expression can lead to morphological variation.


In Brief
Non-coding mutations over a megabase from SOX9 cause the craniofacial disorder Pierre Robin sequence (PRS). Long et al. leverage a human neural crest model to demonstrate that PRS is caused by loss of extreme long-range enhancers active during a restricted developmental window and explore mechanisms underlying the specificity of disease manifestations.

INTRODUCTION
Distal regulatory sequences called enhancers control gene transcription at a distance and play a critical role in directing developmental gene expression patterns (Long et al., 2016). Non-coding mutations are increasingly being implicated in human disease (Franke et al., 2016;Laugsch et al., 2019;Lupiáñ ez et al., 2015), and, in particular, perturbations of enhancers have been documented as being causative because of their effects on gene regulation during development (Spitz, 2016). Although mutations of protein-coding sequences often affect multiple tissues in which a given gene is active, mutations in non-coding regulatory regions can selectively perturb target gene expression in specific tissue contexts. For example, SOX9, an SRY (sex-determining region Y)-related HMG (high mobility group) box (SOX) transcription factor, plays numerous important roles during embryogenesis, including sex determination, chondrogenesis, and craniofacial development (Lee and Saint-Jeannet, 2011;Lefebvre and Dvir-Ginzberg, 2016). Heterozygous loss-of-function mutations in the SOX9 coding sequence cause a severe congenital disorder called campomelic dysplasia, which is associated with bowed long limbs, Cell Stem Cell 27, 765-783, November 5, 2020 ª 2020 The Authors. Published by Elsevier Inc. 765 This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). ll OPEN ACCESS disorders of sex determination, and craniofacial defects (Wagner et al., 1994). Interestingly, SOX9 is the sole protein-coding gene within an unusually large,~2-Mb topologically associating domain (TAD) (Bagheri-Fam et al., 2006;Gordon et al., 2009). Many non-coding mutations have been described within this gene desert, including large deletions, translocations, and duplications, that cause a range of defects that recapitulate distinct aspects, but not all features, of campomelic dysplasia, leading to the hypothesis that cell-type-specific enhancers are disrupted in these tissue-selective disorders (Baetens et al., 2017;Kurth et al., 2009;Sanchez-Castro et al., 2013). In some cases, the perturbed enhancers have been mapped and characterized; for example, an SRYresponsive regulatory element essential for sex determination (Gonen et al., 2018).
A cluster of large genomic deletions and translocation breakpoints at the centromeric far end of the SOX9 TAD are associated with isolated Pierre Robin sequence (PRS), a congenital craniofacial disorder characterized by a single primary phenotype: underdevelopment of the lower jaw or mandible (micrognathia) that leads to secondary phenotypes, including retraction of the tongue (glossoptosis), obstruction of the airway, and, with incomplete penetrance, horseshoeshaped cleft palate (Robin, 1994;Tan and Farlie, 2013). This sequence of anomalies, in turn, results in feeding and breathing difficulties and failure to thrive (Rathé et al., 2015). It has been proposed that PRS-associated mutations perturb the function of key SOX9 long-range enhancers active during craniofacial development (Amarillo et al., 2013;Benko et al., 2009;Gordon et al., 2009Gordon et al., , 2014; however, functional characterization of putative craniofacial enhancers and direct demonstration that SOX9 is the target gene are still lacking. Given the specificity of the developmental defects in PRS and the well-documented requirement for SOX9 function in the neural crest (Cheung and Briscoe, 2003;Mori-Akiyama et al., 2003;Spokony et al., 2002), we hypothesized that the centromeric far end of the SOX9 TAD harbors enhancers active in cranial neural crest cells (CNCCs), a transient population of multipotent progenitor cells that give rise to the majority of vertebrate craniofacial structures, including the jaw (Bronner and LeDouarin, 2012;Minoux and Rijli, 2010;Trainor et al., 2003).
Leveraging a well-characterized in vitro differentiation model of human CNCCs (hCNCCs) (Bajpai et al., 2010;Prescott et al., 2015;Rada-Iglesias et al., 2012), we uncover two clusters of hCNCC-specific enhancers overlapping PRS mutations and demonstrate that they regulate SOX9 transcription within a defined developmental window and over extremely large genomic distances of 1.45 Mb and 1.25 Mb, respectively. To model the sensitivity of craniofacial development to changes in Sox9 gene dosage, we generate an allelic series in mice with increasing severity of Sox9 perturbation. We propose a mechanism of disease etiology where two features of Sox9 regulation converge to confine disease phenotypes to the lower jaw. First, loss of the tissue-specific activity of PRS locus enhancers restricts Sox9 dosage perturbation to the developing facial structures, and second, heightened sensitivity of the lower jaw to Sox9 level reduction further confines PRS-associated malformations.

Three Clusters of Candidate Human Cranial Neural Crest Enhancers Overlap Sequences Lost in PRS
Many large non-coding deletions identified in PRS patients map to the SOX9 locus but are mostly non-overlapping, suggesting the presence of multiple regulatory elements with nonredundant functions whose loss leads to similar phenotypic outcomes (Amarillo et al., 2013;Benko et al., 2009;Gordon et al., 2014). Additionally, numerous translocation breakpoints have been identified that displace much of the distal SOX9 gene desert away from the remainder of the locus ( Figure 1A; Benko et al., 2009). To identify candidate hCNCC enhancer elements that map within regions of the SOX9 gene desert lost in PRS patients, we used chromatin immunoprecipitation sequencing (ChIP-seq) and assay for transposase-accessible chromatin using sequencing (ATAC-seq) datasets from in-vitro-derived hCNCCs (Prescott et al., 2015;this study;Figure 1A). Among the candidate enhancers identified within the SOX9 TAD, three enhancer clusters were located at the far centromeric end of the SOX9 gene desert upstream of the PRS translocation breakpoint region and overlapped with at least one of the large deletions seen in patients with PRS. Each cluster contained two or more discrete binding peaks for the general coactivator p300, was enriched for the active enhancer marks H3K27ac and H3K4me1, and corresponded to regions of open chromatin (Prescott et al., 2015;Figures 1A and 2A;Calo and Wysocka, 2013). All three putative enhancer clusters were located over 1 Mb upstream of the SOX9 gene ( Figure 1A) and were named to reflect their genomic arrangement: enhancer cluster 1.45 (EC1.45) is 1.45 Mb upstream of SOX9, EC1.35 is 1.35 Mb upstream of SOX9, and EC1.25 is 1.25 Mb upstream of SOX9 (Figures 1A and 2A).
Importantly, the three clusters of putative enhancers were not marked by active chromatin marks in human embryonic stem cells (hESCs) ( Figure 1A) or other available profiled cell types ( Figure S1A), except for human fetal craniofacial tissues ( Figure S1B; Wilderman et al., 2018), suggesting that the putative enhancers exhibit cell-type-specific activity in the neural crest and developing face. Indeed, activation of these putative enhancers coincided with a strong increase in SOX9 expression during transition from hESCs through neuroectodermal spheres (NECs) to hCNCCs (Figures S1C-S1E). Therefore, epigenomic signatures identified three putative hCNCC ECs overlapping sequences lost in PRS patients.

PRS Region Candidate ECs Make Long-Range Contacts with the SOX9 Promoter
To determine whether PRS region candidate ECs make contact with the SOX9 promoter over more than a megabase of genomic space, we performed SOX9 promoter-anchored Capture-C assays (Davies et al., 2016) in hESCs, NECs, early-migrating hCNCCs (hereafter called early hCNCCs), or late-passage hCNCCs (hereafter called late hCNCCs) (Prescott et al., 2015). In hESCs, the SOX9 promoter formed contacts that spanned the previously defined TAD (Dixon et al., 2012(Dixon et al., , 2015, with the majority of interactions confined ll OPEN ACCESS Article to the telomeric side, and also showed frequent interactions with CTCF (CCCTC-binding factor)/cohesin sites across the locus ( Figures 1A, 1B, and S1F). A strong shift in interaction frequencies was apparent during hCNCC differentiation. In particular, extreme long-range interactions with the EC1.45, EC1.35, and EC1.25 putative ECs at the far centromeric end of the TAD substantially increased in late hCNCCs compared with hESCs ( Figures 1B, S2A, and S2B), but the dynamics for each EC were distinct. Specifically, EC1.35 already contacted the SOX9 promoter in hESCs (Figures S2A and S2B) and, notably, was occupied by CTCF and cohesin in hESCs and hCNCCs. This was mirrored by a similarly bound CTCF site 2 kb upstream of the SOX9 promoter, suggesting that these genome-organizing proteins may facilitate a developmentally stable long-range interaction between the SOX9 promoter and the distal region of the TAD (Arzate-Mejía et al., 2018;Guo et al., 2015;Ren et al., 2017;Schoenfelder and Fraser, 2019;de Wit et al., 2015). In comparison, EC1.25 and EC1.45 did not interact frequently with the SOX9 promoter in hESCs by , and only in early and late hCNCCs did the contact frequency increase.
To confirm these cell-type-specific interactions, we performed reciprocal Capture-C experiments anchored at each of the three PRS region candidate ECs. Again, we observed an increase in contact frequency with the SOX9 promoter in late hCNCCs compared with hESCs ( Figure S2C). Importantly, Capture-C performed from other gene promoters in nearby TADs, including KCNJ2, COG1, and SDK2, did not reveal interaction with the PRS region putative ECs and did not cross the SOX9 TAD boundaries ( Figure S1F). The extreme longrange candidate enhancers at the PRS locus make selective contacts with the SOX9 gene promoter in a disease-relevant cell type.  (top) and P4 hCNCCs (bottom) at the human SOX9 locus. Three putative hCNCC-specific ECs overlap the PRS locus: EC1.45,EC1.35,and EC1.25. PRS patient deletions (red), translocation breakpoints (blue), topological domains (TADs) (Dixon et al., 2012), and protein-coding genes are shown. Centro, centromeric; telo, telomeric. (B) Capture-C from the SOX9 promoter (see anchor) in hESCs, neuroectodermal spheres (NECs), early (day 11) and late (P4)  To investigate the regulatory potential of the putative PRS region ECs, we tested their capacity to activate transcription in a luciferase assay. We cloned the entire human EC1.45 region, including both p300 peaks, upstream of a luciferase reporter gene with an SV40 minimal promoter. Given their greater size, we combined the two p300 peaks for EC1.35 (S1 and S2) and the four p300 peaks for EC1.25 (S3-S6) (Figure 2A). EC1.45 and EC1.25 were found to be extremely strong drivers of transcription in hCNCCs, rivalling the activity of the viral SV40 enhancer positive control ( Figure 2B, left panel). However, despite harboring epigenetic marks suggestive of active enhancer identity (albeit weaker than EC1.45 or EC1.25), EC1.35 was not active in the luciferase assay, suggesting that it is not a strong driver of transcription, at least in the examined context. As expected, none of the three ECs was active in hESCs ( Figure 2B, right panel).
To characterize the spatiotemporal activity of the PRS locus enhancers during development, we utilized an in vivo LacZ enhancer reporter assay at two mouse embryonic stages, embryonic days 9.5 (E9.5) and E11.5 ( Figure 2C). Human EC1.45 and a number of the constituent p300 peaks for human EC1.25 were active during mouse development, exhibiting reproducible activity patterns that mirrored distinct spatiotemporal subdomains of endogenous SOX9 craniofacial expression ( Figure 2D). This included activity in embryonic domains that will form the mandible: the first branchial arch at E9.5 (i.e., S3 and S5 of EC1.25) and the mandibular process at E11.5 (i.e., EC1.45 and S3 and S6 of EC1.25;; Tables S1 and S2). Mandibular activity of EC1.45 was further confirmed by high-resolution episcopic microscopy (HREM) (Figures 2F,S3G,and S3H). Similar to the in vitro luciferase assay, human EC1.35 did not display activity in the developing facial structures nor in any other tissues at either developmental stage ( Figure 2E).

Heterozygous Ablation of PRS Region ECs Causes an Allele-Specific Reduction in SOX9 Expression
To directly characterize the contribution of the human EC1.45 and EC1.25 enhancers to SOX9 gene regulation during hCNCC differentiation, we generated hESC lines with heterozygous deletions of EC1.45 or EC1.25 (Figures 3A and S4A-S4D;Ikeda et al., 2018). To determine the effect of these deletions on SOX9 expression, we developed an allele-specific reverse transcriptase digital droplet PCR (RT-ddPCR) assay that distinguished a single-nucleotide polymorphism (SNP) in the 3 0 UTR of the SOX9 gene (T or C) (Figures 3B and S4E) and linked this in cis to the presence or absence of EC1.45 or EC1.25 via genomewide phasing (Table S3).
During CNCC differentiation, the two alleles of SOX9 were expressed at nearly equivalent levels in wild-type cells (Figures 3C and S4F, green boxplots) regardless of changes in overall SOX9 expression (Figures S1C-S1E). In contrast, enhancer deletion was associated with a striking allelic skew in SOX9 expression, indicating that loss of EC1.45 or EC1.25 disrupted normal regulation of SOX9 ( Figures 3C and S4F, red boxplots). The effect of EC1.45 enhancer deletion on SOX9 expression was larger than that of EC1.25 deletion, especially in passage 3 (P3)-P4 late hCNCCs, where it led to 50%-55% lower expression of the mutated allele. This greater effect on SOX9 expression in late hCNCCs was in keeping with an observed~4.5-fold increase in EC1.45 activity between P2 and P4 hCNCCs ( Figure S4G). In summary, EC1.45 and EC1.25 are required for normal expression of SOX9 in the cranial neural crest, thus establishing some of the longest-range functional enhancer-gene interactions reported to date in the human genome.

PRS Region Enhancers Are Decommissioned in Cranial
Chondrocytes SOX9 has two sequential critical roles in development of the mandible; first in specification and migration of CNCCs and second during chondrogenesis and formation of Meckel's cartilage, the developmental precursor of the lower jaw (Amano et al., 2010;Wyganowska-Swiaţkowska and Przysta nska, 2011). We therefore tested whether EC1.45 and EC1.25 also regulate SOX9 expression in cranial chondrocytes derived from hCNCCs ( Figure 3A; differentiation validated in Figures S4H and S4I). Remarkably, cranial chondrocytes derived from hCNCCs heterozygous for EC1.45 or EC1.25 enhancer deletions did not show allelic skew in SOX9 expression, indicating that the requirement for these enhancers in regulation of SOX9 transcription is highly cell type restricted ( Figures 3C and S4F). In agreement, ATAC-seq analysis revealed that EC1.45 and EC1.25 enhancers lost hypersensitivity during differentiation of hCNCCs to chondrocytes (Figures 3D and S4J), and luciferase reporter assays further confirmed that the regulatory potential of EC1.45 and EC1.25 was sharply reduced in chondrocytes (Figure 3E). Together, these data reveal that, despite high SOX9 expression in chondrocytes, the PRS-associated enhancers have restricted and transient activity during CNCC development and become decommissioned during chondrogenesis, defining a developmental window for disease etiology.
Two Short Segments Act Synergistically to Drive the Majority of EC1.45 Enhancer Activity To interrogate sequence features critical for hCNCC-specific activity of the PRS ECs, we focused on EC1.45, whose deletion is associated with greater allelic imbalance in SOX9 expression in hCNCCs. First, we tested the enhancer activity of the two constituent EC1.45 p300 peaks (Peak1 and Peak2; Figure 4A) in luciferase reporter assays. Intriguingly, individually, the two p300 peaks exhibited only weak enhancer activity, whereas Peak1+Peak2 led to activation greater than the sum of the two  Figures S5A-S5C). To further refine regions of enhancer activity within EC1.45, we performed a tiling deletion screen across Peak1+Peak2 ( Figure S5A) and identified two minimal regions (overlapping deletions 3-4 in Peak1 and deletions 10-11 in Peak2) whose loss lead to a significant reduction in luciferase reporter activity, min1 and min2, respectively. Importantly, min1+min2 recapitulated the activity and synergy of Peak1+Peak2 and accounted for nearly the full activity of EC1.45 ( Figures 4B and S5B). Of interest, three of the constituent putative enhancers from EC1.25 also act independently as enhancers in late hCNCCs, whereas combination of all four individual elements appears to similarly drive synergistic activation of luciferase expression ( Figure S5C). Unsurprisingly, the two constituent EC1.35 p300 peaks were not active enhancers by luciferase assay ( Figure S5C). Therefore, we identify two core enhancer elements within EC1.45 (and three within EC1.25) that are weak enhancers individually but work together in a robustly synergistic manner to activate gene expression.
Coordinator Motifs Are Essential for Activity and Synergy of EC1.45 Enhancers In our previous study investigating sequence features associated with divergence of enhancer activity between human and chimpanzee CNCCs, we identified a long bipartite sequence that we called ''Coordinator' ' (Prescott et al., 2015;Figure 4C, top). Of all motifs tested, Coordinator had the greatest effect on surrounding chromatin features and affected the highest number of enhancers (Prescott et al., 2015), suggesting a privileged role in establishment of enhancer competence in CNCCs. Strikingly, there are seven Coordinator motifs within the EC1.45 Peak1+Peak2 region, four of which fall within min1+min2 ( Figure S5D). Mutations of all four motifs in min1+min2 diminished activity to the level of the empty vector, while mutation of the Coordinator sequence in min1 brought the activity of min1+min2 down to a level similar to min2 alone ( Figure 4B). Similarly, mutation of the three Coordinator sequences in min2 brought the activity of min1+min2 down to a level similar to min1 alone ( Figure 4B), indicating that the Coordinator motif is essential for the activity and synergistic function of the EC1.45 enhancers. A mutation screen of each of the seven motifs within the Peak1+Peak2 region further supported that the most substantial contribution of Coordinator motifs to overall enhancer activity is within the min1 and min2 regions and revealed that mutation of all seven Coordinator motifs led to a reduction of activity below the baseline level of the minimal promoter control (p = 0.027). Notably, this suggests that repressive sequence features exist within the enhancer region that are unmasked by loss of Coordinator sites ( Figure S5E) and may be harbored within the del1-del2 region ( Figure S5A; p < 0.0063).
Coordinator Motif Content in the Deeply Conserved Region of EC1.45 Correlates with Enhancer Activity across Species Interestingly, EC1.45 min2 is conserved at the sequence level from human to the lobe-finned fish coelacanth across~400 million years of evolution ( Figures S5F and S5G). To examine the relationship between the Coordinator motif content (estimated from Fimo; Figure S5H; Grant et al., 2011) of orthologous min2 regions and their enhancer activity, we cloned the min2 sequences from mouse, opossum, platypus, chicken, lizard, frog, and coelacanth downstream of the human min1 sequence and assessed their combined activity by luciferase assay ( Figure 4D). Strikingly, an increased Coordinator score was associated with increased enhancer activity ( Figure 4D, right panel). These changes in activity did not simply recapitulate the phylogenetic relationship between the examined species because, for example, the most distantly related coelacanth sequence was relatively high in Coordinator content and enhancer activity, suggesting that the presence of the Coordinator motif rather than merely evolutionary drift drive the observed changes in activity.

TWIST1 Regulates EC1.45 Enhancers in a Coordinator-Dependent Manner
We next sought to uncover the trans-regulatory inputs that control EC1.45 activity. The Coordinator sequence resembles an E-boxand Homeobox-like motif, separated by 6 bp, although the factors that bind are so far unknown. E-box motifs are recognized by basic helix-loop-helix (bHLH) transcription factors, and we noted that TWIST1 was among the most highly expressed bHLH factors in hCNCCs. TWIST1 levels tightly coincided with EC1.45 enhancer activity, being strongly upregulated during hCNCC differentiation (with highest expression in late hCNCCs) and downregulated during chondrogenesis ( Figure 4E; compare with to EC1.45 enhancer activity in Figure 3C). TWIST1 is also known to play an essential role in neural crest biology and craniofacial development (Bildsoe et al., 2009;Parsons et al., 2014;Qin et al., 2012). Furthermore, Twist1 inactivation in NCCs populating the mandibular arch in mice leads to micrognathia and cleft palate (Zhang et al., 2012), phenotypes overlapping those seen in PRS.
Genome-wide analysis of TWIST1 ChIP-seq performed in hCNCCs revealed that the top enriched sequence matched the Coordinator motif ( Figure 4C, bottom), followed by a canonical TWIST1 E-box motif ( Figure S5I), suggesting that, in hCNCCs, a substantial fraction of TWIST1 chromatin binding occurs in the context of the Coordinator motif. In keeping with the presence of multiple Coordinator motifs, TWIST1 binds to both EC1.45 constituent enhancers in hCNCCs ( Figure 4A). To assess whether this binding is dependent on the Coordinator sequence, we developed an episomal TWIST1 ChIP-ddPCR assay that Overview of differentiation, including early hCNCCs at day 11, passage 1-2 early hCNCCs, passage 3-4 late hCNCCs, and chondrocytes on days 5 and 9. (B) Schematic of allele-specific RT-ddPCR, indicating primers and LNA probes (HEX/FAM) for the T/C SNP (rs74999341) in the SOX9 3 0 UTR. Shown are wild-type (left) and heterozygous EC1.45 deletion (right). (C) RT-ddPCR for wild-type (green boxplot) and EC1.45 heterozygous deletion (red), plotting SOX9 C:T expression ratio. (D) ATAC-seq reveals hCNCC-specific accessibility for EC1.45. Shown are representative traces from 3-4 replicates. (E) Luciferase assay for late hCNCCs (left) and chondrocytes (right). A COL2A1 enhancer is active in both cell types, whereas EC1.45 and EC1.25 become inactive in chondrocytes. See also Figure S4 and  Figure 4F). In this assay, the strong TWIST1 binding observed for the wild-type min1+min2 sequence was greatly diminished by mutation of the Coordinator sequences ( Figure 4G). These results establish that TWIST1 binds to the min1 and min2 regulatory sequences in CNCCs in a Coordinator motif-dependent manner.  Figure 5B). Clefting was detected in the maxilla and palatine bones in 50% of mutant embryos, a phenotype in PRS patients thought to be a secondary consequence of mandibular hypoplasia (Tan and Farlie, 2013). Although this link remains to be established in the mouse, the observed cleft palate is most likely the cause of postnatal lethality because of feeding or breathing difficulties (Tan and Farlie, 2013). To further quantify craniofacial defects, we performed morphometric landmarking (Ho et al., 2015;Welsh et al., 2018), focusing first on the mandible because micrognathia is a diagnostic characteristic of PRS as well as a feature of campomelic dysplasia (Foster, 1996;Robin, 1994;Tan and Farlie, 2013;Figures S6C and S6D). From this analysis, we quantified a reduction in mandible length for Wnt1::Cre2;Sox9F/+ embryos and gross changes in the shape of the ramus, including a dramatically hypoplastic coronoid process and reduced condylar process width, recapitulating aspects of human patient phenotypes with SOX9 haploinsufficiency ( Figure 5C). Notably, these differences in mandibular shape and size were fully penetrant in all embryos analyzed, regardless of the presence of cleft palate, as illustrated by principal-component analysis (PCA) based on calculated Procrustes distance ( Figures 5C and 5D). The dra-matic changes in mandibular morphology can be illustrated by projecting mandibular landmarks onto a thin plate spline ( Figure 5E).

Mouse
We next analyzed the remaining skull morphology ( Figures  S6E-S6G), and interestingly, although Wnt1::Cre2;Sox9F/+ mutant embryos with a cleft palate displayed a number of measurable skull anomalies, we did not detect significant alterations in skull length or width or midfacial length in mutant embryos without a cleft ( Figures 5F and S6H). Indeed, PCA revealed that skull shapes of non-clefted Wnt1::Cre2;Sox9F/+ animals cluster with those of the wild-type embryos and away from the clefted heterozygotes ( Figure 5G), with no significant change in overall skull shape ( Figure 5H). Therefore, despite the broad expression and function of Sox9 throughout developing craniofacial structures, the mandible exhibits heightened and fully penetrant sensitivity to a 50% reduction of Sox9 gene dosage during mouse neural crest development compared with other craniofacial structures where phenotypes are of variable expressivity.
The Mouse Orthologous EC1.45 Sequence Exhibits Conserved Spatiotemporal Activity Pattern but Weakened Contribution to Sox9 Expression Relative to Human EC1.45 To assess whether the spatiotemporal activity of the EC1.45 and EC1.25 elements was conserved for the orthologous mouse sequences (located 1.21 Mb and 1.04 Mb from the mouse Sox9 promoter, respectively; for clarity, we refer to these regions as mEC1.45 and mEC1.25), we again utilized in vivo LacZ reporter assays. Similar to the human sequence, mEC1.45 was active in the frontonasal prominence at E9.5 and also in the maxillary and mandibular processes and limb buds at E11.5 (Figures 6A, 6B, S7A, and S7B). Of note, a sub-region of this sequence was tested previously in the VISTA Enhancer Browser (mm628) (Visel et al., 2007;Figures S7A and S7C). In contrast, for the three human EC1.25 constituent enhancers with craniofacial activity ( Figure 2E), there was no reproducible activity for mouse orthologous S3 and S5 enhancers and reduced craniofacial activity for the S6 ortholog at E11.5 in a domain not including precursors of the mandible ( Figure S7D). To compare the activity of human EC1.45 and its mouse ortholog in a more quantitative assay, we performed parallel luciferase assays in hCNCCs. Although mEC1.45 is indeed an active enhancer in hCNCCs, it is a much less potent activator of luciferase expression than the human ortholog (around 15-fold lower), indicating a divergence in enhancer strength ( Figure 6C), consistent with reduced Coordinator content for mouse min2 ( Figures 4D and S5H).
Based on the conserved, albeit weak, activity of mEC1.45, we performed pronuclear injection of CRISPR-Cas9 (F) Schematic of plasmids, primers, and probes for ChIP-ddPCR for wild-type (WT) and Coordinator mutant (4x mut) min1+min2 plasmids. F, forward; R, reverse. (G) TWIST1 ChIP-ddPCR for P4 late hCNCCs transfected with the plasmids in (F), normalized to input, and WT adjusted to 1. Two biological replicates are depicted. See also Figure S5. Article ribonucleoprotein (RNP) complexes to target the region for deletion ( Figure 6D) and established two distinct founder lines (Figures 6D,S7E,and S7F). To determine the effect of mEC1.45 deletion on Sox9 expression, we crossed mEC1.45del/+ FVB mice with wild-type C57BL/6J mice, dissected craniofacial processes from embryos at E11.5 and performed allele-specific RT-ddPCR for Sox9 utilizing strain-specific SNPs ( Figure S7G). Analysis of Sox9 expression in the combined medial nasal process (MNP) and lateral nasal process (LNP), maxillary process (MxP), and mandibular process (MdP) revealed that in wildtype embryos, Sox9 was expressed at similar levels from the FVB and C57BL/6J alleles ( Figure 6E, green boxplots). In contrast, in mEC1.45del/+ embryos, Sox9 expression was significantly reduced for the FVB allele carrying the enhancer deletion, with the greatest reduction observed for the MdP ( We initially weighed surviving pups up to weaning and observed a decreased growth rate for compound mutant (CFD) animals compared with Sox9 heterozygous (CFW) animals ( Figure 6G). We next performed microCT and landmarking for E18.5 mandibles from the same cross, and Procrustes analysis followed by a Hotelling test revealed a landmark at the condylar process as most morphometrically distinct between genotypes (p < 1e-04, Figure 6H). Indeed, PCA analysis using landmarks at the condylar process clearly separated mandibular morphology for CFW and CFD embryos ( Figure 6I). These results show that additional loss of the mEC1.45 enhancer exacerbates the changes in mandible morphology observed in the conditional heterozygous Sox9 mutant ( Figures 5E and 6J). Furthermore, quantification of condylar process length and width revealed a reduction for CFD compared with CFW embryos (p < 1.1eÀ4, ANOVA; Figure 6K), whereas overall mandible length was also reduced by 3%-5% (p < 0.007, ANOVA; Figure 6L). Therefore, ablation of a developmental EC that intersects a human disease locus exacerbates PRS-like phenotypes in a sensitized genetic background.
To determine whether mEC1.45 enhancer deletion alone, which, even in a homozygous setting, is expected to cause only a 13% reduction in Sox9 expression ( Figure 6E), results in altered jaw morphology, we performed microCT analysis for E18.5 embryos obtained from a cross between heterozygous mEC1.45del/+ animals. Using all 18 mandibular landmarks, we were able to separate the wild-type (WW) and mEC1.45 homozygous knockout (DD) embryos by PCA, indicating a reproducible phenotypic alteration of mandibular shape when the mEC1.45 enhancer is ablated ( Figure 6M). A Hotelling test again revealed that the ramus was the mandibular structure most affected by changes in Sox9 dosage ( Figure 6N). Although milder, these alterations in mandibular ramus morphology are reminiscent of phenotypes observed in PRS patients, as quantified by a number of studies (Bienstock et al., 2016;Chen et al., 2015;Chung et al., 2012;Suri et al., 2010;Susarla et al., 2017;Volk et al., 2020;Zellner et al., 2017; Table S4; Figure S6C). Finally, to address whether enhancer knockout results in failure to thrive, we weighed pups up to weaning age (P20-P25) and detected a reduction in weight gain for mEC1.45 knockout animals ( Figure 6O). Collectively, these data show that even a subtle reduction in gene dosage, caused by enhancer loss, can lead to alterations of craniofacial morphology and result in reduced ability of an organism to thrive.
(O) Boxplot of postnatal growth rate (P20-P25, g/day) for WW and DD embryos. Two replicate groups plotted as residuals of linear regression; ANOVA p = 0.01473. For PCA plots, different shape markers represent independent litters. See also Figure S7 and Tables S1 and S2.  , 2013). In this study, we shed light onto these long-standing questions, identifying and characterizing two clusters of enhancers 1.25 and 1.45 Mb upstream of the SOX9 gene that fall within the PRS locus, are active during craniofacial development, make long-range contacts with the SOX9 promoter, and dynamically regulate its expression during cranial neural crest development ( Figures 7A and 7B). Importantly,

OPEN ACCESS
Article these enhancers become inactive following hCNCC differentiation to chondrocytes, defining a developmental window for the etiology of craniofacial phenotypes observed in PRS ( Figure 7A).
In some enhanceropathies, a number of patient-specific mutations overlap to reveal a single minimal element that is disrupted in the disorder (e.g. Gonen et al., 2018;Kort€ um et al., 2011). In contrast, many of the PRS deletions described to date are non-overlapping and harbor one or the other EC identified here (Amarillo et al., 2013;Benko et al., 2009;Gordon et al., 2014). Interestingly, PRS patients with translocation breakpoints, in which both EC1.25 and EC1.45 are lost, appear to display more severe phenotypes (Benko et al., 2009). This suggests that loss of distinct enhancers that quantitatively affect SOX9 dosage in CNCCs can lead to similar disease outcomes, whereas loss of the broader regulatory region via chromosomal translocation can have an additive effect on both SOX9 gene dosage and lower jaw morphology. Indeed, our mouse modeling revealed that lower jaw development is sensitive to even small perturbations in Sox9 gene dosage, with a range of phenotypes of increasing severity observed over a range of reductions in Sox9 expression ( Figures 5 and 6).
In our analysis of the 1.45 Mb EC, the two constituent p300binding regions within EC1.45 are individually weak enhancers but display a striking combinatorial synergy far greater than the sum of the individual activities ( Figure 7A). Previous studies looking at the relationship between multiple enhancers within super ECs have supported additive or redundant rather than synergistic activity of the constituent enhancer elements (Dukler et al., 2016;Hay et al., 2016;Moorthy et al., 2017;Shin et al., 2016).
The regulatory elements identified in our study represent the longest-range developmental enhancers involved in congenital malformations that have been described to date, at a distance of nearly 1.5 Mb from the regulated gene promoter. These enhancers provide a valuable paradigm for continuing investigation of long-range developmental gene regulation and its perturbation in human disease, and they join a small class of documented extreme long-range regulatory sequences that activate transcription at a more than 1-Mb genomic interval, such as the Shh ZRS and the Myc BENC and MNE enhancers (e.g., Bahr et al., 2018;Herranz et al., 2014;Lettice et al., 2003;Uslu et al., 2014). The enormous genomic distance begs questions about how the PRS-associated ECs can communicate with the SOX9 promoter to drive tissue-specific regulation in a precise and robust manner. Interestingly, one of the PRS-associated candidate ECs, EC1.35, which does not harbor activity in reporter assays, contains a constitutive CTCF binding site and interacts with the SOX9 promoter already in hESCs. This interaction is further augmented in hCNCCs, along with contacts between all three PRS-associated enhancers and between EC1.45 and EC1.25 and the SOX9 promoter. Remarkably, a recent study where the centromeric Sox9 TAD boundary was deleted in mice showed no significant effect on Sox9 expression or examined phenotypes (Despang et al., 2019), suggesting that TAD integrity may not be required for these long-range interactions and Sox9 regulation. However, although the EC1.35 element may not act as a canonical enhancer, it may instead participate in organizing extreme long-range contacts at the SOX9 locus via formation of CTCF-cohesin-mediated chromatin loops. When attempting to model human non-coding mutations at the orthologous PRS locus in mice, there were a number of challenges to consider. First, because of extensive reshaping of mammalian genomes during evolution by transposons and other genomic forces, many functional human enhancer regions do not have orthologous sequences in mice (Chuong et al., 2017;Villar et al., 2015;Yue et al., 2014). Second, even when the orthologous sequence exists, its regulatory activity may not be conserved or can differ in strength or relative contribution to the target gene dosage (Denas et al., 2015;Shen et al., 2012). This second challenge is well illustrated by EC1.45; although the orthologous sequence is present in mice and its spatiotemporal activity is conserved, mEC1.45 is a substantially weaker enhancer compared with its human counterpart, perhaps compensated for by additional mouse-specific CNCC enhancers at the locus ( Figure S7I). Consequently, deletion of mEC1.45 results in only an~6%-13% decrease in Sox9 expression level compared with the 50%-55% SOX9 reduction seen for human EC1.45 deletion. Nonetheless, it is quite remarkable that even such a slight reduction in Sox9 gene dosage results in measurable changes in lower jaw shape and reduction in postnatal growth.
Despite the caveats outlined above, from our combined human and mouse results we can propose a model for the specificity of PRS manifestations where mutations at the far end of the SOX9 gene desert perturb broadly active craniofacial developmental ECs and affect SOX9 expression across the cranial neural crest. However, the heightened sensitivity of the mandible to SOX9 gene dosage further restricts the manifestations to micrognathia, which can, in sequence, lead to additional PRSassociated phenotypes ( Figure 7C). Our work raises the interesting question of why the mandible is more sensitive to Sox9 dosage perturbation despite the broad expression of Sox9 across craniofacial structures. We suggest two potential hypotheses. In the first, we note that distinct transcription factors and signaling components are expressed in the future upper and lower jaw during development; for example, high levels of Hand2 and Dlx5/6 are expressed in the mandibular but not MxP (Beverdam et al., 2002;Funato et al., 2016). Loss of this patterning through ablation of the upstream Edn1/Ednra signaling pathway leads to a striking jaw transformation (Minoux and Rijli, 2010). Therefore, if spatially restricted morpho-regulatory programs such as these are differentially sensitive to Sox9 activity, then this could lead to tissue-selective effects on craniofacial development. An alternate hypothesis for the observed mandibular sensitivity to Sox9 perturbation could be related to the differences in the trajectory of craniofacial skeletal development. In a somewhat atypical process, formation of the mandible is intimately associated with a cartilage ''template'' called Meckel's, whereas, in contrast, the midfacial skeleton forms strictly via intramembranous ossification independent of any cartilage precursor. Therefore, should perturbation of Sox9 expression in CNCCs affect the propensity or ability to differentiate into chondrocytes, it could account for the selective effect on mandibular development.
From an evolutionary standpoint, the mandible is extremely interesting because it is has widely divergent forms related to feeding and predation (Albertson and Kocher, 2006; Martinez et al., 2018). Furthermore, mandible shape evolution in hominins ll OPEN ACCESS Article appears to be exceptionally rapid compared with any other primate clade (Raia et al., 2018) and includes shape changes within the ramus, including the condylar and coronoid processes and gonial angle-structures that are especially sensitive to slight alterations in Sox9 expression in our mouse models (Meloro et al., 2015;Terhune et al., 2014Terhune et al., , 2018. It is therefore tempting to speculate that some of this morphological divergence could be mediated by regulatory changes leading to minor differences in SOX9 expression levels during CNCC development. In fact, EC1.45, featured in this study, overlaps a Neanderthal-specific hypomethylated region from bone samples (based on reconstructed DNA methylation maps; Gokhman et al., 2014Gokhman et al., , 2020Figures 7A, 7D, and S6I). Although somewhat speculative, this suggests that the Neanderthal enhancer element might have retained regulatory activity longer during development (because DNA methylation is generally associated with silencing) compared with the human enhancer, which becomes decommissioned during chondrogenesis and is hypermethylated in human bones of various origins ( Figures 3C-3E). Together, the PRS locus enhancers represent a fascinating locus for future investigation of extreme long-range gene regulation in development and disease, and across evolutionary time.

Limitations of Study
As outlined above, there are a number of challenges and limitations when attempting to model a human enhanceropathy in mice because of remodeling of the enhancer landscape across evolutionary time. In our study, this is exemplified by the weakened enhancer activity of mE1.45 compared with the human counterpart, and associated lower contribution to Sox9 expression. An additional limitation relates to the mouse strains used in the study; Wnt1::Cre and Sox9F/F mice are on an C57BL/6J background, whereas mEC1.45 deletion was generated on the FVB background. These different genetic backgrounds may cause a differential sensitivity to Sox9 perturbation because of other modifying variants in the genome.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

ACKNOWLEDGMENTS
We acknowledge the Stanford Functional Genomics Facility (SFGF) for nextgeneration sequencing (NIH S10 Shared Instrumentation Grant S10OD018220) and the Stanford Center for Innovation in In-Vivo Imaging (SCI3) for use of the Bruker Skyscan 1276 (NIH S10 Shared Instrumentation Grant 1S10OD02349701, PI Timothy C. Doyle). We thank Dr. David Gokhman for sharing hominin DNA methylation data, Fabian Suchy for ddPCR advice, Dr. Jaaved Mohammed for genomics advice, and Dr. Marta Losa Llabata and Dr. Antoine Zalc for help and expertise with mouse work. We thank Dr. Seungsoo Kim and Dr. Sahin Naqvi for critical reading of the manuscript.

Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Joanna Wysocka (wysocka@stanford.edu).

Materials Availability
DNA constructs and other research reagents generated by the authors will be distributed upon request to other researchers.

Data and Code Availability
The accession number for Gene Expression Ombnibus (GEO) where the ChIP-seq, ATAC-seq, and RNA-seq datasets generated in this study are available is: GSE145327 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE145327). The accession number for the Sequence Read Archive (SRA) BioProject where the 10X Genomics linked-read sequencing data is available is: PRJNA648128. For breeding, two female mice were introduced into a cage with a single male and monitored for timed pregnancies. To generate compound heterozygous mice carrying a deletion of mEC1.45 on one allele and a conditionally deletable Sox9 gene on the other allele, we crossed the mEC1.45 enhancer deletion lines to Wnt1::Cre2 driver females, and then crossed the resultant Wnt1::Cre2;-mEC1.45del/+ females to Sox9F/F males. Mice were genotyped by tail clipping, lysis with Proteinase K in tail buffer (0.2M NaCl, 0.2% SDS, 0.05M EDTA and 0.1M Tris-HCl pH 8.0), precipitation with isopropanol, and analytical PCR using genotype-specific primers and Dream Taq Master Mix (Thermo Fisher Scientific). To monitor weight-gain, mice were weighed from birth to post-natal day 25 (P0-P25) using a scale with 2 decimal places (accuracy ± 0.01 g). Males and females were included for embryonic assays at E9.5 and E11.5, and for post-natal weighing.

METHOD DETAILS
Mouse genome editing using CRISPR/Cas9 Mouse orthologous mEC1.45 was deleted in vivo using CRISPR-Cas9 editing in the FVB strain as previously described (Osterwalder et al., 2018). Briefly, pairs of sgRNAs were designed to target upstream and downstream of the enhancer sequence to be deleted using CHOPCHOP (Labun et al., 2016) and Benchling (https://www.benchling.com/). sgRNAs were generated using a modified version of a previously published oligo assembly protocol (Varshney et al., 2015) . In this process an oligo encoding a T7 promoter and the guide RNA sequence were annealed to a second, generic oligo, and Phusion polymerase (NEB) was used for extension. The guide RNA was synthesized using HiScribe T7 Quick High Yield RNA Synthesis kit (NEB) and purified using the MEGAclear Transcription Clean-up kit (Ambion) prior to quantification. A mix containing Cas9 protein (final concentration of 20 ng/ul; IDT Cat. No. 1074181) and four sgRNAs (12.5 ng/mL each) in an injection buffer (10 mM Tris, pH 7.5; 0.1 mM EDTA) was injected into the pronucleus of FVB mouse embryos at the single-cell stage. F0 founder mice were genotyped using primers spanning the desired deletion region and High-Fidelity Platinum Taq polymerase (Thermo Fisher Scientific) to identify deletion breakpoints, which were validated and mapped using Sanger sequencing. Deletions were validated in second generation F1 animals, and heterozygous animals were crossed to generate homozygous and heterozygous animals for breeding. Two founder lines were established with deletions differing by 58bp. Founder 1 has a 3572bp deletion, and Founder 2 has a 3630bp deletion.

Generation of CRISPR/Cas9 genome-edited cell lines
Human ESCs were targeted for enhancer deletion using two strategies. In the first strategy, H9 hESCs were transfected using Fu-GENE 6 (Promega) with a targeting construct containing Blasticidin selection cassette, flanked by FRT sites, and homology arms for either side of EC1.45 along with a plasmid encoding Cas9 plus single guide RNAs (sgRNAs) flanking EC1.45. Transfected hESCs were grown to confluency and split onto a new plate before selection with 1 mg/mL Blasticidin until all cells died on a mock/GFP transfected control well. Surviving colonies were picked into a 48-well plate, expanded, split and screened for enhancer deletion using a genomic primer and a primer in the targeting cassette. Heterozygous enhancer deleted clones were infected with 1E+08 pfu/mL Adeno-flippase (Ad5CMVFlpO, Fred Hutchinson Cancer Research Center) and clones screened for excision of the selection cassette by PCR. For screening, genomic DNA was extracted using QE buffer (Lucigen) and PCR was performed using Q5 polymerase (NEB). Heterozygous enhancer deletions were generated to be in keeping with the heterozygous deletions seen in PRS patients, and to enable allele-specific SOX9 gene expression analysis.
In the second targeting strategy, a scar-less editing methodology was performed (Ikeda et al., 2018). A targeting construct was designed to insert adjacent to EC1.25 in H9 hESCs, containing a hUbC promoter driving expression of an eGFP-T2A-tCD8 cassette. H9 hESCs were nucleofected with 3 mg this construct and 3 mg of a plasmid encoding Cas9 and a single guide RNA targeting one side of EC1.25 using an Amaxa 4D nucleofector (pulse code CB150). GFP positive cells were isolated by FACS after 7-12 days and plated onto a 6-well plate. GFP positive colonies were then picked, expanded and screened for integration of the targeting cassette using primers within the targeting cassette and flanking genomic sequence. Genomic DNA was extracted using QE buffer (Lucigen) and PCR screening was performed using PrimeSTAR GXL DNA Polymerase (Takara) and heterozygous targeted clones were isolated. A second cassette was designed to excise the first targeting construct as well as EC1.25 using enhancer-flanking homology arms with no extra exogenous sequences, to leave a scar-less deleted enhancer region. To generate matched wild-type clones, a wildtype homology template was used to excise the targeting cassette. Colonies were selected by screening for loss of tCd8 by magnetic activated sorting (MACS), plated onto a 6-well plate, and following dilute re-plating, colonies were picked into a 48-well plate. Clones were passaged and screened using primers flanking EC1.25 to identify positive clones with EC1.25 deleted on one allele, or excision of the targeting construct for the matched wild-type controls. Genomic DNA was extracted using QE buffer (Lucigen) and PCR screening was performed using Q5 polymerase (NEB).
To differentiate hCNCCs to chondrocytes, passage 3 hCNCCs were passaged to passage 4, and the following day were transitioned to chondrocyte media without TGFb3 (ChM). ChM: DMEM-HG, 5% FBS, 1x ITS premix, 1mM sodium pyruvate, 50 mg/mL ll OPEN ACCESS Article ascorbic acid, 0.1 mM dexamethasone and 1x antibiotic/antimycotic. The following day, cells were fed with chondrocyte media with TGFb3 (ChMT). ChMT: ChM + 10 ng/mL TGFb3. Cells were fed every subsequent 3 days with ChMT. Cells were harvested at day 5 and/or 9 of the differentiation for the majority of assays.
To evaluate the chondrogenic differentiation, we performed qRT-PCR for two independent experiments, and assessed the differentiation from P4 hCNCCs to Day 9 of chondrogenic differentiation. COL2A1 and Aggrecan (ACAN) are known to be directly regulated by SOX9 during chondrogenesis, COL2A1 is an early marker of the chondrocyte lineage, while ACAN is a marker of overtly differentiated chondrocytes. SOX5 and SOX6 are two SOX family transcription factors that are co-expressed with SOX9 in chondrocytes, and all three factors often form a trio at regulatory elements to promote chondrocyte differentiation. Notably, SOX5 is induced early during our in vitro differentiation. BMP2 is a marker of hypertrophic chondrocytes and is essential for chondrocyte proliferation and maturation.

Alcian Blue Staining
Alcian Blue stains extracellular matrix proteoglycan components associated with chondrocytes. hCNCCs were differentiated to chondrocytes and fixed with 4% PFA for 15 minutes. Cells were washed three times with 1x PBS, and incubated overnight in 20% sucrose at 4 C. The following day, cells were stained with Alcian Blue solution (pH 2.5) for 30 minutes. Alcian Blue solution was prepared by diluting 1g Alcian Blue, 8GX in 100mL 3% Acetic Acid solution, pH was adjusted to 2.5 with acetic acid. Following Alcian Blue staining, cells were washed with 3% acetic acid for 3 minutes, followed by several washes with 95% ethanol. Ethanol was removed and cells were imaged.
Capture-C Capture-C was performed as previously described (Davies et al., 2016). Briefly, cells were crosslinked for 10 min in 2% formaldehyde in PBS, quenched with 125mM glycine for 5 min, scraped, collected and pelleted by centrifugation (500 rcf., 5 min, 4 C). Cells were washed with 5mL cold PBS, pelleted and resuspended in 5mL cold lysis buffer (10mM Tris pH8, 10mM NaCl, 0.2% Ipegal CA-630 in water with cOmplete Protease Inhibitor Cocktail) for 20 min. Cells were pelleted, washed with 5mL cold PBS, pelleted, resuspended in 1mL cold PBS, flash frozen in liquid nitrogen and stored at À80 C. For 3C library preparation, samples were defrosted on ice, pelleted and resuspended in 1xDpnII digestion buffer before extended digestion with DpnII overnight with addition of extra enzyme. DpnII was then heat inactivated at 65 C and samples were ligated with T4 Ligase for~22 hours. DNA was extracted by sequential Proteinase K and RNase digestion followed by phenol-chloroform isoamylalcohol extraction and precipitation with ethanol and sodium acetate. Efficiency of digestion and ligation was assessed by gel electrophoresis and digestion efficiency was further assessed by qPCR. For addition of Illumina sequencing adaptors, samples were sheared by Covaris sonication and purified by XP SPRI bead clean-up. Sequencing adaptors were annealed using NEB Ultra II kit. Libraries were PCR amplified using Herculase II polymerase (Agilent) in duplicate to add indexing primers and purified by XP SPRI bead clean-up. Samples were pooled and quantified using Qubit dsDNA BR assay kit.
Biotinylated oligos for capture were designed using the Capsequm online tool (http://apps.molbiol.ox.ac.uk/CaptureC/cgi-bin/ CapSequm.cgi) and ordered from IDT. 1-2 mg indexed 3C library was mixed with 5 mg COT DNA, 1nmol TS-HE Universal Oligo and 1nmol of TS-HE Index Oligo (Nimblegen SeqCap EZ HE-oligo and Accessory kit) and dried by vacuum centrifugation at 55 C until completely dry. The 3C library plus blocking oligos were then carefully resuspended in 7.5 mL 2X hybridization buffer and 3 mL Hybridization buffer A and denatured at 95 C for 10min. The 3C library was then transferred into a preheated PCR tube at 47 C containing 4.5 mL of pooled Biotinylated oligonucleotide capture probes at 2.9 mM. After a brief mix and centrifugation, the 3C library oligo mix was incubated at 47 C for 18-20 hr. Following this incubation, the 3C library was washed and recovered by streptavidin bead (M-270 Dynabeads, Invitrogen) pull down using the Nimblegen SeqCap EZ Hybridization and wash kit. Following the recovery of the captured material, the captured DNA was amplified on the streptavidin beads using KAPA HiFi HotStart ReadyMix and POST-LM PCR oligo 1&2 and purified by XP SPRI bead clean-up. For improved capture efficiency, a second round of capture was performed on the total amplified DNA from the first capture. Following the second round of capture, the library was quantified using KAPA library quantification kit using the average size calculated from Bioanalyzer. Libraries were then sequenced on Illumina HiSeq-4000 (2x 150bp).

RNA isolation and preparation of RNA-seq libraries
Total RNA was extracted from a 6-well of hESC, early (P1 and 2) and late (P3 and P4) hCNCCs and day 9 chondrocytes differentiated from hCNCCs using Trizol reagent (Invitrogen) for four independent differentiations. 10 mg RNA was purified twice by Dyna1 oligo(dT) beads (Invitrogen) to enrich for poly(A) + mRNA. The mRNA was then fragmented using 10X fragmentation buffer (Ambion) for exactly 5 min and purified by ethanol precipitation with sodium acetate and RNase-free glycogen. First strand synthesis was performed using Random Hexamer Primers (Invitrogen) and SuperScript II (Invitrogen), followed by second strand synthesis using DNA PolI and RNa-seH (Invitrogen), and cDNA was purified using Nucleospin Gel and PCR Cleanup (Takara). All of the cDNA was used for library preparation by end repair, A-tailing and adaptor ligation (NEB). The samples were treated with USER enzyme, purified using XP SPRI beads then subjected to dual size selection using XP SPRI beads using bead ratios 0.55x to remove > 700bp followed by 0.85x to recover > 200bp sized cDNA. Size-selected cDNA was amplified using NEBNext HiFi 2X PCR mix and Dual Index Primers (NEB, E7600) for 7-10 cycles (as determined by qPCR). Libraries were then purified using XP SPRI beads, and quantified using Qubit dsDNA HS assay kit and pooled for sequencing using average library size (bp) from Bioanalyzer and concentration from KAPA quantification (Kapa Biosystems). Libraries were sequenced using NextSeq 500 (2x 75 bp).
10X Genomics Linked-Read sequencing High molecular weight genomic DNA (HMW gDNA) was generated from H9 hESCs by the salting out method (10x Genomics, manual CG000116) and quality was checked on a FEMTO pulse instrument (Agilent). Linked read libraries were prepared according to the manufacturer's instructions (10x Genomics, manual CG00043) and sequenced on a HiSeq 4000 instrument (2 lanes, 2x 150 bp).
cDNA preparation and reverse transcriptase digital droplet PCR (RT-ddPCR) Total RNA was extracted from early (Day 11, P1 and 2) and late (P3 and P4) hCNCCs and day 5 and 9 chondrocytes differentiated from hCNCCs for wild-type or enhancer mutant cell-lines using Trizol reagent (Invitrogen) for at least four differentiations, or from dissected craniofacial prominences from at least four embryos. 100ng -1ug RNA was used to generate cDNA using the SuperScript Vilo IV MasterMix with ezDNase enzyme (Invitrogen). Primers and locked nucleic acid (LNA) probes were designed by IDT's custom design service to the human SOX9 or mouse Sox9 3 0 UTR. For H9 hESC samples, LNA probes were centered on the rs74999341 T/C SNP -a HEX LNA probe detects the T-allele and FAM detects the C-allele. For mouse samples, LNA probes were centered on a G/C SNP in the C57BL/6J versus FVB mouse strains respectively -a HEX LNA probe detects the G-allele and FAM detects the C-allele. cDNA dilution factor was determined using qPCR with 1X PrimeTime Gene Expression Master Mix (IDT), 500nM primers and 250nM probes, run on LightCycler 480 (Roche). ddPCR reactions were performed using diluted cDNA (10-100X diluted), 900nM primers and 250nM probes and 1X ddPCR Supermix for probes (no dUTP, BioRad). ddPCR droplets were generated using the QX200 Droplet Generator (BioRad) and droplets were read using QX200 Droplet Reader (BioRad) and analyzed using the QuantaSoft Software (BioRad).
Chromatin immunoprecipitation (ChIP) 5-15 million cells were cross-linked per ChIP experiment in 2mL PBS per 6-well with 1% methanol-free formaldehyde for 5-10 min and quenched with a final concentration of 0.125M glycine for 5 min with nutation. Cross-linked cells were washed with PBS, scraped and pelleted by centrifugation, flash-frozen in liquid nitrogen and stored at À80 C. Samples were defrosted on ice and resuspended in 5mL LB1 (50 mM HEPES-KOH pH 7.5, 140 mM NaCl, 1 mM EDTA, 10% glycerol, 0.5% NP-40, 0.25% Triton X-100, with 1X cOmplete Protease Inhibitor Cocktail and optionally 1mM PMSF) and rotated vertically for 10 min at 4 C. Samples were centrifuged for 5 min at 1350 x g at 4 C, and resuspended in 5mL LB2 (10 mM Tris, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, with 1X cOmplete Protease Inhibitor Cocktail and optionally 1mM PMSF) and rotated vertically for 10 min at 4 C. Samples were centrifuged for 5 min at 1350 x g at 4 C, and resuspended in 1mL LB3 per 10 million cells (maximum concentration of cells for Covaris sonication), or 1 mL per ChIP. Samples were sonicated in 1mL AFA tubes for 5 min on E220 evolution Covaris with settings Peak power = 140, Duty Factor = 10, Cycles per burst = 200 to achieve chromatin sized approximately 500-2000bp.
For library preparation, samples were quantified by Qubit dsDNA HS assay kit, and 10-30ng of ChIP DNA was used for library preparation with end repair, A-tailing, and adaptor ligation (NEB). Following USER enzyme treatment, samples were purified using Nucleospin Gel and PCR Cleanup (Takara) and separated by gel electrophoresis and size-selected for 220-500 bp by gel extraction. Libraries were then amplified to add indices using NEBNext HiFi 2X PCR mix and NEBNext Multiplex Oligos for Illumina kit (NEB, E7335S) with 9-15 cycles (as determined by qPCR). ChIP libraries were purified by two rounds of XP SPRI bead clean-up to deplete adaptors. Library concentration and quality was assessed by Bioanalyzer (to determine size) and KAPA qPCR was used to pool multiple libraries. Samples were sequenced using NextSeq or HiSeq 4000 platform (2x 75bp).
Episomal ChIP-ddPCR Around 5 million passage 4 hCNCCs were transfected with 1.5 mg wild-type (WT) and 1.5 mg Coordinator mutant (4x mut) luciferase min1-min2 reporter plasmid in 300 mL optimum with 9 mL Fugene-6. Cells were fixed after 24 hours and ChIP performed as described above. 9 mg TWIST antibody was used per ChIP (Abcam, ab50887). Primers were designed to amplify across the min2-plasmid backbone junction, and plasmid-specific probes were designed to distinguish between the wild-type and Coordinator mutant sequences.
ATAC-seq ATAC-seq was performed as described previously (Buenrostro et al., 2013;Corces et al., 2017) Briefly, cells were dissociated and treated with DNaseI (Worthington) and 50,000 viable cells were sorted as DAPI negative. Cells were pelleted at 500 RCF for 5 min at 4 C and resuspended in ATAC-resuspension buffer containing 0.1% NP40, 0.1% Tween20, and 0.01% Digitonin and incubated on ice for 3 minutes. Following wash-out with cold ATAC-Resuspension Buffer (RSB, 10 mM Tris-HCl pH 7.4, 10mM NaCl, 3mM MgCl2 in sterile water) containing 0.1% Tween20, cells were pelleted and resuspended in 50 mL transposition mix (25 mL 2x TD buffer, 2.5 mL transposase (100nM final), 16.5 mL PBS, 0.5 mL 1% digitonin, 0.5 mL 10% Tween20, 5 mL H 2 O) and incubated for 30 minutes at 37 C with shaking. The reaction was purified using the Zymo DNA Clean and Concentrator Kit, and amplified using PCR primers defined in Buenrostro et al. (2013). Libraries were purified using the Zymo DNA Clean and Concentrator Kit, quantified using the KAPA Library Quantification kit and quality assessed by Bioanalyzer (also used to determine average size for pooling libraries). Samples were sequenced on the HiSeq4000 platform (2x 75bp).
Cloning PRS locus enhancers for luciferase assay, including Coordinator mutant and vertebrate species min2 sequences EC1.45, as well as combined constituent enhancers for EC1.35 (S1 and S2) and EC1.25 (S3, S4, S5 and S6) were PCR-amplified from H9 hESC genomic DNA and cloned into the pGL3 luciferase reporter vector. To generate Coordinator mutant sequences, Coordinator motifs were identified in EC1.45 using fimo from the meme suite (Grant et al., 2011). Coordinator motifs were then mutated in silico at positions with greatest information content in the PWM, to resemble sequence changes associated with reduction in enhancer activity during human-chimpanzee divergence (Prescott et al., 2015). To synthesize orthologous min2 sequences, Multiz Alignments of 100 Vertebrates (UCSC) was used to identify orthologous sequences for mouse, opossum, platypus, lizard, chicken, frog and coelacanth which were then extended to each be 267 bp long. Sequences containing mutant EC1.45 Coordinator motifs and vertebrate min2 sequences were ordered from TWIST Bioscience (or IDT for coelacanth sequence) and cloned into the pGL3 luciferase reporter vector.

Luciferase assay
Luciferase assays were performed as described previously (Prescott et al., 2015). Briefly, H9 hESC were differentiated following the hCNCC differentiation protocol and passaged to passage 3. For hCNCCs, cells were transfected immediately following passaging to passage 4 in 24-well plates. For hESCs, cells were split the day before into a 24 well plate in ROCKi (Y27632) and transfected the following day. For chondrocytes, passage 3 hCNCCs were split into a 24-well plate, and the now passage 4 hCNCCs were transitioned to ChM media for 1 day, and ChMT media for 4 more days before being transfected for the luciferase assay. Transfections were performed in technical triplicate in a single experiment, with each well receiving 10ng of pGL3 plasmid, 0.5ng of control pRL firefly renilla plasmid, 89.5 mL carrier DNA (circularized pGEMT plasmid) and 0.3 mL Fugene 6 in 50 mL of optimum. The pGL3 plasmid contains the firefly luciferase gene driven by an SV40 promoter with either a control SV40 enhancer downstream, or a test enhancer sequence cloned upstream (Promega), the pRL plasmid acts as a transfection control with Renilla luciferase driven by an upstream CMV enhancer and CMV promoter (Promega). 24 hours after transfection, cells were washed in PBS, and lysed in 150 mL 1X passive lysis buffer (in PBS) for 15 min (Promega). 20 mL lysate was then transferred to an opaque flat-bottomed plate for reading with a luminometer (Veritas). An automated injector added 100 mL LARII reagent and the well was read using the following parameters: 2 s delay, 10 s integration. 100 mL Stop-and-Glow reagent was then injected into the well and read using the same parameters. Luciferase assays were repeated in biological duplicate or triplicate with at least two different DNA preps; empty vector and SV40 enhancers were included in each experiment as negative and positive controls, respectively.
LacZ in vivo reporter assay A second generation LacZ reporter vector was cloned with a Hsp68 promoter driving expression of LacZ-P2A-tdTomato flanked by core insulator sequences to minimize position effects of site of integration and with a multiple cloning site upstream for cloning enhancer sequences to be tested. EC1.45 was subcloned into this vector from the luciferase reporter, as was S1-S2 combined EC1.35 enhancer cluster. The constituent enhancers from EC1.25 (S3, S4, S5 and S6) were triplexed and inserted into the multiple cloning site. The reporter plasmids were linearized and injected into fertilized mouse oocytes, implanted into recipient females and allowed to develop to two distinct mouse embryonic stages, E9.5 or E11.5, that represent distinct periods of craniofacial development (Cyagen -for EC1.45, EC1.35 and EC1.25-S5; LBNL for remaining constructs). Embryos were harvested and processed for Xgal staining to reveal spatial expression pattern of the LacZ gene under the control of the putative cloned enhancer sequence. To be considered reproducible, craniofacial expression patterns had to be observed in at least three embryos (Visel et al., 2007). Embryos were imaged using a Leica M205 FA Stereo Microscope coupled to a Leica DFC7000T digital camera or a Leica MZ16 microscope coupled to a Leica DFC420 digital camera. Preparation of in situ hybridization probes Murine cDNA probes for Sox9 corresponding to nucleotides 2443-3588 of RefSeq NM_011448 were TA-cloned from E14.5 C57BL/6 cDNA. The Sox9 probe was linearized and used to transcribe a DIG-labeled antisense probe using an in vitro transcription kit (Roche), following the manufacturer's instructions.

In situ hybridization
In situ hybridization was performed as described previously (Welsh et al., 2018). Embryos (C57BL/6J) were collected at E9.5 and E11.5, dissected into cold PBS and fixed overnight in 4% PFA in PBS at 4 C. Embryos were dehydrated through a methanol-PBST (PBS + 0.1% Tween20) series and stored at À20 C in 100% methanol. For hybridization, embryos were bleached in 6% hydrogen peroxide in methanol for 1 hour, followed by rehydration, treatment with Proteinase K (10 mg/mL) for 10 minutes at 37 C, and then washed twice in ice cold PBST. Embryos were re-fixed in 4% PFA/0.2% glutaraldehyde in PBST for 20 minutes then washed twice in PBST. Embryos were then transferred to a prehybridization solution for 1 hour at 68 C followed by overnight incubation with the Sox9 riboprobe at 68 C. The following day, embryos were first rinsed with hybridization buffer, then washed 7 times (30 minutes each) with 2XSSC/50%formamide/0.1% Tween at 68 C. Subsequently, embryos were washed twice with TBST, twice with MABT, and blocked in 10% blocking solution (Roche) diluted in MABT for 1 hour at room temperature. Embryos were then incubated with anti-digoxigenin-AP antibody (Roche) at a concentration of 1:4000 in 1% blocking solution overnight at 4 C with rocking. The following day, embryos were washed extensively in TBST buffer at room temperature for 4 days. Colorimetric detection was performed with BM-Purple Chromogenic Reagent (Roche). Lastly, embryos were washed in PBST and post-fixed with 4% PFA and stored in PBS containing 0.05% sodium azide. Embryos were imaged using a Leica M205 FA Stereo Microscope and a Leica DFC7000T digital camera. A minimum of three embryos were processed per stage.
High resolution episcopic microscopy (HREM) HREM was performed for a representative LacZ reporter embryo for human EC1.45. In HREM, the embedding medium is made highly fluorescent via addition of eosin. The embryo signal is detected as suppression of this fluorescence, in addition chromogenic substrates, for example those used in our LacZ reporter embryos are detected at distinct wavelengths from tissue morphology (Mohun and Weninger, 2012a). The embryo was embedded in JB-4 embedding solution as previously described (Mohun and Weninger, 2012b), sectioned with an SM2500 motorized microtome (Leica) and the block face was concurrently imaged using custom imaging apparatus. Visualization of HREM images was performed in Amira.

MicroCT and mandibular morphometry
Mouse embryos were collected at E18.5 of development and fixed in 4% PFA. MicroCT was performed using a Bruker Skyscan at 15um resolution, 0.25mm Al filter, 415ms exposure, 2k resolution and images were acquired every 0.5 degrees for 180 degrees.

QUANTIFICATION AND STATISTICAL ANALYSIS
Analyzing and plotting luciferase data Multiple independent luciferase experiments were performed with independent plasmid preparations and distinct cell passages or differentiations. The number of independent replicates for each luciferase assay is indicated in the figure legend, or in the plot as n = .
For each biological replicate experiments, technical triplicate transfections were performed for each plasmid. Empty vector and SV40 enhancer were included in all experiments as negative and positive controls, respectively. When testing enhancer activity in chondrocytes, an additional COL2A1 intron1 enhancer was included which is active in both hCNCCs and chondrocytes. To plot multiple luciferase experiments on the same plot, linear regression was performed in R and residuals were plotted as boxplots. For ease of visualization, the median value for the empty vector control was set to 0. Significance of activity change were determined by t-test.
Fimo analysis was used to detect Coordinator motifs in the vertebrate min2 sequences, and the fimo scores were summed to estimate the relative number and match to consensus of Coordinator motifs for each species. The summed fimo score was compared to the luciferase signal and linear regression analysis performed to define the relationship between the two measures and ANOVA was used to determine the significance.
Analyzing and plotting ddPCR data Concentration of the two alleles of SOX9 or Sox9 was determined by RT-ddPCR using allele-specific HEX/FAM probes from the QuantaSoft Software. To plot allelic skew, concentration for the mutated allele was divided by the concentration for the wild-type allele and plotted as a ratio (red boxplots). For matched wild-type cells (green boxplots), the same ratio was plotted; this revealed the presence of an allelic skew even in unedited cells suggesting that polymorphisms between the two alleles can drive differences in allelic expression. For Figure 3C, samples were collected for two wild-type cell lines and two mutant EC1.45del/+ cell lines for two independent differentiations. For Figure S4F, samples were collected for five wild-type cell lines and five mutant EC1.25del/+ cell lines for three independent differentiations.
To calculate the % change in Sox9 expression upon mEC1.45 deletion, the mean allelic skew for the edited embryo facial tissues (mEC1.45del/+) was divided by the mean allelic skew from matched wild-type tissues to account for the normal level of allelic skew detected in unedited wild-type samples. For Figure 6E, four wild-type and four mEC1.45del/+ embryos were dissected at E11.5. For Figure S7H, six wild-type and seven mEC1.45del/+ embryos were dissected at E9.5 (from two litters). Significant differences in allelic skew were determined by t-test.
For TWIST1 ChIP-ddPCR, the concentration of wild-type or Coordinator mutant plasmid were calculated for ChIP and input samples from the QuantaSoft Software using plasmid-specific HEX/FAM probes and primers spanning the min2 sequence and the plasmid backbone. A ratio was calculated between the TWIST ChIP and input samples, and normalized to 1 for the wild-type plasmid, revealing that TWIST1 binding was dramatically reduced on the Coordinator mutant plasmid.

External datasets
External next generation sequencing data were downloaded from the Sequence Read Archive (SRA) and analyzed as below.
ChIP-seq analysis ChIP sequencing reads were trimmed using skewer and aligned to the human genome (hg19) using bowtie2. For normalization, bedgraph files were generated from aligned bam files using bedtools genomecov with -scale option to normalize to 1 million reads. For visualization, bigwig files were generated using bedgraphToBigWig. Peak calling was performed using macs1.4, for replicate experiments the intersect tool from bedtools was used to identify peaks present in both replicates.

ATAC-seq analysis
Nextera adaptor sequences were trimmed from ATAC sequencing reads using cutadapt and aligned to the human genome (hg19) using bowtie2. For normalization, bedgraph files were generated from aligned bam files using bedtools genomecov with -scale option to normalize to 1 million reads. For visualization, bigwig files were generated using bedgraphToBigWig. Peak calling was performed using macs1.4, for replicate experiments the intersect tool from bedtools was used to identify peaks present in both replicates.
RNA-seq analysis RNA sequencing reads were trimmed using cutadapt and aligned to human gene models (hg38) using HISAT2. Read counts per gene were quantified using featureCounts, gene expression was normalized to fragments per million and plotted in R.

Capture-C analysis
Capture-C analysis was performed using bespoke analysis scripts outlined here: http://userweb.molbiol.ox.ac.uk/public/telenius/ captureManual/oligofile.html. For comparison across samples, Capture-C profiles were plotted as the number of unique interactions per restriction fragment normalized to 10,000 interactions in cis. For quantification, normalized interactions in cis were extracted for DpnII fragments overlapping the feature of interest and plotted as a boxplot in R. Statistical significance was determined by t-test for successive stages of hCNCC differentiation.
10X Linked-Read analysis 10X Linked-Read sequencing data was analyzed using Long Ranger (longranger-2.2.2) and visualized using the 10X loupe genome browser.

Motif discovery
To identify de novo DNA sequence motifs enriched at TWIST1 binding sites, we called peaks from the TWIST1 ChIP-seq data using MACS2 and identified de novo motifs underlying these peaks using the SeqPos tool in Cistrome. Known motifs were identified using the TOM-TOM motif comparison tool (Gupta et al., 2007). Consensus DNA binding motifs were plotted using R package ggseqlogo and plotted using ggplot2.
MicroCT mandibular morphometry, quantification and plotting Reconstruction of microCT data was performed using Bruker Recon software. Hemimandible, pre-maxilla, maxilla, palatine and occipital bones were segmented and landmarks were placed using Amira software (Ho et al., 2015;Welsh et al., 2018). x-y-z coordinates were imported into the R Geomorph package that was used to calculate Procrustes distances, calculate inter-landmark absolute distances and perform ANOVA statistics to determine significance. For boxplots representing inter-landmark distance measurements, a significant reduction in size of the mutant mandibles is labeled as a percentage of the wild-type mandible. Hotelling tests were performed using Procrustes transformed data, and the hotelling.test function in R. Plotting was performed in R. Numbers of embryos analyzed is indicated in the figure legends.

Comparison to other datasets
To determine similarity of the epigenomic landscape at the SOX9 locus to that of other human cell-types, we downloaded over 50 publicly available H3K27ac ChIP-seq datasets from a number of cell-types (Prescott et al., 2015). We compared these datasets to our in vitro hCNCC H3K27ac ChIP-seq data and determined that activity of the PRS-associated enhancer clusters EC1.45, EC1.35 and EC1.25 was restricted to hCNCCs as they were not marked in other cell-types analyzed.

Analysis of differentially methylated regions (DMRs)
A Neanderthal-specific significantly hypomethylated region was identified at genomic coordinates chr17:68668482-68674772 (hg19) from previously published datasets (Gokhman et al., 2014(Gokhman et al., , 2020) that overlaps the EC1.45 enhancer cluster. A heatmap was generated representing DNA methylation at CpG dinucleotides spanning the DMR locus for one chimp (rib), seven anatomically modern humans (femur, crania, teeth), one Denisovan (finger) and two Neanderthal (femur and toe) bone samples. DNA methylation levels were determined by whole genome bisulfite sequencing (WGBS) or reconstructed using previously published methods which leverages spontaneous C / T deamination for ancient samples (Gokhman et al., 2014(Gokhman et al., , 2020. For reconstructed DNA methylation maps, the C / T ratio was calculated for each CpG dinucleotide and then translated via a linear transformation (based on modern fully hypomethylated or fully methylated sites) to a methylation percentage. All samples were smoothed using a sliding window of 25 CpG dinucleotides. DNA methylation is marked in the heatmap from green (0% methylation) to red (100% methylation), while white indicates no information.
ll OPEN ACCESS Figure S1, Related to Figure 1 A

PRS patient mutations
Genes Figure Figure S2. Capture-C interaction profiles for the PRS locus enhancer clusters, Related to Figure 1 (A) Capture-C from the viewpoint of the SOX9 promoter in hESC, neuroectodermal spheres (NEC), early (D11) and late (P4) hCNCCs, focusing on the PRS locus. Long-range interactions can be seen with the three putative enhancer clusters EC1.45, EC1.35 and EC1.25 in late hCNCCs (genomic location chr17:68,567,202-68,959,999, hg19). ChIP-seq data is shown for hESC and hCNCCs including marks of active enhancer (H3K27ac and p300 binding) in addition to CTCF and RAD21. Representative tracks are shown, and CTCF peaks are marked under the tracks as hESC-specific (yellow) or shared between hESC and hCNCC (dark blue).
Notably, a CTCF binding site is present in the EC1.35 enhancer cluster and upstream of the SOX9 promoter in both hESC and late hCNCCs. Enhancer clusters are highlighted in orange (EC1.45), green (EC1.35) and light blue (EC1.25).
(C) Schematic outlining the components of the face derived from the embryonic domains shown in (B) for P0 pup. Mx, maxillary region; Md, mandibular region.
(G) Sagittal, axial and coronal HREM sections of EC1.45 LacZ activity at E11.5. White arrow highlights activity in the mandibular process.
(H) HREM sections of EC1.45 LacZ activity at E11.5. LacZ signal appears most proximal in the most parasagittal sections (section 1), in the region that will give rise to the condylar process. White arrow highlights activity in the mandibular process.   (A) A two-step strategy was performed to delete EC1.45. In the first step, a selection cassette (EF-1a promoter driving mCherry-P2A-bsd; bsd = Blasticidin S deaminase), flanked by FRT sites, was inserted in place of EC1.45 by Cas9-mediated homologous recombination in a heterozygous manner. In a second step, Flippase was introduced to induce recombination of the flanking FRT sites, leaving an FRT and FRT3 site at the site of enhancer deletion.
(B) A two-step strategy was performed to generate scar-less deletion of enhancer EC1.25. First, a selection cassette (UBC promoter driving eGFP-T2A-tCD8) was inserted adjacent to the EC1.25 enhancer using Cas9mediated homologous recombination in a heterozygous manner. In a second step, the cassette was either removed to revert the allele to wildtype, or the cassette was removed along with the enhancer element using using Cas9-mediated homologous recombination.
(E) Overview of allele-specific RT-ddPCR for heterozygous EC1.25 deletion indicating primers and locked nucleic acid probes for a T/C SNP in the 3'UTR of the SOX9 gene in the H9 hESC line. The T-allele is shown above (in cis with the EC1.25 deletion), and the C-allele below. A HEX LNA probe will detect the T-allele, and a FAM LNA probe will detect the C-allele.
(F) ddPCR for EC1.25 heterozygous deletion and wildtype lines, using primers and probes as outlined in Figure   3B and (E). Relative expression between the two alleles is plotted (T:C ratio). Reduction in SOX9 expression for the allele with EC1.25 deletion is observed at all stages of hCNCC differentiation, while allelic expression differences are not detected for Day 9 chondrocytes.
(G) Luciferase reporter assays for EC1.45 at P2 and P4 of the CNCC differentiation indicating a 4.5 fold-change in activity between the two stages.     Figure S5. Identification of sequences central to EC1.45 enhancer cluster activity that act in a synergistic manner and are conserved down to lobe-finned fish, coelacanth, Related to Figure 4 (A) Luciferase assay for the entire EC1.45 enhancer cluster and subsets thereof. p300 Peak1 and Peak2 were assayed together and separately, and a tiling deletion screen of overlapping ~250bp regions was performed to identify critical regions for enhancer activity. Schematic of tested sequences are depicted on the left. Deletions 3 and 4 within Peak 1, and deletions 10 and 11 within Peak 2 dramatically reduce activity (see yellow boxes), while deletions 1 and 2 within Peak 1 increase activity. Percentage change in median value compared to Peak1+Peak2 is indicated for constructs with a significant change.
(B) Luciferase assay illustrating the synergy between EC1.45 constituent enhancer elements. A horizontal line indicates the expected signal for a sum (green, +) and multiplication (purple, x) of Peak1 and Peak2 or min1 and min2. In both cases the actual combined enhancer activity is greater than expected for multiplicative interaction.
Data from 11 independent experiments are plotted.
(D) Annotation of Coordinator motifs in EC1.45 enhancer detected by fimo. p300 Peak1 and Peak2, along with min1 and min2 sequences are indicated.
(E) Luciferase assay for EC1.45 p300 Peak1 and Peak2 with single or multiple Coordinator mutations. Individual mutations in Coordinator #2, #5, #6 or #7 reduce enhancer activity, and mutation of 4x Coordinator motifs in Peak 1 or 3x motifs in Peak 2 reduce activity to a level similar to Peak2 or Peak 1 respectively. Mutation of all 7 Coordinator sequences lead to a reduction of activity 1.84 fold lower that the empty vector control (t-test, pval = 0.027). A schematic on the left illustrates the constructs tested, with a pink vertical line indicating a Coordinator mutation. Percentage change in median value compared to Peak1+Peak2 is indicated for constructs with a significant change.
(F) Multiz alignment at the human EC1.45 enhancer sequence for selected species from mouse to coelacanth reveals deep conservation of min2 sequence, highlighted in pink.
(H) Detection of Coordinator sequences for human, mouse, opossum, platypus, chicken, lizard, frog and coelacanth min2 sequences by fimo, plotting score with p-value indicated by color and strand annotated as +/-.
(I) The top non-Coordinator sequence motif enriched at TWIST1 ChIP-seq peaks genome-wide in hCNCCs.
For luciferase assays, empty vector and SV40 enhancer are negative and positive controls, respectively.
Biological replicates are plotted as residuals of linear regression, normalized to the empty vector control.
(E-F) Sanger sequencing traces of genomic deletions from Founder 1 (E) and Founder 2 (F) mouse lines.
(G) Overview of allele-specific RT-ddPCR indicating primers and locked nucleic acid probes for a G/C SNP in the 3'UTR of the Sox9 gene for the C57BL/6 (G-allele) and FVB (C-allele) mouse strains. The G-allele is shown above, and the C-allele below, with wildtype alleles depicted on the left, and the heterozygous mEC1.45 deletion on the right. A HEX LNA probe will detect the G-allele, and a FAM LNA probe will detect the C-allele. In a cross between wildtype C57BL/6 and mEC1.45del/+ FVB animals, 50% of resulting embryos will be heterozygous null for the mEC1.45 enhancer cluster on the FVB allele (Sox9 C-allele).
Significant reduction in Sox9 expression is observed for the FVB C-allele (with mEC1.45 deleted) for the FNP and MdP compared to the C57BL/6 G-allele. Significance determined by t-test, * pval < 0.05; ** pval < 0.01.  The number of LacZ positive embryos from the LacZ reporter assay at E9.5 and E11.5. * No LacZ positive embryos were detected from 3 PCR positive embryos. Constructs that were triplexed are indicated with "x3".    (Go-Gn) SD, standard deviation; studies ordered by mean age at time of imaging; *, poor age-match