SORBS2 is a genetic factor contributing to cardiac malformation of 4q deletion syndrome patients

Chromosome 4q deletion is one of the most frequently detected genomic imbalance events in congenital heart disease (CHD) patients. However, a portion of CHD-associated 4q deletions without known CHD genes suggests unknown CHD genes within these intervals. Here, we have shown that knockdown of SORBS2, a 4q interval gene, disrupted sarcomeric integrity of cardiomyocytes and caused reduced cardiomyocyte number in human embryonic stem cell differentiation model. Molecular analyses revealed decreased expression of second heart field (SHF) marker genes and impaired NOTCH and SHH signaling in SORBS2-knockdown cells. Exogenous SHH rescued SORBS2 knockdown-induced cardiomyocyte differentiation defects. Sorbs2-/- mouse mutants had atrial septal hypoplasia/aplasia or double atrial septum (DAS) derived from impaired posterior SHF with a similar expression alteration. Rare SORBS2 variants were significantly enriched in a cohort of 300 CHD patients. Our findings indicate that SORBS2 is a regulator of SHF development and its variants contribute to CHD pathogenesis. The presence of DAS in Sorbs2-/- hearts reveals the first molecular etiology of this rare anomaly linked to paradoxical thromboembolism.


Introduction
Copy number variation (CNV) is a common structural variation in human genome and causes a variety of genetic syndromes. The identification of causal disease gene(s) within CNV intervals is crucial to understand the pathogenesis of the related disease. Chromosome 4q deletion syndrome is a genetic disease resulting from a chromosomal aberration that causes the missing of a portion of chromosome four long arm (Strehle and Bantock, 2003). Patients have a spectrum of clinical manifestations including craniofacial, cardiovascular, and gastrointestinal abnormalities, and mental and growth deficiencies (Strehle and Bantock, 2003). Congenital heart disease (CHD) is a common defect seen in about half of the 4q deletion patients. A previous study narrowed the cardiovascular critical region to 4q32.2-q34.3, which contains TLL1, HPGD, and HAND2 genes (Xu et al., 2012). Over-represented right-sided CHDs in 4q deletion syndrome patients suggest that HAND2, an essential regulator of the second heart field (SHF), is mainly responsible for the CHD phenotype (Xu et al., 2012;Huang et al., 2002). However, a part of terminal 4q deletions with CHDs that we and others have discovered does not cover HAND2 (Geng et al., 2014;Strehle et al., 2012;Tsai et al., 1999). SORBS2 within chromosomal 4q35.1 has been proposed as a candidate gene for CHD of terminal 4q deletion syndrome based on an unusual small interstitial deletion (Strehle et al., 2012). However, there has been no further evidence to substantiate it ever since. Here, we have presented evidence from in vitro cardiogenesis, animal model, and mutation analyses to demonstrate that SORBS2 is a genetic factor regulating cardiac development and contributing to cardiac malformation of the CHD population.

Results
SORBS2 is required for cardiomyocyte differentiation and the integrity of sarcomeric structure To recapitulate SORBS2 haploinsufficiency of 4q deletion, we knocked down SORBS2 in human embryonic stem cell lines (H1-hESC). We used two different short hairpin RNAs (shRNAs) to knock down SORBS2, and similar knockdown efficiencies (~40% of wild-type expression level) were achieved ( The cardiomyocyte differentiation efficiency (the proportion of cTnT + cells) was significantly decreased in SORBS2-knockdown group at D15 ( Figure 1A-B). Since SORBS2 is a structural component of sarcomeric Z-line and cardiomyopathy gene (Ding et al., 2020;Li et al., 2020;Sanger et al., 2010), we examined the myofibril structure of differentiated cardiomyocytes. Most cardiomyocytes in SORBS2-knockdown group presented a round or oval shape instead of polygonal or spindle-like outlines in control group, and a close lookup indicated that sarcomeric structure in cells with abnormal shapes was disrupted ( Figure 1C). The percentage of cardiomyocytes with wellorganized sarcomeres and a normal shape was much lower in SORBS2-knockdown group ( Figure 1D). Disrupted sarcomeric structures in SORBS2-knockdown cardiomyocytes were also present in transmission electron microscopy analysis ( Figure 1-figure supplement 3A). The expression of sarcomeric genes TNNT2, MYL7, MYH6, and MYH7 was significantly decreased (Figure 1-figure supplement 3B-3E).
The weakened beating force of SORBS2knockdown cardiomyocytes might be derived from abnormal electrophysiology. To this end, we examined the electrical activities of dissociated D30 cardiomyocytes by patch clamping. The dominant type of cardiomyocytes is ventricular-like in both control and SORBS2-knockdown groups (Figure 1-figure supplement 3F). Statistical analyses on action potential parameters of ventricular-like cells, including average action potential (AP) duration at 90% repolarization, average AP frequency, peak amplitude, and resting potential, showed no difference between two groups (Figure 1-figure supplement 3G-3J).

SORBS2 knockdown decreased the expression of SHF marker genes
Having shown the reduced efficiency of cardiomyocyte differentiation in SORBS2-knockdown group, we hypothesized that SORBS2 had an early role in cardiomyocyte differentiation. Expression dynamics showed that SORBS2 was up-regulated at cardiac progenitor stage D5 after a transient absence at mesodermal cell stage (D2-D3) ( Figure 1E). Consistently, SORBS2-knockdown group disrupted the expression of cardiac progenitor markers, whereas mesodermal markers remained unchanged ( Figure 1F-H, Figure 1-figure supplement 4A-4E). There are two sets of molecularly distinct cardiac progenitors during mammalian heart development, referred to as the first and second heart fields (FHF and SHF), which contribute to distinct anatomical structures of the heart (Srivastava, 2006). Interestingly, we found significantly increased expression of FHF markers (TBX5, HCN4, HAND1) (Figure 1-figure supplement 4C-4E) while significantly decreased expression of SHF markers (TBX1, ISL1, MEF2C) in SORBS2-knockdown cells ( Figure 1F-H). SHF gives rise to cardiac outflow, right ventricle, and inflow (Kelly, 2012). Any defect in these embryonic structures leads to CHDs commonly seen in 4q deletion syndrome.

SORBS2 knockdown decreased NOTCH and SHH signaling
To understand how SORBS2 regulates SHF progenitor commitment, we collected D5 cells for RNAseq. Using a stringent threshold (padj <0.05, |log 2 (fold change)|>1), we selected out 160 down-regulated and 104 up-regulated genes for gene ontology (GO) analysis ( Figure 1I, Supplementary file 1-2). Results showed that the up-regulated genes were enriched in biological processes like cell adhesion and so on ( Figure 1J), which might be a compensatory reaction to reduced SORBS2 as a cytoskeleton component. The down-regulated genes were enriched in biological processes like heart development and so on ( Figure 1K), suggesting that SORBS2 positively regulates cardiac development. Particularly, we noted the NOTCH signaling pathway in the down-regulated list ( Figure 1K-L). We verified the expression of NOTCH signaling target genes HEY1, HEYL, and NRARP by qPCR ( Figure 1-figure supplement 4F). In contrast, we did not see differential expression for NOTCH1 in RNA-seq ( Figure 1L), suggesting that the regulation of SORBS2 on NOTCH signaling might be through modulating protein level. SORBS2 can interact with the non-receptor tyrosine kinase c-ABL as SH3 domain-containing adaptor (Kioka et al., 2002). The binding of SORBS2 to c-ABL triggers the recruitment of ubiquitin ligase CBL and leads to the ubiquitination of c-ABL (Soubeyran et al., 2003). Indeed, we noted that c-ABL protein level was significantly elevated in SORBS2-knockdown cells ( Figure 1M). c-Abl can promote Notch endocytosis to modulate Notch signaling (Xiong et al., 2013). Consistently, NOTCH1 protein level decreased significantly in SORBS2-knockdown cells ( Figure 1N). Notch signaling is a well-known molecular mechanism enhancing cellular response to Shh (Stasiulewicz et al., 2015). We noted that the expression of SHH and SHH signaling targets, . **p<0.01; two-tailed Student's t test. (C) Immunostaining of D30 cells with anti-cardiac troponin I (cTnI, red) and anti-a-actinin (green) antibodies. Boxed areas are magnified in the lower panels. (D) Quantification of cardiomyocytes with wellorganized sarcomeres (control: n = 211, SORBS2-knockdown: n = 197). **p<0.01; two-tailed Student's t test. (E) qPCR quantification of SORBS2 expression dynamics (n = 3 for each time point). **p<0.01; two-tailed Student's t test. (F-H) qPCR quantification of second heart field (SHF) progenitor marker expression at different time points (n = 3 for each time point). *p<0.05, **p<0.01; two-tailed Student's t test. (I) Volcano plot illustrates the Figure 1 continued on next page PTCH1 and GLI1, was also reduced in SORBS2-knockdown cells (Figure 1-figure supplement 4G). We applied recombinant SHH protein to check whether it can rescue defects caused by SORBS2 knockdown. As expected, exogenous SHH activated PTCH1 and GLI1 expression ( Figure 1O). It also up-regulated the expression of SHF markers ISL1 and MEF2C in D5 SORBS2-shRNA1 cells ( Figure 1O) and rescued cardiomyocyte differentiation efficiency with more cells presenting a polygonal or spindle-like shape ( Figure 1P and Q).

Sorbs2 -/mice have atrial septal defect and defective dorsal mesenchyme protrusion
Since the entire SORBS2 gene is absent in terminal 4q deletion, we used Sorbs2 knockout mice to examine its role in cardiac development. In a previous report, about 40-60% Sorbs2 -/mice died within 1 week after birth (Zhang et al., 2016), indicating a possible structural heart defect(s). To this end, we collected 137 embryos at E18.5. The ratio of genotype distribution among embryos was consistent with Mendel's law (Supplementary file 3), suggesting no embryo loss in early development stage. We dissected 30 Sorbs2 -/embryos, and none of them showed conotruncal defect or ventricular septal defect (Figure 2A-B). However, we found that about 40% (12/30) Sorbs2 -/hearts had atrial septal defect (ASD) with 10 being the absence/hypoplaisa of primary septum and two being double atrial septum (DAS) ( Figure 2C, Supplementary file 3). The penetrance of ASD is similar to the ratio of reported postnatal lethality, indicating that ASD might contribute to early postnatal death.
A major part of atrial septum is derived from dorsal mesenchyme protrusion (DMP) originated from posterior SHF (Kelly, 2012). Our previous data indicates that SORBS2 knockdown impaired the in vitro differentiation of SHF progenitors. We also noted DMP malformation in 5 out of 15 E10.5 Sorbs2 -/embryos. The majority of them had DMP aplasia/hypoplasia (n = 4), whereas one of them had duplicated DMP ( Figure 2D). The dichotomy of DMP morphology is consistent with two opposite ASD phenotypes seen in E18.5 embryos. Overall, the in vivo phenotype of Sorbs2 -/mice further supports that SORBS2 haploinsufficiency in 4q deletion contributes to CHD pathogenesis through affecting SHF development.

SORBS2 deficiency-induced molecular changes are highly conserved in mice
To examine Sorbs2 expression pattern in early mouse embryos, we pooled publicly available singlecell transcriptomic profiles from E9.25 to 10.5 mouse embryonic hearts (de Soysa et al., 2019;Hill et al., 2019). We identified nine subgroups as cardiac progenitors or cardiomyocytes and noted that Sorbs2 is highly expressed in cardiomyocytes and in a subgroup of cardiac progenitors that also express Isl1 and Tbx1 (Figure 3-figure supplement 1).
To have a view of transcriptomic changes in Sorbs2 mutants, we performed RNA-seq on E10.5 wild-type Sorbs2 +/and Sorbs2 -/embryos. Principal component analysis (PCA) indicates that wildtype embryos are clustered together, whereas Sorbs2 -/embryos are more scattered (±), which is consistent with diverged cardiac phenotypes of Sorbs2 -/embryos. Interestingly, the majority of Sorbs2 +/samples are juxtaposed more closely to Sorbs2 -/embryos in PCA plot, indicating a molecular phenotype in heterozygous mutants. We selected genes significantly down-regulated in heterozygous mutants (log2(fold change)>0.25, p<0.05) to perform GO analysis and found that these genes are enriched in pathways involved in muscle development and embryonic morphogenesis ( Figure 3C, Supplementary file 4). Using the same threshold, we selected genes significantly downregulated in homozygous mutants to perform GO analysis. These genes are enriched in pathways regulating heart development, myofibril assembly, and cardiac septum development ( Figure 3D, Supplementary file 5). Particularly, the Notch signaling pathway was also in the enrichment list. Looking closer, we noted that genes involved in cardiac development, myofibril assembly, and contraction force were down-regulated in nearly all the homozygous mutants ( Figure 3E). The decrease   Figure 3 continued on next page of posterior SHF marker genes was less general but still clearly down-regulated in the majority of Sorbs2 -/embryos ( Figure 3F). Decreased Tbx5 (n = 3) and Hoxb1 (n = 3 out of 4) expression in posterior SHF of Sorbs2 -/embryos was validated by RNA in situ hybridization ( Figure 3G and H). A portion of embryos had obvious down-regulation of Notch and Shh signaling targets ( Figure 3I), which is consistent with a low penetrance of ASD in homozygous mutants. RNA in situ hybridization confirmed decreased Heyl expression in posterior SHF of Sorbs2 -/embryos (n = 1; Figure 3J). Next, we verified the upstream molecular changes of Notch signaling found in hESC differentiation model. Indeed, we noted that Notch1 expression level significantly decreased whereas c-ABL significantly increased in Sorbs2 -/embryos ( Figure 3K-L).

Rare SORBS2 variants are significantly enriched in CHD patients
Rare genetic variants play a significant role in CHD occurrence (Blue et al., 2017), hence we used a rare variant association to help identify genes within CNVs responsible for CHDs. Besides 23 candidate CNV genes containing SORBS2, the targeted panel also includes 81 known CHD genes from literature. Targeted sequencing was performed on 300 complex CHD cases (two cases removed due to low-quality data). 220 Han Chinese descents from the 1000 genome project were used as controls. The ethnicity of these two groups was matched by PCA ( Figure 4-figure supplement 1). A total of 1560 exonic variants from CHD and control groups passed quality control and were included for further analyses ( Figure 4A). Variant distribution in the breakdown categories of CHD and control groups is shown in Supplementary file 6.
Of the 847 nonsynonymous variants, 43.57% (n = 369) variants had a minor allele frequency (MAF) below 1% across ExAC database and were adjudicated as 'damaging' by at least two algorithms (PolyPhen2, SIFT, or MutationTaster). We applied gene-based statistic tests to evaluate the cumulative effects of rare damaging variants (MAF<1%) on CHDs. Genes with at least two rare damaging variants (n = 57) were included for analysis (Supplementary file 7). 4 out of 57 genes (SORBS2, KMT2D, EVC2, and SH3PXD2B) had a p-value lower than 0.05 (one-tailed Fisher's exact) and two of them (SORBS2 and KMT2D) had a statistically significant mutation burden after the correction for multiple testing (q<0.20) ( Figure 4B, Supplementary file 7). KMT2D is a well-known CHD gene (Ang et al., 2016;Jin et al., 2017). Our data indicate that rare SORBS2 variants have similar levels of enrichment in CHDs as the known CHD genes. The distribution of rare SORBS2 damaging variants in CHD patients spread throughout the gene ( Figure 4C, Supplementary file 8).
Although we didn't detect SORBS2 nonsense variants in CHD patients, missense mutations identified in our cohort caused protein aggregation in cells ( Figure 4D), suggesting an abnormal function of these variant proteins. A high prevalence (85%, 17/20) of ASD, the defect seen in Sorbs2 -/hearts, was observed in patients carrying SORBS2 variants (Supplementary file 9). In our CHD cohort, we noted a significant enrichment of SORBS2 rare damaging variants in patients with ASD (17 out of 183 ASD patients versus 3 out of 117 non-ASD patients, p=0.0306, Fisher's exact test). These data further support that SORBS2 contributes to CHD pathogenesis.

Discussion
The common CHDs in 4q deletion syndrome include ASD, ventricular septal defect (VSD), pulmonary stenosis/atresia, and tetralogy of Fallot and so on Strehle and Bantock, 2003;Lin et al., 1988. The affected structures are atrial septum and cardiac outflow tract, which are all derived from   (Kelly, 2012). Indeed, our data revealed that SORBS2 functions not only as a sarcomeric component to maintain cardiomyocyte function, but also as an adaptor protein to promote SHF progenitor commitment in in vitro cardiomyocyte differentiation. ASD is detected in Sorbs2 -/mouse hearts. It supports that Sorbs2 regulates SHF development in vivo and its role is conserved across species. In both models, we detected an increased protein level of NOTCH1 endocytosis facilitator c-ABL, a decreased NOTCH1 protein level, and impaired SHH signaling. Notch and Shh signaling is essential for SHF development (Paige et al., 2015). Notch signaling promotes Smo accumulation in cilia and enhances cellular response to Shh, placing Notch upstream of Shh signaling (Stasiulewicz et al., 2015;Kong et al., 2015). Therefore, SORBS2 might promote SHF progenitor fate through c-ABL/ NOTCH/SHH axis. In addition, Tbx5 was also significantly down-regulated in posterior SHF of Sorb2 mutants. Tbx5-Hh molecular network is an essential regulatory mechanism in SHF for atrial septation (Xie et al., 2012). It is likely that Tbx5 downregulation also contributes to the pathogenesis of ASD through its effect on Hh signaling. Consistently, adding recombinant SHH protein is sufficient to rescue SHF marker gene expression and cardiomyocyte differentiation efficiency. However, TBX5 was up-regulated in D5 SORBS2-knockdown cells. It is likely that in vitro cardiomyocyte differentiation is a simplified model that cannot fully recapitulate the in vivo spatial and temporal information of various cardiac progenitors and, therefore, has less regulatory layers in which TBX5 may predominantly function as an FHF regulator. Indeed, Tbx5 knockdown reduces FHF progenitors and has no effect on SHF progenitors in an in vitro cardiogenesis model (Andersen et al., 2018).
Unlike human 4q deletion patients, Sorbs2 -/mice have no conotruncal defect and the penetrance of ASD is only 40%, indicating a relatively small effect of SORBS2 in CHD pathogenesis. The high penetrance of conotruncal defects in human 4q deletion patients may be due to genetic modifiers that, together with SORBS2 haploinsufficiency, cause the developmental defect in cardiac outflow tract. An obvious genetic modifier is the HAND2 gene, which is co-missing with SORBS2 in large 4q deletions. Previous studies have shown that CHD is observed more frequently in patients with the terminal deletion at 4q31 than in patients with the terminal deletion at 4q34 or 4q35 (Lin et al., 1988). Since some terminal 4q deletions that do not cover HAND2 still manifest conotruncal defects, there may be another genetic modifier in the terminal deletion region. Helt, whose human homologous HELT is located within 4q35.1, encodes a Hey-related bHLH transcription factor that is expressed in both the brain and heart, and mediates Notch signaling (Nakatani et al., 2004). Therefore, both SORBS2 and HELT haploinsufficiency might synergistically impair NOTCH signaling and cause a cardiac outlfow tract (OFT) defect.
DAS, also called Cor triatriatum type C in the original report (Thilenius et al., 1976), is a very rare CHD characterized by an extra septal structure to the right side of primary atrial septum (Roberson et al., 2006). This anatomic abnormality is implicated as a cause of paradoxical thromboembolic event to stroke or heart attack (Breithardt et al., 2006). However, its etiology and pathogenesis are entirely unknown. Here, we have shown that Sorbs2 deficiency can cause this abnormality. Interestingly, Cor triatriatum, another type of abnormal extra atrial septation, has been reported in a patient with a terminal 4q34.3 deletion (Marcì et al., 2015), which includes SORBS2 but not HAND2. It has been speculated that DAS might result from the persistence of embryologic structures or abnormal duplication of atrial septum. The impaired cardiogenesis in Sorbs2 -/mice suggests that the latter scenario may be the underlying pathogenesis.

Mouse lines and breeding
Mice were housed under specific pathogen-free conditions at the animal facility of Shanghai Children's Medical Center. Sorbs2 flox/flox mice (Zhang et al., 2016) were a gift from Dr. Guoping Feng's lab (McGovern Institute for Brain Research, MIT, Cambridge). Sorbs2allele was obtained by breeding Sorbs2 flox allele into CMV-Cre mouse (Schwenk et al., 1995). The strains were backcrossed with C57BL/6 to maintain the lines ever since we obtained them. 2-to 6-month-old males and females were used for timed mating and embryos were collected at E18.5. Neither anesthetic nor analgesic agent was applied. CO 2 gas in a closed chamber was used for euthanasia of pregnant dam and cervical dislocation was followed. Isolated fetuses were euthanized by cervical dislocation. Animal care and use were in accordance with the NIH guidelines for the Care and Use of Laboratory Animals and approved by the Institutional Animal Care and Use Committee of Shanghai Children's Medical Center (SCMC-LAWEC-2017006).

Histological analysis
For cardiac phenotype analysis, embryos were collected at E18.5 and E10.5, and were fixed in 10% formalin overnight. Isolated hearts were processed for paraffin embedding, sectioned at a thickness of 4 mm, and stained with hematoxylin and eosin. Stained sections were imaged using a Leica DM6000 microscope.

RNA in situ hybridization
E10.5 embryos were collected and fixed in 4% paraformaldehyde (PFA) solution for 2 hr at room temperature, then dehydrated by 30% sucrose, and embedded in OCT (Thermo Fisher Scientific, 6502). 10 mm cryosections were used for RNA in situ hybridization (ISH) according to standard procedure. Tbx5 probe plasmid was a gift of Dr. Cecilia Lo (University of Pittsburgh). Hoxb1 and Heyl probes were generated in house through PCR amplification. Primers for Hoxb1 probe: F-5' TTCCTTTTTAGAGTACCCACTTTG, R-5' GTTTCTCTTGACCTTCATCCAGTC. Primers for Heyl probe: F-5' GCCAGGAGCATAGTCCCAAT, R-5' GGCCCTCAACCCACTCCATGAC.

Flow cytometric analysis
In brief, D15 cells were harvested in 0.25% trypsin/EDTA at 37℃ for 15 min and subsequently neutralized by 10% fetal bovine serum in Dulbecco's modified Eagle medium. Then cells were centrifuged at 1000 rpm for 5 min and resuspended in Invitrogen FIX and PERM solution and kept at 4˚C for 30 min. After washing, cells were incubated with anti-cTnT antibody (1:100, ab105439; Abcam) in washing buffer on ice in the dark for 45 min. Cells were centrifuged, washed, and resuspended for detection. Data were collected by BD FACSCanto flow cytometer and analyzed by BD FACS software.
Electron microscopy D30 cardiomyocytes were harvested using 0.25% trypsin/EDTA and prefixed with 2.5% glutaraldehyde in 0.2 M phosphate buffer overnight at 4˚C. Samples were washed and then post-fixed with 1% osmium tetroxide for 1.5 hr. Next, cells were routinely dehydrated in an ethanol series of 30, 50, 70, 80, and 95% for 15 min each, and 100% ethanol and acetone twice for 20 min each at room temperature, and then embedded in an epoxy resin. Sections (70 nm thick) were poststained in uranyl acetate and lead citrate and visualized on Hitachi 7650 microscope.

Electrophysiological recordings
D30 h1ESC-CMs were digested by Accutase (7920; STEMCELL Technologies, Canada), washed one time with a baseline extracellular fluid, and then moved to the stage of an inverted microscope (ECLIPSE Ti-U; Nikon, Japan) for patch-clamp recording. h1ESC-CMs were continuously perfused by an extracellular solution through a 'Y-tube' system with a solution exchange time of 1 min. Wholecell patch-clamp recordings were performed using Axopatch 700B (Axon Instruments, Inc, Union City, CA, USA) amplifiers under an invert microscope at room temperature (22-25˚C). Glass pipettes were prepared using borosilicate glasses with a filament (Sutter Instruments Co, Novato, CA) using the Flaming/Brown micropipette puller P97 (Sutter Instruments Co). The final resistance parameters of patch pipette tips were about 2-4 MW after heat polish and internal solution filling. After the formation of 'gigaseal' between the patch pipette and cell membranes, a gentle suction was operated to rupture the cell membrane and establish whole-cell configuration. All current signals were digitized with a sampling rate of 10 kHz and filtered at a cutoff frequency of 2 kHz (Digidata 1550A; Axon Instruments, Inc, Union City, CA). The spontaneous action potentials were recorded in a gap-free mode with a sampling rate of 1 kHz and filtered at a cutoff frequency of 0.5 kHz. If the series resistance was more than 10 MW or changed significantly during the experiments, the recordings were discarded from further analyses. The pipette internal solution for action potential recording contained (in mM) KCl 150, NaCl 5, CaCl 2 2, EGTA 5, HEPES 10, and MgATP 5 (pH 7.2, KOH), and baseline extracellular solution and extracellular solution for action potential recording contained (in mM) NaCl 140, KCl 5, CaCl 2 1, MgCl 2 1, glucose 10, and HEPES 10 (pH 7.4, NaOH). Data were analyzed by using Clampfit 10.5 and Origin 8.0 (OriginLab, Northampton, MA).

Quantitative real-time PCR analysis
Total RNA was extracted from D0, D2, D3, D5, and D10 cells or E10.5 embryos using Trizol reagent (15596018; Thermo Fisher Scientific). Reverse transcription was accomplished with Reverse Transcription Kit (Takara; RR037A) according to the manufacturer's instructions. qPCR was performed with the SYBR Fast qPCR Mix (Takara; RR430A) in the Applied Biosystems 7900 Real-Time PCR System. Primers are listed in Supplementary file 10.

RNA-Seq
Total RNA of D5 cells and E10.5 embryos was isolated using TRizol reagent (Thermo Fisher Scientific; 15596018). Library preparation and transcriptome sequencing on an Illumina HiSeq platform were performed by Novogene Bioinformatics Technology Co, Ltd to generate 100-bp paired-end reads. HTSeq v0.6.0 was used to count the read numbers mapped to each gene, and fragments per kilobase of transcript per million fragments mapped (FPKM) of each gene were calculated. We used FastQC to control the quality of transcriptome sequencing data. Next, we compared the sequencing data to the human reference genome (hg19) by STAR. The expression level of each gene under different treatment conditions is obtained by HTSeq-count after standardization. The differentially expressed genes were analyzed by DESeq2 package. Functional enrichment of differentially expressed genes was analyzed on Toppgene website.
GSE126128 and GSE131181 datasets were retrieved from Gene Expression Omnibus (GEO) database. Seurat (version 3.0) toolkit was used for scRNA-seq analysis. After data integration, batch effect elimination, normalization, and scaling, different cell populations were identified based on existing references. Gene expression was plotted using normalized read counts.

Network and GO analysis
From ENCODE database (Gerstein et al., 2012), 316 human fetal heart-specific genes including SORBS2 were selected and their expression coefficients were computed. The highly co-regulated transcriptional networks (correlation coefficient !0.8) were constructed and visualized with BioLayout Express3D. The interconnected gene clusters were detected using the MCL (Markov Cluster) algorithm and illustrated with different colors.

Patient samples
A total of 300 children with complex CHD, including ASD, conotruncal defect, and so on, were enrolled in our study from November, 2011 to January, 2014 in Shanghai Children's Medical Center (Supplementary file 11). Patients carrying 22q11.2 deletion and gross chromosomal aberrations were excluded from our study. The mean age of included probands was 10 months with a range of 3 days to 17 years. 188 (62.7%) were boys and 112 (37.3%) were girls. CHD diagnosis was confirmed by reviewing patient history, physical examinations, and medical records. Patients carrying 22q11.2 deletion and gross chromosomal aberrations (e.g., trisomy 21, trisomy 13, and trisomy 18) were excluded from our study. The study conformed to the principles outlined in the Declaration of Helsinki, and approval for human subject research was obtained from the Institutional Review Board of Shanghai Children's Medical Center (SCMC-201015). Written informed consents were obtained from parents or legal guardians of all patients.

Control cohort
Exome sequences for 220 control subjects of Han Chinese descent were derived from the 1000 genome project (http://www.internationalgenome.org/data). Raw sequence data in the form of fastq files were re-aligned and re-analyzed with the same bioinformatic pipelines with CHD patients. The ethnicity of cases and controls was investigated by performing PCA with single nucleotide polymorphism (SNP) genotype data from all the participants of this study.

Gene selection and targeted sequencing
We used a customized capture panel of 104 targeted genes, which included 81 known CHD genes (Supplementary file 12) and 23 CHD candidate genes (Supplementary file 13). The CHD genes were selected through a comprehensive literature search and had been reported by other research groups to be associated with CHD in either human patients or mouse models (Andersen et al., 2014;Barriot et al., 2010;Fahed et al., 2013). CHD candidate genes were prioritized from pathogenic CNVs and likely pathogenic CNVs identified in CHD patients in our previous study (Geng et al., 2014). The coding regions of selected genes and their flanking sequences were covered by the Agilent SureSelect Capture panel and sequenced on the Illumina Genome Analyzer IIx platform according to the protocols recommended by the manufacturers.

Variant calling and quality control
We used the best practice pipeline of Broad Institute's genome analysis toolkit (GATK) 3.7 to obtain genetic variants from the target sequencing data of 300 CHD cases together with the raw data of 220 Han Chinese control samples. Briefly, Burrows-Wheeler Aligner (BWA, version 0.7.17) was used to align the Fastq format sequences to human genome reference (hg38). De-duplication was performed using Picard, and the base quality score recalibration (BQSR) was performed to generate analysis-ready reads. HaplotypeCaller implemented in GATK was used for variant calling in genomic variant call format (GVCF) mode. All samples were then genotyped jointly. We then excluded the variants based on the following rules: (1) >2 alternative alleles; (2) low genotype call rate <90%; (3) deviation from Hardy-Weinberg Equilibrium in control samples (p<10 À7 ); (4) differential missingness between cases and controls (p<10 À6 ). After alignment and variant calling, we removed two subjects with low-quality data, leaving a total of 298 cases and 220 controls.

Variant enrichment analysis
Given that severe mutations are generally present at low frequencies in the population, we set the relevant variants filtering criteria as follows: (1) variants that located in exonic region, (2) excluding synonymous variants, (3) variants with an MAF below 1% according to the public control database (Exome Aggregation Consortium, ExAC), and (4) damaging missense variants predicted to be deleterious by at least two algorithms (Polyphen2 !0.95/MutationTaster_pred:D/SIFT 0.05). All relevant variants following these criteria were hereafter called 'rare damaging' variants. The number of rare damaging variant carriers in each gene was counted in CHD patients and controls. We hypothesized that rare damaging variants should be enriched in CHD patients, hence the carrier and non-carrier groups were compared between CHD patients and controls using a one-tailed Fisher's exact test. The odds ratio (OR) was calculated. Only genes with at least two variants were retained, and multiple testing correction was performed using Benjamini-Hochberg procedure (q-value, adjusted p-value after Benjamini-Hochberg testing).

Statistical analysis
Statistical significance was performed using a two-tailed Student's t test, c 2 test, or Fisher's exact test as appropriate. The combined difference of Notch and Shh signaling was tested by nonparametric PERMANOVA. Statistical significance is indicated by *p<0.05 and **p<0.01.

Additional files
Supplementary files . Supplementary file 1. Down-regulated genes in SORBS2-knockdown D5 cells for GO analysis.
. Supplementary file 2. Up-regulated genes in SORBS2-knockdown D5 cells for GO analysis.
. Supplementary file 6. Number of exonic variants detected in CHD and normal controls.
. Supplementary file 7. Carriers of rare damaging variants in CHD and normal controls. . Transparent reporting form

Data availability
Targeted sequencing raw data of CHD patients have been deposited in NCBI's Sequence Read Archive (PRJNA579193). RNA-seq data have been deposited in NCBI's Gene Expression Omnibus (GSE137090).
The following datasets were generated: