A complementary study approach unravels novel players in the pathoetiology of Hirschsprung disease

Hirschsprung disease (HSCR, OMIM 142623) involves congenital intestinal obstruction caused by dysfunction of neural crest cells and their progeny during enteric nervous system (ENS) development. HSCR is a multifactorial disorder; pathogenetic variants accounting for disease phenotype are identified only in a minority of cases, and the identification of novel disease-relevant genes remains challenging. In order to identify and to validate a potential disease-causing relevance of novel HSCR candidate genes, we established a complementary study approach, combining whole exome sequencing (WES) with transcriptome analysis of murine embryonic ENS-related tissues, literature and database searches, in silico network analyses, and functional readouts using candidate gene-specific genome-edited cell clones. WES datasets of two patients with HSCR and their non-affected parents were analysed, and four novel HSCR candidate genes could be identified: ATP7A, SREBF1, ABCD1 and PIAS2. Further rare variants in these genes were identified in additional HSCR patients, suggesting disease relevance. Transcriptomics revealed that these genes are expressed in embryonic and fetal gastrointestinal tissues. Knockout of these genes in neuronal cells demonstrated impaired cell differentiation, proliferation and/or survival. Our approach identified and validated candidate HSCR genes and provided further insight into the underlying pathomechanisms of HSCR.


Introduction
The neurocristopathy Hirschsprung disease (HSCR, OMIM 142623), also termed congenital intestinal aganglionosis, represents one of the main causes for neonatal intestinal obstruction. It is characterized by the absence of enteric ganglia (aganglionosis) in the distal bowel. Depending on the length of the affected segment, it is categorized as short-segment (S-HSCR; up to 80%), long-segment (L-HSCR; up to 20%), or total colonic aganglionosis (TCA; up to 8%). The diagnosis is often made within the first days after birth when meconium passage is delayed or a megacolon forms. Patients can also suffer from enterocolitis, constipation, abdominal pain, or emesis. So far, the only treatment is to surgically resect the affected aganglionic bowel segment [1][2][3]. The enteric nervous system (ENS) innervates the gastrointestinal (GI) tract, regulating blood flow, gut motility, peristalsis, ion and fluid homeostasis, and secretion of signaling mediators [4,5]. The ENS originates from vagal neural crest cells (NCCs) during embryonic development, with delamination between E8.5 and E9.5 in mice and before week 4 of gestation in humans [3,4]. NCC-derived progenitor cells that invade and colonize the developing gut are enteric neural crest-derived cells (ENCDCs). They give rise to enteric neurons and glia cells, which are organized into the submucosal and myenteric plexus ganglia [4,6,7]. Impaired migration, proliferation, survival, or differentiation of NCCs or ENCDCs in the developing lower GI tract may cause the enteric neuropathy HSCR [5,7].
Genetically, HSCR is a rare, multifactorial disorder with an incidence of 1:5,000 live births, showing male sex preponderance, incomplete penetrance, and variable expressivity [1,8,9]. To Hospital (https://www.heidelberg-universityhospital.com) (G. Rappold), and the Dres. Majic/ Majic Schlez Stiftung (T. Mederer). Tanja Mederer is a PhD fellow of HBIGS (http://www.hbigs.uniheidelberg.de) and was funded by the Studienstiftung des Deutschen Volkes (https:// www.studienstiftung.de). Cristina Martínez is supported by Instituto de Salud Carlos III, Subdirección General de Investigación Sanitaria, Ministerio de Ciencia, Innovación y Universidades (https://www.isciii.es) (CP18/00116). Salud Borrego was supported by Instituto de Salud Carlos III (https://www.isciii.es) through the project "PI16/0142" and "PI19/01550" (Co-funded by European Regional Development Fund/European Social Fund "A way to make Europe"/"Investing in your future"). The WES analysis from 443 HSCR cases and 493 controls of East Asian ethnicity was supported by the Theme-Based Research Scheme (https://www.ugc.edu.hk) (grant no. T12C-714/14-R.) (Paul Tam). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. date, more than 20 genes have been identified and replicated that affect signaling cascades crucial for ENS development. The major susceptibility locus is RET as mutations account for up to 50% of familial and 20% of sporadic HSCR cases. To date, mutations in other HSCR risk genes have been identified in about 5% of patients [9,10]. However, variants in these established genes account for only 30% of patients, but many more genes have been implicated in its pathoetiology [9,11]. Identifying disease-relevant variants remains a major challenge.
In the present study, we used an evidence-based procedure to select HSCR candidate genes identified by whole exome sequencing (WES). To prioritize and validate candidate genes, we evaluated bioinformatically filtered WES data based on literature and database searches, and on expression analyses of embryonic ENS tissues. To accumulate further evidence for the potential disease-relevance of our selected candidate genes, WES datasets of further HSCR patients as well as of individuals presenting mostly with a neurological phenotype were taken into account, allowing the identification of further rare variants. Candidate genes were functionally analyzed using gene-specific CRISPR/Cas9-based knockout (KO) cell clones. We focused on properties relevant to HSCR pathoetiology, such as migration, proliferation, cell survival, and differentiation.
Our approach effectively and comprehensively evaluated novel candidate HSCR genes and may serve as a blueprint for prospective studies aiming to discover and validate novel diseasecausing HSCR genes.

Candidate gene identification and selection
Coding variants of two sporadic L-HSCR patients and their non-affected parents were assessed by trio WES. Bioinformatic filtering identified 369 potential disease-causing, rare, coding, non-synonymous SNVs and indels in 357 genes for patient I and 373 in 356 genes for patient II (Fig 1A, S1 Fig). Obtained variants were filtered based on the disease model (including de novo for autosomal dominant, homozygous or compound heterozygous for autosomal recessive, and hemizygous for X-linked), resulting in 16 candidate variants in 11 genes for patient I and 19 candidate variants in 14 genes for patient II (Fig 1A, S1 Table, S2 Table, S1 Fig). Of note, none of the filtered candidates were known HSCR genes. However, targeted screening for ENS-relevant and HSCR risk genes (S3 Table) revealed three heterozygous variants for patient I (ECE1 (rs141146885), SERPINI1 (rs61750375), and SUFU (rs34135067)) and one heterozygous variant for patient II (EDNRB (rs780841273)).

Prioritization of novel HSCR candidate genes
Literature and database searches were performed to prioritize candidate genes. Further, genes presenting with variants of CADD scores above 13 were taken into account, which indicate that the variant is among the 5% most deleterious substitutions in the human genome [12]. 14 out of 16 variants of patient I as well as 16 out of 19 variants of patient II reached this CADD score threshold (S1 Table, S2 Table). Moreover, we considered central nervous system (CNS) phenotypes, finally leading to 4 candidates in patient I and 9 candidates in patient II (S1 Table,  S2 Table,  In the filtered list of candidate genes, two genes per patient were prioritized for detailed characterization, and variants were confirmed by Sanger sequencing: ATP7A (ATPase COPPER TRANS-PORTING ALPHA) and SREBF1 (STEROL REGULATORY ELEMENT-BINDING PROTEIN 1) for patient I, and ABCD1 (ATP-BINDING CASSETTE SUBFAMILY D MEMBER 1) and PIAS2 (PROTEIN INHIBITOR OF ACTIVATED STAT 2) for patient II (S1 Table, S2 Table,

Genetic evaluation
Moreover, screening of WES datasets for all four candidate genes of 767 HSCR patients revealed additional rare variants, although the variants present in patients I and II were not seen. Based on our filtering criteria, 6 non-synonymous variants for ATP7A, 10 for SREBF1, 2 and 5 for ABCD1 and PIAS2 respectively could be found (Fig 2A-2D

Candidate gene validation
IPA network analysis. To better understand the biological context of selected candidates, we performed a network analysis using the IPA software tool. All four candidates showed multiple direct and indirect protein interactions with ENS-relevant and HSCR risk factors ( Fig  2E). ATP7A is indirectly connected to RBP4 via the LDL complex. The very long chain fatty  relevance of all candidate genes was examined by evaluating genetic data from additional patients with HSCR. Variants shown in red were detected in index cases of this study while filtered variants from other cohorts are shown in black (# variant was found in both). More information about the filtered rare variants is given in S4 Table. Protein domain structures and the localization of filtered rare variants are given for ATP7A (A), SREBF1b (B), ABCD1 (C), and PIAS2ß (D). Functional domains in specific isoforms were annotated based on database entries (Pfam, NCBI, Uniprot) or additional literature findings [53,54]. HMA: heavy metal associated, TM: transmembrane, A: actuator, N: nucleotide, P: phosphorylation, HLH: helix loop helix, LMNA: LAMIN A/C, ABC: ATP-binding cassette, SAP: scaffold attachment factor-A/B, acinus and PIAS, PINIT: Pro-Ile-Asn-Ile-Thr, Zn-SP-RING: zinc binding -Siz/Pias-really interesting new gene, SUMO: small ubiquitin-related modifier, NLS: nuclear localization signal. (E) IPA core analysis was performed using lists of filtered variants detected in both index cases (S1 and S2 Tables), of ENS-relevant and known HSCR risk factors (S3 Table). Indirect interactions were only kept if linked to candidate gene products. ENS-relevant factors are highlighted in green while validated HSCR risk factors are lettered in red. Candidates are colored in yellow, the four selected most promising candidates are marked by red lettering. Protein interactions between candidates and factors of interest are marked in red. Proteins are arranged according to their subcellular localization. https://doi.org/10.1371/journal.pgen.1009106.g002

PLOS GENETICS
acid (VLCFA) transporter ABCD1 is directly connected to the HSCR factor DNMT3B, while several direct connections for SREBF1 and PIAS2 were found. Among others, both candidates are directly connected to the HSCR-associated factor NAV2. Additionally, PIAS2 was identified as a SUMOylation mediator of PTEN ( Fig 2E, S1 Fig).
Screening of additional WES and WGS data. To investigate, whether our selected candidates might be involved in ENS-and CNS-related phenotypes alike, we subsequently applied the same search and filtering strategy to a set of approximately 15,500 cases submitted for clinical WES and WGS (S1 Fig). This cohort is clinically diverse, with a majority of cases being submitted for neurological indications. In total, we identified 114 individuals harboring coding non-synonymous or indel variants in one of our four selected candidate genes. Of these, 37 individuals carried variants in ATP7A, 31 Table).
GTEx database analysis. In line with this, we also considered expression data in the GTEx database (human brain, colon tissue) to verify their relevance in the brain and the gut and could detect a positive expression in both tissues (S7 Table,  Expression analyses in murine and human tissues. Immunofluorescence analyses of murine embryos at relevant developmental stages (E9.5, E10.5, E11.5, E13.5) was performed and revealed broad protein expression of all candidates in ENS-related tissue structures and cells (S6 Table,  All validation steps are summarized in S7 Table. Determining the neuronal-specific role of candidates Generation of gene-specific KO clones. To determine the neuronal functions of selected genes, we generated gene-specific KO cell clones of the human neuroblastoma cell line SHSY5Y. The differentiation, migration, proliferation, and survival of these cells were assessed. A KO clone of the major HSCR gene RET was analyzed as a proof-of-principle control, and all clones were also compared with a mock control clone. KO clones either harbored homozygous (RET KO, ABCD1 KO) or compound heterozygous genome modifications (ATP7A KO, SREBF1 KO) (S5 Fig). Off-target effects were largely excluded (S1 Data). However, genome editing did not work for PIAS2 as no KO clone with neuronal morphology could be generated, despite several attempts with sgRNAs targeting two different exons. While the genetically modified exon 6 harboring a homozygous one-nucleotide insertion escaped KO by alternative splicing, sgRNAs targeting exon 2 did not yield a suitable clone.
Neuronal differentiation of KO clones. KO clones were differentiated into a neuronallike phenotype to investigate the functional impact of candidate genes on neuronal development (KO vs mock control).
All four KO clones showed impaired development of a neuronal-like phenotype ( Fig 3A). The neuronal morphology (elongated cell bodies carrying several neurite-like outgrowths) became apparent from the undifferentiated to the 7d differentiated state in all clones, except for the ATP7A KO clone, where a confluent cell monolayer was present after 7d (Fig 3A"'). After 7+14d, only the mock control clone had a highly organized network of interconnected neuronal-like clusters (Fig 3A'). KO clones for RET (Fig 3A"), SREBF1 (Fig 3A""), and ABCD1 ( Fig 3A""') showed delayed network formation. After 7+28d, all genome-edited clones ( Fig  3A), except the ATP7A KO clone (Fig 3A"'), showed neuronal-like clusters, which were connected by semi-adherent neurite-like connections.
Throughout differentiation, mRNA expression profiles of various cell type markers were assessed (Fig 3B, S6 Fig). Expression of these markers increased in the mock control clone with advancing neuronal-like maturation. While MAP2, TAU, and SYP were only marginally expressed, TUBB3 and GAP43 were prominently expressed. Marker expression was significantly different in all gene-specific KO clones during neuronal differentiation at various differentiation stages. MAP2 and GAP43 expression was significantly reduced at several stages, while SYP expression was upregulated after 7+21d in the RET KO clone. For the ATP7A KO clone, TUBB3 and TAU expression was significantly higher than in the undifferentiated control, but TAU expression dropped over time. At 7+7d, MAP2 and GAP43 expression was lower. During early differentiation, most markers showed increased expression in the SREBF1 KO clone, except for GAP43 after 7+7d. In contrast, expression of all markers was significantly higher in the ABCD1 KO clone during later neuronal development (Fig 3B).
In addition, expression of P75NTR and NES (neuronal progenitor markers) and UCHL1 and ASCL1 (pan-neuronal markers) was investigated in all cell clones (S6 Fig). These analyses are described in detail in the Supplementary (S1 Data).
Complementary immunofluorescence analyses confirmed TUBB3, MAP2, TAU, SYP, and GAP43 marker expression in most cell clones differentiated for 7+14d ( Fig 3C). Additionally, cellular morphologies were comparable to previous analyses ( Fig 3A). Corresponding to their functions, signals were detectable in all KO clones in neurite-like projections (TUBB3, MAP2, GAP43 and TAU), while for the synapse-marker SYP, a dotted-like staining pattern was observed along these structures ( Fig 3C). However, for the ATP7A KO clone, no neuronal-like clusters and no distinct staining pattern for TAU could be detected at this maturation stage (Fig 3C"'). Numerical raw data for Fig 3B and S6 Fig are given in S13 Table. Further comparative functional in vitro analyses. Cell migration (Fig 4A), proliferation (Fig 4B), and survival ( Fig 4C) were compared in undifferentiated and differentiated KO clones and the mock control clone. Cell migration was significantly reduced in the undifferentiated RET KO clone (Fig 4A'), but no difference was observed in differentiated clones (Fig 4A"). In addition, significantly fewer proliferative cells (BrdU + ) were detected in the undifferentiated ABCD1 KO clone (Fig 4B'), whereas no difference was observed in differentiated (7+1d) candidate clones (Fig 4B"). Apoptosis was higher in the undifferentiated ABCD1 KO clone (Fig 4C'), and the differentiated (7+7d) SREBF1 KO clone (Fig 4C"). Respective numerical data are summarized in S13 Table.

Discussion
Identifying and validating disease-relevant genetic variants in complex disorders such as HSCR remains challenging. To overcome these obstacles, we established a complementary approach to select candidate genes, to collect gene-related information on multiple levels, and finally to determine their disease relevance in functional readouts.
WES data revealed several hundreds of promising HSCR candidate variants in patient I and II. After prioritizing specific mutation types, we further took mRNA expression profiles in murine embryonic tissues, CADD scores (�13), published literature, and database information into account. We considered extra-intestinal neurological phenotypes to be relevant, since the CNS and ENS are structurally and functionally similar [4,[14][15][16], and interconnections between CNS disorders and HSCR/ENS-relevant phenotypes have already been reported [17,18]. Recently, genes involved in HSCR and ENS development (such as CHD8 [19] and NLGN3 [20,21]) were also found to be implicated in neurodevelopmental disorders indicating that genetic neurodevelopmental processes might be conserved between the CNS and ENS. Of note, there is growing evidence suggesting that variants predisposing to ASD by affecting CNS structure and function might also impact ENS development culminating in disturbed GI structure and function [22].
Finally, we ended up with 4 genes for patient I and 9 genes for patient II with potentially relevant variants, that all would equally have qualified for further evaluation. However, since the envisioned functional part of the pipeline was highly demanding, we decided on two candidates per patient: ATP7A and SREBF1 for patient I and ABCD1 and PIAS2 for patient II. Having now proven this study pipeline as feasible, it represents a perfect basis for an automated high-throughput approach. Once established, we aim to apply the adapted pipeline to additional candidates in the future.
ATP7A encodes a copper transporter which regulates intracellular ion levels and has previously been implicated in Menkes disease [23]. The VLCFA transporter ABCD1 is involved in X-linked adrenoleukodystrophy (X-ALD), and impaired ABCD1 function causes substrate accumulation and neurodegeneration [24]. SREBF1 encodes a crucial regulator of sterol and cholesterol biosynthesis, while PIAS2 encodes an E3-SUMO ligase and has multiple targets besides the STAT family [25,26]. These candidates have also been implicated in neurodegenerative disorders including Alzheimer and Parkinson disease [27,28].
We validated the disease relevance of our selected candidates using multiple approaches. In silico network analyses revealed direct and indirect interactions between the candidates and ENS-/HSCR-relevant factors [29][30][31], suggesting the selected candidates modify ENS-relevant genes, or participate in uncharacterized HSCR susceptibility pathways [1,32]. The disease relevance of our candidates was strengthened by additional variants in other HSCR patients. Importantly, another non-synonymous variant in ATP7A was also found in a patient with pseudo-obstruction (rs201788154, c.2452A>G, p.T818A), underlining its potential significance in functional GI disorders.
As already mentioned, neurodevelopmental processes seem to be conserved between the ENS and the CNS. Thus, genetic alterations might affect both systems likewise. In order to get an idea, whether our candidates might be involved in ENS-/CNS-related phenotypes, we screened clinical WES/WGS data of a non-HSCR, neurodevelopmental disease-focused clinical cohort. Obtained results are in line with this, as patients carrying variants in the selected candidates presented not only with neurological phenotypes but also with a high prevalence of GI phenotypes.
We analyzed the protein expression of candidate genes in murine embryonic ENS-relevant tissues and human fetal colon tissue. Although the human specimens did not match the stages examined in murine tissue, expression data from human tissue was valuable because HSCR is a prenatal human disorder. The transcript and protein expression profiles of the four candidates are already known from different species and tissues at various developmental stages [24,[33][34][35][36][37]. However, to the best of our knowledge, ENS-specific analyses have not been performed so far.
To correlate candidate gene function with HSCR-causing pathomechanisms, gene-specific KO cell clones were generated from SHSY5Y human neuroblastoma cells. Although this CNSderived cell source does not mimic the ENS, we hypothesized that neuronal differentiation, migration, proliferation, and survival are conserved between the CNS and ENS. This assumption is supported by the previous finding that the same genes are involved in ENS and CNS development. We successfully generated KO clones for ATP7A, SREBF1, ABCD1 and RET, but PIAS2 escaped all editing approaches. It is tempting to speculate that PIAS2 has an essential cellular function [38], particularly because the edited neuronal-like clone escaped CRISPR--Cas9-based KO by alternative splicing and expressed a shorter, unknown PIAS2 isoform.
To assess phenotypic alterations resembling HSCR pathogenesis in vitro, we also studied neuronal differentiation, migration, proliferation, and survival in KO clones compared with a mock control and RET KO clone. Most striking phenotypic alterations for the RET KO clone were shown for its neuronal differentiation and migration capacity. These findings were in agreement with two previous studies-one using induced pluripotent stem (iPS) cell-derived ENCDCs from a HSCR patient with a RET mutation and one using neuronal precursors from Ret-deficient mice [39,40]. In addition, neuronal markers were differentially expressed in the RET KO clone, underlining the suitability of our in vitro approach in assessing neuronal-specific impairments. We did not validate the neuronal markers on the protein level, but the mRNA expression profiles (e.g. GAP43) were consistent with results from previous studies [41]. How RET deficiency influences marker expression in the differentiating KO clone and the functional relevance of this dysregulation should be elucidated in future studies.
Candidate gene-specific KO clones displayed altered cellular functions. The most striking morphological difference was seen in the ATP7A KO clone, which displayed high cell densities in line with neuronal-like maturation. Although we did not detect changes in proliferation, the higher cell densities may have been caused by the inverse relationship between ATP7A expression and proliferation in neuroblastoma cells [33]. In vivo analyses using a murine disease model demonstrated that Atp7a might be involved in axonal outgrowth and synaptogenesis [42]. This may explain the significantly reduced MAP2 and GAP43 expression in differentiating cells.
The SREBF1 KO clone showed altered differentiation and survival compared with the mock control clone. Srebf1 is involved in lipogenesis, so may be crucial for dendritogenesis [43]. Others have shown that Srebf1c KO causes peripheral neuropathy in mice [44] and that SREBF1 is crucial for cancer cell growth and viability [45,46]. The observed defects in our SREBF1 KO clone are thus in line with its known protein functions.
Further, ABCD1 deficiency reduced proliferation and increased apoptosis in undifferentiated cells. In contrast, neuronal-like network formation was only slightly altered and mRNA expression of many neuronal markers was mostly upregulated during differentiation. These findings may be explained by a compensatory effect of ABCD2 in the KO clone which might be launched during neuronal cell fate induction [47,48]. In the X-ALD zebrafish abcd1 KO model, oligodendrocytes showed increased apoptotic rates [36]. We observed a comparable defect in our neuronal-like ABCD1 KO clone.
Along with the difficulty pinpointing HSCR-relevant candidate genes, the complex pathogenesis of HSCR is multifactorial [8,10]. To get a complete pathophysiological picture in future studies, rare coding variants (SNVs) cannot be studied alone; structural alterations (CNVs), common non-coding risk alleles, and rare coding variants affecting ENS-relevant genes should also be considered. To further strengthen the importance of our four selected candidate genes in HSCR, transcriptome analyses and protein analyses in genome-edited cell clones must be performed. In addition, genome-edited primary ENS-derived cells or patient-derived iPS cells could reveal how the four genes contribute to underlying disease-causing pathomechanisms. Finally, rescue experiments using isogenic iPS cell controls could functionally validate selected candidate genes.
Taken together, we have identified four novel HSCR risk genes based on two patients with L-HSCR by evaluating trio WES data and focusing on genes that harbor rare SNVs. We corroborated the potential disease-relevance of three selected genes using various in vitro analyses ( Fig 5) and suggest that each of the identified variants might contribute to disease risk based on the multifactorial origin of HSCR. Our approach represents a suitable tool identifying candidate genes and evaluating their disease relevance in enteric neuropathies such as HSCR. We plan to adapt our approach to larger studies of candidate genes by up-scaling and automating certain steps. In this way, we hope to understand the complex genetic architecture underlying HSCR and to facilitate the development of novel therapies.

Ethics statement
The parents provided written informed consent for genetic and molecular analyses, including WES. The study was approved by the Ethical Board of the Medical Faculty of the University Hospital Heidelberg (S509/2012).

Candidate gene identification and selection
HSCR patients. Both HSCR patients were full-term neonates with a birthweight of 3,350 g (patient I) and 3,470 g (patient II), and presented with long-segment aganglionosis (L-HSCR) at least up to the colon transversum. In patient I, laparoscopic-assisted transanal endorectal pullthrough (TERPT) with simultaneous colostomy closure was performed at 14 weeks of age at the Pediatric Surgery Clinic (University Hospital Heidelberg, Germany). Patient II underwent an assisted pull-through Swenson procedure 8 months after diagnosis in another clinic. After colostomy closure, the child suffered persistently from intestinal obstruction with severe constipation. TERPT was repeated in Heidelberg at one year of age after persistent aganglionosis was diagnosed.
WES and variant calling. In a trio design, patients with HSCR and their non-affected parents were analyzed by WES. The Agilent SureSelect V4 kit (Agilent Technologies, Santa Clara, USA) was used to prepare the sequencing library. Sequencing was performed on the Illumina HiSeq 2500 platform (Illumina, San Diego, USA) as described in the Supplementary (S1 Text). Filtered variants for both patients (patient I: 11 genes, patient II: 14 genes) are shown in S1 Table and S2 Table. Transcriptomics analysis in murine tissues. For microarray profiling, embryonic regions of interest (pre-migratory vagal NCCs at stages E8.75 and E9.5; embryonic gut at E13.5, without stomach) were prepared and collected in 1 mL of TRIzol reagent (Thermo Fisher Scientific, Waltham, USA). Embryonic tissues from each litter were pooled and three litters were analyzed per stage. Total RNA was extracted according to the manufacturer's instructions. For transcriptomics analyses, 500 ng of total RNA was used for evaluation with the mouse Clariom S array (Thermo Fisher Scientific) as described in the Supplementary (S1 Text).

Prioritization of novel HSCR candidate genes
CADD scores of single nucleotide variants were calculated using the CADD model GRCh37-v1.4 (https://cadd.gs.washington.edu/snv), while CADD version 1.3 was installed locally for indels (S1 Table, S2 Table,   The relevance of filtered candidate genes in neurological diseases was assessed based on published literature and the Genetic Association and the Disease Gene databases (https:// geneticassociationdb.nih.gov/; https://www.disgenet.org/).

Genetic evaluation of candidate genes
Available WES and whole genome sequencing (WGS) data of 767 HSCR patients [8,49] from groups of the International HSCR Consortium (Stanislas Lyonnet, http://www.erare.eu/ financed-projects/hscr) were analyzed for the presence of rare exonic variants in the candidate genes (ATP7A, SREBF1, ABCD1, PIAS2). Common variants/SNPs (MAF >1%) were already excluded in individual studies. First, all variants, except for non-synonymous (missense mutations, stop loss and stop gain), were removed. Next, remaining variants were compared with variant data deposited in the gnomAD browser (population-matched comparison). Variants were kept, if the MAF was < 1%. To assess the putative functional relevance of the filtered variants, CADD scores were calculated using the CADD model GRCh37-v1.4 (https://cadd.gs. washington.edu/snv).

Candidate gene validation
IPA network analysis. To determine the biological context of selected candidate genes, a network analysis was performed. A combined list of filtered WES datasets (n = 25, final candidate genes from both patients), ENS-relevant genes (n = 117), and HSCR-relevant genes (n = 25) [10,16,50] was uploaded to the Ingenuity Pathway Analysis (IPA) software (Qiagen, Venlo, The Netherlands) (S3 Table) and analyzed (S1 Text).
Screening of additional WES and WGS data. To assess the potential involvement of selected candidate genes in ENS-and CNS-related phenotypes, clinical WES and WGS data of approximately 15.500 cases at Baylor Genetics Laboratories (Houston, TX, USA) were screened according to the previously applied filtering criteria. WES had been performed corresponding to previously described methods [51].
GTEx database analysis. Furthermore, publicly available human brain and colon mRNA expression profiles (GTEx database, https://www.gtexportal.org/home/) of the four selected candidates were considered.
Protein expression analyses in human tissues. For expression analyses in human specimens, formalin-fixed paraffin-embedded (FFPE) colon tissue of a fetus acardius amorphus (25 th week of gestation) was used [52]. FFPE sections were stained by immunohistochemistry (S1 Text, S9 Table, S10 Table). A summary of the complete study approach is illustrated schematically in S1 Fig.

Candidate gene characterization
Candidate genes were characterized using SHSY5Y cells (cultivated and processed as indicated in the Supplementary (S1 Text, S8 Table, S11 Table, S12 Table)), including CRISPR/Cas9 genome editing and functional analyses.
Expression analyses of genome-edited clones. For mRNA expression analyses of SHSY5Y clones, cells were harvested and collected in TRIzol. Total RNA was extracted using a modified version of the RNAqueous-Micro Total RNA Isolation Kit protocol (Thermo Fisher Scientific). Protein and mRNA expression analyses are described in the Supplementary (S1 Text, S8 Table, S9 Table, S10 Table).
Functional in vitro assays. Migration of undifferentiated and differentiated (7+1d) SHSY5Y cells was investigated using Boyden chamber assays as outlined in the Supplementary (S1 Text).
Proliferation of undifferentiated and differentiated (7+1d) SHSY5Y clones was assessed using the BrdU incorporation Kit (Roche, Basel, Switzerland). Cells were labeled with BrdU (10 μM BrdU) for 10 h under standard cultivation conditions. The percentage of apoptotic cells in undifferentiated and differentiated (7+7d) SHSY5Y clones was determined using the in situ cell death detection Kit (TUNEL, Roche). In both assays, nuclei were counterstained with Hoechst 33342 prior to mounting with Vectashield and imaged with the automated inverted microscope DMI4000B. Ten fields of view per cell clone (20 × magnification) were analyzed by ImageJ software 1.52i (National Institutes of Health, USA). The mean percentage of BrdU + or TUNEL + cells/field of view was calculated (n�3).