High-risk neuropsychiatric copy number variants are associated with convergent transcriptomic changes in human brain cells

Large, recurrent copy number variants (CNVs) are among the strongest risk factors for neuropsychiatric conditions, contributing to multiple phenotypes with overlapping psychiatric and cognitive symptoms. However, the molecular basis of this convergent risk remains unknown. We evaluated the human brain transcriptome in carriers of nine high-risk neuropsychiatric CNVs and matched non-carriers using single nucleus RNA-sequencing. Brain tissue from carriers displayed widespread disruptions of gene expression, with deletions showing greater changes than the reciprocal duplications. Functional enrichment analysis revealed changes in mitochondrial energy metabolism and synaptic function that converged across CNVs and cell types. For mirror CNVs, the direction of effects was often reversed between deletions and duplications and showed correlation with CNV gene dosage. These findings suggest that a shared pathophysiology underlies risk for convergent brain phenotypes across CNVs and point toward promising therapeutic targets.


Introduction
Copy number variants (CNVs) are chromosomal alterations consisting of a deletion or duplication of specific segments on the chromosome 1,2 .This type of variant is widespread across the genome and a significant source of functional genetic variation 1,3,4 .CNVs can range in size from a few kilobases to several megabases, and their effects on gene dosage can lead to various phenotypic outcomes 5 .Some large, rare but recurrent CNVs are associated with substantial risk for various neuropsychiatric conditions, such as autism spectrum disorder, intellectual disability, schizophrenia, and mood disorders [6][7][8][9][10] .Despite this association with a broad range of diagnoses, carriers of neuropsychiatric CNVs exhibit symptom in the domains of cognition, perception, and mood.Overlapping diagnoses and symptoms among carriers of distinct CNVs suggest some shared pathophysiology 9,11 , but the molecular basis of this clinical risk remains largely undiscovered.
Highly penetrant CNVs offer a unique opportunity to explore the impact of dosage changes in specific sets of genes within CNV regions.While many neuropsychiatric CNVs have been extensively studied with cell culture, mouse model, and neuroimaging methods [12][13][14] , little is known about the cellular and molecular effects of CNVs within the human brain.Studies in human tissue have so far been limited to bulk tissue analysis 15,16 .Through advances in single-nucleus RNA-sequencing, it is now possible to interrogate the molecular effects of CNVs in individual cell types, revealing cell type-specific changes that may be masked in bulk tissue analyses.
In this study, we used single-nucleus RNA-sequencing of post-mortem human brain from individuals harboring high-risk CNVs to explore the neurobiological mechanisms through which these CNVs confer risk for neuropsychiatric illnesses.We analyzed human brain tissue samples from donors with pathogenic deletions and duplications on chromosomes 22q11.2,16p11.2,15q11.2, or 1q21.1, and deletions on 7q11.23, along with matched noncarriers, derived from three brain banks.This unique collection encompassed some of the most penetrant genetic risk factors for neuropsychiatric illnesses.The inclusion of four "mirror" CNVs allowed us to compare reciprocal deletions and duplications.By leveraging the resolution of our dataset, we uncover cell type specific gene expression changes in the human brain.These findings shed light on biological perturbations associated with these CNVs and reveal convergent mechanisms in mitochondrial energy metabolism and synapse-related function that may underlie overlapping patterns of psychopathology.

Single-Nucleus RNA-Seq Resource for High-Risk CNVs in Human Brain Cell Types
We performed single-nucleus RNA-sequencing of post-mortem human brain tissue from individuals carrying high-risk neuropsychiatric CNVs along with matched controls (Figure S1).Carriers included 13 adults (aged >10 years) and 2 infants (aged <1 year), harboring deletions or duplications on chromosomes 22q11.2,16p11.2,15q11.2,1q21.1, or the deletion on the 7q11.23.CNV carriers were carefully matched with two non-carrier individuals based on brain bank, age, sex, ethnicity, and psychiatric diagnosis, resulting in a total of 28 (24 adult and four infant) non-carrier control individuals.Whenever possible, tissue from dorsolateral prefrontal cortex (dlPFC) and anterior cingulate cortex (ACC) was collected for sequencing (Figure 1A,C).
Starting with the adult samples, strict, iterative quality control was applied to remove poor quality nuclei and samples (Methods; Figure S2,S3).We performed clustering of the remaining 295,299 nuclei after filtering and resolved nine cell types for downstream analysis: CGE-and MGE-derived inhibitory neurons, upper-and lower-layer excitatory neurons, astrocytes, microglia, oligodendrocytes, oligodendrocyte precursor cell (OPCs), and vascular and leptomeningeal cells (VLMC)-endothelial cells (Figure 1C-D and Figure S4A-C).The distribution of cell types was consistent across brain regions, aligning with the high degree of concordance expected for the dlPFC and ACC in broad cell types 17 (Figure S5B).The infant dataset underwent a similar pipeline but was processed separately due to differences in cell type composition (Methods; Figure S6A-D).After filtering, the dataset yielded a total of 66 adult and 10 infant samples.
For differential gene expression (DGE), we used dreamlet 18 , which can incorporate mixed models and accounts for differences in sequencing depths and cell type counts across samples (Figure S5C and Table S5,S6).Our DGE model included match group, brain region, and subject as random effects and CNV as a fixed effect."Match group" represents the grouping of carriers and matched non-carrier controls and was correlated with several other demographic variables (Methods; Figure S7A).However, for comparison, an alternative model accounting for additional quality-related variables, such as RNA integrity number (RIN), gene detection rate, and cell type count, was highly concordant and yielded similar results (Figure S7D-E).pipeline.Dorsolateral prefrontal cortex (dlPFC) and anterior cingulate cortex (ACC) were dissected, nuclei were captured and underwent single-nucleus RNA-sequencing with the 10X genomics platform.Sequenced data were analyzed with iterative clustering and quality control.Resolved clusters were annotated via Azimuth 19,20 and expression of marker genes into nine well-resolved cell types.Differential gene expression analysis was conducted between carriers and non-carriers accounting for brain region, brain sample number, and match group as random effects using dreamlet 18

DGE model captures expected changes in CNV regions
We first examined DGE within each CNV region.This step served as an important checkpoint to test if gene expression changes reflected CNV gene dosage, as expected.We first extracted all genes within each CNV region (Methods; Table S1,S2) and evaluated the DGE summary statistics from each model.Mean log fold-change (logFC) across cell types confirmed reduced expression of most genes within deletions and increased expression of most genes within duplications (Figure 2 and Figure S8).Thus, this model captured the anticipated direction of gene dosage change for each deletion and duplication studied.
To test if the DGE model captured the expected CNV gene dosage effects, we compared the difference in mean t-statistic between expression of genes within a CNV and those elsewhere in the transcriptome.This difference was significant for all CNVs (using a twosided Mann-Whitney U-test, p-value <0.05) (Figure S9).This finding further affirms reliability of DGE model results. 21.Genomic coordinates were taken from GRCH38, respective gene starts are shown on the x-axis.The mean logFC (y-axis) was calculated for each CNV gene across nine cell types from the respective DGE results.Genes are colored to indicate those within deletions (orange-red) and duplications (green), showing that with few exceptions genes from deletion model are downregulated (negative logFC) and genes from duplication models are upregulated (positive logFC).Dotted lines indicate expected logFC based on copy number.

Figure 2: Differential gene expression results within CNV regions. CNV regions were taken from ClinGen
Previous work has also suggested that large CNVs can affect expression of nearby genes, on the same chromosome arm as the CNV 22 .To explore this question in our data set, we examined the DGE t-statistics of genes in proximity to each CNV region: those on the same chromosome, on the same chromosome arm, within 5 megabases (Mb) of the CNV, and within 1 Mb of the CNV (Figure S9).We observed some evidence of disrupted gene expression near CNV regions.These effects were more prominent in duplications than in deletions.However, most of the DEGs lay far outside of the CNV region and surrounding areas.Thus, DGE results captured widespread effects beyond the CNV and surrounding genomic regions.

Deletions have greater impact on the transcriptome than reciprocal duplications
Having confirmed that the DGE model captured expected differences in gene copy number, we proceeded to investigate changes across the entire transcriptome.Our DGE analysis revealed thousands of differentially expressed genes (DEGs) spanning cell types and CNVs.Across these dimensions, we compared the number of DEGs, at the FDR threshold of 0.1, as a measure of transcriptomic impact for each CNV.We consistently observed higher numbers of DEGs in deletions compared to the corresponding duplications.The largest numbers of DEGs were detected in carriers of deletions on 22q11.2(4692 genes), 7q11.23 (3876 genes), 16p11.2(1028 genes), and 1q21.1 (9990 genes).In contrast, the lowest numbers of DEGs were detected in samples carrying duplications on: 15q11.2(15 genes), 1q21.1 (8 genes), 22q11.2(58 genes), and 16p11.2(23 genes) (Figure 3A).Comparing deletions and reciprocal duplications, we found a significantly greater numbers of DEGs in deletions at FDR <0.1 across all cell types (Figure 3B).A greater impact of deletions aligns with phenotypic expectations given the typically higher penetrance of deletions compared to duplications 9,23,24 and with more extensive structural changes reported in neuroimaging studies of individuals for deletion CNVs 25 .
Our single-nucleus RNA-seq approach also allowed us to delineate the impact of CNVs on distinct cell types.For each CNV, we investigated the number of DEGs detected within each of the nine cell types we studied.Notably, we observed the greatest differences across cell types in the 7q11.23 and 22q11.2deletions.Astrocytes from the 22q11.2deletion showed a DEG count up to 9.08 times greater than that of the other eight cell types, 18.59% of genes that passed expression filtering level were differentially expressed (Table S7).This was the greatest proportion of genes differentially expressed across all models studied, with the exception of the 1q21.1 deletion.Upper-and lower-layer excitatory neurons from the 7q11.23 deletion showed 3.23-22.05times more DEGs than the other seven cell types, 12.40% of genes that passed quality control were differentially expressed (Table S7).The pronounced variation in DGE across cell types in the 22q11.2and 7q11.23 deletion carriers suggests cell type-specific disruptions in these high-risk deletions, implicating astrocytes and excitatory neurons, respectively.
Investigating the intersection of DEGs across the dataset, we found that most DEGs were unique to each cell type (Figure 3C).However, neurons exhibited a higher intersection of DEGs with other neuron types.Across all CNVs, upper-and lower-layer excitatory neurons shared the largest intersection of DEGs as well as the largest total number of DEGs.Astrocytes showed the greatest number of unique DEGs (n = 1246), but still shared some DEGs with other glial populations, such as oligodendrocytes, microglia, and OPCs (Figure 3C).The large number of cell type-specific DEGs highlights the relevance of using singlecell genomics to elucidate the complex impact of these high-risk variants.

Disruption of the transcriptome correlates with deletion size
In order to quantify the relationship between the genes within each CNV region and the disruption throughout the rest of the transcriptome, we scored CNVs for gene content and loss-of-function intolerance.When exploring deletions and duplications separately, deletions showed a correlation between the number of genes in the CNV region and the number of DEGs (Figure S10A).Duplication size did not correlate to number of DEGs (Figure S10B).We excluded the 1q21.1 CNV from this analysis due to variable involvement of the proximal TAR region, leading to an inflated number of DEGs in the 1q21.1 deletion carriers we studied.
To refine this test, we utilized a metric known as the loss-of-function observed over expected upper bound fraction (LOEUF), which is a continuous measure of genetic tolerance to loss-of-function mutations 26 .A higher inverse-LOEUF score indicates intolerance to loss of function and suggests that the gene is essential for normal function and survival 27,28 .For each of our DGE models, we took the sum of the inverse-LOEUF scores of all genes retained from the CNV region after expression filtering, thereby ensuring that the genes included in the score were well expressed in that cell type.We then correlated this sum with the number of DEGs, separately for deletions and duplications.Among deletions (again excluding 1q21.1),we observed a positive correlation between the number of DEGs and the summed inverse LOEUF scores (Figure S10C).In contrast, no correlation was detected among duplications.Thus deletions with greater aggregated constraint metrics are associated with larger transcriptomic disruptions.

Functional enrichments reveal convergence on mitochondrial energetics and synaptic terms across CNVs
A key objective of our study was to assess the functional impact of changes in gene dosage across CNVs and cell types and to explore functional convergences.To accomplish this task, we performed rank-based functional enrichment testing via GSEA for each of the 81 DGE models (9 CNVs by 9 cell types).To more comprehensively interrogate convergent signals, we evaluated significant gene ontology (GO) terms in three ways: across all 81 models (Figure 4A), for each CNV across cell types (Figure S11A), and for each cell type across CNVs (Figure 5A).
To uncover major convergent functional themes, we first explored the most frequently implicated terms with FDR<0.1 from GSEA across all 81 models.The most frequently implicated terms included "mitochondrial ATP synthesis coupled electron transport", "cellular respiration", "oxidative phosphorylation" (OXPHOS), "regulation of synaptic plasticity", and "synapse organization" (Figure 4A).Nine of the top ten most frequent terms were related to two major themes: mitochondrial energy metabolism and synaptic function.
Within these themes, we found that the direction of enrichment often diverged between deletions and reciprocal duplications, suggesting a CNV gene dosage effect.Among synapse-related terms, CNVs on 22q11.2 and 1q21.1 showed the most prominent directional divergence between the respective deletion and duplication: synaptic terms were upregulated in the 22q11.2deletion and 1q21.1 duplication and downregulated in the 22q11.2duplication and 1q21.1 deletion (Figure 4B).Mitochondrial energy metabolism terms showed the most distinct reversal of direction in the 16p11.2: the deletion showed downregulation and the duplication showed upregulation of genes enriched for these terms.The 7q11.23 deletion showed the strongest upregulation of energy metabolism-related terms across cell types of any individual CNV.
Notably, OXPHOS was the most frequently implicated term, with FDR<0.1 in 47 of the 81 models across CNVs and cell types.OXPHOS was significantly enriched in at least one cell type for all nine CNVs, and the direction of enrichment stayed mostly consistent across cell types for each CNV.The strongest OXPHOS enrichments were associated with the 7q11.23 deletion, with consistent upregulation across cell types, and the 16p11.2deletion, with consistent downregulation across most cell types.These two deletions showed OXPHOS enrichment with an FDR<0.001 for most cell types (Figure 4C).Enrichment of genes within four of the classical complexes of the electron transfer chain (complex II was removed due to consisting of only four genes and therefore limited power to test for enrichment), showed that OXPHOS enrichments extended across all four complexes tested.Specifically, the 7q11.23 and 16p11.2deletions showed respective up-and downregulation for complexes I-V, excluding complex II, across all neuronal and many glial cell types (Figure S12A-B).
To ensure that these and other finding were not a consequence of confounding or qualityrelated factors, we compared these results to our quality-covariate adjusted model, which also included RNA integrity number (RIN), mean gene detection rate, and cell count of each cell type by sample as fixed effects (Methods; Figure S7D).We observed highly correlated transcriptome-wide t-statistics between the original and the quality-covariateadjusted model (Figure S7E).Importantly, OXPHOS enrichments remained highly significant and consistent across CNVs and cell types (Figure S13B), demonstrating that the OXPHOS signal was not driven by quality-related variables.
To capture the most pervasive disruptions across cell types for individual CNVs, we next examined the most consistent functional enrichments for each individual CNV (Figure S11A).Themes of synaptic disruption and OXPHOS emerged for most individual CNVs.Some CNVs, specifically in the 22q11.2duplication and 15q11.2deletion, showed synaptic terms as the most frequently implicated terms across cell types.On the other hand, OXPHOS was one of the most frequently implicated terms across cell types for six of the nine CNVs.For example, the most frequently implicated terms across cell types for the 16p11.2and 7q11.23 deletions were related to mitochondrial energy metabolism, consistent with the strong enrichment signal for OXPHOS observed for these deletions (Figure 4C).We also noted that a related theme of hypoxia-related terms emerged for 22q11.2across cell types with the most frequent terms including "response to oxygen levels" and "cellular response to hypoxia" (Figure S11A).Lastly, although OXPHOS was not significantly enriched for the 22q11.2 in most cell types, another term related to energy metabolism, "glycolytic processes", was frequently implicated across cell types (Figure S11B).

Figure 4: Convergence of functional enrichments across CNVs on synapse and mitochondrial energy metabolism. (A) Barplot showing the intersection of the top ten most frequently implicated terms (y-axis) across the 81 tests (nine cell types and nine CNVs). Nine of the top ten terms were categorized into two key themes: synapse related terms and mitochondrial energy metabolism related terms. One term did not fit into these categories, so it was left out of groupings. (B) Boxplot depicting normalized enrichment scores (NES) (x-axis) of synapse-related terms (aggregating three GO terms) and those related to mitochondrial energy metabolism (aggregating five GO terms). NES values are arranged on the x-axis and colored by CNV. There were observable CNV gene dosage effects with several deletions and reciprocal duplications having opposite directions of enrichments, e.g. synapse related terms are positive in 22q11.2 deletion and negative in 22q11.2 duplication. (C) Heatmap showing the OXPHOS (GO:0006119) functional enrichment results, which was the most frequent term across the nine cell types (x-axis) and nine CNVs (y-axis).
The grid is colored by NES and annotated by FDR significance thresholds (FDR <0.1 = *, FDR <0.01 = **, FDR <0.001 = ***).OXPHOS is significant in all CNVs, often across cell types.

Disruption in mitochondrial energetics extends to infant CNV carriers
We also analyzed a smaller infant dataset, which included samples from the dlPFC and ACC from infant carriers of the 7q11.23 (WBS) or 15q11.2(BP1-BP2) deletion, along with matched infant non-carriers.A parallel approach for cell type annotations and clustering (Methods; Figure S6A-D), as well as DGE via dreamlet and functional enrichments via GSEA was conducted on the infant samples (Figure S6,S15).DGE models in infants showed lower residual variance, suggesting that infant brain may be less influenced by external factors than adult brain samples (Figure S14B).Genes with the deletions were downregulated, as expected (Figure S15A).In the 7q11.23 deletion, we observed that GO terms implicated in the adult carriers were also implicated in infants, including OXPHOS (Figure S15C-D).Genes involved in OXPHOS were highly significantly (FDR <0.001) downregulated across all four neuron types in the 7q11.23 deletion in the infant brain and upregulated across several cell types in the 15q11.2deletion (Figure S15D).Thus, while the direction of enrichment was reversed between the infant and adult samples, gene expression changes implicated OXPHOS changes in both adult and infant carriers of the 7q11.23 and 15q11.2deletions.

Cell type-specific disruptions shared across multiple CNVs
Having explored the most consistent terms across the entire dataset and for each individual CNV across cell types, our focus next turned to identifying common signals among cell types across distinct CNVs.This analysis contextualizes alterations specific to each cell type, which is masked in bulk-tissue studies, while also revealing convergent patterns across distinct high-risk variants.As before, we examined the most frequent terms with FDR<0.1 within each cell type.
For neurons, we observed that the top three most frequently implicated terms were distinct for each neuronal cell type, but largely related to regulation of the synapse, protein quality control, and cellular communication.For CGE-derived inhibitory neurons, enrichments involved cellular communication and signaling, e.g."signal release", "negative regulation of transport", and "cell-cell adhesion via plasma membrane adhesion molecules".MGEderived inhibitory neurons showed enrichments in "proteasomal protein catabolic processes" and "vesicle-mediated transport in synapse".Both upper-and lower-layer excitatory neurons displayed functional enrichments associated with synaptic transmission modulation, synapse organization, and mitochondrial function (Figure 5A).
For glial cell types, many of the convergent terms also involved energy metabolism.The most frequently implicated enrichments for astrocytes included "actin cytoskeleton organization", "autophagy", and ATP metabolic processes.Microglia-specific enrichments involved cytokine responses and terms related to OXPHOS, suggesting aberrant energy metabolism and alterations in gene expression involved in inflammation and cell survival.Oligodendrocytes and OPCs exhibited enrichments in neurotransmitter transport, cell projection morphogenesis, synapse organization, and OXPHOS, which demonstrates that the pervasive disruption of synaptic and metabolic terms across CNVs extended to major glial populations (Figure 5A).Interestingly, we observed that the directions of functional enrichments again often diverged between deletions and duplications.This was most pronounced in excitatory neurons, especially in the "synapse organization" and "signal release" terms for upper-and lower-layer excitatory neurons respectively.
Given the widespread disruptions in OXPHOS and synaptic functions across CNVs, we hypothesized that changes in gene expression for OXPHOS may coincide with changes in gene expression for the synapse in neurons.Using the SynGO database to select synapserelated terms, we found that presynapse was the most convergent term across the dataset, with an FDR<0.1 in 42 of the 81 models (Figure S16A,S16C).We observed a significant, linear relationship between the enrichment scores for OXPHOS and presynapse (Figure 5B).Postsynaptic density was also among the most convergent SynGO terms, with FDR<0.1 in 37 of the models (Figure S16A).Similarly, we observed a significant, linear relationship between the enrichment scores for OXPHOS and postsynaptic density, although this relationship was less pronounced than for the presynapse (Figure 5C).

Mirror CNVs show gene dosage effects in synaptic and energy metabolism terms
Our collection contains four 'mirror' CNVs, allowing us to test gene dosage effects between deletions and reciprocal duplications.Evaluating each CNV independently, we found that several significant functional enrichments were shared, but diverged in their direction for deletions and the reciprocal duplications.Previous work has demonstrated reciprocal changes in clinical phenotypes, brain morphology, and gene expression between carriers of deletions and duplications of the same genomic segments [28][29][30][31] .These "mirror" phenotypes suggest gene-dosage dependent effects for these mirror CNVs across the transcriptome.
To directly examine CNV gene dosage effects, we constructed gene dosage models for each mirror CNV, assigning a value of 1 for deletion carriers, 2 for non-carriers, and 3 for duplication carriers (Figure 6A).These modified DGE models test for genes that are positively or negatively correlated with CNV gene dosage for each CNV, followed by functional enrichments to evaluate terms most frequently associated with CNV gene dosage across cell types.The 22q11.2 CNV showed the largest number of CNV dosagesensitive genes with FDR<0.1.(Figure 6B).In the 22q11.2gene dosage model, similar to the individual CNV models, astrocytes showed the largest number of DEGs, with 1245 genes at FDR <0.1, which was almost 10 times greater than the cell type with the next greatest number of DEGs (CGE-derived inhibitory neurons with 146 DEGs).
Functional enrichments of DEGs detected under the gene dosage models largely recapitulated our previous results for individual CNVs.The 22q11.2 gene dosage model was enriched for terms related to hypoxia response and glycolytic processes, which were both negatively correlated with gene dosage.Additionally, "regulation of synaptic plasticity" was enriched in many cell types, all negatively correlated with copy number (Figure 6C).This aligns with our previous analysis of convergent synapse-related terms for the 22q11.2deletion and duplication (Figure 4B).Among carriers of 16p11.2CNV, almost all of the most frequently implicated term across cell types were related to energy metabolism: "aerobic respiration", "OXPHOS", "mitochondrial ATP synthesis", etc. (Figure 6C).This aligns with our previous analysis of convergent terms related to in mitochondrial energy metabolism in the 16p11.2deletion and duplication (Figure 4A).Our gene dosage model thus confirmed that OXPHOS was significantly positively correlated to 16p11.2 gene dosage for most cell types.We observed comparable levels of significance, but a negative correlation to 16p11.2 gene dosage for microglia.

Discussion
We present a unique dataset which, for the first time, investigates the impacts of several high-risk CNVs in human brain with cell type resolution.We systematically delineate major transcriptomic effects of nine recurrent, high-penetrance neuropsychiatric CNVs across cell types, with the overarching goal of shedding light on cellular pathophysiology that underlies shared phenotypes.Leveraging single-nucleus transcriptomics, we attain cellular resolution from human brain tissue masked in previous bulk tissue studies, allowing us to interrogate impacts of each CNV and converging signals within and across cell types.Using this resource, we discovered that differentially expressed genes in high-risk neuropsychiatric CNVs converged on mitochondrial energetics and synaptic terms.We also found cell-type specific changes, with the most pronounced effects in astrocytes and neurons.These findings provide valuable insights into the shared molecular underpinnings of high-risk CNVs, with major implications for future research and therapeutic targets.Across all DGE models, we observed a consistent upregulation for duplications and downregulation for deletions of genes within the CNV regions, affirming the reliability of the DGE models we employed.We noted some disrupted gene expression in the vicinity of CNV regions, particularly in duplications, which is consistent with previous findings that large CNVs can influence the expression of nearby genes 22 .Despite some localized disruption, the majority of DEGs were located outside the vicinity of the CNVs, indicating that high-risk CNVs have broad, widespread impacts on gene expression beyond their proximal effects.The transcriptome-wide number of DEGs detected in CNV carriers often echoed the expected level of phenotypic risk associated with those CNVs.For instance, deletions exerted a greater influence on the transcriptome than the corresponding duplications of the same genomic region.This aligns with clinical and neuroimaging studies that have shown that individuals with pathogenic deletions tend to exhibit more severe phenotypes and greater penetrance of neuropsychiatric and neurodevelopmental disorders than those with the reciprocal duplications 1,9,13,23,24 .Larger deletions, which encompass more genes and are known to be more penetrant than smaller deletions and duplications 9 , exhibited the highest number of DEGs across the dataset, especially evident from the 22q11.2and 7q11.23 deletions.We observed that the number of DEGs across the transcriptome correlated with the number of genes in the CNV region for deletions.One exception to this was the 1q21.1 deletion, the sole carrier of which showed the highest number of DEGs despite the moderate size of the distal 1q21.1 region.1q21.1 CNV carriers in our study varied in the involvement of the proximal TAR region, with the deletion spanning both the proximal and distal region, a larger subtype of the 1q21.1 deletion often associated with more severe phenotypes 32 .Additionally, one of the three 1q21.1 distal duplications was also accompanied by a deletion in the proximal region.Due to these irregularities, the 1q21.1 CNV was not included in the CNV size and constraint analyses.
After removing the 1q21.1 CNV, the number of DEGs was significantly correlated with the number of deleted genes as well as their aggregated evolutionary constraint scores (sum of 1/LOEUF), highlighting the extensive impacts of large, rare deletions on the transcriptome of human brain cell types.
A core goal of our study was to elucidate the convergent effects of high-risk CNVs across diverse cell types in the human brain, to provide biological insight into their shared pathophysiology.Through functional enrichment analyses, we found that gene expression changes converged on functions related to energy metabolism and synaptic activity.The most frequently implicated term, both with and without correction for quality-related covariates, was OXPHOS.Gene expression changes in the OXPHOS pathway were widespread across the complexes of the electron transport chain and present across CNVs and cell types.The most pronounced effects were seen in the 16p11.2and 7q11.23 deletions, and the direction of the changes was often correlated with CNV gene dosage, especially in the case of 16p11.2.These findings are consistent with previous studies that implicated disruptions of mitochondrial function and OXPHOS in cell culture and mouse models of the 16p11.2deletion 33,34 , 7q11.23 deletion 35,36 , and 22q11.2deletion 37,38 .Notably, disruptions in mitochondrial energetics have also been observed in CNVs not covered in our study, but with similar risk profiles, for instance in cell culture models of the 3q29.2deletion, a significant genetic risk factor for schizophrenia and bipolar disorder 39 .Our findings, taken together with the aforementioned studies of individual CNVs, point to energy metabolism as a common thread among carriers of high-risk neuropsychiatric CNVs.This suggests that, despite potential differences in underlying mechanisms, mitochondrial dysfunction could contribute to the convergent pathophysiology of these CNVs.
The other major theme of convergence arose from synapse-related functions enriched across CNVs.Across all nine CNVs, both neuronal and glial cell types showed functional convergence on the synapse, with terms related to synaptic function and cellular communication for neurons and those related to synaptic support for glia.These results are in line with previous studies that described synaptic disruptions in model systems of the 22q11.2deletion [60][61][62] , 16p11.2 deletion 34,43,44 ,and 7q11.23 deletion 35,45 .We found that the direction and strength of DGE signal across neurons for genes encoding components of the pre-synapse and, to a lesser extent, post-synaptic density were correlated with those of OXPHOS.Synaptic activity is highly energy demanding and relies on proper mitochondrial function 46 .This correlation could therefore be explained by the critical role mitochondria play in maintaining synaptic transmission through local ATP production via oxidative phosphorylation [47][48][49][50] .We believe that the convergent changes in synaptic gene expression can at least in part be explained by correlated deficits in mitochondrial energy metabolism 51 .
Leveraging the power of single-nucleus RNA-seq, we also discovered notable cell-type specific effects of high-risk CNVs, particularly in excitatory neurons and astrocytes.Excitatory neurons showed markedly higher number of DEGs to other cell types, with the greatest number of total DEGs as well as proportion of the transcriptome differentially expressed.The 7q11.23 deletion exhibited notably larger effects in excitatory neurons compared to other cell types.This result aligns with previous reports that have described marked impacts on excitatory neurons carrying the 7q11.23 deletion in cellular and mouse models 35,52 .Astrocytes exhibited noticeable differences compared to other cell types with the greatest number of unique DEGs compared to the other eight cell types.Across the nine CNVs, functional signals in astrocytes converge on terms indicative of cellular stress responses, such as actin cytoskeleton organization and autophagy.The enrichment in actin cytoskeleton organization suggests that astrocytes may be undergoing structural remodeling 53 .Enrichment in autophagy-related processes in astrocytes could be a result of metabolic stress from changes in mitochondrial energy metabolism.These structural changes in astrocytes may serve to modulate neuroinflammatory responses, protect against oxidative stress, and influence neuronal survival [54][55][56] .Astrocytes engage in essential cellular communication with neurons and are critical for supporting neuronal survival and synaptic activity 57,58 .A recent study described a linked expression pattern of synaptic genes between astrocytes and neurons in human brain in the context of schizophrenia 59 .The astrocyte-specific changes in our study might represent an adaptive response to the pronounced changes in energy metabolism and synaptic function from neurons.
The CNV that showed the greatest cell-type specific effects in astrocytes was the 22q11.2deletion, which exhibited a substantially higher number of DEGs in astrocytes compared to other cell types for both the individual CNV model and the 22q11.2gene dosage model.The 22q11.2 deletion showed significant upregulation of terms related to hypoxia-induced stress and glycolysis, enriched not only in astrocytes but also across other cell types.Increased energy demand, e.g. from increased neuronal excitability or reduced oxygen conditions, are known to induce a shift from aerobic (OXPHOS) to anaerobic energy metabolism (glycolysis), particularly in astrocytes 60,61 .The upregulation of genes related to hypoxia stress response and glycolysis indicate that this may be the case in the 22q11.2deletion.This observed upregulation in glycolysis aligns with previous work that found a higher activity of glycolysis compared to OXPHOS in children carrying the 22q11.2deletion 62 .Our findings are additionally in line with previous studies that described increased neuronal excitability 63 , impaired mitochondria function 37,38,64 , and increased glycolysis biomarkers in the 22q11.2deletion 62 .Taken together with the above-mentioned widespread changes in OXPHOS, these findings further support the notion of convergent energy metabolism changes in high-risk CNV carriers.
This study has several important limitations.While screening more than a thousand individuals across three brain banks, the number of carriers was limited by the relative rarity of neuropsychiatric CNVs in the population.While the results likely capture the most prominent gene expression changes, the sample is underpowered to reliably detect changes unique to individual CNVs.Although carriers were selected based on overlap with known recurrent high-risk CNVs, the exact breakpoints cannot be precisely determined based on SNP array data alone.This does not affect most DGE analyses but may complicate those that depend on gene location relative to the CNV.Our DGE analysis was limited in resolution to nine major cell classes selected for frequency and even representation across the samples.Rarer cell types were present but were not resolved in differential gene expression analyses.While gene expression changes in post-mortem tissue provide valuable insight, the observed changes may include both those that are caused by the CNV and those that result from the associated illnesses and treatments.By matching of carriers and non-carriers by psychiatric diagnosis and treatment, we attempted to control for those variables, but such matching is never perfect.Studies of post-mortem tissue will require further experimental validation to robustly disentangle gene expression changes caused by the CNV from secondary effects.
Our study provides in-depth analysis of transcriptomic effects of nine recurrent high-risk neuropsychiatric CNVs, utilizing single-cell transcriptomics to achieve cell-type resolution in human brain.These findings offer valuable insights into shared pathophysiological mechanisms underlying these high-risk CNVs and their broader impacts on gene expression beyond proximal effects.Thousands of genes were differentially expressed across CNVs, with most DEGs located outside of the respective CNV regions and greater transcriptomic disruptions in deletions compared to reciprocal duplications.We found that gene expression changes associated with high-risk CNVs converge on mitochondrial energy metabolism, most markedly in the OXPHOS pathway, and in synaptic functions.Within neurons, changes in OXPHOS and synapse-related gene expression were correlated, although our data do not allow for mechanistic explanation.These findings advance the understanding of the neurobiology associated with CNVs and provide insight into potential convergent biology that could explain the phenotypic convergence of these CNVs through shared pathophysiology.
Moving forward, future research endeavors will need to functionally validate these findings of convergent mitochondrial disruptions across cell types and their connection to the synapse.Larger cohorts are essential to further validate and elucidate the complex genetic and molecular landscape of these rare, large CNVs.Additionally, it would be important to expand this work beyond the two cortical brain regions described in this study.Neuroimaging has discovered many changes in subcortical regions tied to CNVs 65,66 , but our dataset was limited to dlPFC and ACC.Studying other brain regions could yield further insights and provide a better basis for integration with neuroimaging work, which would seek to answer how molecular changes may change brain physiology and lead to clinical phenotypes.Additionally, an avenue to further interrogate these CNVs would be through incorporating multi-omic approaches, e.g.proteomics or ATAC-seq.Multi-omics could offer promising insights into the landscape of CNV impacts, i.e. pairing gene expression with other genomic assays.This future work will provide a deeper understanding CNV biology and the relationship between mitochondrial and synaptic dysfunction, paving the way for the development of biomarkers, and more effective diagnostic and therapeutic strategies for individuals with high-risk neuropsychiatric CNVs.

Matching approach and demographics
Each individual harboring a known high-risk neuropsychiatric CNV, as defined above, was matched as close as possible to at least two presumptive non-carrier individuals from the same brain bank with similar demographic characteristics, based on sex, ethnicity, age of death, psychiatric diagnosis, and drug profile in toxicological screening (when available).In cases where two CNV carriers had very similar demographic features, the same non-carrier individuals served as matches for both.This matching approach created groups of three or more individuals with similar characteristics, which we refer to as 'match groups'.Demographic variables across the dataset are summarized in Table S3.

Regions of interest
Whenever available, dorsolateral prefrontal cortex (dlPFC) and anterior cingulate cortex (ACC) were dissected.50 mg of frozen tissue were used for nuclei isolation and single-nucleus RNA-sequencing, while another 50 mg were used for RNA isolation, followed by bulk RNA-sequencing and measurement of RNA integrity numbers (RIN).Quality-related variables across the dataset are summarized in Table S4.
Nuclei isolation 50 mg of frozen brain tissue were dissociated using Singulator 100 (S2 Genomics) in a prechilled nuclei isolation cartridge, following the Extended Nuclei Isolation protocol provided by the manufacturer.Nuclei were collected by centrifugation at 500g for 5 min at 4°C in a 15 mL tube, washed with 3 mL nuclei storage reagent (NSR) once, then washed with 3 mL washing buffer (PBS + 1% BSA + 0.2 U/μL RNase inhibitor) and filtered through 30 μm MARC SmartStrainer (Miltenyi Biotec) twice, resuspend in 800 μL washing buffer and filtered through 40 μm Flowmi cell strainer (Bel-Art) before loading.

Nuclei capture and library construction
Nuclei were captured and single-nucleus RNA-sequencing libraries were prepared following the 10X Genomics protocol CG000315 RevC: Chromium Next GEM Single Cell 3' Reagent Kits v3.1 (Dual Index).After washing and nuclei counting, approximately 8,000 nuclei from each sample were loaded on the 10X Genomics Chromium Controller.The Reverse Transcription and cDNA amplification were done following the manufacturer's protocol.3' mRNA-seq gene expression libraries were prepared from 100ng cDNA on Sciclone G3 Liquid Handling Workstation (Revivity).

Next Generation Sequencing
Libraries were pooled and sequenced on NovaSeq S4 100 cycle flow cell with 2% PhiX spiked in.The paired-end dual index sequencing run was set up as Read1 (28 cycles, i7 Index: 10 cycles, i5 Index: 10 cycles) and Read 2 (90 cycles).

Single-Nucleus RNA-Sequencing Data Processing
Using the 10X Genomics CellRanger's processing software (v 7.0.0) 78, we demultiplexed the raw base call (BCL) files.The generated FASTQ files from CellRanger were aligned to the reference human genome GRCh38p14 to create raw feature-barcode count matrices.
Quality Control, Nuclei Filtering, and Clustering All analyses were performed with R version 4.2.2.Data from adult (>10 years old) and infant (<1 year old) brain samples were processed separately.Starting with the adult data, we converted the single-nucleus RNA-sequencing counts into a Seurat object, which we then iteratively filtered and visualized with uniform manifold approximation and projections (UMAPs).To ensure data quality and reliability, we implemented a series of rigorous quality control and filtering steps (Figure S2).Utilizing SoupX (v 1.6.2) 79 , we eliminated ambient RNA using the default values provided by the software.Sample outliers were determined by Pearson correlation analysis of the normalized count matrix after pseudobulking across the 36,601 mapped genes.We removed two non-carrier samples (26 and 67) as they show correlation values ≤0.7, the lowest overall correlations compared to the other 67 samples (Figure S3A).
We employed scDblFinder (v 1.12.0) 68to identify and filter out doublets in our data.scDblFinder was run on each sample using the recommended default values.We did not observe a noticeable concentration of doublets in any cluster (all <50% doublets) (Figure S2A).We thus removed any cell classified as a doublet according to scDblFinder.
We established the following criteria for initial nuclei inclusion: nuclei with >200 unique genes detected , > 500 number of transcripts, and a fraction of mitochondriaencoded genes (percent.mt)<5.0% (Figure S2C).Using the Seurat package (v 4.3.0) 19, the data underwent normalization, dimensionality reductions, and unsupervised clustering with Louvain clustering resolution of 1.0.

Harmony Integration and Additional Nuclei Filtering
To address potential batch effects biasing clustering across the dataset, we applied the Harmony algorithm (v 0.1.1) 80.We used Harmony to control for match group, as it discerned cell type clusters with the best resolution.This was expected, since match groups were designed to control for the commonly observed confounding effects of sex, ethnicity, psychiatric diagnosis, and age.Utilizing match group as our integration variable with 50 dimensions, we improved cluster resolution between distinct cell types.
However, we still encountered noticeable bridging between major cell classes in the UMAP visualization, suggestive of persistent debris or poor-quality nuclei.The bridged clusters were also the clusters with the highest percent.mtsuggesting that this bridging was an artifact of debris with high mitochondria content.We conducted quality filtering and Harmony integration by removing clusters with highest mean percent.mt.We started with clusters exceeding 2.5%, then removed clusters with percent.mtexceeding 2.0% and reran Harmony after both steps.Rigorous iterations of cluster removal by percent.mtfiltering allowed for fewer bridged clusters and better cell type resolution (Figure S2B).

Cell Type Annotations and Sample Filtering
We utilized multiple approaches to annotate cell types and checked for consistency across approaches.First, we explored the expression of top marker genes associated with expected cell types.Then we employed Azimuth software (v 0.4.6) to annotate cell types, using the human motor cortex data set from the Allen Institute of Brain Research as a reference 19,20 .
In picking cell type resolution, we aimed to achieve a high level of biologically meaningful resolution, while ensuring robust representation across all samples.Cells were annotated by identifying the most predominant cell type with each cluster from Seurat, based on subclass provided by Azimuth.Although Azimuth allowed us to resolve cell types in fine detail, we choose to limit our analysis to nine broad cell types based on cell type counts across samples, grouping some of the smaller populations of neurons and glia.(Figure S4A).We settled on nine well-represented cell types: astrocytes, microglia, oligodendrocytes, OPCs, vascular leptomeningeal cell (VLMC) and endothelial cells, upper-and lower-layer excitatory neurons, and medial ganglionic eminences (MGE)-and caudal ganglionic eminences (CGE)derived inhibitory neurons (Figure 1E).VLMCs and endothelial cells were grouped together as they both had very low cell counts and serve similar biological functions.Low cell counts of VLMC and endothelial cell types are not unexpected in dlPFC and ACC 81 .We compared these annotations to marker gene expression to ensure confidence (Figure S4A-C).We observed similar proportions of the nine broad cell types between dlPFC and ACC, after normalizing by the number of samples from both brain region (Figure S5A-B), consistent with prior work 17 .
After classifying cell types, we filtered out samples with low neuron numbers.We removed sample 30, which contained fewer than 20 neurons across all four neuron types (Figure S3C,5C).Following the same criteria for outlier detection as used before, we analyzed Pearson correlations between samples on the normalized pseudobulked expression matrix across the 36601 genes (Figure S3B).Sample 30 also had low correlations across the 67 samples and, along with the underrepresentation of neuronal cell types, was thus excluded on this basis.After removing the outlier sample, each cell type had greater than 10 cells in each sample, except VLMCs.

Testing of Cell type Proportion
To rule out major cell type shifts and detect potential biases in the relative cell type abundance, we calculated the proportion of each cell type using the total cell count for each sample as the denominator and respective cell counts for each of the nine cell types as the numerator.Within each cell type, we ran a linear mixed model using lme4 (linear mixed-effects models) 82 procedure following the formulas:

celltype_count~CNV +(1|BrainRegion) +(1 |BrainNumber) +(1 |Match_group)
We used square root and log transformations on the cell type proportion and counts respectively for the model to meet normality.Transformation for each cell type and model were chosen based on the Shapiro-Wilk test, requiring a p-value >0.05 of the residuals.We tested for the effect of CNV with match group, brain number, and region as random effects, since they are nested in the study design.We did not observe significant cell type shifts across most CNVs.One exception was the 15q11.2deletion, which showed significant differences in cell type proportion for most cell types.The cell type counts for the 15q11.2deletion, however, only showed significantly higher numbers of microglia and oligodendrocytes.These two cell types likely drive the significant results across proportions.Additionally, OPCs in the 22q11.2deletion were significant for both cell type proportion and counts, with fewer OPCs in carriers than non-carriers (Table S5,S6).Given the limited sample size, we cannot conclude if differences in cell type proportion were due to biological or technical factors.However, the involvement of oligendrocytes, OPCs, and microglia suggests that they may be attributable to the ratio of grey and white matter in the dissection.

Differential Gene Expression
Differential gene expression (DGE) analysis was achieved using dreamlet (v 0.99.6) 18.Dreamlet has the capability to use mixed models, which effectively addresses the nested nature of our data.Dreamlet adopts a pseudobulking strategy for single-nucleus data, which has shown improved performance compared to single-cell approaches 83 .Dreamlet incorporates two-levels of precision weights, accommodating variations in both cell count and sequencing depth.This was beneficial to our study design to account for the observed variability in cell type counts described above.
Utilizing matched non-carriers enabled us to account for known confounders despite limited sample size and better target transcriptomic changes driven by each CNV.However, it was important to delineate how well our non-carrier controls were matched and determine correction for additional covariates as needed.To explore if or which other covariates should be included in the DGE model, we used variance partitioning 69 and linear regression analyses where we iteratively added in covariates.Using variance partitioning, we assessed correlations among the covariates and found that match group was significantly correlated with age, sex, ethnicity, etc. as expected (Figure S7A).Exploring the variance explained by each covariate indicated that CNV status and match group account for the largest amount of variance (Figure S7B).
We also performed principal component analysis and used linear regression to test for association of known covariates with the top principal components (PC), finding that match group was the most significant covariate (Figure S7C).
After residualizing for match group, brain region, and subject ID as random effects, we found no covariates were significantly associated with PC1-at p-value < 0.05 in all cell types (Figure S7D).
We therefore chose the following DGE model in dreamlet:

~CNV + (1|BrainRegion) + (1|BrainNumber) + (1|MatchGroup)
We followed default settings in dreamlet for gene and sample filtering, which requires a minimum number of five reads for a gene to be considered expressed in a sample, minimum of four samples passing cutoff for a cell type to be retained, and minimum of 40% of retained samples to have non-zero counts for a gene to be retained.Additionally, we also removed highly abundant mitochondrial-encoded and ribosomal genes that are often associated with sample quality: gene names starting with "MT-", "RPL", and "RPS".
No covariate was significantly associated (p-value <0.05) with PC1 for all nine cell types, however some were significant in several cell types, such as cell type count.While dreamlet corrects for differences in cell type counts with precision weights, we created a more rigorous quality-covariate adjusted model for comparison to ensure differences in cell type count and sample quality between carriers and non-carriers did not influence DGE results.We selected quality covariates with a p-value < 0.01 in PC1 for at least one cell type to be added to a quality-covariate-adjusted model.
Three covariates met this criterion: RNA integrity number (RIN), mean number of genes per cell by sample (nFeature), and cell count of each cell type by sample.

~CNV + (1|BrainRegion) + (1|BrainNumber) + (1|MatchGroup) + RIN + nFeature + Cell.Type.Counts
We conducted a parallel analysis with these three additional fixed effects.We observed high correlations of t-statistics between the original model and the qualitycovariate adjusted model, with the exception of oligodendrocytes in the 15q11.2deletion.This was consistent with the significant differences in cell-type proportions in this sample, which had suggested more white matter in that dissection (Figure S7E).Given that this is likely driven by dissection, we believe the 15q11.2oligodendrocyte results should be interpreted with caution.
To test for CNV gene dosage effects, we split carriers into groups by CNV, comprised of deletions, reciprocal duplications, and kept all non-carriers.We then classified samples by CNV gene dosage, with deletions classified with a 1, noncarriers a 2, and duplications a 3.This continuous gene dosage classification was then used as the fixed effect in the model, keeping the same grouping factors:

Functional Enrichments
Rank-based functional enrichment testing approaches were used to mitigate noise and unreliable results due to arbitrary p-value thresholds, known to influence threshold-based functional enrichment testing results 70,84 .GSEA was our primary functional enrichment testing method as it is widely used and acknowledged to be a strong candidate for discovering biological insights for RNA-seq data analysis 70,85 .
We tested for functional enrichments using Gene Ontology (GO) Biological Processes (BP) database 71,[87][88][89] .Mitocarta 74 was used to obtain more detailed curated terms related to mitochondrial functions, including OXPHOS complexes.We also used SynGO to select terms related to the synapse 75 (Figure S16).
Results were adjusted for multiple testing using the Benjamini-Hochberg procedure.We applied a filtering criterion for functional enrichments, retaining only those with false discover rate (FDR) < 0.1 as a threshold for significance.To assess convergence across the dataset, we explored terms with the highest overlap in significant terms across both CNVs and cell types, in all 81 models (DGE across 9 CNVs and 9 cell types).Any term with the same intersection size was then ranked by FDR for selection of top terms.To explore direction of enrichment, we utilized the normalized enrichment score (NES).This value accounts for the differences in gene set size.
Additionally, we explored convergence at the level of the CNV and at the level of the cell type, by extracting terms with the highest intersection size (1) across cell types for each CNV and conversely (2) across CNVs within each cell type.When terms had the same number of intersections, they were ranked by decreasing summed FDR value.

LOEUF Scores
To characterize genes within the CNV regions, the loss-of-function observed/expected upper fraction (LOEUF score) was used as a metric for risk.LOEUF scores are a continuous metric to quantify the intolerance of a gene to a loss of function variant and were obtained from the GnomAD database 26 (v4.1.0).To aggregate constraint scores for each CNV, inverse LOEUF scores (1/LOEUF) were summed for all CNV genes with available constraint metrics for each DGE model, as previously described 27 .
Infant Samples Clustering, Cell Type Annotations We also analyzed a smaller infant dataset, which included samples from the dlPFC and ACC of infant carriers of the 7q11.23 (WBS) or 15q11.2(BP1-BP2) deletion, along with matched infant non-carriers (Figure 1A).The infant data underwent the same pipeline as the adult data.Moreover, the same QC steps were applied to the infant samples as with the adult samples, with Harmony integrations and removal of clusters with the highest percent mt.The nuclei were then annotated for cell type using Azimuth software (v 0.4.6) 19,20, with the human motor cortex data set from the Allen Institute of Brain Research as a reference 19 (Figure S6A-D).The only change in cell type classification was that oligodendrocytes and OPCs were merged into a single category labeled OPC_Oligo, due to the low number of oligodendrocytes in infant brain samples and the biological similarities between the two cell types.
We followed the same analysis pipeline as for the adult dataset for differential gene expression with exploration of metadata by variance partitioning and linear regression of covariates (Figure S14A-B).The final model stayed consistent with the adult dataset as:

Figure 1 :
Figure 1: Overview of single-nucleus transcriptomic dataset to explore the effects of highrisk CNVs in human brain cells.(A) Outline of study design.Each of the 15 carriers (13 adults and 2 infants) across nine CNVs was matched with two non-carrier individuals, which comprised the match groups.(B) Schematic of dataset, 9 CNVs and 9 cell types, leading to 81 differential gene expression (DGE) tests.(C) Overview of data analysis . (D) Stacked violin plot showing normalized expression of selected marker genes for each major cell type.(E) UMAP visualization of single nuclei colored by annotated cell types.Rigorous quality control and filtering were used to refine cluster resolution.Decisions about resolution of cell types aimed to retain resolution of biologically meaningful cell types with sufficient coverage across samples.(F) Barplot depicting total number of nuclei for each of the nine cell types across the dataset (G) Barplot depicting cell-type proportions across CNV carrier and noncarrier groups.[Created with Biorender.com].

Figure 3 :
Figure 3: Number of differentially expressed genes (DEGs) across CNVs and cell types.(A) Number of DEGs for each CNV in all nine cell types, at p-value<0.05(nominal) and FDR <0.1.

Figure 5 :
Figure 5: Cell type-specific enrichments across CNVs.(A) GSEA results of the top 5 most frequently implicated terms enriched at FDR <0.1 across CNVs were selected for each of the nine cell types, ordered in reverse alphabetical order.Heatmap is colored by NES, with red indicating positive NES and blue indicating negative NES. FDR significance is marked with asterisks (FDR <0.1 = *, FDR <0.01 = **, FDR < 0.001 = ***).(B) Association of OXPHOS and presynapse DGE statistics.Mean t-statistics across all genes in the OXPHOS term for each neuronal CNV model (n=36) (x-axis) and across all genes in presynapse term (y-axis), which was log-transformed for normality.(C) Association of OXPHOS and postsynaptic density DGE statistics.Same x-axis as described above and mean t-statistic across genes in postsynaptic density term for CNV models for four neuronal cell types.

Figure 6 :
Figure 6: Differential gene expression testing for association with CNV gene dosage in two 'mirror' CNVs.(A) Schematic showing gene dosage testing.DGE was run separately for each CNV with groups of deletions, reciprocal duplication, and all non-carriers.(B) Number of DEGs (FDR <0.1) within cell types for each gene dosage model.Following the same