Genes associated with gray matter volume alterations in schizophrenia

Although both schizophrenia and gray matter volume (GMV) show high heritability, however, genes accounting for GMV alterations in schizophrenia remain largely unknown. Based on risk genes identified in schizophrenia by the genome-wide association study of the Schizophrenia Working Group of the Psychiatric Genomics Consortium, we used transcription-neuroimaging association analysis to test that which of these genes are associated with GMV changes in schizophrenia. For each brain tissue sample, the expression profiles of 196 schizophrenia risk genes were extracted from six donated normal brains of the Allen Human Brain Atlas, and GMV differences between patients with schizophrenia and healthy controls were calculated based on five independent case-control structural MRI datasets (276 patients and 284 controls). Genes associated with GMV changes in schizophrenia were identified by performing cross-sample spatial correlations between expression levels of each gene and case-control GMV difference derived from the five MRI datasets integrated by harmonization and meta-analysis. We found that expression levels of 98 genes consistently showed significant cross-sample spatial correlations with GMV changes in schizophrenia. These genes were functionally enriched for chemical synaptic transmission, central nervous system development, and cell projection. Overall, this study provides a set of genes possibly associated with GMV changes in schizophrenia, which could be used as candidate genes to explore biological mechanisms underlying the structural impairments in schizophrenia.


Introduction
Schizophrenia is a common chronic debilitating disorder characterized by hallucinations, delusions and cognitive impairments ( Purcell et al., 2009 ). A considerable number of studies have reported both microscopic (e.g., synaptic reduction ( Fromer et al., 2014 ;Glantz and Lewis, 2000 ) and neurotransmitter dysfunction ( Howes et al., 2012 )) and macroscopic (e.g., gray matter volume (GMV) reduction ( van Haren et al., 2011 ;Warland et al., 2020 ) and anatomical and functional dysconnectivity ( Liu et al., 2019 ;Romme et al., 2017 ;van den Heuvel et al., 2013 )) changes in schizophrenia. However, we still know little about associations between microscopic and macroscopic changes in schizophrenia, which are critical for understanding the pathophysiological mechanisms of this disorder. For example, we know little about genetic mechanisms underlying GMV reduction in schizophrenia, which is a common neuroimaging finding in this disorder ( Haijma et al., 2013 ;Owens et al., 2012 ).
Schizophrenia shows an estimated heritability of about 80% ( Chen et al., 2019 ;Mowry and Gratten, 2013 ;Kochunov et al., 2014 ), which has inspired researchers to identify risk genetic loci of schizophrenia using genome-wide association study (GWAS). The Schizophrenia Working Group of the Psychiatric Genomics Consortium (PGC) has identified 108 schizophrenia-related loci ( Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014 ), which are either relevant to the hypothesis of the etiology and treatment of schizophrenia or its involvement in schizophrenia-related biological processes (e.g., synaptic plasticity, calcium channel, glutamatergic neurotransmission, and neurodevelopment). In an exploratory study (777 discovery and 609 replication samples) with multivariate analytic methods ( Owens et al., 2012 ), 39 single-nucleotide polymorphisms (SNPs) from 1,402 SNPs mapped from the PGC 108 loci were found to be related to GMV reduction in inferior parietal and superior temporal regions in schizophrenia.
In contrast to the need for the large sample in a genome-wide neuroimaging association study, transcription-neuroimaging association study can identify genes associated with neuroimaging changes in brain disorders in a relatively small sample. Based on the densely sampled gene expression data of six post-mortem brains from the Allen Human Brain Atlas (AHBA; http://human.brain-map.org ), several studies have identified genes associated with altered structural brain network ( Liu et al., 2019 ), anatomical and functional disconnectivity ( Romme et al., 2017 ), and abnormal morphometric similarity ( Morgan et al., 2019 ) in schizophrenia. However, no transcriptionneuroimaging association study has been performed to identify genes associated with GMV alterations in schizophrenia.
In this study, we used transcription-neuroimaging association analysis to test which schizophrenia-related genes identified by the PGC-GWAS study are associated with GMV changes in schizophrenia. For each sample, gene expression data were derived from the AHBA data, Fig. 1. Pipeline of data processing. (A) Calculating gray matter volume (GMV) difference for each tissue sample. Individual voxel-wise GMV map is created based on structural MRI data of schizophrenia patients (SCZ) and healthy controls (HC) from five independent datasets. Two sample t -test is used to create voxel-wise GMV difference maps for these datasets, and meta-analysis is used to summarize the overall GMV difference for each tissue sample. After harmonizing individual GMV maps derived from the five sites, we obtain an overall voxel-wise GMV difference map, from which we calculate GMV difference for each sample. (B) Generating a gene expression matrix. Gene expression values of genes in tissue samples are obtained in six donated brains from the Allen Human Brain Atlas. (C) Identifying genes associated with GMV alterations in schizophrenia. Gene-wise cross-sample spatial correlations are performed between gene expression and GMV differences derived from the two integration methods, respectively. The intersected significant genes are defined as genes associated with GMV alterations in schizophrenia. and case-control GMV difference was analyzed based on GMV maps of schizophrenia patients and healthy controls from five independent datasets integrated by both harmonization and meta-analysis. We then performed cross-sample spatial correlations between gene expression levels of the donated AHBA brains and case-control GMV differences ( tvalues) of the living human brains. Finally, functional annotation was conducted to explore the functional relevance of the identified genes associated with GMV changes in schizophrenia. A schematic summary of the processing pipeline is shown in Fig. 1 .

Gene expression data processing
Gene expression data were acquired from the AHBA ( Hawrylycz et al., 2012 ), which provides normalized microarray gene expression data of six postmortem adult brains. These data were processed using a new pipeline that was proposed to link whole-brain gene expression profiles to neuroimaging data ( Arnatkevic Iute et al., 2019 ). Considering that only two donors had tissue samples of the right hemisphere, while six donors had the left hemisphere, we ultimately used 1,028 cortical samples from the left hemisphere. Briefly, we reassigned probes to genes by utilizing the latest NCBI database; we excluded probes with expression intensity lower than the background signal in more than 50% samples; we chose the probe with the highest correlation with the RNA-seq data, and we scaled robust sigmoid to normalize expression data. These processing procedures generated normalized expression data of 10,185 genes for each tissue sample. Here, schizophrenia risk genes were defined based on a recent PGC-GWAS study, in which the identified 108 schizophrenia risk loci were mapped to 348 genes ( Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014 ). Only 196 schizophrenia associated risk genes with qualified AHBA expression data were included in this study and further analysis.

Participants for neuroimaging analysis
This study included 5 independent structural MRI datasets of schizophrenia patients and healthy controls from Tianjin Medical University General Hospital (TMUGH) and other four public datasets from the SchizConnect database ( Wang et al., 2016 ). The inclusion and exclusion criteria, and imaging acquisition details are provided in Supplementary materials. The image quality was checked by two radiologists, and we excluded subjects with brain abnormalities, incomplete whole brain coverage, imaging artifacts, or poor image segmentation or normalization ( Supplementary Fig. 1). After qualifying for the structural MRI data, 276 patients with schizophrenia and 284 healthy controls were included in this study. All experimental procedures were approved by the local Ethics Committee, and informed written consent was provided by each participant. Schizophrenia was diagnosed according to the criteria of the Diagnostic and Statistical Manual of Mental Disorders IV (DSM-IV). Specifically, 89 patients with schizophrenia and 91 healthy controls were from the TMUGH; and other participants were downloaded from the SchizConnect database ( Wang et al., 2016 ), including BrainGluSchi Demographic data of participants included in the neuroimaging analysis were analyzed using the Statistical Package for the Social Sciences version 22.0 (SPSS, Chicago, IL, USA). The two-sample t -test was used to compare age difference between patients with schizophrenia and health controls from each dataset, and the chi-square test was used to test gender difference.

GMV calculation
GMV images were calculated using the standard pipeline of the CAT12 toolbox (version r1278, http://dbm.neuro.uni-jena.de/cat ), which is an extension of the SPM12 ( http://www.fil.ion.ucl.ac.uk/spm ). After correcting for bias-field inhomogeneity and affine registration to standard space, the structural images were segmented into gray matter, white matter, and cerebrospinal fluid. The segmented images were further normalized using the diffeomorphic anatomical registration through the exponentiated lie algebra (DARTEL) technique and then resampled to a voxel size of 1.5mm × 1.5mm × 1.5mm ( Ashburner, 2007 ). The normalized gray matter images were modulated by multiplying the voxel values with the Jacobian determinant (both linear and nonlinear components) derived from the spatial normalization. Finally, the obtained GMV maps were smoothed with a Gaussian kernel of 8mm × 8mm × 8mm full-width at half maximum.

Case-control GMV difference created by harmonization
To obtain a reliable GMV difference map between schizophrenia patients and healthy controls, we used ComBat ( https://github.com/ Jfortin1/ComBatHarmonization ) to harmonize individual GMV maps from the five independent datasets. This technique can improve statistical power, recover the true effect sizes and be robust to small sample sizes ( Fortin et al., 2017 ). After harmonization, the two-sample t -test was used to compare voxel -wise GMV differences between all patients with schizophrenia and all healthy controls, controlling for the effects of age, sex and total intracranial volume. These comparisons were performed within a cortical gray matter mask (303,120 voxels). A familywise error (FWE) method was employed to correct for multiple comparisons with a significance threshold of P < 0.05. Based on the MNI coordinate of each tissue sample (n = 1,028), we drew a sphere with a radius of 4.5mm (i.e., 3 times of the voxel size) on the uncorrected case-control GMV difference map and defined the mean t -value of voxels within this sphere as the t -statistic value of GMV difference of this sample.

Case-control GMV difference created by meta-analysis
To obtain reliable case-control GMV differences of tissue samples, meta-analysis was also used to summarize an overall GMV alteration of each sample from GMV difference maps derived from the five datasets. For each dataset, voxel -wise GMV differences between patients with schizophrenia and healthy controls were compared within the same cortical gray matter mask by two -sample t -test, while controlling for the effects of age, sex, and total intracranial volume. Then a sphere with a radius of 4.5mm was drawn centered at the MNI coordinate of each tissue sample (n = 1,028) on the uncorrected case-control GMV difference map from each dataset, and the mean t -value of voxels within this sphere was defined as the t -statistic value of GMV difference of this sample in this dataset. For each sample, we performed a meta-analysis to integrate the t -statistic values of GMV differences derived from the five independent datasets ( van Assen et al., 2015 ). Specifically, we applied the Stouffer Z method to obtain aggregate P and Z value based on the significance level of tests weighed by case-control sample size ( Darlington and Hayes, 2000 ).
To verify the consistency of the case-control GMV differences derived from the two integration methods (harmonization and meta-analysis), the cross-sample correlation was conducted between the t -statistic values of GMV differences obtained by the two methods.

Identifying genes associated with GMV alterations in schizophrenia
Based on the expression values of the 196 schizophrenia-related genes in 1,028 samples derived from AHBA data and the two sets of case-control GMV differences ( t -statistic values) in these samples derived from the harmonization and meta-analysis methods, cross-sample (n = 1,028) non-parametric Spearman rank was performed to determine the correlations between gene expression and GMV changes in schizophrenia. Multiple comparisons were corrected using permutation test. Specifically, we randomly shuffled the sample labels 10,000 times and then re-performed the correlation analysis for each permutation to build null distribution, based on which we inferred the significance ( P < 0.05). The number of comparisons (n = 196) at the gene level was further corrected with a significance threshold of P < 2.55 × 10 − 4 = 0.05/196 (Bonferroni correction). Finally, genes associated with GMV alterations in schizophrenia were defined as those whose expression values were significantly correlated with case-control GMV differences derived from both harmonization and meta-analysis methods.

Functional annotation
Toppogene ( https://toppgene.cchmc.org/ ) was used to perform enrichment analysis for genes consistently associated with GMV alterations in schizophrenia ( Chen et al., 2009 ). The Benjamini and Hochberg method for false discovery rate (FDR-BH correction) ( P < 0.05) was a P value was obtained by a chi-square test. b P value was obtained by a two-sample t -test. Abbreviations: NC, normal control; SCZ, schizophrenia; SD, standard deviation; TIV, total intracranial volume. used to correct for multiple comparisons. We also created proteinprotein interaction (PPI) networks based on genes which were consistently associated with GMV alterations in schizophrenia by STRING ( Szklarczyk et al., 2019 ) version 11.0 ( https://string-db.org/ ), with the medium confidence value of 0.4.

Participants
We ultimately recruited 276 patients with schizophrenia and 284 healthy subjects from the five datasets after quality control procedures. Demographic and clinical data of patients and controls are shown in Table 1 . Although there were no significant intergroup differences in age and gender in most of the five datasets ( P > 0.05), there was a significant gender difference between the patient and control groups from the BrainGluSchi dataset ( P = 0.002). Thus, we controlled for these two factors throughout GMV analyses.

GMV alterations of schizophrenia
After harmonizing individual GMV maps from the five independent datasets, we compared voxel -wise GMV differences ( P < 0.05, FWE corrected) between all patients with schizophrenia and all the healthy controls, while controlling the effects of age, sex, and total intracranial volume. Compared with healthy controls, patients with schizophrenia mainly showed reduced GMV in the medial and lateral prefrontal cortex, anterior cingulate cortex, lateral temporal cortex, temporal pole and insula ( Fig. 2 A). However, no brain regions showed a significant GMV increase in patients with schizophrenia.
The reliability and reproducibility of GMV differences between schizophrenia patients and healthy controls in the tissue samples are critically important for following the association analyses with gene expression. Thus, we used two methods (harmonization and metaanalysis) to summarize the overall GMV differences in these tissue samples derived from the five independent datasets. To further test the consistency of the integrated results derived from these two methods, we performed cross-sample (n = 1,028) correlation between t -statistic values of GMV differences computed by the two different integration methods, and found a high correlation ( R = 0.99, P = 2.2 × 10 − 16 ) ( Fig. 2 B).
To test the potential effect of antipsychotic use on correlation analysis in TMUGH dataset, we performed correlation analysis between GMV difference maps with and without adjusting for medication use in schizophrenia patients and found a relatively high correlation ( R = 0.77; Supplementary Fig. 2). Detailed methods were mentioned in Supplementary materials.

Genes related to GMV alterations in schizophrenia
After reannotation and probe selection, we finally obtained normalized expression data of 10,185 genes for each of 1,028 samples from the AHBA data. For each of 196 selected schizophrenia risk genes, a sample-wise spatial correlation was performed between gene expression and GMV changes in schizophrenia. Based on case-control GMV differences derived from harmonizing GMV maps of the five datasets, 107 genes were found to have significant association with GMV alterations in schizophrenia ( P < 0.05, Bonferroni corrected). According to case-control GMV differences derived from a meta-analysis of GMV differences in the five datasets, we found that 99 genes are significantly associated with GMV alterations in schizophrenia ( P < 0.05, Bonferroni corrected). More importantly, there were 98 genes ( Supplementary Fig.  3) that consistently showed significant correlations with GMV changes in schizophrenia derived from the two integration methods. Their correlation coefficients were displayed in Supplementary Table 1. Furthermore, we selected six representative genes to present their correlation scatterplots between gene expression values and the t -statistic values of case-control GMV differences ( Fig. 3 ). Here, the positive correlation means that brain samples with greater GMV reduction in schizophrenia show higher gene expression ( Fig. 3 A), and the negative correlation means that brain samples with greater GMV reduction in schizophrenia show lower gene expression ( Fig. 3 B).

PPI network analysis
Based on 98 genes which showed significantly stable correlations with GMV alterations in schizophrenia, we performed the PPI network analysis and identified a network consisting of 60 connected proteins and 72 edges, which is significantly higher than the expected 42 edges ( P = 1.99 × 10 − 5 ) ( Fig. 5 A). The proteins coded by GRIA1, GRIN2A, DRD2, KCNB1 and NRGN were hub proteins in the PPI network ( Fig. 5 B). Enrichment analysis indicated that they were mainly involved in chemical synaptic transmission, protein binding, and cell projection.

Discussion
Although the PGC-GWAS identified 108 schizophrenia risk loci (348 genes) ( Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014 ), we still know little about which genes are associated with GMV changes in schizophrenia. This study addressed the important gap by analyzing spatial correlations between the expression of these genes and GMV changes in schizophrenia. In the current study, we found that expression levels of 98 genes showed reproducible correlations Fig. 2. GMV differences between schizophrenia patients and healthy controls. (A) Brain regions with significant GMV reduction ( P < 0.05, FWE corrected) in schizophrenia derived from voxel-wise comparisons between the harmonized GMV maps of schizophrenia patients and healthy controls. The color bar represents t -values and a positive t -value indicates a greater GMV in healthy controls than in schizophrenia patients. (B) Scatterplot shows cross-sample (n = 1,028) correlation ( R = 0.99, P < 0.001) between case-control GMV differences derived from the harmonization (x-axis) and meta-analysis (y-axis) methods. Fig. 3. Cross-sample (n = 1,028) correlations between expression values of the six representative genes and GMV alterations in schizophrenia. GMV differences between schizophrenia patients and healthy controls are calculated using the integration methods of harmonization (A) and meta-analysis (B), respectively. The x-axis is the t -statistic of GMV difference between schizophrenia patients and healthy controls, and the y-axis is the gene expression value. The name of each representative gene is displayed at the top of each scatterplot, and the correlation coefficient and P value are displayed in the upper left corner.
with GMV alterations in schizophrenia and schizophrenia-related biological processes, such as synaptic transmission, transmembrane transportation, CNS development and cell projection. These findings may improve our understanding about the molecular mechanisms associated with GMV changes in schizophrenia and related genes.
In consistent with previous studies reporting GMV reduction in schizophrenia ( Todtenkopf et al., 2005 ;Job et al., 2005 ;Pantelis et al., 2003 ;Harrison, 1999 ;Douaud et al., 2007 ;Choi et al., 2005 ;Sim et al., 2006 ;Wolf et al., 2014 ), we found that the anterior cingulate cortex, prefrontal cortex, and temporal cortex showed significant GMV reduction in patients with schizophrenia compared to that of healthy controls. GMV reduction in schizophrenia is theoretically related to the alterations in size, morphology, and number of the cellular and non-cellular components of the cerebral cortex. Pathologic studies on postmortem brains of patients with schizophrenia have revealed that various micro-structural changes occur which are possibly responsible for GMV reduction in this  disorder ( Torrey et al., 2005 ;Kang et al., 2009 ;Mahmoudi et al., 2019 ). For instance, compared with control subjects, patients with schizophrenia show: reduced number of interneurons in the anterior cingulate and prefrontal cortex ( Roberts et al., 2015 ); smaller neuronal cell bodies in the prefrontal cortex; reduced number of axons, dendrites, and synapses in the anterior cingulate and prefrontal cortex ( Roberts et al., 2015 ). However, the genetic mechanisms resulting in these micro-structural changes remain largely unknown.
Although identifying genes associated with GMV changes in schizophrenia by assessing the similarity of spatial distribution patterns between gene expression levels in AHBA brains and case-control GMV changes in living human brains is an indirect method, we still believe that this method can provide meaningful information as there is no such large sample of patients with schizophrenia who have both neuroimaging and gene expression data of the brain ( Richiardi et al., 2015 ). In contrast to a prior study that performed parallel independent component analysis and only reported a few genes in 6p22.1 that are associated with GMV reduction in schizophrenia ( Chen et al., 2019 ). In the current study, we performed transcription-neuroimaging association analysis and found that half (98/196) of the schizophrenia-related genes are under investigation which are associated with GMV reduction in schizophrenia. Hence, it indicates that GMV reduction in schizophrenia is caused by the joint-effects of many schizophrenia-related genes. The involvement of multiple genes in the GMV reduction in schizophrenia is more consistent with polygenetic inheritance of the complex phenotypes of GMV and schizophrenia. Enrichment analyses linked the identified genes associated with GMV reduction in schizophrenia to CNS development, synaptic transmission, and cell projection. All of these are potentially associated with GMV variation. In terms of genes associated with CNS development, for example, INA , a member of the intermediate filament family, specifically expressed in the brain is involved in the morphogenesis of neurons and in the composition of cytoskeletal structure ( Cairns et al., 2004 ;Yuan et al., 2006 ;Lodato et al., 2014 ;Yuan et al., 2015 ;Wang et al., 2018 ). PTPRF regulates biological processes of cell growth, differentiation, and migration, also the cytoskeletal rearrangement ( Zeng et al., 2003 ;Bernabeu et al., 2006 ). These functional properties support a link of these genes to GMV variation. In terms of genes associated with synaptic transmission, NRGN encodes a postsynaptic protein kinase substrate and DRD2 encodes dopamine receptor D2. Genetic variations of these genes were associated with GMV reduction in schizophrenia, especially in the prefrontal cortex and anterior cingulate cortex ( Bertolino et al., 2010 ;Glatt et al., 2009 ;Wu et al., 2017 ;Xu et al., 2016 ;Montag et al., 2010 ;Di Giorgio et al., 2014 ;Walton et al., 2013 ;Ohi et al., 2012 ;Ruano et al., 2008 ).
Since all gene expression values were positive and most of the t -statistic values were also positive (indicating GMV reduction in schizophrenia), positive correlations between gene expression and GMV difference indicated that brain regions with greater GMV reduction in schizophrenia show higher gene expression, and negative correlations mean that brain regions with greater GMV reduction in schizophrenia show lower gene expression. For example, we found a positive correlation for GRIA1, which encodes the AMPA receptor subunit and is associated with the long-term depression of synaptic transmission ( Sanderson et al., 2011 ). The higher expression level of GRIA1 in brain regions with more GMV reduction in schizophrenia is possibly attributed to its inhibitory effect on synaptic transmission, which is consistent with the function of this gene in synaptic plasticity and glutamatergic neurotransmission ( Sanderson et al., 2011 ;Bygrave et al., 2019 ). These findings are also consistent with a schizophrenia-related micro-deletion model that shows deficits in synaptic transmission and long-term plasticity ( Fenelon et al., 2013 ;Pockett, 1995 ). In contrast, HAPLN4 showed a negative correlation, i.e., the expression of this gene is lower in brain regions with more GMV reduction in schizophrenia. As a member of hyaluronan and proteoglycan binding link proteins family, HAPLN4 modulates synaptic plasticity and neurodevelopment processes of cell migration, neurite outgrowth and neuronal adhesion ( Kahler et al., 2011 ;Faralli et al., 2016 ;Cicanic et al., 2018 ;Edamatsu et al., 2018 ). The low expression level of HAPLN4 might lead to GMV reduction by influencing extracellular matrix and perineuronal nets ( Kahler et al., 2011 ).
Several limitations are worth mentioning in this study. First, transcriptional and neuroimaging data were not from the same individuals, which will lead to miss genes of interest with large expression variation across individuals in the transcription-neuroimaging association study. Second, GMV differences between patients with schizophrenia and healthy controls may be influenced by antipsychotic use. In the TMUGH dataset, we found a high positive correlation between GMV differences with and without regressing medication dosage, indicating a limited effect of medication use in our results. Third, transcriptome data were derived from the left cerebral hemispheres of six normal post-mortem AHBA brain data, and thus the small sample size and hemisphere selection may bias our findings. Finally, our analysis is exploratory and the transcription-neuroimaging association study cannot provide causal effects.
In conclusion, we used transcription-neuroimaging association analysis to identify which of the PGC-GWAS identified genes are associated with GMV changes in schizophrenia. We found 98 genes consistently showing significant correlations between gene expression and GMV changes in schizophrenia, which could be used as candidate genes to explore molecular mechanisms of structural impairments in schizophrenia.
The gene expression data that support the findings are available in 2010 Allen Institute for Brain Science. Allen Human Brain Atlas was available from: human.brain-map.org.
Toppogene was used to perform enrichment analysis for genes from: https://toppgene.cchmc.org/ Schizophrenia risk genes were defined based on a recent PGC-GWAS study: Biological insights from 108 schizophrenia-associated genetic loci. DOI: 10.1038/nature13595