Pathway-Based Polygenic Risk Scores for Schizophrenia and Associations With Reported Psychotic-like Experiences and Neuroimaging Phenotypes in the UK Biobank

Background Schizophrenia is a heritable psychiatric disorder with a polygenic architecture. Genome-wide association studies have reported that an increasing number of risk-associated variants and polygenic risk scores (PRSs) explain 17% of the variance in the disorder. Substantial heterogeneity exists in the effect of these variants, and aggregating them based on biologically relevant functions may provide mechanistic insight into the disorder. Methods Using the largest schizophrenia genome-wide association study conducted to date, we associated PRSs based on 5 gene sets previously found to contribute to schizophrenia pathophysiology—postsynaptic density of excitatory synapses, postsynaptic membrane, dendritic spine, axon, and histone H3-K4 methylation—along with respective whole-genome PRSs, with neuroimaging (n > 29,000) and reported psychotic-like experiences (n > 119,000) variables in healthy UK Biobank subjects. Results Several variables were significantly associated with the axon gene-set (psychotic-like communications, parahippocampal gyrus volume, fractional anisotropy thalamic radiations, and fractional anisotropy posterior thalamic radiations (β range −0.016 to 0.0916, false discovery rate–corrected p [pFDR] ≤ .05), postsynaptic density gene-set (psychotic-like experiences distress, global surface area, and cingulate lobe surface area [β range −0.014 to 0.0588, pFDR ≤ .05]), and histone gene set (entorhinal surface area: β = −0.016, pFDR = .035). From these, whole-genome PRSs were significantly associated with psychotic-like communications (β = 0.2218, pFDR = 1.34 × 10−7), distress (β = 0.1943, pFDR = 7.28 × 10−16), and fractional anisotropy thalamic radiations (β = −0.0143, pFDR = .036). Permutation analysis revealed that these associations were not due to chance. Conclusions Our results indicate that genetic variation in 3 gene sets relevant to schizophrenia may confer risk for the disorder through effects on previously implicated neuroimaging variables. Because associations were stronger overall for whole-genome PRSs, findings here highlight that selection of biologically relevant variants is not yet sufficient to address the heterogeneity of the disorder.

polymorphisms (SNPs) implicated in glutamatergic signaling were associated with attention, a cognitive process known to be impaired in schizophrenia.Yao et al (7) found that PRSs calculated based on neural microRNA-137 (MIR137) explained a disproportionately larger schizophrenia risk variance than genomic control PRSs when accounting for gene-set size (w2% MIR137 = w1000 genes; w10%, whole-genome [WG] = w20,000 genes).Therefore, it is possible to interrogate genetic risk aggregated to biologically relevant gene sets to gain insight into the association between aggregated genetic risk and disorder-specific traits of interest.
Previous evidence has also shown associations of schizophrenia PRSs with disruptions in white matter microstructure and global and regional brain volumes, although results have been inconsistent (8,9).As such, it has been suggested that biologically relevant gene sets may reveal stronger associations with neuroimaging phenotypes, as demonstrated recently (10).Grama et al. (10) investigated whether behaviorand neuronal-related gene sets, previously implicated in schizophrenia, were associated with subcortical volumes.They found that one gene set, "abnormal behavior," was associated with right thalamic volume, and this association was robust across different p-value thresholds (10).This methodology can be applied to other psychiatric disorders.For instance, PRSs calculated for a previously established biological pathway (NETRIN1) in relation to major depressive disorder were associated with relevant neuroimaging phenotypes, shedding light on links between biology and neuroimaging (11).This indicates that meaningful associations with traits of interest may be revealed when applying genomic methods that address relevant parts of the genome.
Based on this evidence, we hypothesized that schizophrenia PRSs aggregated in biologically relevant pathways previously shown to play a role in schizophrenia would be associated with structural neuroimaging and reported psychotic-like experience (PLE) phenotypes in a sample of adults in the general population.Identifying associations with specific neuroimaging phenotypes may provide an opportunity to disentangle the heterogeneity of the disorder, in terms of both genetic risk and inconsistent previous neuroimaging findings.Therefore, we selected 5 gene sets previously identified by the PGC (5): postsynaptic density (PSD), postsynaptic membrane (PSM), dendritic spine, axon, and histone H3-K4 methylation.These 5 cellular components and biological processes have been associated with schizophrenia in a number of studies investigating human and animal models (12)(13)(14).We calculated PRSs for each gene-set-specific set of SNPs and for SNPs excluded from the gene sets for paired comparisons (gene-set SNPs vs. WG minus gene-set SNP PRSs).Then, we tested their association with brain structural phenotypes (cortical volume, thickness, and surface area; white matter microstructure indexed by fractional anisotropy [FA] and mean diffusivity [MD]; and subcortical volumes) and reported PLEs (Mental Health Questionnaire [MHQ]) in the UK Biobank (UKB), utilizing the most up-to-date genetic, mental health, and imaging data.We hypothesized distinctive roles in the pathophysiology of schizophrenia for the different biologically relevant pathways tested, after adjustment for WG PRSs (excluding SNPs in each gene set), highlighting important mechanistic processes underlying the different phenotypes associated with schizophrenia.We expected this methodology to partly address the heterogeneity in schizophrenia through the identification of biologically relevant mechanisms.

Study Population
The UKB comprises of 502,411 community-dwelling individuals recruited between 2006 and 2010 in the United Kingdom (https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=2 00) (15).The UKB received ethical approval from the research ethics committee (Reference 11/NW/0382).This study was approved by the UKB Access Committee (Project Nos.4844 and 16124).Written informed consent was obtained from all participants.This study was conducted using the latest release of UKB neuroimaging data (n = 29,791 cortical, n = 29,536 subcortical, n = 27,917 white matter) and n = 119,947 with MHQ data.We excluded individuals with a diagnosis of schizophrenia as indicated by ICD-10 (F20, https://biobank.ndph.ox.ac.uk/showcase/field.cgi?id=41270) due to the small proportion of diagnosed individuals (n = 989) and the fact that our analyses focused on associations in the general population.

Gene-Set Selection
The top 5 gene sets associated with schizophrenia were selected from The Network and Pathway Analysis Subgroup of the PGC (5) and identified on Gene Ontology (GO) by searching for each pathway's GO identifier in the PGC study (5) [see also (16)]: PSD (GO:0014069), PSM (GO:0045211), dendritic spine (GO:0043197), axon (GO:0030424), and histone H3-K4 methylation (GO:0051568).The gene sets were selected based on their robust associations with schizophrenia in this and previous studies (5,13,17,18).Further details on gene sets are included in the Supplement and Supplementary Excel File 1.
We used SNPs from the largest available GWAS of schizophrenia (N = 320,404, n = 76,755 cases), which was carried out by Trubetskoy et al. (4).They identified schizophrenia associations with common variants at 287 distinct loci, with PRSs explaining w17% of the variance in a European ancestry target sample (4).The GWAS sample utilized here did not include any individuals in UKB.The full SNP quality control protocol is detailed in Supplement and was based on Choi et al. (20) (https://choishingwan.github.io/PRS-Tutorial/base/).In addition, we ensured that our sample consisted of unrelated, White British participants with no overlap with the PGC sample (21) when combining with the reported-PLEs and imaging samples (see below).Following functional annotation (22) (see the Supplement), SNPs located within each gene set were extracted.See Table S1 in the Supplement and Supplementary Excel File 1 for the number of genes and SNPs within each gene set.From these lists, gene-set PRSs were computed for each individual in the UKB (see the Supplement) using PRSice (23) at 5 p-value thresholds (.01, .05,.1,.5, 1).Each gene-set PRS had its respective WG PRS (i.e., each gene-set SNP list was input as an exclusion flag to create WG PRSs that did not include the specific gene sets).The primary analysis in this manuscript comprises SNPs that met a significance level of .1 (4).Analyses at other thresholds are included in Tables S5 to S11 in Supplementary Excel File 2.

Phenotypes
Psychotic-like Experiences.An MHQ was sent to par- ticipants who provided an email address between July 2016 and July 2017 (n = w157,000).This included 4 unusual and psychotic experience items, the dichotomized responses to which were used to create the 4 lifetime PLEs utilized in this study, which will be referred to as "conspiracies," "communications," "voices," and "visions" based on their clinical significance (see the Supplement for the list of questions).We also examined distress by selecting individuals who reported PLEs as distressing (as opposed to neutral or positive) and those who reported no PLEs to determine associations with experiencing stress (24)(25)(26), which will be referred to as "distress."Frequencies for all items are noted in Table 1.
Neuroimaging Phenotypes.A brain magnetic resonance imaging (MRI) scan was conducted for a subset of participants (27,28), and imaging-derived phenotypes of T1-weighted and diffusion MRI images were used in this study.MRI acquisition, preprocessing, and quality control protocols can be found in the Supplement.Individuals with global values .3SD from the sample mean, as ascertained by conducting principal component (PC) analysis on each modality's imaging variables to derive a global value, were excluded (range n excluded based on imaging modality = 105-232 participants) (distribution plots created using "ggplot" in R are shown in Figure S1 in the Supplement) (29,30).A list of neuroimaging phenotypes investigated is included in Table S2 in the Supplement.
For white matter microstructure, we derived global and regional (association and projection fibers, thalamic radiations) measures of FA and MD by conducting PC analysis on sets of tracts and extracting scores of the first unrotated principal component.For cortical ROIs, we derived global and lobar measures by summing up all (global) or lobar-specific ROIs, as in previous studies (34,35).Sample size and descriptive statistics for neuroimaging phenotypes are presented in Table 2.

Statistical Analysis
All analyses were conducted using R (version 4.1.0;R Core Team) in a Linux environment.We used linear mixed-effects models (function "lme" in package "nlme") for bilateral neuroimaging phenotypes, with age, age 2 , sex, 15 genetic PCs, scan site, 3 MRI head position coordinates (lateral brain position X https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=25756, transverse brain position Y https://biobank.ctsu.ox.ac.uk/ crystal/field.cgi?id=25757, and longitudinal brain position Z https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=25757), and genotype array set as covariates.Hemisphere was included as a within-subject covariate in all mixed-effects models.Intracranial volume was included as a covariate for gray matter phenotypes.We used general linear models (function "lm") for unilateral, regional, and global neuroimaging phenotypes, using the same covariates as above without hemisphere included.Finally, we used logistic regression for dichotomized PLEs, with age, sex, 15 genetic PCs, and genotype array as covariates.All models included gene-set PRS and each gene set's respective WG (excluding SNPs in each gene set) PRS as predictor variables.False discovery rate (FDR) was used to correct for multiple testing and was applied separately for each neuroimaging and reported-PLEs phenotype and across all p-value thresholds (36) (see Table S3 in the Supplement for detailed protocol).Effect sizes from linear models were standardized throughout.To establish that the effect of gene-set PRSs on neuroimaging and reported-PLEs phenotypes was not due to chance (because the SNP set sizes in all gene sets were much smaller than the WG), permutation analysis was carried out using a method developed by Cabrera et al. (37) for all significant associations (see the Supplement).

RESULTS
All results presented below concern PRSs at a p-value threshold , .1.Descriptive statistics for the reported-PLEs and neuroimaging phenotypes are noted in Tables 1 and 2, respectively.Analyses involving other thresholds are included in  in Supplementary Excel File 2.

Associations With PLEs
Gene set-specific significant findings and corresponding WG (excluding SNPs in each gene set) results are noted in Figure 1 ).WG PRSs (excluding SNPs in each gene set) were associated with all 4 MHQ items as well as distress, and effect sizes were of greater magnitude than associations with geneset PRSs (see Table S5 in Supplementary Excel File 2).Effect sizes for gene set-specific PRSs ranged from 1 3 10 24 to 0.092 and for WG PRSs (excluding SNPs in each gene set) ranged from 0.087 to 0.300.Results concerning MHQ items at other p-value thresholds are provided in Table S5 in Supplementary Excel File 2.

Permutation Analysis
For gene-set PRSs that were significantly associated with reported PLEs or neuroimaging measures, we performed circular genomic permutation analysis.We found that the significant associations with the gene-set PRSs in these phenotypes were not due to chance based on a permutation p value that was computed by comparing t values from the real associations with t values from the permuted associations (Table 3).

DISCUSSION
In this study, we investigated associations between biologically relevant pathway PRSs and 119,947 reported-PLEs and w29,000 neuroimaging phenotypes after adjustment for respective WG PRSs (excluding SNPs in each gene set).We calculated PRSs using the largest, most recent schizophrenia GWAS to date (4).We found significant associations of axon, PSD, and histone H3-K4 methylation gene sets with several cortical regions and white matter tracts, and with reported PLEs, specifically psychotic-like communications and distress associated with a reported PLE.Associations with reported PLEs were stronger for WG PRSs (excluding SNPs in each gene set), while associations with neuroimaging variables were stronger for gene-set PRSs.The 3 gene sets with significant associations identified here have previously been linked to schizophrenia.The PGC (5) identified pathways implicated in schizophrenia through robust computational analyses by aggregating gene sets from a large number of databases.The most significant genetic contribution to schizophrenia was found in genes encoding proteins located in excitatory synapses, in particular the proteome of the PSD, followed by histone methylation-related processes.The PSD of excitatory synapses comprises protein complexes that assemble glutamate-sensitive neurotransmitter receptors to intracellular proteins (38,39).Postsynaptic networks have long been implicated in schizophrenia (40)(41)(42) because they play a role in cognition and synaptic plasticity, features known to be disrupted in schizophrenia (13,43).In addition, both postsynaptic and presynaptic networks were uncovered through gene prioritization in the latest schizophrenia GWAS, highlighting the consistency of associations with this pathway and further rationale to investigate the pathway here (4).
We identified associations between the PSD gene set and decreased global and cingulate lobe SA, as well as higher PLEassociated distress.The cingulate lobe is part of the limbic system and comprises 4 cortical regions: rostral, caudal, posterior, and isthmus cingulate.The region is involved in behavior, emotion regulation, and cognitive processes including memory, attention, and motivation, all of which are disrupted in schizophrenia (44).Volumes in this region were reduced in previous studies investigating their association with schizophrenia, and a systematic review concluded that hypoactivity of the cingulate cortex underlies the manifestation of negative symptoms in many patients, although the studies analyzed provided inconsistent results (44,45).Interestingly, WG PRSs have not shown associations with cingulate or global SA in previous studies (46,47).Associations identified here indicate that narrowing the genome to a biologically relevant gene set may reveal associations that are not observed genome-wide.
The axon gene set, a cellular component that conducts electrical signals to presynaptic boutons that store and release neurotransmitters, has also been found to play an important role in schizophrenia.Specifically, disruption in axon guidance and axonal growth have been associated with increased schizophrenia risk in both human and mouse models (48,49).A recent study showed that individuals at high genetic risk for schizophrenia had hemispherical asymmetry in whole-brain structural networks, indicating that genetic susceptibility to schizophrenia modulated white matter network abnormalities.In addition, gene-set enrichment analysis found that genes participating in the PRS threshold used were involved in multiple relevant pathways, including axonal growth and axon guidance (18).Therefore, the axon gene set is a highly relevant route of investigation in the context of schizophrenia.Here, the axon gene set was associated with thalamic radiations, white matter microstructural tracts that link the thalamus to the rest of the cerebral cortex (50), and volume of the parahippocampal gyrus, a cortical region that plays a role in memory processes such as encoding and retrieval (51).Both neuroimaging phenotypes have been implicated in schizophrenia (52) and have recently been associated with state anhedonia and PRSs for anhedonia, a core negative symptom of schizophrenia ( 53).An opposite directionality of effect for WG PRSs and gene-set PRSs on parahippocampal gyrus volume could denote differential pathway-specific action on this region.The findings here indicate that genes conferring risk for schizophrenia that are aggregated in the axon gene set are strengthening evidence for brain structural regions that have already been implicated in schizophrenia psychopathology.
Finally, the histone H3-K4 methylation gene set was associated with entorhinal SA here.Histone H3-K4 methylation involves the modification of histone H3 by the addition of one or more methyl groups to lysine at position 4 of the histone.Histones, and specifically H3, are used to package DNA, and modifications lead to changes in gene expression (54).Epigenetic processes, including histone and DNA methylation, have been associated with schizophrenia through candidate gene regulation (HDAC1, GAD67) and epigenome-wide studies, lending support to the investigation of epigenetic modifications in schizophrenia (55)(56)(57).A key feature of schizophrenia is that it has its age of onset in young adulthood, and the transcriptional regulation of schizophrenia risk genes encoding synaptic proteins occurs at the age of onset, potentially through mechanisms involving H3-K4 methylation (42).
The entorhinal cortex plays a role in the integration of multisensory information between cortical and subcortical structures.It is strongly connected to the hippocampus and is involved in memory processes such as formation and consolidation (58).In studies investigating a neurodevelopmental animal model of schizophrenia, lesions in the entorhinal cortex, created early in the developmental period, were associated with enhanced mesolimbic dopamine release at a later time point, which was expressed through increased locomotor activity.This indicates that structural disruptions in this area are relevant for schizophrenia-like presentations (59).Interestingly, in an animal model, H3-K4 methylation was found to be upregulated in the hippocampus 1 hour after induction of contextual fear conditioning, a memory-related task that aims to rapidly create context-related fear memories (17).
Although not directly related to the entorhinal cortex, this finding indicates that H3-K4 methylation may be a useful candidate in the investigation of some schizophrenia-related symptoms in a subcortical structure closely linked to the entorhinal region.Lastly, in a recent human study investigating cortical thickness, SA, and folding index of the entorhinal cortex, Schultz et al. (58) uncovered a link between left SA and folding index and increased psychotic symptom severity, further supporting the role of the region in schizophrenia.Notably, we observed an opposite directionality of effect for H3-HK methylation WG PRSs and gene-set PRSs on entorhinal SA, which we hypothesize could be due to differential action of the pathway SNPs versus the WG SNPs.
Interestingly, the dendritic spine and PSM gene sets were not associated with any of the phenotypes that were investigated.Both cellular components have been linked to schizophrenia previously and are closely related to the other biologically relevant pathways investigated here (i.e., PSD, axon gene sets) (60,61).This may indicate that, while these gene sets are relevant in schizophrenia, we may need additional information [e.g., interaction with other relevant biological pathways or investigation of copy number variants in relation to these cellular components (62)] to identify associations with these reported PLE and structural brain features.Despite correlation analyses of PRSs across participants revealing a strong positive correlation between PSD-and PSM-pathway PRSs (r = 0.63, p # .01),this did not translate to similarly strong associations within the main analyses.
All PLEs, as well as distress, were significantly associated with WG PRSs (excluding SNPs in each gene set), as expected.This indicates that the PRSs generated from the schizophrenia GWAS were able to predict schizophreniarelated traits and therefore are a valuable tool.In addition, a number of neuroimaging phenotypes were associated with WG PRSs (irrespective of which gene set-specific SNPs were removed prior to calculation), including white matter microstructure and cortical and subcortical volume regions (see Table S4 in the Supplement for list of regions).These areas have been associated with WG PRSs in previous studies, calculated both with the GWAS summary statistics utilized here and with those from older GWASs (63,64).These findings confirm previous results and provide additional evidence of an association between increased genetic risk for schizophrenia and disruptions in gray and white matter.However, the finding that WG PRSs (excluding SNPs in each gene set) more accurately predicted schizophrenia-relevant phenotypes over pathway PRSs for these measures at this stage qualifies our aim to address the disorder's heterogeneity by selecting biologically relevant pathways.Future studies could incorporate additional factors such as expressionquantitative trait loci to PRSs to determine whether pathway-specific gene expression on brain tissue can better differentiate variation in specific phenotypes over nonpathway expression trends.
The current study has a number of strengths.We utilized the largest, most recent GWAS of schizophrenia with a large sample comprising reported-PLEs and neuroimaging datasets.Our findings were further tested utilizing permutation analysis, which indicated that results were not due to chance.Finally, we were able to identify associations that could be utilized in the future to identify stratified patient populations, potentially leading to earlier diagnosis and applied interventions.
Limitations include the investigation of these associations in a general population sample, in which participants are generally healthier and wealthier than the general population, as shown by Fry et al. (65).Due to the low number of ICD-10 diagnosed schizophrenia participants with reported-PLEs (n = 123) and imaging (n = 49) measures, we were unable to attempt a replication and highly encourage that this be done in larger cohorts.This may partially explain the low effect sizes we observed throughout, and it is expected that effect sizes will increase as clinical samples with available genetic and neuroimaging data increase in number.We are unable to generalize our findings to populations of non-European ancestry because both our GWAS and target samples consisted exclusively of individuals of European descent.However, efforts are continually being made to collect data from people of other ancestries, and the associations identified here could be explored in these cohorts.A further limitation includes the use of the MHQ, which measures lifetime occurrence of PLEs rather than PLEs that occurred at a specific time point.Therefore, our results may differ from those obtained in samples of participants in other age groups.Furthermore, responses were self-reported, and it is unclear whether reported PLEs arose from normal experiences or were linked to a psychiatric disorder.The strong associations with schizophrenia for WG PRSs (excluding SNPs in each gene set) indicate that the items measure some psychosis-related features.However, although PLEs are a hallmark of schizophrenia, they may be shared by a number of mental health disorders and nonclinical phenotypes (66,67).As such, schizophrenia-implicated pathways may not fully capture the genetic basis for PLEs.A multidisorder study focused on the severity of PLEs encompassing pathways associated with other mental health disorders could disentangle potentially shared genetic pathways associated with PLEs.
The differential directionality of effect for a number of pathway PRSs and WG (excluding SNPs in each gene set) PRSs may signal the presence of other disorder-contributing factors which may have differential pathway and WG effects that were not controlled for here.Future studies could include a wider range of covariates encompassing both environmental and genomic factors, such as smoking status and methylation.
Lastly, several methodological limitations should be noted.Reference-based approaches, such as the pathway-of-interest selection employed here, may limit the strength of their findings (68) by lowering the threshold required for a pathway to be significantly associated with a phenotype.An omnibus study utilizing all available pathways could be useful in contextualizing the strength of these findings in relation to the larger pool of gene ontologies not previously associated with schizophrenia.Although FreeSurfer is a standard methodological approach to analyzing structural MRI data across psychiatry, it may have limitations compared with manual tracing approaches, especially concerning segmentation variability (69).Future studies could utilize a combination of image processing software, such as FreeSurfer and voxel-based morphometry, to validate their findings.Finally, window-based approaches for SNP-gene mapping are limited by their chosen arbitrary distances and may be less informative than methodologies using gene expression data.Ultimately, this study is intrinsically limited by its inclusion of only 5 biologically relevant pathways and a limited number and type of reported-PLEs and neuroimaging measures and the methods used to obtain these.Future studies could complement this investigation by diversifying these parameters, especially in the case of utilizing functional imaging measures, as well as by replicating our findings in other large cohorts.
The pathways investigated here were previously found to be implicated in schizophrenia in both animal and human studies (4,17,53).In this study, we identified structural neuroimaging and reported-PLEs phenotypes that were associated with biologically informed PRSs, and these were shown to be more strongly associated with neuroimaging phenotypes than WG PRSs.However, because associations were stronger overall for WG PRSs, our findings also indicate that genetic risk aggregated to biologically relevant pathways is not yet more informative than genome-wide risk, but it may be still of relevance to future studies attempting to address heterogeneity and stratify individuals by genetic risk.

Figure 1 .Figure 2 .
Figure 1.Significant associations between the axon and postsynaptic density gene-set PRSs and reported-PLEs phenotypes.The x-axis indicates standardized effect sizes; * indicates significant associations after correction for multiple comparisons.PLE, psychotic-like experience; PRS, polygenic risk score; WG, whole-genome (excluding single nucleotide polymorphisms in each gene set).

Figure 3 .
Figure 3. Cortical and white matter microstructure phenotypes that were associated with the axon, postsynaptic density, and histone gene-set polygenic risk scores.The thalamic radiations tract category, comprising the anterior, superior, and posterior thalamic radiations, was associated with the axon gene-set polygenic risk score; the cingulate lobe, comprising caudal anterior, rostral anterior, posterior and isthmus cingulate, was associated with postsynaptic density gene-set polygenic risk scores.Global surface area (i.e., the entire brain's surface area) was also associated with postsynaptic density, but this is not indicated in the graph above.
. The final genetic sample consisted of N = 365,125 participants and N = 5,974,990 SNPs, which was further reduced

Table 2 .
Demographic Characteristics of the Neuroimaging Samples n is different for each variable because outlier exclusion (3 SDs from mean) was applied individually to each phenotype.Schizophrenia Pathway PRSs and Brain Phenotypes816Biological Psychiatry: Global Open Science October 2023; 3:814-823 www.sobp.org/GOS