Interactome overlap between schizophrenia and cognition

Cognitive impairments constitute a core feature of schizophrenia, and a genetic overlap between schizophrenia and cognitive functioning in healthy individuals has been identified. However, due to the high polygenicity and complex genetic architecture of both traits, overlapping biological pathways have not yet been identified between schizophrenia and normal cognitive ability. Network medicine offers a framework to study underlying biological pathways through protein-protein interactions among risk genes. Here, established network-based methods were used to characterize the biological relatedness of schizophrenia and cognition by examining the genetic link between schizophrenia risk genes and genes associated with cognitive performance in healthy individuals, through the protein interactome. First, network separation showed a profound interactome overlap between schizophrenia risk genes and genes associated with cognitive performance (SAB = -0.22, z-score = -6.80, p = 5.38e-12). To characterize this overlap, network propagation was thereafter used to identify schizophrenia risk genes that are close to cognition-associated genes in the interactome network space (n = 140, of which 54 were part of the direct genetic overlap). Schizophrenia risk genes close to cognition were enriched for pathways including long-term potentiation and Alzheimer's disease, and included genes with a role in neurotransmitter systems important for cognitive functioning, such as glutamate and dopamine. These results pinpoint a subset of schizophrenia risk genes that are of particular interest for further examination in schizophrenia patient groups, of which some are druggable genes with potential as candidate targets for cognitive enhancing drugs.


Introduction
Schizophrenia is a severe neuropsychiatric disorder that affects about 1% of the population. Although the disease is diagnosed based on positive and negative symptoms, cognitive impairments are considered a third core aspect of the disease, affecting about 80% of the patients (Tsuang et al., 1990;Young et al., 2015). Further, cognitive impairment has been observed before the onset of clinical symptoms, indicating that cognitive symptoms in schizophrenia are not related to secondary effects of the disease such as medication (Gold, 2004;Khandaker et al., 2011). While cognitive symptoms have a higher impact on functional outcomes of the disease, as well as patients' ability to reintegrate into society (Green, 2006) than the positive symptoms (Galderisi et al., 2016;Vita and Barlati, 2018), currently available medications primarily treat the positive symptoms by reducing dopamine (Rampino et al., 2019). Thus, cognitive impairment constitutes an important unmet treatment need for patients with schizophrenia as well as other neuropsychiatric disorders. To develop new drugs and treatments, further understanding of the underlying mechanisms behind specific symptoms is warranted.
Genetics is known to play a crucial role in the development of schizophrenia that has an estimated heritability of 80% (Sullivan et al., 2003), but inheritance is complex with a large number of identified risk genes that do not functionally relate to one single biological system (Neale and Sklar, 2015;Purcell et al., 2009). Several neurotransmitter systems have been suggested to be involved in the disease pathology, such as dopamine, serotonin, and glutamate (Stępnicki et al., 2018), but none can explain all aspects of the disease, and to what extent the different domains of clinical symptoms are a result of shared or separate underlying biological mechanisms remains to be revealed (Neale and Sklar, 2015;Purcell et al., 2009;Stępnicki et al., 2018). Clinically, it has been shown that the degree of cognitive impairment in schizophrenia correlates with negative symptom severity but not with positive symptoms (Carbon and Correll, 2014). Genetic correlation analyses have identified a genetic overlap between schizophrenia and cognitive functioning in healthy individuals (Hubbard et al., 2016;Smeland et al., 2017), and increased genetic risk for schizophrenia has been related to decreased cognitive performance in healthy individuals (Germine et al., 2016;Hubbard et al., 2016;Kauppi et al., 2015). However, cognitive impairment in patients has not been predicted by genetic risk scores for the disease, but rather by genetic profile scores for cognitive performance (Richards et al., 2019). It is possible that a subset of schizophrenia risk genes, that are involved in biological pathways shared with cognition-related genes, are related to cognitive symptoms in patients (Hubbard et al., 2016).
In this study, we examined the genetic overlap between schizophrenia and cognition from a perspective of biological gene networks to identify schizophrenia risk genes that are related to the cognitive symptoms in schizophrenia. We utilized information from the human protein interactome (Menche et al., 2015), together with identified genes associated with schizophrenia (Ripke et al., 2014) and cognitive functioning (Lee et al., 2018) from large-scale genome wide association studies (GWAS). The protein products of risk genes for many polygenic disorders are often involved in similar cellular processes, and therefore tend to cluster together in the interactome, forming a "disease module" or "trait module" when examining a trait such as cognitive functioning. It has been shown that diseases whose modules overlap often share clinical characteristics as well as biological and functional similarities (Menche et al., 2015). Such network-based mapping is biologically more informative than examining shared risk genes between the two traits (Smeland et al., 2017). The identification of the genetic overlap between schizophrenia and cognition may help to reveal the biological relatedness of cognitive functioning and schizophrenia, and may suggest candidate proteins for new drugs targeting cognitive symptoms.

Disease/trait genes
To identify genes related to schizophrenia and cognitive functioning, summary statistics from large-scale GWASs were used. Schizophrenia risk genes were derived from the discovery sample of a GWAS performed by the Psychiatric Genomics Consortium (PGC) including 35,476 cases and 46,839 controls, which identified 108 independent loci associated with schizophrenia (Ripke et al., 2014). For genes associated with cognitive functioning, we used a large multicenter GWAS on cognitive performance measured across at least three domains of cognition including 257,841 individuals (Lee et al., 2018). For control analysis, a GWAS of osteoporosis including 142,487 individuals was used where 159 loci significantly associated with osteoporosis have been identified (Kemp et al., 2017). For all three phenotypes, genes linked to genomewide significant single nucleotide polymorphisms (SNPs) were identified using the "SNPtoGene" function of the web-based software Functional Mapping and Annotation of Genome-Wide Association Studies (FUMA) (Watanabe et al., 2017). First, variants from the extended major histocompatibility complex (MHC) region (25-34 Mb on chromosome 6 on the hg19 assembly) were excluded, due to the high linkage disequilibrium (LD) of this region, including hundreds of genes where it is not known which gene (s) that have a causal role in the disease/trait. Thereafter, genes were selected that were located within a LD-threshold of r 2 b 0.6, and a maximum distance of 250 kb, from genome-wide significant SNPs (p b 5 × 10 −8 ) with a minor allele frequency (MAF) of ≥0.01 (positional mapping with default settings in FUMA).

Interactome databases, network terminology and illustrations
For the main analyses, we used the protein-protein interaction (PPI) network created by Menche et al. (2015), here referred to as the human interactome. This interactome consists of 13,460 proteins interconnected by 141,296 interactions (see Supplementary materials for details). To validate the results from the human interactome, the STRING database (von Mering et al., 2005) was used for comparison (see Supplementary materials for details). For our analyses, we used combined confidence scores of N0.7 for the human interactome version 11.0 giving 17,161 proteins interconnected by 841,068 interactions. Network figures were created using Cytoscape (Shannon et al., 2003), where nodes refer to genes and edges refer to interactions between two genes through identified PPIs between gene products (proteins).

Network localization
Using network localization (Menche et al., 2015), we examined if genes associated with cognitive functioning are significantly localized in the network space, as we previously reported for schizophrenia risk genes (Kauppi et al., 2018). For each cognition-associated gene, we calculated the number of interaction steps, d s , to the next closest trait gene and the corresponding frequency distribution of d s across all cognitionassociated genes. To determine whether cognition-associated genes are more localized than expected by chance, 1000 randomly selected sets of genes with the same number of genes as the trait were generated for comparison with d s for cognition-associated genes to calculate test statistics (see Menche et al. (2015) for details).

Disease module overlap between schizophrenia and cognition
To examine if the disease/trait modules of schizophrenia and cognition, as well as our control disease, were separated or overlapped in the interactome network space, we used a method called network separation (Menche et al., 2015). We calculated the network-based separation (S AB ) of disease/trait pairs (A and B) by comparing the shortest distance between proteins within each disease/trait (d AA and d BB ) to the shortest distance between the disease/trait pairs (d AB ). Thus, the network-based separation of a disease/trait pair, A and B, is calculated using S AB = d AB − ((d AA + d BB ) / 2), where a positive S AB indicates separation and a negative S AB value implies that there is an overlap between the two disease/trait modules (Menche et al., 2015). As node degree (number of edges of a node) in the interactome is biased toward well studied genes, we generated a node degree-preserved random distribution from 1000 random repetitions following Guney et al. (2016) to calculate test statistics for the observed S AB value. This method uses a data binning approach that groups nodes within a certain degree interval (≥100 nodes in each bin) and then the same number of nodes from each bin is selected for the randomly selected gene lists as in the disease/trait gene list . We repeated the networkseparation analysis in a brain-specific interactome (Kitsak et al., 2016). Using gene expression data (RNA-seq) from the human protein atlas version 19.3 (Uhlén et al., 2015), we excluded proteins that were not expressed in the brain from the human interactome resulting in a brain-specific interactome (in terms of its nodes) containing 136,006 protein interactions and 12,740 proteins.

Network propagation
To identify the disease modules of schizophrenia and cognitive functioning, as well as their disease module overlap, we used the method network propagation (Carlin et al., 2017;Köhler et al., 2008;Vanunu et al., 2010), implemented in the Cytoscape application Diffusion (Carlin et al., 2017). Starting with a chosen set of input proteins, information from all their PPIs (referred to as heat) is transferred to their neighbors and received from them through an iterative process. The strength of association of proteins is scored (0-1), where higher diffusion output (heat) values correspond to higher relatedness to the input query proteins (Carlin et al., 2017;Köhler et al., 2008;Vanunu et al., 2010). All network propagation analyses were performed in the human interactome where we excluded self-interactions, resulting in 138,427 interactions. To estimate network closeness between schizophrenia genes and cognition-associated genes, we performed network propagation using all cognition genes as input query.
To investigate which genes are close to cognition genes, we first chose a random set of non-cognition-associated genes with equal number and comparable node degree as the cognition-associated genes (n = 479) . Next, we performed a receiver operating characteristic (ROC) analysis using the heat value from all cognition-associated genes and the heat value from the randomly selected genes to predict their gene class ("randomly selected" versus "cognition-associated").
Here, sensitivity refers to cognition-associated genes (positives) correctly classified as cognition-associated genes, and 1-specificity refers to randomly selected genes (negatives) falsely classified as cognitionassociated genes. To define a cut-off giving equal importance to sensitivity and specificity, we calculated the Youden index (Youden, 1950). This procedure was repeated 1000 times and the mean value from these 1000 cut-offs was used as the final cut-off for the heat values. Correspondingly, we performed network propagation analysis with all schizophrenia risk genes as input query and defined a cut-off from 1000 ROC analyses with the heat values from all schizophrenia risk genes (positives) and the heat value from non-schizophrenia random genes (negatives) to define which genes are close to schizophrenia. These analyses were performed in R version 3.5.1 (R Core Team, 2018).

Gene ontology and pathway enrichment analyses, and drug-gene interactions
To describe the function of specific gene sets, ToppGene (Chen et al., 2009) (last updated 2019-12-03) was used to examine enrichment in gene ontology annotations for schizophrenia risk genes determined as close to cognition-associated genes within the human interactome based on network propagation analyses. From ToppGene, which uses hypergeometric distribution with Bonferroni correction for determining statistical significance, we used the gene ontology annotation categories molecular function, biological process and cellular component as well as pathways. A Bonferroni-corrected p-value threshold of 0.05 was used. Additionally, the drug-gene interaction database (DGIdb) (Cotto et al., 2018) (v3.0.2, last updated 2018-01-25) was used to search for druggable genes and potential druggability among the schizophrenia risk genes defined as close to cognition. In addition, we used GUILDify (v2.0) (Aguirre-Plans et al., 2019) to describe biological functions and potential drugs within the genetic overlap between schizophrenia and cognition defined by the Netscore algorithm in the BIANA (Biological Interactions And Network Analysis) network that consists of 13,090 proteins connected by 320,337 interactions derived from external databases characterized by attributes like detection method and reliability (Garcia-Garcia et al., 2010). The Netscore algorithm is comparable to the Network propagation algorithm, but defines closeness between two traits by first defining top ranking genes within each trait within the whole interactome and then defining the overlap between the top-ranking genes (Aguirre-Plans et al., 2019).

Defining disease/trait modules for cognition and schizophrenia
Using FUMA (Watanabe et al., 2017), 313 schizophrenia risk genes were identified from the schizophrenia GWAS summary statistics (Ripke et al., 2014). Of these, 232 were included in the human interactome (Supplementary Table 1). For cognitive functioning, 621 associated genes were identified from the corresponding GWAS summary statistics (Lee et al., 2018). Of these, 479 were included in the human interactome (Supplementary Table 1). The largest connected component comprised 22 and 136 genes for schizophrenia and cognition, respectively. The STRING interactome, which was used for comparison, includes 201 of the GWAS-identified schizophrenia risk genes and 432 of the genes related to cognitive functioning.
Genes associated with cognition were significantly localized in the human interactome with a mean shortest path length of 1.64 edges between two cognition-associated genes compared to 1.69 edges between two randomly selected gene sets (p = 0.04). Using the STRING interactome, the mean shortest path length between cognition gene pairs was 1.55 compared to 1.60 edges between two randomly selected genes, again significantly shorter (p = 0.02), indicating a trait module for cognitive performance in the interactome, as was previously shown for schizophrenia (Kauppi et al., 2018). To see what genes are most central in each of those disease/trait modules, we performed network propagation analyses using all genes associated with schizophrenia (Fig. S1A) or cognition (Fig. S1B) as input query. These analyses take all interactions in the interactome into account to identify global relatedness between genes.

Overlap between schizophrenia and cognition
Risk genes for schizophrenia and cognition-associated genes had a negative network separation, indicating network-based overlap, which was significantly different from the separation between schizophrenia and degree-matched randomized gene sets (S AB = −0.22, zscore = −6.80, p = 5.38e−12). In the STRING interactome, a significantly more negative network separation between schizophrenia risk genes and cognition-associated genes was seen compared to randomly selected gene sets (S AB = −0.08, z-score = −2.86, p = 0.002), which confirms a significant overlap between schizophrenia and cognition. The network-separation analysis within the brain-specific interactome also showed a significant overlap between schizophrenia and cognition (S AB = −0.21, z-score = −5.65, p = 8.13e−09).
In both the human interactome and STRING, there was no overlap between risk genes for osteoporosis, our control disease and schizophrenia risk genes or cognition-associated genes (see Supplementary materials for details).

Characterizing the overlap between schizophrenia and cognition
To identify which specific schizophrenia risk genes are close to cognition-associated genes, we used network propagation analyses to define a subset of genes in the whole interactome that are close to cognition (n = 5471) with all cognition-associated genes as input query and determined a cut-off from the heat value (0.02088248). The network distance (heat value) of each specific schizophrenia risk gene to cognition-associated genes and vice versa is reported in Supplementary  Table 2 and shown in more detail in Fig. S2A (schizophrenia risk genes) and Fig. S2B (cognition-associated genes), together with all the withintrait/disease interactions. Schizophrenia risk genes were significantly overrepresented among genes defined as close to cognition (n = 140, z-value = 6.16, p-value = 1.8e−9), even when we excluded genes that are both schizophrenia and cognition-associated (n = 54, zvalue = 2.16, p-value = 0.02). We also determined a subset of genes that are close to schizophrenia (n = 1443, cut-off = 0.02128962), of which 93 were cognition-associated genes. Fig. 1 shows all schizophrenia risk genes and cognition-associated genes, and their network-based closeness to each other.

Gene ontology and pathway enrichment analysis, and drug-gene interactions
Gene ontology and pathway enrichment analysis was performed for the 140 schizophrenia risk genes defined as close to cognition. These genes were enriched for gene sets implicated in Alzheimer's disease and long-term potentiation, identified from the KEGG pathway (Fig. 2). The most significant molecular functions were peptidase activity and drug binding, and enriched biological processes included regulation of catabolic processes, chromosome organization, cell signaling and neuron differentiation (Supplementary Table 3). The schizophrenia risk genes not defined as close to cognition were enriched for gene sets implicated in nicotinic acetylcholine receptor activity. Among the 140 schizophrenia risk genes defined as close to cognition, 51 are drug targets (mostly for Alzheimer's disease, inflammation, diabetes, epilepsy, and cancer) and among those genes that are not yet used as drug targets, 45 genes are druggable. While these numbers were not extreme events (two-sample proportion test, drug targets: z-value = 1.29, pvalue = 0.10; druggable genes: z-value = 0.99, p-value = 0.16), these genes may serve as suggestion for repurposing and as potential new drug targets, respectively (Fig. 2, Supplementary Table 3). Using all schizophrenia risk genes (n = 313) and cognition-associated genes (n = 621) as input in GUILDify (Aguirre-Plans et al., 2019), we first identified a significant genetic overlap between schizophrenia and cognition (p = 7.0e−09) defined by the Netscore algorithm as common genes among the 1% top-ranking genes within the BIANA network (Garcia-Garcia et al., 2010). These common genes (n = 12) were enriched for T cell activation and positive regulation of neuron apoptotic processes. Among potential drugs targeting the common genes were mostly antidiabetics and anti-cancer drugs.

Discussion
Using established network-based methods, we have identified and characterized an overlap between schizophrenia and cognition through biological gene networks of protein interactions among gene products. Overlapping parts of the interactome were linked to cognition-related pathways, such as long-term potentiation, and contained druggable genes that may be of interest as drug targets to treat the cognitive symptoms in schizophrenia.
In a recent publication we showed that schizophrenia risk genes are significantly localized in the interactome, forming a disease module (Kauppi et al., 2018). Here, we first tested whether cognitionassociated genes were also found to be significantly localized in the interactome network space, which was the case. To study the biological link between schizophrenia risk genes and cognition-associated genes, we examined their relatedness in the interactome. We found that the trait/disease modules of schizophrenia and cognition significantly overlap in the network space, indicating shared biological processes (Menche et al., 2015). This was also seen in the brain-specific interactome, strengthening the biological validity of the results.
Next, using network propagation (Carlin et al., 2017;Köhler et al., 2008;Vanunu et al., 2010), and an unbiased cut-off value for networkbased closeness, we defined genes close to cognition-associated genes in the whole interactome. Among genes close to cognition, we found an overrepresentation of schizophrenia risk genes (n = 140), even when excluding schizophrenia risk genes that are also associated with cognition (n = 54, as those were part of the input gene list). Those results show that the biological overlap between schizophrenia and cognition extend beyond the direct genetic overlap (n = 54) to also include schizophrenia risk genes that are close to cognition-related genes in the protein interactome (n = 140). In contrast to networkbased measures of shortest distance, the network propagation algorithm has the advantage to express the global network proximity based on the number of interactions with other genes, thus not favouring genes with a generally large node degree (Köhler et al., 2008). From those analyses we also defined trait/disease modules of schizophrenia and cognition based on their heat value for relatedness to other genes from the same trait/disease module ( Fig. S1A and B).
Gene ontology analyses of the 140 schizophrenia risk genes close to cognition revealed strongest enrichment for the KEGG pathways Alzheimer's disease and long-term potentiation, based on genes such as calcium-channel (CACNA1C), and glutamate-receptor genes (GRIA, GRIN2A), suggesting a plausible route of those genes to cognitive impairments. Another set of genes that have been associated with the cognitive symptoms in schizophrenia are the nicotinic acetylcholine receptor genes (Friedman, 2004). Notably, these were not among the 140 schizophrenia risk genes that are close to cognition. However, clinical evidence suggests that acetylcholine esterase inhibitors do not improve the cognitive symptoms in schizophrenia patients (Kishi et al., 2018;Santos et al., 2018). Current medications available for schizophrenia patients have considerable limitations such as serious side effects and treatment resistance in about one third of patients with psychosis (Rampino et al., 2019;Stępnicki et al., 2018). To identify drug targets to also treat the negative and cognitive symptoms, novel approaches in schizophrenia drug design aim to cover new signaling mechanisms including various neurotransmitter systems beyond dopamine (Rampino et al., 2019).
To identify genes with potential as new drug targets or for drug repurposing, we examined the interactome overlap between schizophrenia and cognition through the druggable genome, as listed in Supplementary Table 3. Fifty-one drugs already targeted gene products of schizophrenia risk genes defined as close to cognition, which could potentially be used for repurposing to treat the cognitive symptoms in schizophrenia. Among them are drugs developed to treat Alzheimer's disease, anti-inflammatory and immunosuppressant drugs, anticonvulsant medications and antidiabetic medications, but also anticancer and other drugs (Supplementary Table 2). Using Netscore as an alternative algorithm to define the genetic overlap between schizophrenia and cognition, we found that genes within this overlap were mostly targeted by anti-diabetic and oncology drugs. Interestingly, oncology drugs (Araki, 2013), anti-diabetic drugs (Yarchoan and Arnold, 2014), and antiinflammatory drugs (Wang et al., 2015;Zhang et al., 2018) have been suggested for repurposing to treat the cognitive impairment in Alzheimer's disease, and could be considered for further examination as cognitive enhancers in schizophrenia. Indeed, it has been shown that diseases that share drugs also tend to share biological pathways, which could be targeted to make use of novel drug repurposing opportunities (Aguirre-Plans et al., 2018).
While previous research has shown a genetic overlap between cognition and schizophrenia (Hubbard et al., 2016;Ohi et al., 2018), it has been difficult to relate schizophrenia gene variants to cognitive symptoms (Goff et al., 2012;Richards et al., 2019), which may be a result of the biological heterogeneity of schizophrenia (Goff et al., 2012). Studies investigating schizophrenia symptoms in relation to patients' IQ found that there may be a subgroup of schizophrenia patients with high IQ who do not show any cognitive deficits (Cernis et al., 2015;Maccabe et al., 2012) and who have fewer negative symptoms than typical schizophrenia patients (Cernis et al., 2015). In a recent study, Bansal et al. (2018) suggested that schizophrenia can be divided into two disease subtypes where one resembles bipolar disorder and high IQ, and one that is a cognitive disorder independent of bipolar disorder. Thus, our study contributes to unravel the complexity of schizophrenia by defining a group of schizophrenia risk genes that may be related to a biological disease subgroup that resembles a cognitive disorder. To further explore this hypothesis, risk genes that we identified as close to cognition should be further examined in relation to cognitive symptoms in patients.
A major limitation of the human interactome used in this study is its incompleteness, as it covers only about 20% of all estimated PPIs. However, Menche et al. showed that the incomplete network could be used to successfully identify discrete disease modules of 226 complex diseases allowing the systematic investigation of disease mechanisms and relationships between diseases (Menche et al., 2015). To address this limitation, we also used the less conservative STRING interactome for which results remained consistent. Moreover, human interactomes are potentially biased toward well-studied proteins. In the human interactome, this bias is addressed by including PPIs derived from highthroughput data sets (Rolland et al., 2014;Venkatesan et al., 2009;Yu et al., 2011). To minimize this bias, we used node degree-preserved methods.
In summary, we have identified a biological overlap between genes related to schizophrenia and cognitive ability in healthy individuals, through the human protein interactome that extends beyond previously identified shared risk genes. This overlap was characterized by schizophrenia risk genes related to long-term potentiation and Alzheimer's disease, and contained important risk genes with a role in neurotransmitter systems such as glutamate and dopamine, as well as calcium channels, which were not also GWAS-identified cognition genes. The results pinpoint schizophrenia risk genes of particular interest for further examination in relation to cognitive symptoms in schizophrenia patient groups. In addition, many druggable genes were found among genes constituting the overlap, some of which may be potentially suitable as candidates for drugs targeting cognitive symptoms of schizophrenia.

Role of the funding source
The funding source has no involvements in any parts of the current work or in the submission of the manuscript.

Contributors
E.K. was responsible for the analyses and drafting the manuscript. B.R. and A.L. were responsible for the statistical methods. K.K. and C-H.C. were responsible for study concept and design. All authors edited and approved the final version of the manuscript.

Declaration of competing interest
The authors have no conflicts of interest to declare.