Genetics of schizophrenia

Schizophrenia is a common psychiatric disorder with a strong genetic component. Recent studies applying new genomic technology to large samples have yielded substantial advances in identifying speciﬁc, associated DNA variants as well as clarifying the underlying genetic architecture of the disorder. The genetic liability of schizophrenia is now established as polygenic, with risk alleles in many genes existing across the full allelic frequency spectrum. It has also become apparent that schizophrenia shares risk alleles with other neuropsychiatric phenotypes, such as bipolar disorder, major depressive disorder, autism spectrum disorder, intellectual disability and attention-deﬁcit hyperactivity disorder. These risk variants aggregate in several sets of functionally related genes, thereby providing novel insights into disease pathogenesis and opportunities for research into discovering new treatments.


Introduction
Schizophrenia is a debilitating psychiatric disorder, characterised by hallucinations, delusions, thought disorder and cognitive deficits, and has a lifetime prevalence of around 1%. Evidence for a substantial genetic contribution comes from family, twin and adoption studies [1] but the underlying causes and pathogenesis of the disorder remains unknown.The past few years have witnessed marked progress in our understanding of genetic risk at the level of DNA variation, which has been largely driven by applying advanced genomic technologies to very large samples.There is evidence that risk variants occur across the full allelic frequency spectrum, many of which are associated with other neuropsychiatric disorders.Moreover, genetic associations involving different classes of mutations have now implicated specific biological pathways in disease pathogenesis.This review will cover recent advances in schizophrenia genetics from studies of de novo mutation, rare copy number variation (CNV), rare single nucleotide variant (SNV, defined as point mutations with a frequency less than 1%) and small insertion/deletion (indel) mutations and single nucleotide polymorphisms (SNPs, defined as point mutations with a frequency greater than 1%) (Figure 1).

De novo mutations
High heritability estimates for schizophrenia suggest that much of the risk is inherited [2].However, alleles which are not inherited, i.e. newly arising (de novo) mutations, have also been shown to contribute to risk.In addition, increased paternal age at conception, which is correlated with the number of de novo mutations observed in an individual [3,4], has been associated with increased schizophrenia risk [5].The first molecular evidence associating de novo mutation with schizophrenia came from studies of CNVs [6][7][8].Across studies, the CNV de novo mutation rate was found to be significantly elevated in schizophrenia ($5%) versus controls ($2%), with some evidence for a higher rate among patients with no family history of the disorder [6][7][8].The median size of de novo CNVs > 100 Kb found in schizophrenia cases (574 Kb [6][7][8]) is also larger compared with that in controls (337 Kb [6][7][8][9]).Selection coefficients (s) between 0.12 and 0.88 have been estimated for CNVs robustly associated with schizophrenia (a selection coefficient of 1 being reproductively lethal) [10].With this intensity of selection, de novo CNVs at schizophrenia-associated loci are purged from the population in less than five generations [10].
Studying gene-sets overrepresented for being disrupted by de novo mutation in schizophrenia has provided novel insights into biological pathways underlying the disorder.For example, genes disrupted by schizophrenia de novo CNVs are enriched for those in the post-synaptic-density proteome [6].This association is largely driven by genes encoding members of the N-methyl-D-aspartate receptor (NMDAR) and neuronal activity-regulated cytoskeletonassociated protein (ARC) complexes, both of which are involved in synaptic plasticity [6].
More recently, exome sequencing studies have permitted the evaluation of de novo SNV mutations and indels in schizophrenia.In contrast to studies of de novo CNVs in schizophrenia, the exome-wide rate of de novo SNV/indel mutations is not increased in cases compared with the population expectation [11 ].Some smaller studies have reported slightly elevated rates of de novo SNV mutations, as well as a greater proportion of de novo mutations occurring as nonsynonymous, in schizophrenia compared with controls [12][13][14], but these findings were not observed in the largest study till date [11 ].However, loss-of-function de novo SNV/indel mutations are enriched among patients with poor educational attainment (these cases did not have intellectual disability) [11 ].Multiple schizophrenia loss-of-function de novo SNV/indel mutations have been observed in two genes (TAF13, SETD1A) [11,15], suggesting they are likely to be relevant to the disorder.
The products of genes disrupted by damaging de novo mutations in schizophrenia show greater connectivity in protein-protein interaction (PPI) networks than expected by chance [13] or compared with controls [14].Genes disrupted by nonsense de novo mutations in schizophrenia have also been shown to preferentially occur in genes subject to haploinsufficiency [12], suggesting many are likely to be pathogenic.Despite the lack of an increased exome-wide rate of de novo SNV/indel mutation in schizophrenia, these mutations are enriched among cases in previously associated sets of biologically related genes.Specifically, the ARC and NMDAR postsynaptic protein complexes, associated with schizophrenia in studies of de novo CNVs, have been further implicated through significant enrichments in cases for nonsynonymous and loss-of-function de novo mutations [11 ].Brain expressed genes targeted by fragile X mental retardation protein (FMRP) also show evidence for significant enrichments of de novo SNV/indel mutations in schizophrenia [11 ] following an earlier observation for a similar enrichment for de novo mutations in ASD [16].Other sets reported to be enriched for de novo mutations include those related to the assembly of actin filament bundles [11 ], genes related to epigenetic regulation, specifically chromatin-remodelling [12,13,15], and genes disrupted by de novo mutations in ASD and intellectual disability (ID) [11 ].

Rare copy number variations
Studies of rare (<1%) CNVs in schizophrenia have now reported several reproducible associations.It is established 128 linkage disequilibriumindependent genome-wide associations (OR<1.2).  that patients with schizophrenia have a significantly increased genome-wide burden of rare CNVs compared with controls, with the strongest effect usually seen for large (>500 Kb) deletions [17][18][19][20].Since the discovery of a deletion at 22q11.2 as the first schizophrenia-associated CNV [21,22], analyses of rare CNVs involving >20,000 cases have revealed associations at more than 15 loci [20,23,24] (Figure 2).The majority of these CNVs substantially increase the risk of developing schizophrenia, with odds ratios (OR) between two and 60 [24].As their frequency among patients is often less than one in 500, their individual contribution to the total population variation in schizophrenia genetic liability is small [25], although collectively they are found in around 2.5% of patients [24].Most schizophrenia-associated CNVs are large and recurrent, meaning multiple mutation events have occurred at the exact same, or near identical, genomic location.The breakpoints of recurrent CNVs are usually flanked by repetitive genomic elements such as low copy repeats (LCRs), which mediate mutation through nonallelic homologous recombination [26].10 recurrent CNVs have been associated with schizophrenia at a level of statistical support that survives correction for the multiple testing of 120 potential recurrent CNV loci in the human genome (Figure 2).Drawing biological insights from recurrent CNVs remains a challenge, largely because multiple genes and regulatory elements are often disrupted.However, single-gene disrupting non-recurrent CNVs have also been associated with schizophrenia at NRXN1, VIPR2 and PAK7.These mutations have the potential to offer clearer insights into disease pathogenesis, although only the NRXN1 association survives correction for the multiple testing of all human genes ($20,000).NRXN1 encodes a synaptic cell adhesion molecule neurexin 1 that links presynaptic and postsynaptic neurons [27].
Gene-set analyses have shown rare CNVs in schizophrenia to be enriched among biological pathways previously  implicated in schizophrenia, such as the NMDAR and metabotropic glutamate receptor 5 (mGluR5) components of the post synaptic density (PSD), calcium channel signalling (see single nucleotide polymorphisms below) and FMRP targets [20].Additional gene-sets recently implicated in rare CNV studies include signalling components within the immune system, chromatin remodelling complexes and targets of microRNA miR-10a [20].
Schizophrenia-associated CNVs have been shown to increase risk for additional neuropsychiatric disorders [28,29].For example, schizophrenia-associated duplications of the Williams-Beuren and Prader-Willi/Angelman syndrome regions are also implicated in ASD [9,30], deletions of 15q11.2 and 15q13.3 in epilepsy [31,32] and duplications of 16p13.11 in attention-deficit hyperactivity disorder (ADHD) [33].Up to 72 pathogenic CNVs, which include the majority of those presented in Figure 2, are enriched in large cohorts of patients with early onset neurodevelopmental phenotypes, such as ID, ASD and congenital malformations (CM) [34,35].It has been suggested that individuals carrying more than one pathogenic CNV are at greater risk of developing an earlier onset neurodevelopmental disorder (ID/ASD/CM) compared with schizophrenia [36].In some instances, reciprocal CNVs (i.e.deletion and duplications at the same locus) appear to have different phenotypic effects.For example, deletions and duplications at 16p11.2 are associated with obesity and low body mass index, respectively [37].In schizophrenia, duplications at 22q11.2 are significantly less common than they are in controls, whereas the deletion of this locus is one of its strongest risk factors [38].
The CNVs in Figure 2 are considered to have fairly high, but incomplete, penetrance for schizophrenia and for other neurodevelopmental disorders, most having lower penetrance for schizophrenia than the other disorders [28 ].However, the incomplete penetrance of these CNVs has recently been questioned in a large study which showed the level of cognitive performance in non-affected carriers of schizophrenia-associated CNVs to be in-between that observed in schizophrenia patients and population controls [39 ].

Rare single nucleotide variant and insertion/ deletion mutations
Over the past few years, several publications have used new sequencing technology to investigate rare inherited (as opposed to de novo) alleles in schizophrenia.Intriguing findings have been reported from some studies [40,41], although their results largely remain inconclusive owing to small sample size.Only one schizophrenia study till date has employed exome sequencing in large samples (2536 cases and 2543 controls) [42 ].No single rare allele (MAF < 0.1%) was associated at genome-wide levels of significance, and overall, the exome-wide burden of rare variation was not increased in cases.However, a significantly increased burden of rare, disruptive alleles was observed in a set of 2546 genes selected for a higher probability of being associated with schizophrenia.This burden was distributed across a large number of genes.As in the de novo CNV and SNV studies, significant enrichments for rare disruptive SNVs and indels were found in proteins affiliated with ARC and NMDAR genes, and FMRP-targets, but also for voltage-gated calcium channels [42 ].This work demonstrates a contribution of ultra-rare damaging alleles spread across a large number of genes in schizophrenia, although larger samples are required for robust associations to be made to specific genes/alleles.

Single nucleotide polymorphisms
Genome-wide association studies (GWAS) of SNPs have now identified a number of common schizophrenia risk alleles [43][44][45].Individually, these alleles have a weak effect on schizophrenia risk, with ORs generally < 1.2, although collectively they are estimated to account for between a third and a half of the variation in schizophrenia genetic liability [43,46,47].Given the modest effect size of these alleles, very large samples have been required to obtain the necessary statistical power for associations to be made at genome-wide levels of significance (P < 5 Â 10 À8 ).Until recently, the number of individual alleles identified at genome-wide levels of significance was small (n < 30) [43], although en masse analysis of GWAS data had established the disorder to have a polygenic component likely to involve over a thousand common alleles [43,46,48].Recent successes in the identification of schizophrenia common allele associations can largely be attributed to the Schizophrenia Working Group of the Psychiatric Genomics Consortium (PGC), which was created with the aim to maximise sample size by combining GWAS data from multiple international research groups [49].The latest data from the PGC identified 128 linkage disequilibrium (LD)-independent genome-wide significant associations in 108 distinct loci [45 ].The most significant allelic association in schizophrenia is in the extended major histocompatibility complex (MHC) on the short arm of chromosome 6 [45 ].Identifying candidate genes from this association is a major challenge as the existence of strong LD across this region of about 8Mb makes it difficult to localise the association to one, or even a few, of the hundreds of genes at the locus.The MHC's involvement in immunity suggests that immune dysfunction might play a role schizophrenia, although non-immune genes are also found in this region [50].Additional genomewide significant associations are found in genes long believed to play a major role in schizophrenia, such as the dopamine receptor D2 gene, which encodes the therapeutic target of most antipsychotic drugs [45 ].This suggests that biological insights gained from other novel common allele associations have the potential to identify new drug targets.Gene-set analyses have not yet shown any biological pathway to be significantly enriched for the 128 schizophrenia genome-wide significant associations after correction for multiple testing, and a definitive analysis is awaited [45 ].However, the associations are enriched for enhancers expressed in brain, and also for enhancers in tissues involved with immunity [45 ].
Schizophrenia has been shown to share common risk alleles with other psychiatric disorders, such as bipolar disorder (BP), major depressive disorder (MDD), ASD and ADHD [51].The most powerful demonstration of this comes from the en masse effects of SNPs which have revealed a high genetic overlap between schizophrenia and BP, a moderate overlap between schizophrenia and MDD, and a small but significant overlap between schizophrenia and ASD [46,48].Combining GWAS data from schizophrenia and BD has proved fruitful in identifying common risk alleles [52,53], although polygenic risk scores have also been able partly to distinguish between these disorders, suggesting that some risk alleles may confer more specific effects at the level of the psychiatric phenotype [53].Schizophrenia polygenic risk scores have also been shown to predict lower cognitive ability [54], suggesting these alleles contribute to the cognitive deficits associated with the disorder.

Conclusion
It is now established that the genetic architecture of schizophrenia involves rare, common and de novo risk alleles distributed across a large number of genes.Despite substantial genetic heterogeneity, different classes of mutation have been shown to converge onto common biological pathways, implicating neuronal calcium signalling, components of the post synaptic density, synaptic plasticity, epigenetic regulation and the immune system in the disorder.It has also become clear that schizophrenia shares risk alleles with other neuropsychiatric disorders, with evidence of a gradient of mutational severity with intellectual disability and schizophrenia at the most extreme and moderate ends of this spectrum, respectively [55].It is inevitable that further increases in sample size in both GWAS and sequencing studies will identify additional risk alleles and whole-genome sequencing will allow for more complex types of genetic variation to be examined, while permitting the investigation of rare alleles in regulatory elements.

Figure 1
Figure 1 Rare alleles Common allelesCurrentOpinion in Behavioral Sciences Schizophrenia risk alleles.Bullet points summarise some of the key findings covered in this review.MAF = minor allele frequency, OR = odds ratio, CNV = copy number variation, SNV = single nucleotide variant, indel = insertion/deletion, ARC = activity-regulated cytoskeleton-associated protein, NMDAR = N-methyl-D-aspartate receptor, FMRP = fragile X mental retardation protein, BD = bipolar disorder, MDD = major depressive disorder, ASD = autism spectrum disorder.