Selection in the dopamine receptor 2 gene : New candidate SNPs for disease-related studies

English version) ............................................................................................................... 5 Abstract (German version) .............................................................................................................. 6 Introduction ..................................................................................................................................... 7 Material and Methods ...................................................................................................................... 9 Results ........................................................................................................................................... 12 Discussion ..................................................................................................................................... 15 Acknowledgements ....................................................................................................................... 17 References ..................................................................................................................................... 18 Supplementary Material ................................................................................................................ 22 Curriculum vitae ............................................................................................................................ 41


Abstract (English version):
Dopamine is a major neurotransmitter in the human brain and is associated with various diseases.
Schizophrenia, for example, is treated by blocking the dopamine receptors type 2. In 2009, Shaner, Miller and Mintz stated that schizophrenia was the low fitness variant of a highly variable mental trait. We therefore explore whether the dopamine receptor 2 gene (DRD2) underwent any selection processes. We acquired genotype data of the 1000 Genomes project (phase I), which contains 1093 individuals from 14 populations. We included only single nucleotide polymorphisms (SNPs) with a minor allele frequency (MAF) of over 0.05 in the analysis. This is equivalent to 151 SNPs for DRD2. We used two different approaches (an outlier approach and a Bayesian approach) to detect loci under selection. The combined results of both approaches yielded nine candidate SNPs under balancing selection. While directional selection strongly favours one allele over all others, balancing selection favours more than one allele. All candidates are in the intronic region of the gene and only one (rs12574471) has been mentioned in the literature. Two of our candidate SNPs are located in specific regions of the gene: rs80215768 lies within a promoter flanking region and rs74751335 lies within a transcription factor binding site.
We strongly encourage research on our candidate SNPs and their possible phenotypic effects.

Introduction:
The catecholamine dopamine is a neurotransmitter in the human brain. Dopaminergic neurons can be divided into four major pathways: nigrostriatal, mesolimbic, mesocortical and tuberoinfundibular (Andén et al., 1964;Dahlstroem and Fuxe, 1964). These neurons play an important role in voluntary movement, feeding, reward and learning, as well as certain other functions.
Outside of the brain, dopamine takes on a physiological role in cardiovascular functions, hormonal regulation, renal functions and other (Snyder et al., 1970;Missale et al., 1998;Sibley, 1999;Carlsson, 2001;Iversen and Iversen, 2007). Due to this involvement in many different processes and systems, dopamine is also related to a variety of diseases. Parkinson's disease, which is caused by a loss of dopaminergic innervations in the striatum, is a prominent example (Ehringer and Hornykiewicz, 1960). Additionally, the expected associations between the dopaminergic system and schizophrenia stem from the fact that various dopamine receptor 2 blockers are used as antipsychotics in treating that condition (Snyder et al., 1970;Creese et al., 1976;Seeman et al., 1976;Carlsson et al., 2001). Further relationships with dopamine dysregulation are expected in Tourette's syndrome and attention deficit hyperactivity disorder (ADHD) (Mink, 2006;Swanson et al., 2007;Gizer et al., 2009). The strong involvement of dopamine in the reward system suggests an association with drug abuse and addiction (Hyman et al., 2006;Di Chiara and Bassareo, 2007;Koob and Volkow, 2010). Many more diseases and conditions are expected to involve dopamine dysfunctions. (As reviewed by Beaulieu & Gainetdinov, 2011) In humans, five different dopamine receptors exist. They are classified into two categories based on their structure and their pharmacological and biochemical properties. The D1-class includes the dopamine receptors 1 and 5, while the D2-class consists of the dopamine receptors 2, 3 and 4 (Andersen et al., 1990;Niznik and Van Tol, 1992;Sibley and Monsma, 1992;Sokoloff et al., 1992a;Civelli et al., 1993;Vallone et al., 2000). The focus of our study is on the dopamine receptor 2 and its gene DRD2. The dopamine receptor 2 gene lies on the long arm of chromosome 11 (11q23.1). It spans from 113,280,317 to 113,346,413 for a total of 66,096 base pairs (bp) (information accessed on NCBI in the GnRH37 assembly). For the gene card, see Figure 1 in Results. DRD2 has six introns (Gingrich and Caron, 1993). Alternative splicing between intron 4 and 5 of an 87 bp exon generates two variants of the dopamine receptor 2. The difference between D2S (short) and D2L (long) is a 29-amino-acids-long chain in the third intercellular loop of the protein (Giros et al., 1989;Monsma et al., 1989). While the short form (D2S) is mainly -8 -expressed at the presynapse, the long form (D2L) is expressed postsynaptically (Usiello et al., 2000;De Mei et al., 2009). The D2S are mainly autoreceptors, i.e. they reduce the expression of dopamine when activated. This leads to an important negative feedback mechanism (Wolf and Roth, 1990;Missale et al., 1998;Sibley, 1999). (Again, as reviewed by Beaulieu & Gainetdinov, 2011) Among the many single nucleotide polymorphisms (SNPs) of DRD2, one prominent example is rs6277, also known as C957T. It has been associated with schizophrenia in Han Chinese in Taiwan (Glatt et al., 2009), in Russians (Monakhov et al. 2008) and in Bulgarians (Betcheva et al. 2009). Together with the -141C allele, the 957T allele is associated with the diagnosis of anorexia nervosa (Bergen et al., 2005). A meta-analysis showed that the Ser311Cys polymorphism (rs1801028) in DRD2 is a risk factor for schizophrenia. The heterozygotes (Ser/Cys) and the homozygotes for Cys were both at elevated risk for schizophrenia when compared to the Ser/Ser genotypes (Glatt and Jönsson, 2006). In a study with alcoholic patients and controls, the A allele of rs1076560 was more frequent in alcoholic patients (Sasabe et al., 2007). In 2012, Mileva-Seitz et al. conducted a study with Caucasian mothers and their infants. They taped mother-infant behaviour and genotyped various SNPs of DRD2 and also DRD1. Rs1799732 and the previously mentioned rs6277 were both associated with direct vocalization of the mother towards the infant.
The body of literature on SNPs and their possible effects is growing rapidly. Considering the influences those SNPs could have on human behaviour, and bearing in mind the different ecological habitats of Homo sapiens, we aimed to explore if DRD2 underwent any selection processes.
In 2009 an interesting proposal by Shaner, Miller and Mintz stated that schizophrenia was the low fitness variant of a highly variable mental trait. Because of the connection between dopamine receptor 2 and schizophrenia, as stated above, we focused our analysis on DRD2.
To reduce false-positives, we used two selection detection algorithms to explore DRD2. This is an exploratory ("hypothesis-free") approach in which we want to find candidate SNPs that were under selection. The data basis of our analysis are the 1000 Genomes Project samples.

Material and Methods:
We acquired data from the 1000 Genomes Project (phase I) through SPSmart engine v5.1.1 (http://spsmart.cesga.es/engines.php; Amigo et al., 2008), using the search term "DRD2". We included all single nucleotide polymorphisms (SNPs) with a minor allele frequency (MAF) greater than 0.05. Over the span of the whole DRD2 gene (113,280,317 -113,346,413 in the GnRH37.p13 primary assembly; gene card shown in Results, Figure 1), this amounts to 151 SNPs. In total we included the following populations in our analysis. The data was converted by hand into the CONVERT format. All further format conversions were performed by PGD Spider 2.0.5.2 (Lischer and Excoffier, 2012).

Superpopulation
Two different programs were used to detect selection. Both use FST approaches to detect outliers.
The program LOSITAN calculates FDIST, which uses FST and the expected heterozygosity. It assumes an island model of migration with neutral markers. An expected distribution of Wright's inbreeding coefficient is calculated and then outliers are identified. A neutral mean FST was computed by the program before the 50,000 simulations were performed. The infinite alleles model was used. To avoid false positive detection we set the significance level to p < 0.01 (P(Simulation FST < sample FST)). (Antao et al., 2008) BayeScan is a Bayesian statistics program. Basically it calculates two simulations for every loci: In BayeScan the threshold of a posterior P of > 0.99 and a log10(PO) of 2 or higher is used. This threshold is labelled as "Decisive" by BayeScan (see the program manual at http://cmpg.unibe.ch/software/BayeScan/files/BayeScan2.1_manual.pdf). (Foll and Gaggiotti, 2008) -11 -To compute linkage disequilibrium (LD) of the SNPs, we used the R "genetics package" (http://cran.r-project.org/web/packages/genetics/genetics.pdf; Warnes et al., 2013). We used D' as a measurement for LD. In most populations one or more SNPs had to be excluded to successfully run the computation. The population IBS was excluded entirely from this computation. IBS is a very small population (n = 14), and 30 SNPs caused the computation to fail. For a detailed view on all excluded SNPs see Table S1 in the supplementary material.

Results:
The combined results of LOSITAN -13 -For a detailed view on the results of LOSITAN and BayeScan for all SNPs, see Table S2 in the supplementary material. In Figure 1 a gene view of DRD2 with labels for the candidate SNPs is provided.  (7) lies within a transcription factor binding site.
However, we found no known associations for those two SNPs.
The FST values of these nine loci indicate an overall low genetic differentiation, as well as a low differentiation between populations (Table 2). This is in accordance with balancing selection acting on the gene. The differences in FST values stem from different algorithms used by the programs.  Table S2 in the supplementary material.
-14 -  (Table 2). For the exact values see Table S2 in the supplementary material.
The Linkage Disequilibrium measurement D' was used. The heat maps for all nine populations are shown in the supplementary material ( Figure S1 -S13). The relative position of the marked SNPs change because different populations had different SNPs excluded (see Table S1 for the list).

Discussion:
We found nine SNPs to be candidates for balancing selection and none for directional selection.
Checking those SNPs with Genome Browser reveals that they are all intronic region variants.
rs80215768 (4) lies within a promoter flanking region and rs74751335 (7) lies within a transcription factor binding site (TFBS). There have been many studies on the possible effects of mutations in such regions (Hayashi, Watanabe and Kawajiri, 1991;and In et al. 1997; The question is whether these conditions could potentially affect fitness. Bassett et al. (1996) showed that reproductive fitness is reduced in groups of familial schizophrenia, which suggests a selection process. Puzzlingly enough, they also found some evidence for an increased fitness of a Lastly, what also should pique our interest is the fact that we only found candidates for balancing selection. We extended our analysis to the other 4 dopamine receptors and again only found candidates for balancing selection and none for directional selection (data not shown). Note that this applies only to SNPs. There has been evidence for directional selection in DRD4, by analysing a tandem repeat in the coding region (Ding et al. 2001). Recently, DRD1 has been analysed as one of ten neurochemical genes that are responsible for the Social-Decision Making System (O'Connell & Hofmann, 2012). The study finds that the system underwent very little change during evolution. The authors wanted to include DRD2 but too little data for different species were available. The fact that we find only balancing selection and no directional selection acting upon DRD2 hints that it is a conserved region within the human genome.
To untangle the possible effects of our SNPs we propose a study in which our candidate SNPs are investigated in schizophrenic and non-schizophrenic persons. A simple comparison of the SNPs and the different haplotypes between the two groups should efficiently assess our findings.
If this proposed study finds differences in those two groups, the mechanisms of those SNPs and their possible haplotypes must be investigated.