Genome wide association study of incomplete hippocampal inversion in adolescents

Incomplete hippocampal inversion (IHI), also called hippocampal malrotation, is an atypical presentation of the hippocampus present in about 20% of healthy individuals. Here we conducted the first genome-wide association study (GWAS) in IHI to elucidate the genetic underpinnings that may contribute to the incomplete inversion during brain development. A total of 1381 subjects contributed to the discovery cohort obtained from the IMAGEN database. The incidence rate of IHI was 26.1%. Loci with P<1e-5 were followed up in a validation cohort comprising 161 subjects from the PING study. Summary statistics from the discovery cohort were used to compute IHI heritability as well as genetic correlations with other traits. A locus on 18q11.2 (rs9952569; OR = 1.999; Z = 5.502; P = 3.755e-8) showed a significant association with the presence of IHI. A functional annotation of the locus implicated genes AQP4 and KCTD1. However, neither this locus nor the other 16 suggestive loci reached a significant p-value in the validation cohort. The h2 estimate was 0.54 (sd: 0.30) and was significant (Z = 1.8; P = 0.036). The top three genetic correlations of IHI were with traits representing either intelligence or education attainment and reached nominal P< = 0.013.


Introduction
Human hippocampi are small structures, one in each temporal lobe that belongs to the brain's limbic system and is known to be mainly involved in memory processes such as long term memorisation and spatial navigation [1]. The limbic system and the hippocampus influence the activity of the hypothalamic Pituitary Adrenocortical (HPA) axis, a major neuroendocrine mediator of stress, playing a role in emotional stress responses [2]. Thus, the hippocampus is implicated, with evidence of morphological changes, in a variety of neurological pathologies and psychiatric disorders, such as Alzheimer's disease where hippocampal atrophy increases with the pathology [3]; major depressive disorder where hippocampal volume can predict the response to antidepressants [4,5], is related to suicide attempts [6], and is linked to cortisol disruption (highlighting the implication of the hippocampus in the HPA axis) [7]; Schizophrenia, where patients have smaller hippocampi [8]; or temporal lobe epilepsy, the most frequent form of chronic focal epilepsy in adults, linked to hippocampal sclerosis [9]. Furthermore, during brain development, the growth of the left and the right hippocampi shows distinct responses to postnatal maternal stress [10]. Anatomically, there is a variation to the typical presentation of the hippocampi in normal subjects: the incomplete hippocampal inversion (IHI) also referred to as hippocampal malrotation (Fig 1). This anatomical variant has been initially observed in healthy subjects by [11] and then mostly observed in patients with epilepsy [12,13]. IHIs are mainly left-sided and characterized by a rounded or vertical shape, a medial positioning and a deep collateral sulcus [13][14][15] and are present in around 20% of the normal population [15]. It has been reported that IHI impacts the hippocampal volume: subjects with incomplete inversions appear to have smaller hippocampi [16], and more specifically, the hippocampal subfield CA1 seems to be related to the IHI severity [17]. Also it has been suggested that IHI might interfere with the quality of hippocampal segmentation for volumetric analysis [16,18], which may be clinically relevant, since the hippocampal volume can predict the response to antidepressant in patients without IHI [4,5]. Additionally, a sulcal morphometry analysis suggested that morphological changes associated with IHI are not confined to the hippocampus [15]; significant differences in cortical sulci located along the limbic system are shown between participants with and without complete inversion. Several studies suggest that IHI have their origin in developmental processes [19,20]. For example, [21] observed that during the rotational growth of the hemispheres, the major portion of the hippocampus is carried  dorso-laterally and then ventrally to lie in the medial part of the temporal lobe. As the neocortex expands and evolves, the allocortex (the 3 layers cortex) is displaced inferiorly, medially and internally into the temporal horn. This rotational growth of the cortex implies an inversion of the hippocampus during normal development, which in some cases may remain incomplete. Following this hypothesis, [22] conducted a study using foetal MRI and found a correlation between the degree of in-folding and the number of gestational weeks. In a recent study [15] described detailed criteria to evaluate IHI, ultimately making the IHI evaluation more reproducible. In the same study, the introduced criteria had been applied to assess the IHI status of 2000 adolescents without neurological disorders. Results showed a prevalence of about 20% of IHI among this normal population. The majority of the IHI cases were left-sided (17% on left side). The lateral preference of left-sided over right-sided IHI may be rooted in the observation of asymmetrical hippocampus development in neonates with the right hippocampus developing faster than the left one [23]. In addition to these developmental observations, IHI has been reported to be associated with genetic changes. For instance, IHI was observed at higher prevalence in subjects with chromosome 22q11.2 microdeletion [24], which leads to DiGeorge syndrome. Given that recent evidence implicates developmental processes in the aetiology of IHI and the observation that the structure and shape of subcortical structures, including the hippocampus, are under genetic control [25], we aimed at elucidating specific genetic variants contributing to IHI. To this end we conducted the first genome-wide association study on the genetics of incomplete hippocampal inversion.

Subjects
Subjects were investigated from two cohorts: IMAGEN [26] and PING [27]. The IMAGEN cohort comprises >2000 subjects collected at eight sites across Europe [26], and local ethics committee approved the study (see at the end of the paper for details and study [26]). At the time of baseline data collection and study inclusion all participants were 14 years of age. The second cohort was obtained from the Pediatric Imaging Neurocognition and Genetics (PING) Study database (http://www.chd.ucsd.edu/research/ping-study.html). PING was launched in 2009 by the National Institute on Drug Abuse (NIDA) and the Eunice Kennedy Shriver National Institute Of Child Health & Human Development (NICHD) as a 2-year project of the American Recovery and Reinvestment Act. The primary goal of PING has been to create a data resource of highly standardized and carefully curated magnetic resonance imaging (MRI) data, comprehensive genotyping data, and developmental and neuropsychological assessments for a large cohort of developing children aged 3 to 20 years. The scientific aim of the project is, by openly sharing these data, to amplify the power and productivity of investigations of healthy and disordered development in children, and to increase understanding of the origins of variation in neurobehavioral phenotypes. Access to the dataset was granted through a Federal Wide Assurance (FWA). For up-to-date information, see http://www.chd.ucsd.edu/research/ping-study.html and [27]. All methods were performed in accordance with relevant guidelines and regulations.

Image data processing and IHI scoring
The procedure for scoring IHI [15], which has been previously described in detail and shown a good intra-and inter-reproducibility [15], was applied to the subjects used in this study (from IMAGEN and PING). Inter-and intra-rater variability were assessed in a previous publication [15]. This was studied on 42 participants from the discovery cohort using the kappa statistic. In all cases, intra-and inter-rater agreements were beyond substantial (κ�0.64). Very strong agreements (κ�0.8) were observed in the majority of comparisons (14/20). Rating on the validation cohort was conducted after by a single rater (CC), thus the rater was not blinded to whether subjects were from the discovery or validation cohort. In brief, the IHI score is composed of four different criteria: (1) assessing the roundness of the hippocampal body; (2) evaluating the verticality of the collateral sulcus which is located between the 4 th and the 5 th temporal lobe convolution ( Fig 2); (3) the mediality of the hippocampal body; and (4) the depth of the fusiform gyrus, separating the 4th and the 3rd convolution of the temporal lobe (Fig 2). Each criterion is assessed from a coronal point of view after registering the subjects' T1 weighted MRI into the standard MNI space using the FSL's affine transformation FLIRT [28,29]. Evaluation was carried out using an inhouse Java interface (https://github.com/ cclairec/viewerIHI_java). During scoring each criterion received a score between 0.0 and 2.0. The first three criteria have a step size of 0.5, the fourth criterion is binary (0 or 2), and the 5th criterion, assessed between 0 and 2, has a step size of 1.0. The sum of those criteria forms the overall IHI score ranging from 0.0 to 10.0. This is a semi-continuous score (with a step of 0.5), where an IHI score of 0.0 indicates the total absence of IHI, and a score of 8.0 represents a very pronounced presentation of IHI. In their previous study [15] established an optimal cut-off (at 3.75) of the overall IHI score to indicate presence or absence of IHI, by maximising the accuracy of the classification of a global criterion (blind to individual criteria or IHI scores), indicating if a given hippocampus presents or not an IHI (an intermediate score for partial IHI were present but not used in the estimation of the optimal cut-off): hippocampi without IHI correspond to IHI score < 4.0 and hippocampi with IHI correspond to IHI scores > = 4. For this genetic study, the phenotype was IHI in either left or right hippocampus. To determine IHI, we applied the same cut-off of 4.0 for left and right hippocampi and used, for the IMAGEN cohort, the previously processed data from the IMAGEN study [15].

SNP genotyping and pre-processing
IMAGEN subjects were genotyped from blood samples on 610-Quad SNP and 660-Quad SNP arrays from Illumina. Genetic data was available for 1,841 subjects. In a first round of quality control (QC) we performed subject-level QC by removing subjects with mismatching selfreported sex and genotype inferred sex (N = 10) or where more than 10% of SNPs were missing (N = 0). Next, we performed ancestry matching based on the HapMap3 data [30]. Population outliers were defined as subjects exhibiting more than five standard deviations distance from the CEU and TSI population in any of the first five principal components. Based on these criteria, 220 subjects were excluded from further analysis (S1 Fig). For the remaining subjects the genetic relationship matrix (GRM) was computed on common SNPs (minor allele frequency [MAF] >5%) after LD pruning using GCTA [31]. Another 18 subjects were removed due to relatedness (i.e., PIHAT > 0.05) leaving a total of 1593 subjects for the analysis. The raw genotyping data were prepared for imputation using a series of scripts (http://www.well.ox.ac. uk/~wrayner/tools/). Haplotype reference consortium (HRC) v1.1 [32] SNPs were imputed on the Sanger imputation server (https://imputation.sanger.ac.uk) using EAGLE2 [33] for prephasing and PBWT [34] for imputation. Data from the two different genotyping chips were imputed independently. Genotypes were hard called based on the maximal genotype posterior probability with a threshold of 0.9. That is, if none of the three genotypes reached a posterior probability of at least 0.9, then the SNP was set to missing in the corresponding subject. Finally, an additional round of QC was conducted on SNP level based on imputation quality (INFO score > 0.3), missingness (< 5%), minor allele frequency (MAF>1%) and deviation from Hardy-Weinberg-Equilibrium (p<1e-6) leaving 6,742,645 SNPs across the autosomes for the association analysis.
PING subjects were genotyped from saliva samples on Human660W-Quad arrays from Illumina. After QC, genetic data for 1,391 participants was suitable for analysis. Individual SNPs of the PING dataset were accessed through the PING data portal (ping-dataportal.ucsd. edu). Ancestry and admixture proportions in the PING participants were based on the ADMIXTURE software [35] and downloaded through the data portal (for details see [27]). We restricted the validation cohort to participants of at least 12 years of age and of European ancestry (minimum 90% European ancestry as per ADMIXTURE; N = 197).

Genome wide association study
The genome wide association study was carried out with Plink v1.9 [36] assuming an additive genetic model and computing for every SNP a logistic regression while correcting for sex, age at imaging (in days) and five principal components for population structure. Phenotype or covariate information was missing for 212 participants. Thus, the discovery GWAS comprised 1,381 unrelated subjects. The genome-wide statistical significance threshold was set to the standard threshold of p<5e-8 and regional association plots were generated with LocusZoom [37]. SNPs exceeding the threshold for suggestive association with IHI (p<1e-5) were followed up in an independent cohort of adolescents (PING). In case the top SNP was not genotyped in PING, LDlink [38] (https://analysistools.nci.nih.gov/LDlink/) was used to identify a proxy in LD (r2) within +/-50kb of the top SNP's location. Association with single SNPs was tested in R using the glm function; the logistic model was corrected for age and sex.

Functional annotation of GWA summary statistics
The GWA summary statistics were annotated using the web-based version of the FUnctional Mapping and Annotation (FUMA) tool [39] (http://fuma.ctglab.nl/). In order to elucidate the functional consequences of genetic risk loci, FUMA approaches the mapping in two separate steps: first, lead SNPs are identified and mapped to relevant genes on the basis of strand proximity, expression quantitative trait loci (eQTL) and chromatin interaction; second, the reprioritized genes returned by the first step are annotated with respect to expression levels and overrepresentation in differentially expressed gene sets among a wide range of human tissues.
For the purposes of this study, SNP-to-gene mapping was performed according to the following parameters: SNPs with p<5e-8 were identified as lead SNPs, and genomic risk loci were constructed by including SNPs in linkage disequilibrium with independent lead SNPs (LD r 2 >0.6 in the 1000 Genomes Phase 3 EUR panel) and with a minimum MAF of 1%. Positional mapping was performed by linking lead SNPs to genes in a 50kb window. Mapping based on eQTL was performed by using only SNP-gene pairs significant at FDR<0.05 in all tissues/cell types from 4 data repositories (GTEx [40], the Westra blood eQTL dataset [41], the BIOS QTL browser [42] and BRAINEAC [43]); the available data only covers cis-eQTLs with up to 1 Mb distance between SNP and gene. Chromatin interaction mapping was also performed to take into account potential long-range interactions between risk loci and genes due to chromatin folding. We based mapping on interactions significant at FDR<1e-6 in 14 tissue types and seven cell lines from [44]. We also based mapping on tissue/cell type specific enhancer or promoter regions annotated in 111 epigenomes from the Roadmap Epigenomics Project [45]. The Major Histocompatibility Complex (MHC) was excluded from annotations, and mapping to all functional gene classes (protein-coding, non-coding RNA, long intergenic ncRNA, processed transcripts, pseudogenes) was enabled.
After mapping lead SNPs to relevant genes, we performed annotation of the prioritized genes in biological context, mainly with respect to tissue-specific expression levels. Average expression levels (log 2 Read Per Kilobase per Million (RPKM+1)) of protein-coding genes in 53 tissues from GTEx v6 were visualized through heat maps, allowing for comparison of expression level across genes and tissue types. Candidate genes were tested for overrepresentation in sets of differentially expressed genes (DEG), as well as sets of genes up-and down-regulated, across 53 specific tissue types from GTEx v6 using hypergeometric tests. The same geneset enrichment analysis strategy was applied to test for overrepresentation of biological functions among the prioritized genes, using gene sets from the Molecular Signatures Database version 5.2 [46], WikiPathways [47] and the GWAS catalog [48], and applying the Benjamini-Hochberg multiple testing correction procedure.

Heritability analysis and genetic correlation
We used LD score regression [49] in order to estimate IHI heritability from the GWAS summary statistics data. Next, we computed partitioned heritability estimates using the LD score method described in [50] and [51]. Heritability estimates were partitioned into 53 overlapping functional categories, derived from 24 main annotations, from [50]. Stratified LD score regression was also used to test for heritability enrichment in genes specifically expressed in a number of tissues of cell types. For this analysis, we used the specifically expressed gene lists compiled by [51] for the following datasets: expression levels from RNA-seq experiments in the 53 GTEx tissues and cell types, as well as only the 13 GTEx brain regions; the Cahoy dataset, comprising microarray expression data from three cell types (astrocyte, neuron, oligodendrocyte) in the mouse brain [52]; the Franke dataset, comprising microarray expression data in 152 tissues and cell types from human, mouse and rat [53]; and the Immunological Genome Project Consortium dataset, comprising microarray expression data for 292 immune cell types in the mouse [54]. Enrichment p-values were corrected for multiple comparisons using the Benjamini-Hochberg procedure.
Given the reported higher prevalence of IHI in patients with epilepsy, we used LD score regression to compute the genetic correlation [55] between IHI and epilepsy susceptibility based on a recent GWAS [56]. Finally, we conducted an exploratory analysis of genetic correlation between IHI and traits from 832 GWASs using the LD hub [57] (http://ldsc. broadinstitute.org/).

Gene expression in the developing human brain
In order to explore the transcription pattern of two candidate genes, we downloaded their expression values from BrainSpan through the web interface (http://www.brainspan.org/ rnaseq/search/index.html). The data comprises post-mortem gene expression data of 42 subjects at ages spanning from prenatal development (eight post conception weeks) till adulthood (40 years). Brains were sampled across 26 brain structures. Gene expression was measured using RNA-sequencing and expression levels for each gene were provided as reads per kilobase of exon model per million mapped reads (RPKM). We analysed this data using a linear mixed effects model implemented in the lme4 package in R. In these analyses, the gene expression level was the target variable, subject ID and structure ID were random effects, while an indicator variable for age less than 25 weeks post conception was the fixed effect. We tested for the significant effect of age<25 post conception weeks (pcw) on gene expression. This threshold was selected based on the estimated occurrence of hippocampal inversion between pcw 20 and 30 [20].

Subjects: Cohort statistics
In IMAGEN 1381 subjects had genotyping and all phenotype and confounding information available. Incidence rate of IHI was 26.1%. In PING, for the 197 European subjects aged 12 years or older, we could successfully access and score 161 T1 weighted MR images for analysis, and IHI incidence rate was 23.6%; both at the 4.0 cutoff. There was a higher incidence rate of IHI in the left hemisphere in both cohorts. Summary statistics for both cohorts can be found in Table 1.

Genome wide association study and functional annotation
We tested each of the 6.7mio SNPs for an association with the presence of IHI. In the discovery dataset comprising subjects from the IMAGEN study, 17 loci passed the threshold for suggestive association (Fig 3; Table 2). One locus on chromosome 18 reached genome-wide significance (top SNP: rs9952569; OR = 1.999; Z = 5.502; P = 3.755e-8; Fig 4). There was no inflation     Fig. left). In fact the top associated SNP is located in an intron of KCTD1. Brain gene expression based on GTEx shows high and brain-specific expression for AQP4 and moderate to high expression levels for KCTD1 (S4 Fig, right). The gene set enrichment analysis showed that the four protein-coding genes prioritized by FUMA (AQP4, AQP4-AS, KCTD1 and U3) were statistically significantly enriched (p<0.000314; Bonferroni corrected for 3 � 53 tests) in overexpressed genes in hippocampal and caudate tissue from GTEx v6 (S5 Fig). We tested the 17 top SNPs within each suggestive loci in the PING cohort for an association with IHI. In PING, there were either genotyped or SNPs in near perfect LD (r2>0.9) for seven loci, intermediate (0.25 < r2 < 0.9) proxies for five loci and weak proxies for the remaining five loci (r2 < 0.25). None of the selected candidate SNPs showed a nominal significant association (uncorrected p < 0.05) with IHI in this cohort. The top SNP from the discovery cohort, rs9952569, was not significant in the validation sample and showed an effect in the opposite direction (OR = 0.597; P = 0.29; Table 2).

Heritability analysis and genetic correlation
Heritability of IHI was estimated from the GWAS summary statistics using LD score regression. The h 2 estimate was 0.54 (0.30) and statistically significant using a one-sided test (Z = 1.8; P = 0.036). We next sought to identify genomics regions or cell type marker genes that show enriched heritability. However, none of the tested genomic regions or gene sets showed statistically significant enrichment after FDR correction (S6 Fig). Motivated by reported increased prevalence of IHI in persons with epilepsy we computed the genetic correlation (rg) between IHI and epilepsy susceptibility. The estimate of rg was -0.0854 (0.2612) which did not reach statistically significance (Z = -0.3269; P = 0.7437).
LDhub was used to compute rg between IHI and 832 GWAS summary statistics; the computation was successful for 749 GWAS (S1 Table). None of the traits survived the FDR corrected p-value threshold (P FDR <0.05). A total of 20 traits reached nominal significance (p<0.05, Table 3). The top three positively correlated traits were: intelligence [58], College or University degree based on a UK BioBank (UKBB) GWAS and Years of Schooling [59]. Among the 20 nominal significant genetic correlations was also Fluid Intelligence Score (UKBB).

Gene expression in the developing human brain
We extracted the gene expression levels in the developing brain for AQP4 and KCTD1. The summary of the data up to post-conception week (pcw) 100 are depicted in S7 Fig. Expression of KCTD1 remains rather stable across the entire time frame, while the expression of AQP4 starts very low and increases with the progression of brain maturation and reaches its peak around the time of term birth (pcw 37-40). Gene expression was significantly different before and after pcw 25 for both KCTD1 (p = 3.115e-04) and AQP4 (p = 2.424e-07) when limited to data acquired before pcw 40.

Discussion
Incidence rate of IHI was consistently around 25% in both, the discovery and the validation cohort. This was comparable to previous reports of 18-19% [60,61], especially considering that the IHI score at the 4.0 cutoff includes not only strong IHI (as in the cited studies), but also lighter IHI [15], therefore increasing the IHI rate. In both cohorts there was a higher incidence rate of IHI in the left hippocampus, which agrees well with the observations that the right hippocampus matures faster and thereby inverts correctly. The GWAS highlighted one genomewide significant locus on chromosome 18, which is linked through chromatin interaction maps (S4 Fig, left) to six genes, two of which show substantial expression in brain tissue: KCTD1 and AQP4. Of note, the locus showed consistently strong association with continuous scales of the IHI phenotype. Furthermore, a genome-wide screen of those continuous scales revealed two genome-wide significant loci (S3 Fig), one of which exceeded the suggestive threshold in the original GWAS (Table 2; rs35806781, OR = 4.889, Z = 4.798, P = 1.603e-06). The other SNP (rs186025034) had a low minor allele frequency (about 1%) and missed the suggestive threshold in the original GWAS (OR = 5.159, Z = 4.34, P = 1.423e-05). Overall, we observed consistency across IHI definitions and their genetic associations. The Potassium Channel Tetramerization Domain Containing 1 (KCTD1) gene negatively regulates the AP-2 family of transcription factors and the Wnt signalling pathway, which controls normal embryonic development, cellular proliferation and growth [62]. Interestingly, mutations in KCTD1 have been linked to Scalp-Ear-Nipple syndrome [63], which is a rare, autosomal-dominant disorder characterized by cutis aplasia of the scalp as well as minor anomalies of the external ears, digits, nails, and malformations of the breast. Clearly, KCTD1 has the ability to influence developmental processes. Thus, it is conceivable that more benign variation in KCTD1 may play a role in the generation of IHI. Furthermore, KCTD1 is a potassium channel gene and various members of the potassium channel gene family have been linked as causes of epilepsy [64][65][66]. In a recent GWAS for epilepsy susceptibility SNPs in the KCTD1 gene reached p-values as low as 0.0003758 (S8 Fig) [56].
Aquaporin-4 (AQP4) is a bidirectional water channel that is found on astrocytes throughout the central nervous system (S4 Fig). However, while AQP4 expression in brain tissue is in general high in children and adults, its expression is quite low before post-conception week 20 (S7 Fig). MRI studies of IHI during brain development [20,23] show lack of hippocampal inversion during the early phases of development <25 post-conception weeks, which coincides with the time point of increased AQP4 expression. Furthermore, AQP4 has been linked through various lines of evidence to epilepsy, e.g., the lack of aquaporin-4 water channels increased seizure threshold and seizure duration in mice [67,68] and AQP4 expression among chronic temporal lobe epilepsy patients is increased almost twofold in the hippocampus of the affected hemisphere compared to the contralateral hemisphere [69]. Taken together, astroglial AQP4 may modulate neuronal excitability by regulating the extraneuronal and extrasynaptic environments and thereby affect the epileptogenesis. This may explain the observed increased rates of IHI in persons with epilepsy. Interestingly, the four protein-coding genes prioritized by FUMA were also enriched in genes overexpressed in hippocampal and caudate tissue (S5 Fig). However, enrichment results tend to be unstable when only a small gene set is tested for enrichment, thus, this result should be considered with caution regarding its interpretation.
We attempted to validate the genome-wide significant locus in a second independent cohort of adolescents. However, neither the genome-wide significant locus, nor any of the suggestive loci reached nominal significance in the validation cohort. One major contributor to this lack of replication was the limited sample size of the validation cohort. Despite the equally large set of participants in both studies, age and ethnicity restrictions severely limited the available sample size for the validation cohort (N = 161) and drastically lowered the statistical power to detect differences (power of 35% for just the top variant). Larger validation cohorts are needed to confirm the association of the identified locus with IHI: e.g., to validate the top variant with 80% at least N = 500 subjects are required. Although there are growing imaging and genetic datasets, e.g., the UKBB that aims at 100,000 participants with genetics and brain imaging data, few studies focus on healthy younger subjects (children, adolescents or young adults), which is beneficial for the validation in order to exclude confounding by disease processes or age-related atrophy. One such option is the Philadelphia Neurodevelopmental Cohort (PNC) [70]. However, it is important to keep in mind that the evaluation of IHI is not restricted to adolescents. IHI can be observed in children and adults too, without extra difficulties. The study here focus on adolescents, since the discovery cohort is the dataset used for the reference study [15]. Still, in older patients, even though IHI still exist, their detection may be more difficult and less reliable due to the confounding effect of hippocampal atrophy to ageing, hippocampal sclerosis or neurodegeneration. Also scoring bigger databases (such as UKBB) will be feasible only after automatic methods for IHI scoring have been developed.
We estimated heritability of IHI based on the state-of-the-art LD score regression method that operates on the GWAS summary statistics. The inferred heritability was substantial with h 2 = 0.54; the estimate was subject to high uncertainty as reflected by the high standard deviation of 0.3, which is likely a direct reflection of the low sample size of the discovery cohort. Analysis on twin data, such as the healthy young adult twins participating in the Queensland Twin IMaging (QTIM) study [71], can be used to confirm this preliminary heritability estimate, but would require a significant effort in manually scoring IHI in these large cohorts. In addition, the magnitude of h 2 is comparable to recently published estimates on the heritability of hippocampal volume and shape from more than 3600 subjects [25]; h 2 ranges from 0.08 to 0.337 depending on hemisphere and structural measure, with heritability of volume being generally the lowest. Given the impact of IHI on hippocampal shape and appearance together with the high prevalence of IHI in nearly 25% of healthy subjects, it is likely that the observed heritability of hippocampal shape reported by [25] was in part due to IHI.
We used the generated genome-wide summary statistics for two additional explorations. First, we sought to investigate if heritability was enriched in any particular region of the genome, characterized by its function or by marker genes for specific cell types. None of the investigated categories achieved statistical significance after FDR correction. Second, we computed genetic correlations with other traits. We hypothesized that there may be genetic link with epilepsy, however the resulting correlation was non-significant and negative, i.e., people with IHI were less likely to be affected by epilepsy, thereby contradicting earlier reports. The exploratory analysis with 832 additional traits highlighted a positive genetic correlation between IHI and intelligence and education attainment. There are various reports highlighting the contribution of the hippocampus and its subregions to various mental aspects that collectively are referred to as intelligence, e.g., spatial processing [72] and working memory [73]. Moreover, one recent study linked hippocampal shape to cognitive performance [74]. In particular, in males the radial distance of the hippocampus correlated with better test scores (e.g., general factor of intelligence, abstract-fluid intelligence, and the rotation of solid figures). In females, the effect was reversed. Therefore, a genetic correlation of IHI with intelligence in the broad sense is conceivable.
In conclusion, we presented the first genome-wide association study of IHI, where we identified a genome-wide significant locus. Additional exploration of the resulting summary statistics revealed a high heritability and suggested positive genetic correlation of IHI with traits linked to intelligence and education attainment. All informed consents have been obtained from a parent and/or legal guardian. Written informed assent and consent were obtained, respectively, from all adolescents and their parents after complete description of the study. For the PING dataset, written parental informed consent was obtained for all PING subjects below the age of 18, and child assent was also obtained for all participants between the ages of 7 and 17. Written informed consent was obtained directly from all participants aged 18 years or older. The y-axis depicts the -log10(p-value) of the association between SNP and presence of IHI assuming an additive model in the discovery cohort. The SNPs tested in the study are ordered along their chromosomal position on the xaxis. The red horizontal line donates genome wide significance at the Bonferroni threshold (P = 5e-8), while the blue horizontal line marks the threshold for suggestive association (P = 1e-5). Upper plot: Result for the sum of the five criteria and then maxed over left and right hippocampus. Two loci exceed the genome wide significant threshold: on chromosome 6 rs35806781 (beta = 1.478, Z = 5.766, P = 1.006e-08) and on chromosome 9 rs186025034 (beta = 1.867, Z = 6.202, P = 7.408e-10). Lower plot: Result for the global criterion presenting 3 classes: 0 = non IHI, 2 = IHI, 1 = partial IHI, and taking the max over left and right hippocampus. One locus exceeds genome-wide significance on chromosome 9: rs186025034 (beta = 0.8251, Z = 5.575, P = 2.98e-08).