Joint influence of small-effect genetic variants on human longevity

The results of genome-wide association studies of complex traits, such as life span or age at onset of chronic disease, suggest that such traits are typically affected by a large number of small-effect alleles. Individually such alleles have little predictive values, therefore they were usually excluded from further analyses. The results of our study strongly suggest that the alleles with small individual effects on longevity may jointly influence life span so that the resulting influence can be both substantial and significant. We show that this joint influence can be described by a relatively simple “genetic dose - phenotypic response” relationship.

The genome wide association studies (GWAS) were introduced to perform exhaustive analyses of genetic influence on complex traits. A number of recent publications emphasize that the approach did not entirely meet the expectations: Although GWAS provided important insights in genetics of particular disorders [1], it failed to detect a major portion of genetic influence on traits of interest [1][2][3][4][5]. In most cases genetic variants found in GWAS cannot explain heritability estimates calculated for such traits in the pre-genomic era. An important conclusion emerged from many such studies was that the complex traits are typically affected by a large number of common alleles, each of little predictive value, with small or statistically non-significant effect [1][2][3][4][5]. Recent suggestion to focus on the search for rare alleles with significant phenotypic effects in small population subgroups [6] requires new SNP data with minor allele frequencies (MAF) less than 1%. (Traditional GWAS deal will MAF >1%). More results could be obtained by sequencing selected areas of the genome [7,8] In this paper we show that the use of extended approach to GWAS allows for addressing the issues of lost genetic influence on complex traits by analysing regularities of joint action of many small-effect-low-sig-Research Perspective nificance alleles. Using longevity trait as an example we show that the results of our analyses bring important insights into mechanisms of genetic regulation of this trait. In this approach we hypothesized that value of the complex trait (life span) depends on number of the small-effect "longevity" alleles, contained in individual genomes and tested this hypothesis using genome wide data on 550K SNPs from the original cohort of the Framingham Heart Study (FHS). The results show that the joint influence of small-effect alleles on life span is both significant and substantial and can be described as the "genetic dose -phenotypic response" relationship. The existence of such relationship brings a new perspective to GWAS of complex traits and can at least partly justify sizable efforts and resources that have recently been invested in GWAS.
We evaluated associations between 550,000 SNPs and life spans in 1,173 genotyped participants of the Framingham Heart Study (FHS) original cohort. After performing a standard quality control procedure [9], (call rate ≥80%; MAF>1%; HWE > 10 -7 ) for each SNPs we evaluated parameters of the linear regression model by considering individuals' life span as function of SNP genotype (categorical variables) using code "0" for homozygote with respect to the major allele; "1" for heterozygote; and 2 for homozygote with respect to the minor allele. The SAS program SAS PROC REG (© SAS Institute, Inc.) has been used for this purpose. The SNPs for which the estimate of the slope parameter was positive and had p≤10 -6 were selected as "longevity" SNPs. Note that this threshold is larger than 10 -7 used in traditional GWAS with correction for multiple comparisons in data samples of similar size. This procedure resulted in selection of 169 "longevity" SNPs.
To evaluate joint effect of genetic variants on life span, we calculated the number of longevity SNPs (from selected set of 169 SNPs) contained in the genome of each individual in the study and performed regression analyses considering lifespan as a linear function of the number of longevity SNPs contained in person's genome. The estimates of both the intercept and the slope were positive and highly statistically significant (Figure1).
The estimated dependence explained 21% of variance in life span. This estimate seems to be reasonable if one takes into account that narrow sense heritability in life span is estimated at the level about 25% [10]. The estimated relationship between life span and the number of longevity SNPs shown in Figure 1 is the main result of this paper. It shows that in studies of genetic determinants of longevity the joint influence of many small-effect genetic variants may be substantial. We suggest that similar "genetic dose" -"phenotypic response" relationship is likely to characterize genetic influence on many other complex traits. www.impactaging.com The two aspects of performed analyses require additional testing. The first is the use of data on all genotyped individuals from the original FHS cohort, which include first degree relatives from 618 families. The second is the fact that the two procedures: (i) selection of longevity SNPs and (ii) testing the presence of their joint influence on life span used data on the same individuals. To check whether the exclusion of relatives from the list of study subjects modifies the results of analyses, we randomly selected 618 individuals, one from each family, identified a set of "longevity" SNPs using the procedure described above, and estimated dependence of life span on the number of selected longevity SNPs in these individuals. To diminish the effect of sampling, we repeated this procedure 10 times. In each such analysis, the estimates of slope and intercept were positive and highly statistically significant with p≤10 -19 . These results suggest that the conclusion about joint influence of longevity SNPs on life span does not depend on the presence or absence of relatives among the study subjects. To take into account variants selected in each experiment, we unified sets of longevity SNPs selected in each of 10 experiments. This procedure resulted in the set with 70 genetic variants. Note that the reduction in the number of study subjects (because of excluding genetically dependent individuals) increases the chances of selecting false positive variants. To diminish the number of such variants, we intersected the set of 70 SNPs with the set of 169 SNPs, selected earlier using data on the entire FHS cohort. This procedure resulted in 39 longevity SNPs.
This set of 39 SNPs was then used in regression analyses where life span was considered as a linear function of the number of longevity SNPs contained in person's genome. The result is shown in Figure 2. The analyses showed that the estimates of both the intercept and slope are highly statistically significant. The estimated dependence of life span on genes explains 19% of variance in life span, which is close to 21% estimated earlier. Thus, the presence of relatives in the population used for selecting longevity SNPs does not affect the conclusion about the presence of "genetic dose" -"phenotypic response" relationship. The fact that 39 selected SNPs explained almost the same percent of life span variance as 169 SNPs selected earlier (19% vs 21%) indicates that this set of SNPs deserves further analyses. Table 1 shows how selected SNPs are related to known genes.
The second aspect mentioned above deals with prediction and replication. If the procedures, described above, do select longevity variants, and if the detected pattern of joint influence of such variants on life span is a property of a biological mechanism, then genetic variants selected using data on one population should be able to predict life spans in other genetically independent population of individuals who experienced similar environmental and living conditions. To test this, we randomly divided all 618 families into two groups. Data on individuals from the first 309 families plus data on 162 individuals with missing family identities were used for selecting SNPs having effect on life span. Then for each individual in the second (genetically independent) group we identified the number of such SNPs contained in person's genome. We estimated parameters of the linear regression model considering life span as function of the number of longevity variants contained in the genomes of individuals from the same (first) group and from the second (independent) group of individuals. To replicate the result, longevity SNPs selected from data on the second population were used for evaluating linear "genetic dose" -"life span response" relationship on the same population, as well as on the first population of individuals genetically independent from the second one. To reduce the sampling effect, the procedure of random division of the 618 families into two groups with subsequent selection of longevity variants and estimating regression coefficients in the "genetic dosephenotypic response" relationship was repeated 10 times. The results are shown in Table 2.  www.impactaging.com The possibility of using a straight line for approximating "the number of longevity SNPs --life span" relationship indicates the presence of substantial additive component of the genetic contribution to longevity. It is relevant to note that additive genetic effects were the subject of numerous studies in quantitative genetics of the pre-genomic era. Many genetic calculations (e.g., estimates of narrow sense heritability of complex traits) were based on the assumption about the additive nature of genetic component of phenotypic variation. The availability of genome-wide data nowadays allows for evaluating such effects directly. Moreover, evaluating the non-additive (non-linear) joint genetic influence (epistasis) becomes also possible with the use of more sophisticated patterns of the "dose -response" relationship.
While the replication of findings became a standard requirement in GWAS, the results of our analyses suggest that in studying joint effect of many alleles this practice needs to be revised. Our analyses show that one should not expect that exactly the same sets of genetic The results of 10 experiments in which genetic variants individually affecting life span (longevity SNPs) were selected twice using data on two populations representing genetically independent genotyped individuals in the original Framingham Heart Study (FHS) cohort for whom life span data are available. The longevity SNPs selected from data on the first population were used for evaluating linear "genetic dose" -"life span response" relationship on the same population, as well as on the second population of individuals. In turn, longevity SNPs selected from data on the second population were used for evaluating linear "genetic dose" -"life span response" relationship on the same population, as well as on the first population of individuals. Column "#" shows experiment's number. Columns www.impactaging.com variants will contribute to "genetic dose -phenotypic response" relationship evaluated using data on other population. One reason for this may be geneenvironment interaction: difference in populations' exposure to external conditions is likely to produce difference in genetic regulation of the trait in these populations. Identification of genetic variants "sensitive" to specific external signals will open new opportunities for studying the role of genetic and nongenetic factors in complex traits.