Identification and replication of novel genetic variants of ABO gene to reduce the incidence of diseases and promote longevity by modulating lipid homeostasis

Genes related to human longevity have not been studied so far, and need to be investigated thoroughly. This study aims to explore the relationship among ABO gene variants, lipid levels, and longevity phenotype in individuals (≥90yrs old) without adverse outcomes. A genotype-phenotype study was performed based on 5803 longevity subjects and 7026 younger controls from the Chinese Longitudinal Healthy Longevity Survey (CLHLS). Four ABO gene variants associated with healthy longevity (rs8176719 C, rs687621 G, rs643434 A, and rs505922 C) were identified and replicated in the CLHLS GWAS data analysis and found significantly higher in longevity individuals than controls. The Bonferroni adjusted p-value and OR range were 0.013-0.020 and 1.126-1.151, respectively. According to the results of linkage disequilibrium (LD) analysis, the above four variants formed a block on the ABO gene (D’=1, r2range = 0.585-0.995). The carriers with genotypes rs687621 GG, rs643434 AX, or rs505922 CX (prange = 2.728 x 10-107-5.940 x 10-14; ORrange = 1.004-4.354) and haplotype CGAC/XGXX (p = 2.557 x 10-27; OR = 2.255) had a substantial connection with longevity, according to the results of genetic model analysis. Following the genotype and metabolic phenotype analysis, it has been shown that the longevity individuals with rs687621 GG, rs643434 AX, and rs505922 CX had a positive association with HDL-c, LDL-c, TC, TG (prange = 2.200 x 10-5-0.036, ORrange = 1.546-1.709), and BMI normal level (prange = 2.690 x 10-4-0.026, ORrange = 1.530-1.997). Finally, two pathways involving vWF/ADAMTS13 and the inflammatory markers (sE-selectin/ICAM1) that co-regulated lipid levels by glycosylation and effects on each other were speculated. In conclusion, the association between the identified longevity-associated ABO variants and better health lipid profile was elucidated, thus the findings can help in maintaining normal lipid metabolic phenotypes in the longevity population.


INTRODUCTION
A healthy life span is a complex phenotype that is influenced by both genetic and environmental factors. It has been observed that the influence of genetic factors increases with age [1]. Based on recent genetic studies, more than 50 different genes are associated with longevity in different populations [2][3][4][5][6]. Many reported studies have revealed that individuals with a life span of ≥ 90 years had several healthy genetic variants, indicating the importance of genetic contribution to a longer life span. Some of these variants were found to be associated with plasma lipid homeostasis that could delay the onset or prevent diseases and promote a longer life span [4].
The balance between metabolism and plasma lipids is vital for physiological turnover. The results of the Long Life Family Study (LLFS), an international collaborative study, showed that individuals with a longer life span had a better lipid profile [7,8]. The molecular composition and concentration of lipid species are indicative of their cellular localization, metabolism, and, consequently, their impact on age-related diseases and a healthy life span [9]. Previous studies have identified a few loci associated with longevity involving lipid metabolisms, such as APOE Ɛ2, TOMM40 rs2075650, FOXO3A rs2802292, CETP rs5882, HLA-DQB1 rs1049107, and rs1049100 in individuals with an exceptionally long life span [10][11][12][13].
Recently, our group has successively reported some lipid metabolism-related genetic variances associated with a healthy life span. However, the overall genetic basis of these variances is unidentified, and given this, there may be more yet unexplained genetic variances whose cumulative influence increases longevity by altering and maintaining lipid homeostasis [10][11][12][13]. There are multiple gene interaction networks in our body, which together maintain the body's physiological balance, including lipid metabolism. We tried to find more genetic variants that promoted longevity and metabolic balance to explain their biological significance through multi-gene network interaction.
Many studies have shown that the ABO gene has been linked to longevity [14][15][16]. Fortney et al., (2015) evaluated and replicated five loci including rs514659 in ABO in Caucasians by applying informed genome-wide association studies (iGWAS) [17]. Timmers et al. used a genome-wide association (GWA) of 1 million parental lifespans of genotyped subjects and data on mortality risk factors to identify and replicate rs2519093 in ABO in the English population [18]. But it is still not clear for ABO variants in longevity in other populations, for example, Chinese. So, it is important to develop this study in Chinese to confirm ABO variants associated with human longevity.
In addition, using NGS, other ABO SNPs, which were potential causal loci related to lipid homeostasis and health, were discovered subsequently. Previous research suggested that individuals with the ABO genotype, i.e., rs8176719 CC, had improved overall cardiovascular health and increased longevity via plasma lipid levels [14,[19][20][21]. According to a meta-analysis of the LURIC and YFS cohorts, the minor allele of Ars657152 of the ABO gene was significantly associated with greater cholesterol absorption that results in disrupted healthy aging [22]. Another research found that the major rs644234*T allele of the ABO gene was associated with decreased levels of apolipoprotein E (ApoE), a multifunctional protein involved in lipid metabolism and longevity [23][24][25]. Hence, it is needed to identify some loci on the ABO gene associated with longevity and lipid metabolism. So far, there are few reports on genetic variants of the ABO gene and plasma lipids associated with healthy longevity. Meanwhile, the genetic mechanism by which ABO gene variants protect against lipid metabolic disorders and promote healthy aging is unknown.
Hence, the current study explored the ABO gene genetic variants that maintain plasma lipid homeostasis and enhance health longevity. Based on the CLHLS, a population genetic analysis was conducted in the Chinese population to find genetic variants of the ABO gene linked to a long life span and normal plasma lipid levels. We used genome-wide association studies (GWAS), metabolic phenomics technology, and combined analysis to identify the possible beneficial variants by performing a comparative analysis between longevity and age-specific control groups in these population cohorts. The obtained results would offer a new perspective on understanding a healthy longer life span and aging. ORrange = 1.530-1.997). Finally, two pathways involving vWF/ADAMTS13 and the inflammatory markers (sE-selectin/ICAM1) that co-regulated lipid levels by glycosylation and effects on each other were speculated. In conclusion, the association between the identified longevity-associated ABO variants and better health lipid profile was elucidated, thus the findings can help in maintaining normal lipid metabolic phenotypes in the longevity population.

Identification of new longevity-associated ABO variations
First, the raw data was collected from GWAS phases I and II, and data quality control procedures were followed for the sample screening. There were 5803 longevity subjects and 7026 young controls with genotype left. Then, based on chromosomal position (i.e., chromosome 9: 136125788-136150617) of the ABO gene, 80% of the participants including 4437 longevity individuals and 5627 young controls with genotype were randomly selected to identify variants on ABO genes.
Seven variants (i.e., rs8176722, rs8176719, rs687621, rs2519093, rs514659, rs643434, and rs505922) were genotyped on ABO genes and four of them were associated with longevity (p≤0.05) as shown in Figure 1A. While the flowchart for the steps of sequential analytical has been shown in Figure 2.

Genotype-phenotype study of longevity-associated variants and plasma lipid or BMI
There were 2,527 longevity subjects with an average age of 96.06 years and 3,259 young controls with an    Note: X represents the major allele or minor allele of the corresponding SNP.

AGING
average age of 70.00 years in the samples with integrated epidemiological data. CLHLS participants were 1455 nonagenarians and 1072 centenarians. Sex, disease history, BMI, plasma lipids, blood pressure, and blood glucose were compared between different age groups. We found a statistical difference in the distribution of sex (p = 1.575 x 10 -57 ), BMI (p = 2.359 x 10 -3 ), and lipid levels (p = 8.000 x 10 -5 ) between longevity and young controls (Supplementary Table 4).

Relationship between longevity-associated variants and plasma lipid homeostasis
The analysis of the lipid metabolism index (HDL-c, LDL-c, TG, and TC) showed that the longevity samples possessed lower LDL-c levels (p-value = 1.700 x 10 -5 ), TG (p-value = 1.275 x 10 -22 ), and TC (p-value = 0.011). There were significant differences in the levels of LDLc (p-value = 7.669 x 10 -7 ), TG (p-value = 2.522 x 10 -16 ), and TC (p-value = 6.400 x 10 -5 ) between nonagenarians and the young controls. Only two indices, TG (p-value = 2.941 x 10 -13 ) and HDL (p-value = 0.049) showed significant differences between centenarians and the young controls. TG was a common difference index in comparison between the different age groups (Supplementary Table 5).

Identification of longevity-associated variants and haplotypes
Longevity is a highly complicated phenotype that is influenced by genetic as well as environmental factors. The various cut-off to define longevity have been used, varying from 85+, 90+ and 100+ years, and the impact of these differences have been addressed in Broer's paper (2015) [3]. In this study, the longevity phenotype is considered as individuals (≥90yrs old) without major health complications, including CVD, cancer, diabetes, hypertension, etc. Individuals that have a longer life span with a lower risk of aging-associated diseases are regarded as a model of healthy aging. Our previous genetic research has identified some longevityassociated factors, such as FOXO3 [27], IGFBP-3 [28], CETP [29], SIRT1 [30], and HLA-DQB1 [10].
According to the reported studies, ABO has been associated with blood transfusions, organ transplants, and diseases such as cancer, coronary heart disease (CHD), and lower circulating cholesterol levels [31][32][33][34]. However, after multiple GWAS database analyses, Fortney et al. proposed the ABO may be associated with AGING longevity [17]. We hypothesized that there are some ABO variations associated with longevity in Chinese.
In our cohort, we identified and replicated four SNPs in the ABO gene that were associated with healthy aging and longevity, including rs8176719, rs687621, rs643434, and rs505922, and three of these variants have never been identified in previous studies on longevity. Compared with the young controls, all four variants showed a significant difference in longevity, which suggested that these four variants were longevity-associated genetic variances that could increase the lifespan by healthy aging. Next, by analyzing 5803 longevity subjects and 7026 young controls, we showed that a single-nucleotide insertion in codon 87 (rs8176719) constructed a strong linkage AGING disequilibrium block (LD; r 2 = 0.944) between rs687621, rs643434, and rs505922 in the ABO gene. This is the first study to report that rs8176719 C, rs687621 G, rs643434 A, rs505922 C (p Bonferroni range = 0.013-0.020; OR range = 1. Our study focused on ABO variants associated with longevity in Chinese. We have identified three novel variants (rs687621, rs643434, and rs505922) of the ABO gene different from Caucasians and replicated one allele (rs8176719) in ABO reported before [17,18]. The obtained results revealed that ABO gene variants are associated with human longevity, but there existed many different variants in the ABO gene among different populations.

Longevity variants associated with lipid homeostasis in individuals with a longer life span
Many longevity-associated variants were found that were potentially associated with maintaining the balance of plasma lipids. Several observational studies have found that increases in TG levels are associated with an increase in the risk of morbidity and mortality related to aging-associated diseases [35,36]. In the Leiden Longevity Study (LLS), lower levels of TG, one of the biomarkers of healthy aging, were found to decrease morbidity associated with aging-related disorders [37,38].
Hence, we identified ABO variants associated with two phenotypes in the Chinese population: longevity and normal lipid levels. Considering the potential bias existed in the selection of longevity and local control individuals for analysis, we compared the major demographic and characteristics of the participants between the included (2527 longevity, 3259 controls) and excluded (3276 longevity, 3767 controls). Meanwhile, we also compared them of the participants between the included (2527 longevity, 3259 controls) and total (5803 longevity, 7026 controls). There was not statistically significance between any pair's comparison identified (Supplementary Table 6). Therefore, we justify our included subjects (2527 longevity, 3259 controls) are equally balanced or objectively represented with all participants of ours. Besides, we did stratification analysis of lipid metabolism by genotype and age, and also showed that there was no AGING selection bias (Supplementary Tables 7-10). Therefore, we hypothesized that there was a significant correlation between ABO variants and longevity and lipid normal levels in the Chinese population, which needs further investigation.

Functional analysis of the new healthy-associated variants in ABO
The ABO gene (chromosome 9q34.2) is known to determine the presence of antigens on the surface of red blood cells. Our results showed that except for rs8176719, the other three novel SNPs, i.e., rs687621, rs643434, and rs505922 that were identified in our study were all located in the intron region. Data from ENCODE showed that rs687621 was located in a region featured by enhancer histone marks and could act as an expression Quantitative Trait Locus (eQTL). It was possible that the expression of ABO was being increased by other variants proxied by rs687621 [39]. The other two SNPs, i.e., rs643434 and rs505922, located in intron 1 of the ABO gene were highly linked (LD; r 2 = 0.994). Noncoding transcript exon variant rs8176719 was a frameshift mutation in exon 6. Because of a potential open chromatin region, several epigenetic markers, a transcription factor binding site, and evolutionary conservation, the combined prediction results from ENCODE, ChIP-seq, and UCSC suggested that rs8176719 might be crucial for gene regulation [40].
The glycosylation of soluble cell adhesion molecules links the ABO blood group antigens to E-selectin ligand-1 and P-selectin glycoprotein ligand-1 [41]. ABO SNPs altered lipid levels by working on the clearance and glycosylation of membrane molecules, including biomarkers (such as soluble cell adhesion molecules: sEselectin, sP-selectin, ICAM1) [42]. Glycosylation can occur on the ligand itself, the receptor, as well as on key signaling enzymes and effector proteins. Regarding the glycosylation of lipids, the process of O-linked glycosylation, which is generally initiated by the addition of the monosaccharide, i.e., N-acetylgalactosamine to the hydroxyl group of serine and threonine amino acids (GalNAca1-O-Ser/Thr) is critical for the LDL receptor stability, and stable expression of the very low-density lipoprotein receptors on the cell surface. Interaction analysis of genes revealed an interaction relationship between ABO and ADAMTS13, as represented in Figure 1 (Supplementary Figure 2). Some studies indicated that individuals carrying rs8176719 CC have plasma levels of von Willebrand Factor (VWF) 25% lower than individuals carrying rs8176746 A allele due to increased proteolysis and clearance of VWF at the Tyr1605-Met1606 bond by ADAMTS13 [20,43], which specifically inhibits platelet deposition and inflammation, and reducing the risk of death [41].
Individuals carrying rs687621, rs643434, and rs505922 altered TG concentrations by glycosylating the target molecules using the O-linked sugar domain, and may stabilize circulating inflammatory markers and lipid levels by promoting healthy lipid metabolism, thus contributing to individual healthy longevity ( Figure 4).
Hence, we identified and replicated the presence of four longevity-associated variants in our cohort, as well as a AGING new haplotype (on the ABO gene) linked to longevity. Then, the analysis of genotype and metabolic phenotypes showed that the longevity individuals with rs687621 GG, rs643434 AX (AG+AA), and rs505922 CX (CT+CC) were associated with normal levels of lipid and BMI. Lastly, two pathways involving vWF/ADAMTS13 and the inflammatory markers that co-regulated lipid levels by glycosylation and effects on each other were speculated. As a result, we can deduce that individuals with longevity-associated variants have an improved cardiovascular profile, which may lower the risk of aging-related disorders and maintain healthy physical circumstances, resulting in longer life. Although we indicated the relationship between the ABO blood group and healthy longevity, several pieces of evidence involved the mechanism of ABO blood group antigens and lipids metabolism. Additionally, healthy longevity could be studied in cell or animal models by using new technologies, such as single-cell sequencing, CRISPR/Cas 9, and 3D organ models. Thus, we need to understand the mechanism of longevity and achieve healthy aging for all human communities.

CONCLUSIONS
The present study revealed that rs8176719 C, rs687621 G, rs643434 A, and rs505922 C of the ABO gene were not only longevity-associated genetic variants but also lipid homeostasis-associated variants in our cohort. These variants probably altered triglyceride concentrations by glycosylation on the target molecules by the O-linked sugar domain, and promoted healthy lipid metabolism, thereby contributing to longevity. Our results showed that ABO longevity-associated genotypes (rs687621 GG, rs643434 AX, and rs505922 CX) could promote lipid homeostasis. In the future, further functional and mechanism studies should be conducted to better understand the molecular mechanism of longevity associated with ABO and lipid homeostasis.

Subjects
All experimental procedures were reviewed and approved by the Ethics Committee of Beijing Hospital, Ministry of Health, China. We obtained written consent forms from all participants before study initiation. All clinical investigations were conducted following the principles of the Declarations of Helsinki.
The We interviewed all consented longevity in the sampled counties and cities. Young middle-aged controls (30-85 years old) were obtained in the same country/city as long-lived individuals, who needed to satisfy one specific criterion of having a nonfamily history of longevity (no lineal family members within three generations aged above 85) [4,44].

DNA extraction and genotyping
DNA was extracted from the whole blood and hybridized following the manufacturer's instructions. A total of 257 longevity individuals (aged 102.04±2.05 years) were genotyped using the Illumina HumanOmniZhongHua-8 Bead Chips, that were created by strategically selecting optimized tag SNP content from all three HapMap phases and the 1000 Genomes Project (1 kGP). The chip represents a state-of-the-art choice for GWAS in Asian populations to maximize international compatibility [4]. After standard GWAS quality-control filtering for subjects, we obtained a total of 818048 genotyped SNPs.
The phase II of GWAS was from 5546 longevity subjects (97.66±4.96 years) and 7026 young controls (aged 67.72±13.65 years) in CLHLS. Based on previous research, phase II of GWAS used a custom SNP chip with 27,656 selected longevity and disease-related SNPs for targeted genotyping [44].

Genome-wide association analysis
We combined the raw data from GWAS phases I and II and completed the sample filtering data quality control procedures. There were 5803 longevity subjects and 7026 young controls with genotype. Then we randomly selected 80% and 20% of participants with genotype for discovery and validation, respectively.
Laboratory parameters and genotypic data of the present GWAS were from CLHLS, which were offered by the Center for Healthy Aging and Development Studies, National School of Development, Peking University.

Selection of variations and genotyping
We identified the variants associated with longevity from 80% of GWAS phase I + II samples (4437 AGING longevity subjects and 5627 young controls) by the chromosomal location of the ABO gene (chromosome 9: 136125788-136150617). There were 4 variants in the ABO gene identified as candidate variants with a MAF (minor allele frequency) greater than 10%. We included duplicate samples with 1128 longevity subjects and 1397 young controls as quality controls to verify the reliability of the variants (Supplementary Table 11). Finally, four genetic variants were identified as longevity-associated genetic variants. The multiple comparisons of CLHLS GWAS phase I and II study of our group were performed the Bonferroni correction. P-value thresholds ≤ 0.025 were considered significant. The Haploview software was used to perform haplotype analysis. The 3D Genome Browser (http://3dgenome.fsm.northwestern.edu/) was used to examine three-dimensional genome interactions.

Association of variation and metabolic genotype in longevity
In CLHLS, there were 2527 individuals (aged 90-114 years) and 3259 young controls (aged 38-85 years). Both sets included an integrated questionnaire of an epidemiological survey as well as biochemical indexes.

Genetic model analysis
Long-lived individuals carry special mutations associated with longevity. The base sequence of the gene has been changed (partially or completely) in longevity compared to a normal individual. A variation in the degrees of association between the genotypes and phenotype of the risk and non-risk SNPs has been clearly understood. Therefore, according to Mendel's mode of inheritance, we compared the frequency of longevity and controls who carries mutations or not. The strength of association between the genotypes and phenotype was estimated using the odds ratio (OR). Pvalue threshold≤0.05 was considered statistically significant and p≤0.01 was considered extremely significant.

Statistical analysis
The Statistical Package for Social Sciences (SPSS Inc, Chicago, IL, USA) Windows, v 19.0 was used for statistical analysis. Gene counting was done to determine the differences in the distribution of genotype and allele frequencies, and the χ2 goodness-fit test was used to test the deviations from the Hardy-Weinberg equilibrium (HWE) for all SNPs. The odds ratio (OR) was used to estimate the strength of association between the variables, with 95% confidence intervals (95%CI). p≤0.05 was considered statistically significant. The mean and standard deviation (SD) were used to describe the normally distributed plasma lipid levels as continuous variables.

Availability of data and material
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

ACKNOWLEDGMENTS
We thank all participants involved in the study. We thank all subjects who offered their genomic DNA and clinical information for this study and appreciate the work of all clinicians who helped evaluate samples and data.

Supplementary Tables
Please browse Full Text version to see the data of Supplementary