Identification of novel genetic variants associated with cardiorespiratory fitness

a Cardiac Exercise Research Group (CERG) at Dept. of Circulation and Medical Imaging, Faculty of Medicine, Norwegian University of Science and Technology (NTNU), Trondheim, Norway b Department of Cardiology, St. Olavs Hospital, Trondheim, Norway c Genomics Core Facility (GCF), Norwegian University of Science and Technology (NTNU), Trondheim, Norway d Institute for Experimental Medical Research, Oslo University Hospital and University of Oslo, Oslo, Norway e School of Human Movement & Nutrition Sciences, University of Queensland, Brisbane, Australia


Introduction
Low aerobic fitness, quantified as maximal oxygen uptake (VO 2max ), is a strong and independent predictor of all-cause and cardiovascular mortality in healthy individuals and in patients with cardiovascular disease (CVD). 1-4 VO 2max is determined by a combination of genetic and environmental factors, and the genetic contribution is suggested to bẽ 50%. 5,6 Identification of genes and genomic variations associated with VO 2max would lead to a better understanding of this complex trait, and provide possible links between VO 2max and CVD. Previously, a few genes and genomic loci have been associated with VO 2max. [7][8][9] However, most studies are limited in size and employ the conventional hypothesis-driven approach of searching for pre-specified genomic associations, which limits the discovery of new genetic loci. Hence, the scientific community call for a large-scale systematic screening of genetic variants associated with directly measured VO 2max in a large wellcharacterized population. 10 By taking advantage of one of the world's largest databases of directly measured VO 2max , we report the first large-scale systematic screening for genetic variants associated with VO 2max . Furthermore, we explore potential associations between VO 2max -related singlenucleotide polymorphisms (SNPs) and CVD risk factors, and their potential biological implications by using in silico tools and genotypephenotypes databases.

Study participants
The Nord-Trøndelag Health Study (HUNT) is one of the largest health studies ever performed. It includes a unique database of questionnaire data, clinical measurements and biological samples. During the past 35 years, 120.000 individuals have contributed throughout four waves of the HUNT study (HUNT1 in 1984-86, HUNT2 in 1995-97, HUNT3 in 2006-08 and HUNT4 in 2017 in Norway. Participants in the present study attended a sub project during the third wave of HUNT (HUNT3 Fitness Study) designed to directly measure maximal oxygen uptake (VO 2max ) in a healthy adult population. 11 Exclusion criteria for the HUNT3 Fitness Study were present or previous heart disease, stroke, angina, lung disease (asthma, chronic bronchitis, chronic obstructive pulmonary disease, and sarcoidosis), cancer, current pregnancy, orthopedic limitations and use of hypertensive medication. In total, 3470 participants that reached a true VO 2max were selected for genotyping after excluding firstand second-degree relatives (siblings, parents, children, grandparents, aunts, uncles or grandchildren). Close relatives were excluded both by using data from Statistics Norway, and by searching for segmental sharing using PLINK. 12 In the validation cohort, DNA-samples were analyzed from 718 participants from the Generation 100 Study. 13 This cohort includes both men and women, aged 70-77 years, who performed a gold-standard VO 2max -test and reached a true VO 2max using the same criteria as the HUNT3 Fitness Study. Exclusion criteria were present or previous heart-or lung disease, cancer, and medical contraindication or orthopedic limitation to exercise. This study was approved by the Regional committee for medical research ethics (4.2008.2792), the Nord-Trøndelag Health Study, the Norwegian Data Inspectorate, and by the National Directorate of Health. The study was in conformity with Norwegian laws and the Helsinki declaration, and a signed informed consent was obtained from all participants.

Clinical measurements
Weight and height were measured on a combined scale (Model DS-102, Arctic Heating AS, Nøtterøy, Norway), and body mass index (BMI) was calculated as weight divided by height squared (kg/m 2 ). Fat, muscle percentage and visceral fat were obtained using the InBody 720 scale (Biospace, Seoul, Korea).

Testing maximal oxygen uptake (VO 2max )
An individualized protocol was applied to measure VO 2max.
14 Each test-subject was familiarized with treadmill walking during the warmup of 8-10 min, also to ensure safety and avoid handrail grasp when this was not absolutely necessary. Oxygen uptake kinetics were measured directly by a portable mixing chamber gas-analyzer (Cortex MetaMax II, Cortex, Leipzig, Germany) with the participants wearing a tight face mask (Hans Rudolph, Germany) connected to the MetaMax II device. The system has previously been found reliable and valid in our laboratory. Heart rate was measured by radio telemetry (Polar S610i, Polar Electro Oy, Kempele, Finland). From the warm-up pace, the load was regularly increased. When the participants reached an oxygen consumption that was stable for more than 30 s, treadmill inclination (1-2%) or velocity (0.5-1 km/h) were increased stepwise until the participants were exhausted. A maximal test was achieved when the respiratory quotient reached N1.05 or when the oxygen uptake did not increase N2 ml/kg/min despite increased workload. VO 2max was measured as liters of oxygen per minute (l/min), and subsequently calculated as VO 2max relative to body mass (ml/kg/min) and VO 2max scaled (ml/ kg 0.75 /min).

Questionnaire-based information
Physical activity is likely to be the most important behavioral factor influencing VO 2max , and has to be adjusted for to isolate the genetic contribution to the phenotype. Physical activity was registered based on the responses to a self-administered questionnaire. 15 The questionnaire included three questions and each participant's response to the questions (i.e. numbers in brackets) were multiplied to calculate a physical activity index score (Kurtze score): Question 1: "How frequently do you exercise?", with the response options: "Never" (0), "Less than once a week" (0), "Once a week" (1), "2-3 times per week" (2.5) and "Almost every day" (5). Question 2: "If you exercise as frequently as once or more times a week: How hard do you push yourself?" with the response options: "I take it easy without breaking a sweat or losing my breath" (1), "I push myself so hard that I lose my breath and break into sweat" (2) and "I push myself to near exhaustion" (3). Question 3: "How long does each session last?", with the response options: "Less than 15 minutes" (0.1), "16-30 minutes" (0.38), "30 minutes to 1 hour" (0.75) and "More than 1 hour" (1.0). As the second and third question only addressed people who exercised at least once a week, both "Never" and "Less than once a week" yielded an index score of zero. Participants with a zero score were categorized as inactive, 0.05-1.5 as low activity, 1.51-3.75 as medium activity, and 3.76-15.0 as high activity.

Blood analysis
Standard biochemical analyses were performed on fresh venous non-fasting blood samples at Levanger Hospital, Norway. Non-fasting glucose was analyzed by hexokinase/G-G-PDH methodology reagent kit 3L82-20/3L82-40 Glucose (Abbott Diagnostics, Illinois, US), highdensity lipoprotein (HDL) cholesterol by the Accelerator selective detergent methodology reagent kit 3K33-20 Ultra HDL (Abbott Diagnostics), total cholesterol by enzymatic cholesterol esterase methodology reagent kit 7D62-20 Cholesterol (Abbott Diagnostics), triglycerides by Glycerol Phosphate Oxidase methodology reagent kit 7D74 Triglyceride (Abbott Diagnostics) and C-reactive protein (CRP) by the Areoset CRP Vario kit (Abbott Diagnostics). Triglycerides and CRP were measured in only 80% of the HUNT population. Low-density lipoprotein (LDL) cholesterol was calculated based on information on total cholesterol, HDLcholesterol and triglycerides.

Genotyping of exploration cohort
Deoxyribonucleic acid (DNA) was extracted from blood samples stored in the HUNT biobank as described elsewhere. 16 DNA samples were analyzed by the custom-made Cardio-Metabochip including approximately 210.000 SNPs (Illumina, CA, US). The annotation on the chip is based on Genome build 36.3. The Cardio-Metabochip was designed by representatives of the following genome-wide association studies (GWAS) meta-analysis consortia: CARDIoGRAM (coronary artery disease), DIAGRAM (type 2 diabetes), GIANT (height and weight), MAGIC (glycemic traits), Lipids (lipids), ICBP-GWAS (blood pressure), and QT-IGC (QT interval). The candidate SNPs were selected according to five sets of criteria: (I) individual SNPs displaying evidence for association in GWAS meta-analysis to diseases and traits relevant to metabolic and atherosclerotic-cardiovascular endpoints, (II) detailed fine mapping of loci validated at genome-wide significance from these metaanalyses, (III) all SNPs associated at genome-wide significance with any human trait, (IV) "wildcards" selected by each consortium for consortium-specific purposes, and (V) other useful content, including SNPs that tag common copy number polymorphisms, SNPs in the human leukocyte antigen region, SNPs marking the X and Y chromosomes and mitochondrial DNA, and for sample fingerprinting.
The study was designed as a quantitative trait approach with VO 2max as a continuous variable, as this provides the best statistical power. The genotyping raw data was subjected to systematic quality control using the statistical software PLINK. 12 Individuals with low genotype call rate (b90%) were excluded. SNPs with a genotype call rate b95% or a minor allele frequency b1% were also excluded. Furthermore, SNPs that clearly deviate from the expected Hardy-Weinberg Equilibrium were excluded (p b 10 −7 ). Individuals who showed gender discrepancies based on the heterozygosity rate from chromosome X were also excluded. After pre-processing the raw data, 123.545 SNPs were left for association analyses.

Genotyping of validation cohort
Candidate SNPs from the exploration cohort, as well as a 7 wild-card SNPs not included on the Cardio-Metabo chip, were genotyped using the Agena Biosciences MassARRAY® platform (Agena Bioscience, San Diego, CA, US). SNP multiplexes were designed using Assay Design Suite v.1.0 software (Agena Bioscience). Genotyping was performed according to the manufacturer's protocol using IPLEX Gold assay (Agena Bioscience) and analyzed using the MassARRAY Analyzer 4 platform. Mass signals for the different alleles were captured with high accuracy by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). Genotype clustering and individual sample genotype calls were generated using Sequenom TyperAnalyzer v.4.0 software (Agena Bioscience).

In silico analysis of transcription starting sites
To determine if validated SNPs were located in transcription factors binding sites, we performed in silico analysis of predicted transcription factor binding sites using the software PROMO. 17

Genotype-Tissue Expression (GTEx) database
To explore the relationship between validated SNPs and gene expression in different human tissues/organs, we used the GTEx database. The database includes~900 post-mortem donors and opens the possibility for studying the effects of genetic variation in multiple human reference tissues. 18

BXD mouse database
The BXD database is an open-access web service for systems genetics (www.genenetwork.org) to explore the genetic control of multiple phenotypes. 19 The database includes N2000 phenotypes across a large panel of isogenic but diverse strains of mice (BXD type). The database contains phenotypes such as heart rate, oxygen consumption and blood parameters, such as hematocrit and iron levels, highly relevant for exploring the functional importance of VO 2max -related genes. We tested potential correlations between expression levels of genes under the influence of VO 2max -SNPs and relevant phenotypes in mouse on both chow diet and high fat diet, independently.

Statistical analyses
The association between the final 123.545 SNPs and VO 2max were analyzed by linear regression using PLINK. The main covariates for the VO 2max phenotype were gender, age (years) and physical activity level (Kurtze score). The cut-off for significance were set to (p b 5.0 * 10 −4 ) in the exploration cohort, as findings reaching the traditional genomewide significance were considered unlikely due to the low number of available cases and that VO 2max is a complex trait. To overcome the issue of using a moderately stringent p-value, validation of the findings in a separate cohort was necessary. In the validation cohort, associations between VO 2max and candidate SNPs were tested using the same statistical analyses as in the exploration cohort. Nominal p-value was considered significant (p b 0.05). A genetic score was created using a combination of 9 SNPs associated with VO 2max . Each participant was scored according to the sum of high VO 2max genotypes carried. The differences in VO 2max between participants with increasing numbers of favorable genotypes were calculated by one-way ANOVA using the LSD post hoc test.

Results
Characteristics of the participants in the exploration cohort (HUNT3 Fitness Study) and the validation cohort (Generation 100 Study) are shown in Table 1.
After filtration of genotyping data, 123.545 SNPs were tested for their association with VO 2max . 41 SNPs were significantly associated with VO 2max in the exploration cohort after adjusting for age, gender and physical activity level (p b 5.0 * 10 −4 ). Locus zoom plots can be found in Supplementary Fig. 1. The candidate SNPs were subsequently genotyped in a validation cohort, in addition to 7 wild-card SNPs not included on the chip used for the exploration cohort. The association between VO 2max and six novel SNPs were replicated in the validation cohort (p b 0.05, Table 2). The SNP in the promoter region of the Myosin Regulatory Light Chain Interacting Protein (MYLIP) (rs3757354) did not pass the significance threshold in the validation cohort, however sub analyses for each gender showed a highly significant association in women, and the SNP was therefore included in Table 2. Three of the 7 wild-card SNPs in the genes beta-3 adrenergic receptor (ADRB3), alpha-actinin-3 (ACTN3) and endothelin 1 (EDN1) were associated with VO 2max in men, women or both genders (p b 0.05, Table 2). Candidate SNPs that failed to be replicated in the validation cohort can be found in Supplementary Table 1.
Considering that VO 2max is a complex trait influenced by multiple genetic factors, 20 we assessed whether a cumulative effect existed between the number of favorable genotypes and VO 2max . By using a combination of the 9 SNPs from Table 2, and scoring the high VO 2maxassociated genotypes 1 and low VO 2max genotypes 0, we calculated a genetic score for each participant estimating inborn VO 2max . In the validation cohort, the variations in VO 2max ranged from 63 ml/kg 0.75 /min to 98 ml/kg 0.75 /min, for participants scoring 1 or 7, respectively (Fig. 1A). This corresponded to unscaled VO 2max -values ranging from 22.3 ml/ kg/min to 32.7 ml/kg/min, for participants scoring 1 or 7, respectively. To illustrate that the power of this allele combination was independent of physical activity levels, we split the participants into two subgroups, participants below (inactive) and above (active) the median physical activity level. Interestingly, the proposed score appears to be robust even with the reduced sample power of this sub-analysis (Fig. 1B).
Using the same SNPs as basis, we also found a cumulative effect of the number of favorable SNPs and the decline in several CVD risk factors, e.g. waist circumference, visceral fat, fat %, cholesterol and BMI (Fig. 2). In addition, among the participants with 1-4 favorable SNPs, 36% were on treatment for hypertension, compared to 23% in those with N4 favorable SNPs (p b 0.05). Among participants reporting little or no physical activity (Kurtze score b 3.75, n = 235) those with 1-4 favorable SNPs had higher fat percentage (+2%), visceral fat (+9%), total cholesterol (+5%) and LDL-cholesterol (+6%) compared to those with N4 favorable SNPs (p b 0.05).  To explore and predict physiological consequences of the VO 2max -SNPs, we used in silico tools and genotype-phenotype databases. The non-synonymous SNP rs3803357, located in the first exon of the Bromo adjacent homology domain containing 1 (BAHD1) gene, cause a shift from the amino acid glycine to lysine. The group of participants homozygote for the rs3803357 minor allele (TT) (24%) had a 3 ml/ kg 0.75 /min lower VO 2max than the group carrying the heterozygote allele (GT) (50%) or the common allele homozygotes (GG) 26% (Fig. 3A). In the validation cohort, the group of participant's homozygote for the rs3803357 minor allele (TT) (24%) had 4% and 7% lower VO 2max compared to those harboring the (GT) and (GG) variants, respectively (Fig. 3B). SNPs located outside the promoter region or within introns and exons may influence transcription of proximal genes. Using the Genotype-Tissue Expression (GTEx) database, rs3803357 was found to be associated with differential expression BAHD1 in the left ventricle (p = 9.0e−9) (Fig. 3C). By using the BXD mice population, we found significant negative correlations between cardiac expression of Bahd1 and basal VO 2 (in an untrained state), as well as with myocardial mass (Fig. 3D).
Another SNP that was found to be associated with VO 2max in women, rs3757354, is located within the 2 KB upstream region of MYLIP. Women homozygote for the rs3757354 common allele (GG) (56%) had a 3 ml/kg 0.75 /min higher VO 2max than the group carrying the heterozygote genotype (AG) (37%) or the minor allele homozygotes (AA) (7%) (Fig. 4A). To determine if rs3757354 could interfere with transcription factor binding, we performed in silico analysis to discover possible transcription factor binding sites. The analysis predicted that having the A allele at rs3757354 creates a perfect binding site for the estrogen receptor alpha (ER-α) targeting the sequence TGACC, whereas having the G allele at rs3757354 is likely to disable the binding of ER-α, potentially reducing estrogen-induced expression of MYLIP (Fig. 4B). Using the GTEx database, we found that rs3757354 was associated with  differential expression of MYLIP in the adipose tissue, skeletal muscle and the heart (p b 0.05). By using the BXD mice population, we found significant negative correlations between cardiac expression of Mylip and heart mass (Fig. 4C). Participants harboring the high-VO 2max genotype (GG) had significantly lower waist, BMI, visceral fat, fat percentage, CRP-levels, as well as significantly higher HDL-cholesterol as compared to the low-VO 2max genotypes (AA) and (AG) (Fig. 4D). Furthermore, among those with the low-VO 2max genotypes (AA) and (AG) significantly more of the participants were on treatment for hypercholesterolemia (10%) compared to those with the high-VO 2max genotype (GG) (3%) (Fig. 4E).

Discussion
Here we report the first large-scale screening for genetic variants associated with VO 2max . So far, the lack of large studies directly measuring VO 2max has limited the possibilities for large genetic association studies for this phenotype. In this present study, we validated 6 new SNPs associated with VO 2max , and replicate associations with 3 SNPs previously associated with fitness-related traits. 10,21,22 Based on these nine SNPs we proposed a genetic score reflecting inborn VO 2max . The mean difference in VO 2max between those with 1 favorable SNP compared to those with 7 favorable SNPs was 10.4 ml/kg/min, which is equal to a difference in 3 metabolic equivalents (METs) (as 1 MET≈3.5 ml/kg/min). In other prospective studies, it has been suggested that a decrease of 1 MET is associated with increased risk of diabetes, hypertension and the metabolic syndrome, [23][24][25] whereas a corresponding increase has been associated with lower risk of all-cause and CVD mortality. 2,26 Interestingly, the number of favorable VO 2max -SNPs carried correlated negatively with several CVD risk factors, like waist circumference, BMI, visceral fat, fat percentage, total cholesterol and LDL-cholesterol. Furthermore, among the participants with 1-4 favorable SNPs, significantly more participants were on treatment for hypertension, compared to those with 5 or more SNPs (p b 0.05). Furthermore, sedentary participants with 1-4 favorable SNPs had higher fat percentage, more visceral fat, higher total cholesterol and higher LDL-cholesterol compared with those with 5 or more favorable SNPs. This indicated that inborn high VO 2max is associated with decreased CVD risk.
Since VO 2max is a strong predictor of cardiovascular health, 2,3,24,27 SNPs associated with VO 2max may provide physiological explanation for the link between VO 2max and CVD. In this present study, a significant association was found between VO 2max and a missense mutation in the exon of BAHD1 (rs3803357), which involves transcription of different amino acids depending on genotype. According to data from the GTEx database, rs3803357 is associated with differences in BAHD1 gene expression in adipose tissue, skeletal muscle and left ventricle, but also with differential expression of proximal genes in different tissues. Furthermore, the BXD database indicates that cardiac Bahd1 levels correlates with basal VO 2 and heart mass in mice. A previous study has shown that BAHD1 act as a transcription repressor that, among other things, is involved in epigenetic repression of different cardiac growth factors. 28 BAHD1 is known to repress insulin growth factor 2 (IGF2) expression by binding to its promoter and recruiting heterochromatin proteins. 28 Interestingly, we have previously shown that Igf2 is one of the most significantly upregulated genes in the left ventricle of rats with inherited high VO 2max. 29 In addition to the links to cardiac phenotype, we also found trends toward lower fat percentage and total cholesterol levels in participants with the high-VO 2max genotypes of BAHD1. Further studies are needed to explore the links between these genomic loci and VO 2max . Another interesting SNP found to be associated with VO 2max was located in an intron of the vasoactive intestinal peptide receptor 2 (VIPR2). VIPR2 encodes a neuropeptide receptor that is expressed in the heart and the coronary arteries. 30 In the heart, VIPR2 regulates cardiomyocyte contractility in response to binding of vasoactive intestinal peptide (VIP). 31 The release of VIP also increases coronary artery vasodilatation. 30 Interestingly, several studies have shown that physical activity induces the release of VIP, hence VIPR2 is likely to be important for cardiovascular adaptions during exercise. 32,33 Furthermore, in rats and humans with cardiomyopathy, the levels of VIPR2 are reduced both in heart and serum, suggesting also a link between VIPR2 and CVD. 30,34 One of the validated SNPs (rs3757354) was located in the promoter region of MYLIP, potentially interfering with transcription factor binding sites. SNPs in promoter regions may cause loss of transcription factor binding sites or formation of a novel binding sites, which may influence how the gene is transcribed upon different stimuli. 35 The SNP in the promoter region of MYLIP was significantly associated with VO 2max in women in the validation cohort, but not in men, indicating that this genotype influence VO 2max in a gender-specific manner. Interestingly, in silico analysis using the software PROMO indicated that rs3757354 is located in the transcription factor binding site of the estrogen receptor alpha (ER-α). In fact, having the G allele in that locus is predicted to disable the binding of ER-α, thus abolishing estrogen-induced expression of MYLIP. For participants carrying the high VO 2max genotype GG at this locus, the in silico analysis predicted that ER-α is unable to bind and induce expression of MYLIP. In contrast, participants with the low VO 2max genotype AA are predicted to harbor intact binding sites for ER-α targeting the sequence TGACC. As ER-α is activated by estrogen, this may explain why this SNP is only important for VO 2max in women. This was further supported by evidence from the GTExdatabase, showing that rs3757354 was associated with differential expression of MYLIP in the adipose tissue, skeletal muscle and the heart (p b 0.05). Interestingly, using the BXD mouse database, we found significant negative correlations between cardiac expression of Mylip and myocardial mass. Furthermore, a previous transcriptome characterization of estrogen-treated human myocardium identified MYLIP as a sex-specific element influencing contractile function, more specifically showing a negative correlation between cardiac expression of Mylip and contractile function. 36 In line with our data, several other studies have reported gender-specific associations with MYLIP genotypes. 36,37 For instance, Yan et al. report that G allele-carrying women from the Bai Ku Yao population had higher levels of HDL-cholesterol than the non-carriers. Furthermore, G allele-carrying women from the Han population had decreased levels of total cholesterol and apolipoprotein A1 (ApoA1) compared to non-carriers. None of these associations were seen in men. 37 In our study, rs3757354 was also found to be significantly associated with HDL-cholesterol, and several other CVD risk factors like waist circumference, BMI, visceral fat, fat percentage and highsensitivity CRP-levels. Furthermore, G-allele homozygotes were less likely to be on cholesterol treatment, suggesting that these women are less prone to hypercholesterolemia. Studies in mice show that increased liver expression of Mylip promotes degradation of the LDL-receptors (LDLR) and thereby increase circulating LDL-cholesterol. 38 Induction of MYLIP expression by the liver X receptors (LXRs) transcription factors is important for cholesterol homeostasis. 38 Upon stimulation by LXRs or LXR agonists, MYLIP degrades the LDLR, apolipoprotein E receptor 2 (ApoER2) and the very low-density lipoprotein receptor (VLDLR) thereby raising circulating LDL-cholesterol. Furthermore, cells lacking Mylip exhibit markedly elevated levels of LDLR and increased rates of LDL-uptake. 39 Overall, the literature provides compelling evidence suggesting important physiological consequences of MYLIP genetic variation. Based on the gender-specific associations of rs3757354, and the previous reported associations with longevity and CVD, 37,40,41 these findings may shed new light on the gender-differences in CVD and the influence of sex-specific hormones. 42 Furthermore, the location of rs3757354 in a potential transcription factor binding site that is under the control of estrogen encourages this hypothesis.
The potassium voltage-gated channel subfamily Q, member 1 (KCNQ1) is a well-characterized gene involved in potassium handling in cardiomyocytes. The rs2074238 located in an intron of KCNQ1 was associated with VO 2max both in the exploration and validation cohort. In previous meta-analysis, the minor allele T of this particular SNP has been associated with a shortening of the QT interval, a measure of myocardial repolarization time. 43 Prolongation of the QT interval duration, is a risk factor for drug-induced arrhythmias and sudden cardiac death. 44 Other studies have also reported that this particular SNP influence QT interval in healthy Europeans. 45,46 In our study, participants harboring the genotype previously associated with prolonged QT interval, had a significantly higher VO 2max, compared to the other genotypes. This may shed new information on the U-shaped association between risk of arrhythmias and VO 2max . 47 However, only mechanistic studies will be able to identify the true functional consequences of rs2074238.
Due to large differences in human physiology between men and women, and that gender is a major determinant of VO 2max , 48 it is likely that some genetic variants have stronger effects in one gender compared to the other. In our study, the lack of similar dependency among men and women for several of the reported SNPs indicates that they may influence VO 2max in a gender-specific manner. A previous study suggests that androgenic hormones are likely to make a significant contribution to VO 2max in men, hence, the relative effect of the VO 2maxrelated SNPs may be lower in men than in women. 49 Since our approach only covers a part of the genome, we do not have sufficient evidence to fully evaluate the genetic contribution to VO 2max in men compared to women. As DNA-sequencing technology becomes more accessible, future studies will hopefully be able to explain gender differences with greater confidence.

Limitations
There are some limitations related to this study not discussed previously. First, the age distribution of the validation cohort is different from the exploration cohort, hence, we may fail to validate some of the SNPs from the exploration cohort due to their importance in different stages of life. Next, as this study only includes individuals with Caucasian decent, the results are not necessarily valid for other ethnicities, and would have to be validated in other cohorts. Furthermore, estimation of physical activity level is an important source of bias, as this parameter is included as a covariate in the genetic association analyses. Nevertheless, as regular physical activity has large influence on VO 2max , this was considered a necessary covariate despite the use of self-reported data.

Conclusion
This is the first large genetic association study on directly measured VO 2max . We discovered and validated new genetic loci associated with VO 2max and explored their physiological importance using genotypephenotype databases and in silico tools. We proposed a genetic signature of inborn VO 2max consisting of 9 SNPs that could distinguish high vs. low fitness individuals based on simultaneous carriage of multiple favorable alleles. Interestingly, the number of favorable SNPs correlated negatively with the presence of several CVD risk factors. Future studies combining several large cohorts with directly measured VO 2max are needed to identify more SNPs associated with this complex phenotype.

Statement of conflict of interest
None of the authors have any conflicts of interests with regard to this publication.