Introduction

Blood pressure (BP) is a classical heritable quantitative trait in humans.1 When elevated and the primary cause is unknown, it is termed essential hypertension (EH) and is a leading susceptibility factor for stroke, end-stage renal disease, and coronary artery disease. Hypertension affects 33.6% of Americans (2004 prevalence estimate)2 and a similar proportion of the entire world population. Worldwide, it is the third leading risk factor for all morbidity and mortality explaining 4.4% of the global burden of disease.3 Consequently, EH entails huge morbidity, mortality, and cost when untreated. Despite progress on a number of fronts, the pathophysiology of hypertension remains largely obscure.

Ever since the classic work of Pickering and co-workers,1 hypertension has been considered a multifactorial trait. It is a common assumption that EH is in substantial part due to the genetic effects of many genes and interaction of these genes with the environment, each of these factors imparting a small contribution to risk. The elucidation of the molecular details of rare monogenic forms of hypertension by Lifton et al.4 have described pathways for BP regulation but the genes identified have not been shown to explain a substantial fraction of EH risk in the general population. EH genetic studies have identified innumerable genetic variants in one or several populations, however, few variants can be consistently detected in multiple studies. The Family Blood Pressure Program (FBPP) is a National Heart, Lung, and Blood Institute (NHLBI)-funded multicenter study with coordinated clinical protocols and a pooled data resource that is publicly accessible.5 The four FBPP networks have recruited a total of 17 129 individuals including four major United States ethnic groups: African Americans (AA), European Americans (EA), Hispanic Americans (HA), and Asian Americans.

Recent developments in genotyping technology permit association and family-based studies using hundreds of thousands of markers in thousands of individuals. This is an exciting opportunity to assess a major part of the human genome in an unbiased fashion for genes associated with EH. The Wellcome Trust Case Control Consortium (WTCCC) has published the largest genome-wide association study on hypertension to this date with 2000 cases of EH and 3000 controls.6 Although no single SNP examined reached genome-wide significance (P<5 × 10−7), the six variants most significantly (P<10−5) associated with EH constitute a priority list for follow-up in other well-characterized hypertension samples. We have analyzed the six most significant SNPs in 11 433 individuals from the FBPP, using systolic blood pressure (SBP) and diastolic blood pressure (DBP), and derived phenotypes as quantitative phenotypes as well as hypertension status.

Methods

FBPP and phenotyping

Three of four FBPP networks participated in this replication study: GenNet, GENOA (Genetic Epidemiology of Arteriosclerosis Network), and HyperGEN (Hypertension Genetic Epidemiology Network). The respective institutional review boards approved all studies and all individuals included gave written informed consent. Information on population characteristics is listed in Table 1. Detailed information on procedures of inclusion and participant characteristics are published elsewhere.5 BP measurements were carried out according to standard operating procedures in a sitting position after a resting period.5 For GenNet, the average of two manual BP measurements was used. For GENOA and HyperGEN, the average of up to three Dinamap measurements were used where available and the average of up to two Omron measurements if no Dinamap measurement was recorded. For the current analysis, raw SBP and DBP values were used, without correction for treatment. Analysis of treatment corrected SBP and DBP values (adding 10 mm Hg to SBP and 5 mm Hg to DBP if treatment present) gave very similar results. Correction by addition has been shown to perform well in simulation studies, but all approaches to this problem are necessarily approximations.7 Hypertension status was defined as SBP≥140 mm Hg and/or DBP≥90 mm Hg and/or antihypertensive treatment.

Table 1 Basic characteristics of study participants

Genotyping

We used the Taqman technology with fluorescent probes for genotyping (Applied Biosystems, Foster City, CA, USA). Genotypes were called manually using the manufacturer's software. Each network performed genotyping separately. Extensive DNA quality control has been carried out previously on all samples. GenNet typed 1% repeat samples and 1% HapMap samples with known genotypes for the six SNPs. HyperGEN typed 7% HapMap samples. GENOA typed 3.5% repeats. No discrepancy between repeats or between the HapMap sample genotype and the reference was detected, except for the GENOA repeats for rs2398162, which showed a concordance rate of 99.4%.

Power calculations

To estimate power in our study we used Purcell's genetic power calculator8 (http://pngu.mgh.harvard.edu/~purcell/gpc/), using a case–control sampling for threshold-selected quantitative traits. The following parameters were used: type I error=5%; percent variance explained=0.5% or 1%, additive genetic effects model; number of cases=2500 for AA and EA, 1250 for HA; cases assumed to be ascertained from >1 SD of the phenotype distribution and controls from the entire population; control:case ratio=1.

Analysis methods

Data file preparation and descriptive statistics were carried out using custom code written in R version 2.6.0 (The R Foundation for Statistical Computing) and by Perl scripts. For pedigree verification and Hardy–Weinberg tests, the Pedstats software package was used.9 For regression of the phenotype on age, sex, and BMI, the residuals of the following model were used for further analysis: y=α+β1 × age(years)+β2 × sex(1 for male subject, 2 for female subject)+β3 × BMI(kg/m2)+ɛ. For regression-based tests of association, the Merlin software package (assoc option)10 and the XTGEE procedure (GEE in the following) of STATA (StataCorp LP, College Station, TX, USA) were used for continuous phenotypes, the Lamp software package11 for dichotomous phenotypes. Merlin assumes an additive model and Lamp was set to an additive model. For TDT-based tests, the FBAT/PBAT software package (hbat–p option)12 and the QTDT software package13 were used. P-values were not adjusted for multiple testing. The transcription factor binding site prediction was carried out using AliBaba 2.1 software package using the TRANSFAC database (Biological Databases, Wolfenbüttel, Germany).

Results

The FBPP data set

The combined FBPP data (three networks: GenNet, GENOA, and HyperGEN) comprises a large data set with phenotype data available on 12 593 individuals. It is composed of 39% AA, 40% EA, and 21% HA participants (Table 1). DNA was genotyped on 11 433 (91%) of participants on which DNA was available. The most salient difference among the three networks is the younger age of participants in GenNet, their lower BMI, family ascertainment based on elevated BP rather than hypertension, and a lower frequency of antihypertensive treatment (for details see Online Data Supplement).

Family-based association tests

To maximize power, we analyzed networks jointly, but stratified by ethnicity. SBP and DBP were used as quantitative traits and residuals after regression on age, sex, and BMI were also tested to control for the most important confounders. We used the regression-based Merlin software (see ‘Methods’) for the primary analysis on quantitative traits. We also tested one other regression-based analysis method (GEE; see ‘Methods’) and two TDT-based tests (FBAT/PBAT and QTDT). P-values calculated by Merlin were found to be closely correlated to the P-values calculated with GEE (r2=0.91 286 data points); P-values from the FBAT/PBAT analyses were found to be weakly correlated to the P-values calculated by QTDT (r2=0.27, 140 data points) and uncorrelated to the regression-based tests. Given the close correlation between the regression-based tests and the greater power of these methods (all samples and not only the informative are considered) we used Merlin for primary testing. To estimate power we used the genetic power calculator of Purcell et al.8 Power was close to 0.7 for AA and EA and close to 0.4 for HA, assuming a total QTL effect of 1% (rs2820037 and rs1937506) (see ‘Methods’ and Online Data Supplement for parameters and details).

Table 2 depicts the results of the analysis by Merlin for two systolic, two diastolic phenotypes, and hypertension status. In addition to SBP and DBP, their residuals after regression on age, sex, and BMI were analyzed to investigate effects due to these confounders. The tests carried out are not two independent tests because the parent variable and the residuals remain highly correlated (see phenotype r2 in Table 2). Additionally, SBP is correlated with DBP (r2=0.45, 0.38, and 0.42 for AA, EA, and HA, respectively). We obtained significant results with one SNP (rs1937506) for more than one phenotype tested. This significance reproduces for both systolic phenotypes in EA and HA (Table 2). The effect size due to this variant is large for systolic phenotypes in both EA and HA, but the effect is in opposite directions. The effect sizes and their direction are depicted in Figure 1 for the analysis of SBP residuals. Each additional G allele of rs1937506 is associated with a 24.9 mm Hg decrease of SBP in EA and with a 27.7 mm Hg increase in HA. Considering DBP residuals, each additional allele is associated with an 8.7 mm Hg decrease of DBP in EA, whereas the results are nonsignificant for HA. There is no significant result for this SNP in AA.

Table 2 Analysis of gene effects by ethnicity (regression-based tests)
Figure 1
figure 1

Effect sizes of each additional copy of the G allele in rs1937506 on age, sex, and BMI-regressed systolic blood pressure (SBP) in African-American (AA), European-American (EA), and Hispanic-American (HA) samples. Only the significant effect sizes are depicted. NS, nonsignificant.

We also investigated association with hypertension status using the Lamp software for discrete traits (see ‘Methods’). No significant result was obtained in any population studied (Table 2), although several variants show a trend.

Genomic position of rs1937506 and binding motif analysis

The SNP rs1937506 is located in a 500 kb gene desert on chromosome 13q21. The adjacent genes are a predicted olfactory receptor and the protocadherin 9 isoform 1 precursor genes. Neither gene has been reported to be involved in EH. Sequence analysis reveals an Oct-1 and an ICSBP binding site in close proximity to rs1937506.

Discussion

New genomic tools now enable geneticists to assess genetic variability as the basis of common disease in an unbiased fashion. The results of the WTCCC study on EH using 2000 cases and 3000 controls is the first such example in the field of hypertension. None of the 469 557 SNPs the WTCCC tested (post-genotyping quality control) were significantly associated with hypertension status after correction for multiple testing. Six SNPs with moderate evidence for association were identified (5 × 10−7<P<1 × 10−5) based on a Bayesian's interpretation of detection thresholds. We attempted to replicate these six SNPs in the participants of the FBPP in this study.

The design of the WTCCC genome-wide association study on hypertension has been criticized by the WTCCC investigators and others.6 Although cases were recruited at the extremes of the BP distribution, were diagnosed before 60 years of age, and were mostly non-obese (BMI<30 kg/m2) – all of which are believed to increase power by focusing on more genetic forms – analysis was based on hypertension status as opposed to BP levels. A quantitative phenotype is generally thought to significantly improve power, but only as long as medication does not confound BP or can be adequately controlled for. The absence of genome-wide significant findings in the WTCCC study on HTN might also be due to the following reasons: (1) variants associated with EH were assayed but the results are not significant because of small effect sizes, the experimental design (eg controls are from the general population and will likely contain a significant proportion of untreated hypertensives); (2) the variants associated with EH were not assayed because of insufficient density of markers or markers that are of intermediate or rare frequency. Quantile–quantile plots of each SNP in the WTCCC study show no data point being significantly different from the expected χ2-distribution so that even the most significant variants could simply be chance findings. In addition, the true genome-wide significance level in the WTCCC study might be significantly lower than 5 × 10−7 which is derived from a Bayesian argument rather than a frequentist argument. If this is the case, the probability of the findings being false positives would be even greater.

In this replication study we identify one of the six SNPs as potentially associated with SBP and DBP in 4979 EA and 2646 HA from the FBPP (rs1937506). The variant is not significantly associated with hypertension status. This might appear surprising because the WTCCC study used hypertension status as primary phenotype, but several important differences might explain the result: (1) the WTCCC study focused on non-obese individuals and the FBPP has many obese participants (Table 1); this is in line with our observation that regression of SBP or DBP on BMI leads to lower P-values in EA for rs1937506. (2) Using a quantitative phenotype can increase power.

Considering SBP and DBP as phenotypes, the effect sizes are large but they are in opposite directions in EA and HA. Some less significant results are only observed in EA and not in HA and this might be related to the lower power in HA given the smaller sample size. At this point it is unclear if rs1937506 is the causal variant for the association observed and it is also unclear if ethnicity-specific, opposite-directed effects exist. Molecular biology experiments will confirm or refute this hypothesis. These studies will also need to take into account the genomic position of this SNP in a gene desert. Other variants associated with disease have also been identified in gene deserts by genome-wide association studies.14