A genome-wide association study identifies risk loci for spirometric measures among smokers of European and African ancestry

Pulmonary function decline is a major contributor to morbidity and mortality among smokers. Post bronchodilator FEV1 and FEV1/FVC ratio are considered the standard assessment of airflow obstruction. We performed a genome-wide association study (GWAS) in 9919 current and former smokers in the COPDGene study (6659 non-Hispanic Whites [NHW] and 3260 African Americans [AA]) to identify associations with spirometric measures (post-bronchodilator FEV1 and FEV1/FVC). We also conducted meta-analysis of FEV1 and FEV1/FVC GWAS in the COPDGene, ECLIPSE, and GenKOLS cohorts (total n = 13,532). Among NHW in the COPDGene cohort, both measures of pulmonary function were significantly associated with SNPs at the 15q25 locus [containing CHRNA3/5, AGPHD1, IREB2, CHRNB4] (lowest p-value = 2.17 × 10−11), and FEV1/FVC was associated with a genomic region on chromosome 4 [upstream of HHIP] (lowest p-value = 5.94 × 10−10); both regions have been previously associated with COPD. For the meta-analysis, in addition to confirming associations to the regions near CHRNA3/5 and HHIP, genome-wide significant associations were identified for FEV1 on chromosome 1 [TGFB2] (p-value = 8.99 × 10−9), 9 [DBH] (p-value = 9.69 × 10−9) and 19 [CYP2A6/7] (p-value = 3.49 × 10−8) and for FEV1/FVC on chromosome 1 [TGFB2] (p-value = 8.99 × 10−9), 4 [FAM13A] (p-value = 3.88 × 10−12), 11 [MMP3/12] (p-value = 3.29 × 10−10) and 14 [RIN3] (p-value = 5.64 × 10−9). In a large genome-wide association study of lung function in smokers, we found genome-wide significant associations at several previously described loci with lung function or COPD. We additionally identified a novel genome-wide significant locus with FEV1 on chromosome 9 [DBH] in a meta-analysis of three study populations.


Background
In the United States, chronic obstructive pulmonary disease (COPD) is the third leading cause of death [1]. The major environmental risk factor for COPD is cigarette smoking, but only a minority of smokers will develop COPD during their lifetime [2,3]. COPD risk is most likely the cumulative result of genetic factors, environmental factors such as cigarette smoking, developmental factors, and gene-by-environment interactions [4].
A diagnosis of COPD is based on post bronchodilator spirometric measures of the forced expiratory volume in the first second (FEV 1 ) and the forced vital capacity (FVC), the total volume of air expired after a maximal inhalation [5]. The ratio of FEV 1 /FVC is a widely used measure of airflow obstruction [3]. Understanding the genetics underlying these spirometric measurements may help increase our knowledge of the genetics of COPD.
Initial genome wide analyses of spirometric measures of pulmonary function using family-based linkage analyses identified broad genomic regions on chromosome 1, 2, 4, 8, and 18 [6]. Subsequent genome wide association studies in the Framingham cohort, a population based sample, identified the HHIP gene as a susceptibility locus for FEV 1 /FVC [7]. This Framingham cohort was combined with several other population based cohorts forming the CHARGE consortium of greater than 20,000 individuals. This sample, along with the Spiro-Meta consortium, another population based sample of over 20,000 individuals, provided the sample for a series of meta-analyses; one used CHARGE as a discovery population with subsequent replication in SpiroMeta [8], a second used SpiroMeta as the discovery population with selected genotyping in an additional 32,000 individuals and a pooled meta analysis with the CHARGE consortium [9], and a third combined CHARGE and SpiroMeta in the discovery phase (n = 48,201) with SNP replication in an additional combined population based sample of 46,411 individuals [10]. The first two of these meta-analyses confirmed HHIP as a susceptibility locus for FEV 1 /FVC and identified multiple additional loci that were significantly associated with spriometric measures of pulmonary function. The third meta-analysis identified 16 new loci for pre bronchodilator pulmonary function in addition to 10 previously reported loci [10].
To examine the role of smoking on the genetic susceptibility to spirometric measures of pulmonary function, the CHARGE/SpiroMeta samples with additional European ancestry samples totalling more than 30,000 individuals were stratified by smoking status (ever versus never smokers) [11]. Among smokers, a novel signal on chromosome 15q25 (CHRNA5/A3/B4 was identified for airflow obstruction defined as pre bronchodialator FEV 1 and FEV 1 /FVC below the lower limit of normal. This is the major genomic region for nicotine dependence and smoking exposure and related traits [12]. More recently a genome wide study of pulmonary function identified the CYP2U1 gene [involved in nicotine metabolism], however, this was not replicated in the CHARGE/SpiroMeta sample [13]. Given that smoking is the major environmental determinant of pulmonary function decline, we performed a GWAS of the full quantitative range of post bronchodilator pulmonary function in 9919 current and former smokers of the COPDGene study with complete data and genome wide genotyping. We hypothesized that we would identify novel genetic loci and replicate known genomic regions affecting pulmonary function by performing a GWAS of post bronchodilator spirometric measures in the COPDGene study, a multi-center observational study designed to identify genetic factors associated with risk of COPD. In addition, to insure the GWAS results are generalizable beyond a single study, we performed a meta-analysis of post bronchodilator FEV 1 and FEV 1 /FVC ratio over three similar studies: the COPDGene, ECLIPSE, and GenKOLS studies. Characteristics of the 3 studies (COPDGene, ECLIPSE, and GenKOLS) are given in Table 1. We also assessed whether different genomic regions were associated with spirometric measures of pulmonary function separately among non-Hispanic white (NHW) and African-American (AA) COPDGene subjects.

COPDGene GWAS in Non-Hispanic Whites
For FEV 1 among all NHW COPDGene participants, several SNPs at the 15q25 locus [near CHRNA3, CHRNA5, CHRNB4, AGPHD1, and IREB2] reached genome-wide significance. For FEV 1 /FVC among all NHW COPDGene subjects, several SNPs in the same region on chromosome 15 reached genome-wide significance. Tables 2 and 3 show p-values for the most significant SNPs in these regions. Additional file 1: Tables S5-S6 list all SNPs with a p-value less than 5 × 10 −6 .

COPDGene GWAS in African Americans
For both FEV 1 and FEV 1 /FVC among all AA COPDGene subjects, there were a few SNPs that were genome-wide significant, but these SNPs were all imputed, with low minor allele frequency (<5 %), and in a region with no other non-imputed SNPs. Therefore, we are not confident that these signals are valid associations. The top SNPs for these analyses with p-values less than 5 × 10 −6 can be found in the Additional file 1: Tables S1 and S2.
Results from the Meta-Analysis: COPDGene participants combined with the ECLIPSE and GenKOLS cohorts For FEV 1 and FEV 1 /FVC, most of the genome-wide significant results were again at the 15q25 locus, on chromosome 4 in FAM13A, and on chromosome 4 near HHIP. In addition, Case-only analyses among NHW in COPDGene, AA in COPDGene, and within the meta-analysis In addition, we examined the genetic susceptibility to variation in these pulmonary function phenotypes in COPD cases only (GOLD stages 2-4) (2820 NHW in COPDGene, 821 AA in COPDGene, 1764 NHW in ECLIPSE, and 863 NHW in GenKOLS). While no SNPs reached genome-wide significance for either FEV 1 /FVC or FEV 1 in the case-only analyses, Additional file 1: Tables S3-S4, S7-S8, and S11-S12 show the top SNPs for these analyses. Some of the regions that met genome-wide significance in the entire study population had at least nominal evidence for association in COPD cases only. In addition to the well-established COPD genetic loci near HHIP, FAM13A, and CHRNA3/5, the MMP3/MMP12 and DBH regions had P < 0.001 evidence for association to lung function levels within COPD cases. Power may have been limited in the caseonly analysis, but it is also possible that genetic determinants are more important for the presence/absence of COPD than for the severity of airflow obstruction within COPD cases.   The "Coded Allele" column refers to the reference allele where the first reference allele is for the COPDGene NHW cohort, the second reference allele is for the COPDGene AA cohort, and the third reference allele is for the meta-analysis of COPDGene NHW, COPDGene AA, ECLIPSE and GenKOLS. Note that FEV 1 /FVC was measured on the proportion scale (0-1) and not the percentage scale (0-100) The "Coded Allele" column refers to the reference allele where the first reference allele is for the COPDGene NHW cohort, the second reference allele is for the COPDGene AA cohort, and the third reference allele is for the meta-analysis of COPDGene NHW, COPDGene AA, ECLIPSE and GenKOLS Note that there is a similar but narrower and less significant signal in AA subjects for both spirometric measures of pulmonary function. While no single SNP reaches genome-wide significance in this region among the AA subjects, there is a region near CHRNA3 with several SNPs that have p-values in the range of 5x10 −6 . It appears that the results among AA are similar to those in NHW but may not have the power to reach genomewide significance due to the smaller sample size. Figure 2 shows qualitatively different association results in the NHW (highly associated) and AA (not associated) subjects for the region on chromosome 4 near HHIP, although reduced power in the AA subjects could contribute to reduced evidence for association.

Comparison to previously published spirometry GWA studies
We considered genome-wide significant results from a previously published spirometry GWA analysis in general population samples, which combined the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) and SpiroMeta studies [10]. Table 4 shows p-values of these loci from the CHARGE/SpiroMeta analyses in the COPDGene subpopulations analysed above. Except for HHIP and FAM13A, none of the other regions achieved genome-wide significance (p-value < 5 × 10 −8 ) in the COPDGene cohort or in the meta-analysis of COPDGene, GenKOLS, and ECLIPSE. However, quite a few SNPs had a signal in the same direction as CHARGE/SpiroMeta and met a nominal levels of significance using Bonferroni correction (p-value < 0.0018 for the 28 regions tested) in the meta-analysis of these three study populations, including SNPs in or near AGER, MFAP2, RARB, GSTCD, NPNT, SPATA9, ADAM19, THSD4, and CFDP1.

A comparison of post and pre bronchodilator FEV 1 and FEV 1 /FVC among NHW in the COPDGene cohort
Post bronchodilator pulmonary function provides the ability to separate individuals with reversible pulmonary function impairments, which is indicative of asthma from those individuals whose pulmonary function is not reversible with a bronchodilator. Thus, measuring post bronchodilator spirometry provides a phenotype that is more homogeneous with respect to the nature of the pulmonary function impairment. We hypothesized that a GWA of post bronchodilator FEV 1 and FEV 1 /FVC would be similar or more powerful than a GWA of pre bronchodilator FEV 1 and FEV 1 /FVC. To test this hypothesis, we performed a GWA of pre bronchodilator FEV 1 and FEV 1 /FVC in the COPDGene cohort among NHW. The correlation between pre and post bronchodilator FEV 1 is 0.95 and the correlation between pre and post bronchodilator FEV 1 /FVC is 0.98. Additional file 1: Tables S13 and S14 show the genome wide significant results for these analyses. The results for the GWA of pre bronchodilator FEV 1 are similar to those of post bronchodilator FEV 1 among NHW in the COPDGene cohort. [HHIP] are significantly associated with these measures. These comparisons suggest that previous GWAS [7 -13] are not biased due to the inclusion of individuals with bronchodilator reversibility. There appears to be only a modest loss in signal in GWAS of pre bronchodilator FEV 1 compared to post bronchodilator FEV 1 , and no apparent difference in the signal between pre and post bronchodilator FEV 1 /FVC ratio.

Discussion
To the best of our knowledge, this is the first GWAS of post bronchodilator pulmonary function. These analyses were performed in a large cohort of current and former smokers with the full range of pulmonary function from normal values to severely impaired. We identified multiple loci that were genome wide significant for post bronchodialater FEV 1 and FEV 1 /FVC in both the COPD-Gene cohort and in the combined meta-analyses. The most significant association for both FEV 1 and FEV 1 /FVC among NHW in the COPDGene cohort and in the combined meta analysis was on chromosome 15q25 [CHRNA3]. This region contains a cluster of nicotinic receptors that are associated with nicotine dependence, COPD case status, lower limit of normal for pre bronchodilator airway obstruction, lung cancer, and other smoking related traits [14][15][16][17][18][19][20][21][22][23]. A recent analysis by our group suggested this region may both directly and indirectly affect COPD affection status through nicotine dependence [24]. Other genes within this region in linkage disequilibrium also demonstrate significant associations with post bronchodialator FEV 1 and FEV 1 /FVC ratio including CHRNA5/B4, IREB2, AGPHD1, and ADAMTS7.
Our results suggest common genetic susceptibility to post bronchodilator FEV 1 and FEV 1 /FVC, pre bronchodilator measures of pulmonary function, and COPD affection status. We confirmed previous GWA association The SNP with the lowest p value within each region or gene from the CHARGE/Spirometa consortium is listed ordered by chromosome number [10]. Quite a few SNPs met a nominal levels of significance using Bonferroni correction (P < 0.0018 for the 28 regions tested) *In Table 4, while this SNP is not significant in our cohort and meta-analysis, rs12048582 in the TGFB2 gene was genome wide significant (p-value = 6.28E-09) with HHIP, TGFB2, FAM13A, MMP12, MMP3, CYPA7, CYP2A6, and RIN3 from previous studies with our results on post bronchodilator FEV1 and FEV1/FVC.

Significant Spirometry Results associated with COPD Affection status in COPDGene
In this study, we found that multiple genetic determinants of COPD affection status were associated with spirometric measures of lung function in COPDGene. These results are not at all surprising, since COPD is defined by lung function thresholds. In the GWAS results of COPD and severe COPD affection status in COPDGene [25], there was a significant association with affection status and SNPs in the 15q25 region [CHRNA3, CHRNA5, CHRNB4, AGPHD1, and IREB2], HHIP on chromosome 4, FAM13A on chromosome 4, TGFB2 on chromosome 1, MMP3/ MMP12 on chromosome 11, and RIN3 on chromosome 14. We found evidence for all of these regions as genetic determinants of spirometric measures in COPDGene as well.
The RIN3 locus on chromosome 14 has not been previously associated with lung function, although SNPs in RIN3 were associated with COPD affection status in the COPDGene study [25]. We identified an association near RIN3 with FEV 1 /FVC in the COPDGene cohort. RIN3, a Rab5 GTPase binding protein, is expressed in many tissues, including the lung [31,32].

Significant spirometry GWA results not significantly associated with affection status in COPDGene
While not previously associated with lung function, DBH on chromosome 9 has been associated with smoking intensity [33]. Our finding represents the first evidence of association of this locus with lung function. Although the SNP identified in this study (rs1108581) does not cause amino acid residue changes in DBH, gene expression may be modified either directly or through other variant(s) in strong LD. This view is supported by evidence that a genetic variant (C-1021T or rs1611115), located upstream of the DBH translational start site, accounts for 51 % of the variation in plasma-DBH activity in NHW [33]. SNPs near CYP2A7 and CYP2A6 on chromosome 19 have been associated with lung cancer, cigarette smoking, and COPD [34,35]. Notably, both of these loci were significant despite adjustment for cigarette smoking status.

Novel nature of COPDGene study
The COPDGene study is novel in several ways. There are many subjects with severe and very severe COPD (GOLD spirometry grades 3-4). There were sufficient numbers of both AA and NHW subjects to allow reasonable power to detect a genetic association with quantitative spirometric measures in these stratified samples. COPDGene has carefully collected standardized spirometric measures and post-bronchodilator spirometry. In addition, all COPDGene subjects were former or current smokers.

Potential limitations
The COPDGene cohort was ascertained based on smoking status and GOLD stage. Analysing secondary phenotypes in a case-control study can be biased due to this ascertainment condition. However, this is only an issue for SNPs associated with both the ascertainment condition and the secondary phenotype. Since our analysis focused on measures of pulmonary function (one of the primary ascertainment conditions) and adjusted for smoking status, our analysis should be robust against this sampling bias [36]. While COPDGene includes both AA and NHW, the sample size of AA subjects was considerably smaller and therefore had limited statistical power.

Conclusions
The GWA of lung function measures in COPDGene identified a novel locus on chromosome 9/DBH among NHW as being associated with common spirometric measures, and replicated multiple previously reported genetic loci for lung function. Further research will be required to determine the functional genetic variants within these regions of association.

COPDGene study subjects
COPDGene is a multi-center study performed in the United States that has been described in detail previously [37]. The COPDGene study included 10,192 current and former smoking participants with at least 10 packyears of cigarette smoking history and ages ranging from 45-81. Table 1 details characteristics of the COPDGene participants included in the genome-wide association analysis. We excluded subjects from the analysis with severe alpha-1 antitrypsin deficiency, genotyping failure, or spirometric tests which failed quality control review. This resulted in 9919 subjects (6659 NHW and 3260 AA). Among these subjects, there were 3641 COPD cases (2812 NHW and 821 AA) defined as having Global Initiative for Chronic Obstructive Lung Disease (GOLD) spirometry stages 2-4 with post-bronchodilator FEV1/FVC < 0.70 and FEV1 < 80 % of predicted values.

Post bronchodilator spirometry measurements
Spirometry in COPDGene was performed using a standardized spirometer (EasyOne by ndd Medical Technologies, Inc, Andover, MA). Spirometry was performed at baseline and repeated approximately 20 min after two puffs (180 mcg) of albuterol administered through a spacer. The analyses in this manuscript focused on the postbronchodilator spirometric values. Pulmonary function measurements were collected according to the American Thoracic Society/European Respiratory Society guidelines [38]. Methodology for spirometric measures have been described in detail previously [37].

Genotyping, quality control and imputation
All COPDGene subjects were genotyped using the Illumina HumanOmniExpress by Illumina (San Diego, CA). Details of genotyping quality control have been previously described [25]. Imputation on the COPD-Gene cohorts was performed using MaCH and minimac [39,40]. Prephasing and imputation were both performed using 30 rounds and 200 states, with regions divided into 1 megabase and 500 kb flanks. Reference panels for the NHW and AA subjects were the 1000 Genomes Phase I v3 European (EUR) and cosmopolitan reference panels, respectively [41]. Variants with an R-squared value of ≤ 0.3 were removed from further analysis. SNPs with minor allele frequency less than 1 % were excluded. Further details concerning genotyping, quality control, and imputation are posted on the COPDGene website (http:// www.copdgene.org). All SNP genomic locations are based on the NCBI37/hg19 assembly.

Meta-analysis study populations: ECLIPSE and GenKOLS
The Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints (ECLIPSE) was a longitudinal, prospective, observational study conducted at 46 clinical centers in 12 countries with genome-wide SNP data available from 1764 COPD cases and 178 current or ex-smoking controls [42]. The GenKOLS GWAS cohort consists of 863 COPD cases and 808 controls from Bergen, Norway. Genotyping methods and study descriptions for the GenKOLS and ECLIPSE cohorts have been described previously [43,44]. We limited our analysis in both studies to current or ex-smokers of European descent.

Statistical analyses
GWA analyses were performed in PLINK (v1.07) and were stratified by race [45]. Linear regression analyses of FEV 1 and the ratio of FEV 1 /FVC were adjusted for age, gender, pack-years, height and genetic ancestry (as summarized by principal components) by including these covariates in the model. The primary analyses were performed on the whole cohort (including smoking controls with normal spirometry and GOLD stages 2 to 4 COPD in all cohorts, and additionally individuals with unclassified spirometry [FEV 1 < 80 % predicted but FEV 1 /FVC ≥0.7], and GOLD stage 1 COPD with FEV 1 ≥ 80 % predicted but FEV 1 /FVC < 0.7 in COPDGene). A secondary analysis was limited to only GOLD 2-4 cases.

Ethics
The COPDGene, ECLIPSE, and GenKOLS studies were all approved by the respective clinical center institutional review boards. The COPDGene, ECLIPSE, and Gen-KOLS studies met IRB protocol approved by the NHLBI for human subjects research. For the COPDGene study and the meta-analysis study conducted in this manuscript, we have obtained IRB approval from the Colorado Multiple Institutional Review Board (COMIRB) at the University of Colorado, Colorado School of Public Health.

Consent
We have obtained written informed consent from the subjects to participate in these studies. We have obtained written informed consent to publish from the participants of the COPDGene, ECLIPSE and GenKOLS studies and no individual patient data or individual clinical data is presented in this manuscript.

Availability of data and materials
The datasets used in this paper can be found at http:// www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi? study_id=phs000179.v1.p1

Additional file
Additional file 1: The supplement for this manuscript contains the following information, tables, and figures. The COPD Foundation funding and a list of the COPDGene and ECLIPSE investigators are given in the supplement. Tables S1-S12 list the top SNPs with a p-value <5E-06 for FEV 1 and FEV 1 /FVC for all subjects and cases only among AA in the COPDGene study, NHW in the COPDGene study, and in the meta-analysis of the COPDGene, ECLIPSE and the GenKOLS studies. Tables S13 and S14 provide a comparison of genome-wide significant results for pre and post bronchodilator FEV 1 and FEV 1 /FVC among NHW in the COPDGene study. Figures S1 and S2 are region plots for the genome-wide significant results for FEV 1 and FEV 1 /FVC, respectively, in the meta-analysis. (DOC 15181 kb) Abbreviations COPD: Chronic obstructive pulmonary disease; SNP: Single nucleotide polymorphism; MAF: Minor allele frequency; AA: African American; NHW: Non-Hispanic White; FEV 1 : Forced expiratory volume in the first second; FVC: Forced vital capacity, the total volume of air expired after a maximal inhalation; GWAS: Genome wide association study.

Competing interests
Regarding conflicts of interest, in the past three years, Edwin K. Silverman received honoraria and consulting fees from Merck and grant support and consulting fees from GlaxoSmithKline. Craig Hersh received lecture fee from Novartis and consulting fees from CSL Behring. David Lomas is a consultant and has received grant support and honoraria from GlaxoSmithKline. He chairs the Respiratory Therapy Area Board at GlaxoSmithKline. No other authors reported conflicts of interest. The funding sources played no role in the design of the study or the decision to submit the manuscript for publication.