Genome-Wide Analysis of DNA Methylation and Fine Particulate Matter Air Pollution in Three Study Populations: KORA F3, KORA F4, and the Normative Aging Study

Background: Epidemiological studies have reported associations between particulate matter (PM) concentrations and cancer and respiratory and cardiovascular diseases. DNA methylation has been identified as a possible link but so far it has only been analyzed in candidate sites. Objectives: We studied the association between DNA methylation and short- and mid-term air pollution exposure using genome-wide data and identified potential biological pathways for additional investigation. Methods: We collected whole blood samples from three independent studies—KORA F3 (2004–2005) and F4 (2006–2008) in Germany, and the Normative Aging Study (1999–2007) in the United States—and measured genome-wide DNA methylation proportions with the Illumina 450k BeadChip. PM concentration was measured daily at fixed monitoring stations and three different trailing averages were considered and regressed against DNA methylation: 2-day, 7-day and 28-day. Meta-analysis was performed to pool the study-specific results. Results: Random-effect meta-analysis revealed 12 CpG (cytosine-guanine dinucleotide) sites as associated with PM concentration (1 for 2-day average, 1 for 7-day, and 10 for 28-day) at a genome-wide Bonferroni significance level (p ≤ 7.5E-8); 9 out of these 12 sites expressed increased methylation. Through estimation of I2 for homogeneity assessment across the studies, 4 of these sites (annotated in NSMAF, C1orf212, MSGN1, NXN) showed p > 0.05 and I2 < 0.5: the site from the 7-day average results and 3 for the 28-day average. Applying false discovery rate, p-value < 0.05 was observed in 8 and 1,819 additional CpGs at 7- and 28-day average PM2.5 exposure respectively. Conclusion: The PM-related CpG sites found in our study suggest novel plausible systemic pathways linking ambient PM exposure to adverse health effect through variations in DNA methylation. Citation: Panni T, Mehta AJ, Schwartz JD, Baccarelli AA, Just AC, Wolf K, Wahl S, Cyrys J, Kunze S, Strauch K, Waldenberger M, Peters A. 2016. A genome-wide analysis of DNA methylation and fine particulate matter air pollution in three study populations: KORA F3, KORA F4, and the Normative Aging Study. Environ Health Perspect 124:983–990; http://dx.doi.org/10.1289/ehp.1509966


Introduction
Ambient air pollution has been associated with total mortality, as well as cardio respiratory disease morbidity and mortality (Brook et al. 2010;Hoek et al. 2013). Recently, association between long-term exposure to ambient air pollution, benzene and nitrogen dioxide, and lung cancer has been reported in North America and Europe (Puett et al. 2014;Raaschou-Nielsen et al. 2013;Villeneuve et al. 2014). Especially fine particulate matter [PM < 2.5 μm (PM 2.5 )] is believed to be responsible for the associations. The World Health Organization (WHO) estimates 3.7 million premature deaths worldwide in 2012 due to ambient air pollution (WHO 2014).
Findings based on animal models suggest that oxidative stress and inflammatory responses initiated upon deposition of fine PM in the alveoli may be key pathophysiologic mechanisms linking exposure to ambient fine particles to both respiratory and cardiovascular diseases in humans (Cassee et al. 2013). Oxidative stress and inflammation have also been proposed as underlying mechanisms linking PM and cancer, including lung cancer (Soberanes et al. 2012;Zhao et al. 2013). Despite these findings, the extent to which systematic effects are elicited by ambient particles, and the detailed pathways activated are still under debate (Peters 2012). Novel molecular approaches such as genome-wide methylation assays allow a hypothesis-free assessment of changes in the regulation of blood leukocytes, involved in CVD development .
Changes in global methylation as well as in candidate genes (Bind et al. 2014) were observed in individuals with high-occupational exposure such as foundry workers in a small study ) or in response to ambient PM concentrations a few hours before the study visit ). However it is difficult to determine the exact time window associated with methylation.
Genome-wide methylation assays allow taking advantage of advances in biological technologies in epidemiological studies (Christensen and Marsit 2011) and studying in particular the role of ambient fine particle Background: Epidemiological studies have reported associations between particulate matter (PM) concentrations and cancer and respiratory and cardiovascular diseases. DNA methylation has been identified as a possible link but so far it has only been analyzed in candidate sites. oBjectives: We studied the association between DNA methylation and short-and mid-term air pollution exposure using genome-wide data and identified potential biological pathways for additional investigation. Methods: We collected whole blood samples from three independent studies-KORA F3 (2004)(2005) and F4 (2006)(2007)(2008) in Germany, and the Normative Aging Study (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007) in the United States-and measured genome-wide DNA methylation proportions with the Illumina 450k BeadChip. PM concentration was measured daily at fixed monitoring stations and three different trailing averages were considered and regressed against DNA methylation: 2-day, 7-day and 28-day. Meta-analysis was performed to pool the study-specific results. results: Random-effect meta-analysis revealed 12 CpG (cytosine-guanine dinucleotide) sites as associated with PM concentration (1 for 2-day average, 1 for 7-day, and 10 for 28-day) at a genome-wide Bonferroni significance level (p ≤ 7.5E-8); 9 out of these 12 sites expressed increased methylation. Through estimation of I 2 for homogeneity assessment across the studies, 4 of these sites (annotated in NSMAF, C1orf212, MSGN1, NXN) showed p > 0.05 and I 2 < 0.5: the site from the 7-day average results and 3 for the 28-day average. Applying false discovery rate, p-value < 0.05 was observed in 8 and 1,819 additional CpGs at 7-and 28-day average PM 2.5 exposure respectively. The objective of the analyses presented here is to identify and investigate DNA methyla tion at CpG (cytosine-guanine dinucleotide) sites in association with shortand mid-term PM 2.5 ambient exposure. In addition, we consider biological pathways that might mediate associations between PM 2.5 and health outcomes, based on the specific CpG sites identified.

Methods
Three independent cohort studies formed the basis for the analyses presented here. Uniform methods were applied for fine particle measurements and methylation methods.

Study Populations
KORA F3 and F4 cohorts are follow-up studies from the previous KORA S3 and S4, which enrolled all inhabitants of German nationality between the ages of 25 and 74 years old from the region of Augsburg, South Germany, in accordance with principles of the Declaration of Helsinki (World Medical Association Declaration of Helsinki 2008). They included 3,988 participants from F3 and 4,227 participants from F4; data were collected between 2004 and 2005 (F3) and between 2006 and 2008 (F4). Exhaustive information about these two studies has been described previously (Holle et al. 2005;Wichmann et al. 2005). Methylation profiles were evaluated for a total of 500 KORA F3 participants and 1,799 F4 participants. No sample overlap appears between F3 and F4. All participants supplied written informed consent that were approved by the Ethics Committee of the Bavarian Medical Association.
The Veteran Affairs (VA) Normative Aging Study (NAS) is an ongoing longitudinal study of aging, which began in 1963; details of this study have been published previously (Bell et al. 1972). Briefly, the NAS is a closed cohort of 2,280 male volunteers from the Greater Boston area who were 21-80 years old at entry. They were enrolled after an initial health screening determined that they were free of known chronic medical conditions. The present study was approved by the Department of Veterans Affairs Boston Healthcare System, and written informed consent was obtained from subjects prior to participation. The NAS participants have been reevaluated every 3-5 years using detailed on-site physical examinations and questionnaires. Blood samples were provided from 657 participants and for most of them a second sample was drawn (1,119 samples in total) between 1999 and 2007.
We restricted the current analysis to white participants (n = 657) in order to increase comparability across the three cohorts.

Profiling of DNA Methylation
We used the Illumina 450k Beadchip (following the Illumina Infinium HD Methylation Protocol) to assess DNA methyla tion in more than 480,000 CG CpG methylation sites throughout the entire genome (Zeilinger et al. 2013). Detailed validation and evaluation of this technology are provided by Sandoval et al. (2011) andDedeurwaerder et al. (2011). Outputs of the chip are beta values that represent the percentage of methylation for every CpG target. Since the microarray measures each CpG site with either of two technically distinct types of probes, the distribution of resulting methylation values differs. We used the following approach to preprocess the data: a) data quality: removal of records according to functional beads, detection p-value and SNP frequency; b) data correction: background subtraction and dye bias adjustment; c) probe type adjustment: beta-mixture quantile normalization (BMIQ; Teschendorff et al. 2013). Normalization process was chosen based on review papers Wu et al. 2014).

Environmental Measurement
Specifically, in KORA, PM 2.5 mass concentration in ambient air and temperature were measured hourly at one monitoring station approximately 1 km southeast of the city center of Augsburg for the entire study period from 2004 to 2008 (Birmili et al. 2010;Pitz et al. 2008) with the tapered element oscillating microbalance device (TEOM® series 1400A, Thermo Electron Corporation, East Greenbush, NY USA) as described by Patashnick and Rupprecht (1990). Forty-four days were missing in KORA in 2004-2008 and eventually excluded from calculation of trailing averages.
In NAS, ambient PM 2.5 concentration was monitored in downtown Boston 1 km from the VA medical center. We measured hourly PM 2.5 concentrations with the same device as in Augsburg. Hourly temperature data were obtained from the Boston Logan Airport (Boston, MA) weather station (12 km from the medical center). Sampling, processing of samples, analysis and reporting were conducted according to standard operating procedures (Dockery et al. 2005). Missing hourly concentration data for PM 2.5 were imputed using regression modeling, including a long-term time trend, day of week, hour of day, temperature, relative humidity, barometric pressure, and nitrogen dioxide concentrations (NO 2 ) as predictors.

Statistical Analysis
An Epigenome-Wide Analysis Study (EWAS) was conducted in each of the three studies. Based on previous knowledge Brüske et al. 2010;Steenhof et al. 2014;Zeilinger et al. 2013) we defined a priori model with the following covariates: age, personal income (education years for NAS, in which information on income was not available), alcohol intake, BMI, temperature (trailing average always matching with the PM exposure window), and the proportion of five white blood cell types: monocytes, B Cells, CD8 T cells, CD4 T cells, NK [estimated with a method developed by Houseman et al. (2012)] as continuous variables; and sex, smoking status (never, former, current and passive-only for KORA-smokers), day of the week, and season (according to the astronomical definition) as categorical variables (Table 1). In order to investigate the association between short-and mid-term PM 2.5 and DNA methylation, we considered three different averaging periods (2-, 7-and 28-day) backwards starting from the day of the visit, decided a priori based on Bind et al. (2014), Schwartz (2000) and Rückerl et al. (2007). For KORA, multi variable linear regression models (Equation 1) were used to investigate the association between PM 2.5 exposure and methyla tion values: where Y i is the methylation measurement for subject i, β 0 is the intercept, β 1 and β 2 are the coefficients of the trailing average values for exposure and temperature during the specific time window, X 3i to X pi are the p-2 covariates and ε i is the error. Effect estimates represent the difference in methylation associated with a 10 μg/m 3 increase in PM 2.5 . For NAS data, we fitted generalized mixed-effect models (Equation 2) in order to account for the repeated measurements; time-variant covariates were assessed at both first and second visit and a random participant effect (u i ) was applied in order to take the data collection at two different time points into account: Finally, we pooled cohort-specific estimates, when available for all three studies, for each exposure window by randomeffect meta-analysis (428,415 CG targets). Bonferroni threshold (fixed at 7.5E-08) and false discovery rate [FDR, (Benjamini and Hochberg 1995)] with Benjamini-Hochberg criterion was used to adjust fixed-effect p-values for multiple comparisons. I-squared test on fixed-effect estimates have been used to assess heterogeneity and CpGs with p-values > 0.05 and I 2 < 0.5 were labeled as homogenous. Finally, a number of sensitivity analyses were performed. We repeated the a priori models with additional adjustment for average annual exposure during the year before the visit to assess potential confounding by longterm exposure. In addition, we ran models adjusted only for age and sex, and models adjusted only for age, sex, and white blood cell proportions. All analyses were performed using statistical software R (version 2.14; R Core Team 2014

Results
Data from three independent cohort studies were available (Table 1). Specifically, crosssectional data from two independent subsamples of the KORA study (KORA F3, n = 500 participants; F4, n = 1,799) and cohort data collected as part of the NAS (n = 657) formed the basis of the analyses presented here. The NAS included only men with an average age of 72 years, while KORA F3 and F4 participants (52% and 49% of males) were on average 53 and 61 years old. While body mass index was rather similar, substantial differences were observed for years of education (mean of 15.1 in NAS versus 11.7 and 11.5 in KORA F3 and F4, respectively) and alcohol consumption (19.7% of drinkers for NAS versus 59.2% and 57.7% for KORA F3 and F4, respectively). Regarding smoking, KORA F3 consisted mostly of never and current smokers, KORA F4 of former and current, whereas approximately two-thirds of NAS participants were former smokers. The NAS, on average, had lower particle concentration the day before the visit but had higher average temperatures than did the KORA studies. During the study period, PM 2.5 exceeded the daily U.S. Environmental Protection Agency (EPA) standard of 35 μg/m 3 7.5% of the days in F3 (2004-2005), 5.9% in F4 (2006-2008) and 2.9% in NAS (1999-2007) (U.S. EPA 2004). Consistent methylation averages were observed between the three studies with relatively small standard deviations (Tables 2-3).
The meta-analyses identified genome-wide significant (p < 7.5E-08) associations between PM 2.5 exposure averaged over 2 days up to 4 weeks and single CpG-sites ( Figure 1). DNA methylation at one CpG site (cg25575464 within NEURL4, chromosome 17) reached genome-wide significance (p < 7.5E-08) in association with 2-day trailing average PM 2.5 , with a positive association indicating higher methylation at 10-μg/m 3 increase in exposure (Table 2; see also Figure S1). Although studyspecific associations were all positive, there was significant heterogeneity among the studies. For 7-day average PM 2.5 concentration, the association with one CpG site, cg19963313 (NSMAF, chr 8) reached genome-wide significance (Table 2), and study-specific estimates were positive and homogenous (I 2 = 0.0, p-value 0.59) ( Figure 2). Associations between 7-day PM 2.5 and cg02608596 (MPND, chr 9) also were positive and homogeneous among the three studies, but the p-value was slightly above the alpha level for genome-wide   (Table 3). Study-specific associations were homogenous for cg23276912, cg11046593, and cg26003785 (Figure 3), but heterogeneous for the other CpGs (see Figure S1). When we considered all associations with FDR p < 0.05, a total of 1,829 CpG sites were associated with 28-day average PM 2.5 (see Excel File S1), including 5 in genes with at least one Bonferroni significant CpG (also shown in Table 3): cg16856342 (SERBP1, chr 1), cg02795981 (ZMIZ1, chr 10), cg24101979 and cg26283240 (NXN, chr 17) and cg06004017 (MN1, chr 22). One CpG with a significant FDR for 28-day PM 2.5 reached genome-wide significance for 7-day PM 2.5 (cg19963313, NSMAF, chr 8).

Sensitivity Analysis
Genome-wide significant CpGs at 28-day were also adjusted for long-term exposure and resulted in consistent estimates and p-values,    Benjamini-Hochberg (1995) method, significance level at 0.05. d Non-Bonferroni significant but FDR significant CpGs located in the a gene with a Bonferroni significant CpG. e Shown in Figure 3. f FDR significant and Bonferroni significant at 7-day PM 2.5 . g NA: no annotated gene. Note: 28-day Trailing average starts from the day of the visit. A complete list of all CpGs that meet genome-wide significance or FDR significance for 28-day PM 2.5 is provided in Excel File S1. except for cg20680669 which estimate moved from a ß = -0.0049 with p = 2.09E-08 (without long-term) to β = -0.0020 with p = 2.36E-03 and cg26003785 which moved from β = 0.0038 with p = 9.53E-09 to β = 0.0033 with p = 1.10E-06 (see Table S2). Furthermore, we checked for potential influences of outliers (see Figures S2-S4). Cg11046593 was of concern in these plots and 22 values were excluded for F4, 1 for F3 and 12 for NAS. However, the association remained significant: the estimate moved from 0.016 to 0.012 and the p-value from 1.12E-08 to 5.48E-08.

Discussion
This meta-analysis of three cohort studies identified 12 CpGs genome-wide significantly associated with ambient fine particulate matter concentrations at different exposure times based on Bonferroni corrections. Based on previous knowledge (Bind et al. 2014;Rückerl et al. 2007;Schwartz 2000), we considered three different cumulative exposure windows: 2, 7 and 28 days, and we observed that the number of associations was larger for the longest exposure window. Nine CpG sites displayed increased methylation and three decreased methylation after exposure to fine ambient particle concentrations. All identified methyla tion sites displayed little overall variation (average co-efficient of variation was 15%) within the study populations. Four of them manifested homogeneous changes across the three different studies. Applying FDR, 7 and 1,819 additional CpGs were found significant at 7-and 28-day average PM 2.5 exposure, respectively. The CpG site (cg19963313) identified with the 7-day trailing average showed homogeneity among the studies. Cg19963313 is positioned in the gene NSMAF that is linked with the 55kD tumor necrosis factor receptor since it encodes a WD-repeat protein that binds its cytoplasmic sphingomyelinase activation domain (Montfort et al. 2010). Moreover, it participates in the same reaction within a pathway as SMPD2 (Wu et al. 2010), which has been demonstrated in primary cells to be linked to oxidative stress (Byon et al. 2008;Jana and Pahan 2007). Furthermore, it has been identified in cellular response to hyperosmolar stress (Robciuc et al. 2012). Hyperosmolarity is well known to impose remarkable stress on membranes, especially the ones that are in direct contact with the environment (Hallows et al. 1996), but it has never been associated with air pollution.
Furthermore we identified three CpG sites significantly and homogeneously associated with the 28-day average exposure to fine particle: cg26003785, cg11046593 and cg23276912 annotated to NXN, MSGN1 and C1orf212, respectively, which are proteincoding genes.
Specifically, NXN has been observed as partner of phosphofructokinase (PFK) 1, a glycolytic enzyme, reported as contributor for systemic metabolic conditions and also cancerous processes (Mor et al. 2011;Yi et al. 2012).
Increased methylation was detected at cg11046593, located in the promoter of MSGN1, that-when methylated-has been shown to lead to transcriptional repression (Jones and Takai 2001). Domain databases also determined shared protein domain with AHR (aryl hydrocarbon receptor) and ARNT (aryl hydrocarbon receptor nuclear translocator), involved in regulation of inflammatory  (Scrivo et al. 2011;Ukena et al. 2010). It was found that these two genes regulate chemokine-responses mostly relating AHR and ARNT to the nuclear factor-κB family (NF-κB) where the p65/p50 dimer is pivotal in the regulation of the inflammatory responses (Øvrevik et al. 2014;Vogel and Matsumura 2009). AHR and particulate matter exposure have already been associated through nongenotoxic events and Th17 polarization (Andrysik et al. 2011;van Voorhis et al. 2013), but here we observed an epigenetic factor as possible mediator. Even without a direct association, the discovery of MSGN1 provides a novel hypothesis in the path between exposure to endogenous factors and immunological system responses and future studies are needed to verify and eventually clarify the possible role of ARNT.

Temporal Variation within Short-and Mid-Term Range
For cumulative exposures over 28 days, 10 CpG sites were genome-wide significant. Larger datasets are needed to better understand the optimal exposure time window and to confirm a hypothesis, that it may be CpG site-specific. The cases of cg25575464 (Bonferroni significant at 2-day, FDR significant at 7-day and non-significant at 28-day average) and cg19963313 (non-significant at 2-day, Bonferroni significant at 7-day and FDR significant at 28-day average), might be consistent with the hypothesis regarding CpGs associated at shorter time periods but not over longer time.
One of the genes we highlighted, ZIMZ1, has already been connected to skin tumors in mouse models (Rogers et al. 2013) and our results, independently, link it to PM exposure via DNA methylation, reinforcing the hypothesized role of epigenetics in the pathways to tumor development (Laird 2005).  We observed mostly positive effect estimates, in this genome-wide methylation study, in contrast with previous results (Guo et al. 2014) that observed a negative association between short-term PM exposure and DNA methylation in tandem repeats. Zeilinger et al. (Zeilinger et al. 2013) observed decreased methylation as consequence of active smoking in a cross-sectional study. Their most striking and significant CpG belongs to AHRR (aryl hydrocarbon receptor repressor) that repress AHR, and we observed increased methylation in a gene that shares protein domain with AHR. Possible relations and implications need to be verified in the future.

Strengths and Limitations
The data presented here combines evidence from three independent studies, each considering data of at least 500 participants, a paramount element to identify differentially methylated CpG sites that have very little variability. We also adjusted our models for important variables that may otherwise confound the effect of associations with ubiquitous exposures such as ambient air pollution. Finally, we used daily averages of temperature to calculate the same trailing averages and apply appropriate adjustment for weather conditions. We performed a number of sensitivity analyses. Overall, the results of a priori chosen model were considered a conservative estimate. The observed hits between PM 2.5 and CpG sites were independent of long-term exposure at the residence and were not influenced by potential outliers. This study has also limitations. There is a consensus in the scientific community that a background station measuring particulate matter with aerodynamic diameter ≤ 2.5 μm (PM 2.5 ) mass concentrations could be regarded as representative for larger urban areas (Monn 2001). Considering that no coal power plant is in operation in proximity of the participants and only a small percentage of them live close to a major road we had to rely on ambient air pollution measurements since personal exposures were not available. Measurement error from using a single site in this study is expected to result in primarily Berkson-type measurement error (Zeger et al. 2000), which would bias the standard errors, but not the estimated associations. We also acknowledge that the study included only whites, and generalizability to other populations is uncertain. While KORA was cross-sectional, the NAS study assessed the role of ambient particles longitudinally on time. Nevertheless, we had no comparable exposure estimates available to assess the longterm effect of ambient particles. Finally, the Illumina 450k does not completely cover the entire epigenome.

Conclusions
In conclusion, in this epigenome-wide investigation of CpG dinucleotide methylation, we highlighted several CpG sites associated with cumulative exposure to ambient particles up to a month. The trend of significance level of our results tends to increase with the length of the averaging period and the majority shows an increase in methylation. The identified genetic loci suggest novel biological pathways that may link ambient particulate matter to health outcomes such as tumor development and also gene regulation, inflammatory stimuli, pulmonary disorders and glucose metabolism. Future mechanistic studies are needed to establish whether these epigenetic changes could potentially explain the evidence found for ambient fine particles and lung cancer incidence.